HMVhmantovani
Back to Portfolio
Data Engineering & BIReal Case

Automated SAP data warehouse without native API

PythonSeleniumPyAutoGUIETLSAPAutomationData Governance

Overview

Engineered a complete data ecosystem to automate 40 complex SAP extraction loops, transitioning the organization from fragmented, manual Excel-based reporting to an automated, auditable infrastructure that feeds real-time BI and executive dashboards.

The Challenge

Enterprise IT constraints and the absence of native API access required teams to rely on manual SAP report extraction across dozens of workflows, supported by fragmented Excel logic and inconsistent reporting structures. This created operational bottlenecks, reduced auditability, and consumed significant weekly effort to maintain reliable data outputs.

The Solution

Developed a Python automation layer using Selenium and PyAutoGUI on a dedicated VM to simulate human interaction and interact with our SAP system during the day. I architected an "always-ready" data warehouse that maintained both historical daily/monthly archives and a root-level "latest version" for instant Python/PowerBI consumption. Additionally, I implemented a monitoring layer that validated 30–50 variables per loop, triggering real-time bug reports via Microsoft Teams to ensure data integrity.

Results & Impact

Eliminated waiting times for data extraction, reducing 15-minute Excel "freezes" during report updating to near-instant updates. The system also improved data reliability by identifying suspected human input errors in real-time and established a unified, 100% auditable data layer for company-wide reporting.

Tech Stack

PythonSeleniumPyAutoGUIETLSAPAutomationData Governance
GitHub — Coming Soon