A Strategic Guide for Data-Driven Process Improvement
A comprehensive guide to leveraging data for effective process improvement and business transformation.
The Bottom Line
Data extraction shouldn’t take longer than getting insights. This article explains why we deliberately chose not to use out-of-the-box connectors, and how our data template approach gets you to value faster.
Looking for data templates? Check our continuous process improvement guides with ready-to-use templates for processes like Purchase to Pay, Order to Cash, and Accounts Payable.
”Data is 80% of the work in process mining.”
You’ve probably heard this statistic. We’ve certainly lived it. But here’s the uncomfortable truth: data being 80% of the work is not a natural law—it’s a symptom of doing things wrong.
When data extraction becomes a months-long project, something has gone off track. The goal of process mining isn’t perfect data pipelines. It’s insights. It’s discovering that invoices sit untouched for 12 days before anyone looks at them. It’s finding that 40% of orders require manual intervention because of a misconfigured system.
Data is a means to an end, not the end itself.
So why do so many process mining projects get stuck in the data phase? Often, it comes down to one decision: choosing out-of-the-box connectors.
Let’s be fair. The promise is compelling.
Out-of-the-box connectors are pre-built integrations that claim to extract process data from systems like SAP, Salesforce, ServiceNow, or Oracle. The pitch goes something like this:

In an ideal world, this would be transformative. And to be honest, we’ve seen it work—in demos. Sometimes even in simple production environments with vanilla system configurations.
But here’s the thing: your company is not a demo environment.
We’ve spent years helping organizations with process mining. We’ve seen what works and what doesn’t. And time after time, we’ve watched out-of-the-box connectors create more problems than they solve.
One of our clients started their process mining journey excited about a vendor’s SAP connector. “Three weeks to first insights,” they were promised.
Eighteen months later, they were still debugging data quality issues.
This isn’t an anomaly. We’ve seen connector projects drag on for over a year, consuming budgets and enthusiasm in equal measure. The “simple” setup becomes anything but simple when reality meets the sales pitch.
Your system isn’t standard. Connectors are built for specific versions of specific software. But companies customize. They add fields, modify workflows, change table structures. That ERP system you’re running? It’s probably been evolving for a decade. The connector expects a textbook installation; you have a living, breathing system shaped by years of business decisions.
Connectors try to solve everything. A connector designed to handle every possible process mining use case becomes incredibly complex. It needs to support edge cases you’ll never encounter while potentially not supporting the specific analysis you actually need. This one-size-fits-all approach means you’re fighting through complexity that adds no value to your project.
The multi-table maze. Most connectors work with multiple source tables, which sounds powerful—and it is. But setting up and maintaining these table relationships is complex. You need to understand not just your business process, but the connector’s data model, the source system’s data model, and how they’re supposed to map together. That’s a lot of models.
Black boxes breed mistrust. When something goes wrong with a connector (and something always goes wrong), you can’t see inside to understand why. Is the data wrong in the source system? Is the connector transforming it incorrectly? Is there a configuration issue? You’re debugging blind, which is frustrating for everyone involved—and makes it nearly impossible to explain results to stakeholders.
Duplicate work. Here’s an irony: many organizations already have clean, prepared data in their data warehouses or lakes. Your data engineers have spent years building pipelines, resolving data quality issues, and creating reliable datasets. A connector ignores all that work and starts from scratch, potentially introducing inconsistencies with the data your organization already trusts.
Generic names that confuse everyone. Connectors use standardized terminology that may not match how your organization talks about its processes. When the connector calls something “Order Confirmation” but your team calls it “Sales Acknowledgment,” you’ve created a translation layer that slows down every conversation.
The ETL trap. Most connectors force you into the vendor’s ETL tooling. That means learning proprietary systems, depending on vendor-specific features, and building expertise that doesn’t transfer to other tools. Meanwhile, your company probably already has ETL infrastructure and people who know how to use it.
Vendor lock-in by design. Let’s be direct: complex, proprietary data pipelines are a feature, not a bug, from the vendor’s perspective. Once you’ve invested six months building a connector-based infrastructure, switching to a different tool becomes enormously expensive. That’s not an accident.
Beyond the technical challenges, there are real costs to the connector approach:

We took a different path. Instead of building connectors, we created data templates.
The philosophy is simple: prescribe what data is needed in a format that’s close to your process and easy to generate.
Yes, really. You can start with Excel.
New to event logs? Our guide on how to create a process mining event log walks you through building your first event log step by step, with examples in both Excel and SQL.
Our data templates define exactly what columns you need for each type of analysis. For many processes, that’s a Case ID, a Timestamp, and an Activity name—plus whatever additional attributes matter to your business.
You can create this yourself. Today. No waiting for IT. No procurement process. No consultant engagement. Export some data from your system, arrange it in our template format, upload it, and you’re analyzing your process.
Is it perfect? No. Is it fast? Absolutely. And fast matters more than perfect when you’re trying to understand your process.
We deliberately use a single-table format for most analyses. This seems limiting until you understand the reasoning:
You can still work with multiple tables in ProcessMind—different perspectives, different processes, even object-centric process mining approaches. But you don’t have to start there. Start simple, add complexity only when it delivers value.
Here’s something we’ve learned: the connector approach often bypasses your data engineers. That’s a mistake.
Your data team knows your systems. They know where the data quality issues are. They know which fields to trust and which to verify. They’ve probably already solved many of the problems a connector would encounter.
Our data templates create a common language. Hand the template to your data engineer and they’ll immediately understand what’s needed. They can use the tools they already know—SQL, their existing ETL platform, their data warehouse—to generate exactly the data required.
No new systems to learn. No proprietary vendor training. Just clear requirements that experienced data people can meet quickly.
Most organizations already have significant data infrastructure:
The template approach lets you leverage all of this. Don’t rebuild what already exists. Don’t create parallel data pipelines. Use the investments you’ve already made.
This also means the data you prepare for process mining can be reused elsewhere. Build a dataset once, use it for process mining, machine learning, traditional BI, and whatever comes next. That’s efficiency.
Even with clear templates, we know data doesn’t always arrive perfectly formatted. That’s why we’ve built AI-powered data mapping into ProcessMind.
Upload your data, and our system will often understand it automatically—recognizing which column is the case ID, which is the timestamp, which contains activity names. If something doesn’t map correctly, you can adjust it manually with a few clicks.
The goal is removing friction between you and insights.

BPMN-Based Process Mining: A Data Quality Advantage
Not all process mining approaches are equal when it comes to data requirements.
Traditional, pure data-driven process mining tools need to infer everything about your process from the event log. Every gateway, every decision point, every parallel path must be encoded in the data. If your data has gaps or imperfections, the algorithm struggles—or produces misleading results.
BPMN-based process mining works differently. Because the process structure is defined in a model, the tool can handle gaps in the data more gracefully. Missing events don’t necessarily break the analysis. The model provides context that pure data approaches lack.
This is one reason we built ProcessMind around BPMN modeling. Real-world data is messy. Your process mining tool should work with that reality, not against it.
Daily Updates vs. Smart Updates
”Real-time data updates” sounds impressive in a sales presentation. But consider what daily updates actually mean for analysis:
For most process mining use cases, stable datasets analyzed periodically work better. Run your analysis monthly or quarterly. Establish clear comparison points. Make changes and measure impact against a fixed baseline.
Update your data when it makes sense for your analysis cycle, not because the technology allows constant refreshes. Focus your effort on insights and improvements, not pipeline maintenance.
Here’s how we recommend approaching process mining data:
Define your goal first. What process question are you trying to answer? What improvement are you hoping to find?
Identify available data. What’s already in your data warehouse? What can you export from systems today? Start with what’s accessible.
Use our templates. Download the appropriate template from our continuous process improvement guides. The format is simple and documented.
Start in Excel. Export some data, format it to match the template, and upload. You can have insights in an hour, not in months.
Iterate. Your first dataset won’t be perfect. That’s fine. Learn what’s missing, improve the data, and run again. Each cycle takes days, not months.
Automate later. Once you know exactly what data you need and have proven the value of the analysis, then consider automation. Work with your data team to build a sustainable pipeline using tools they already know.
Keep it simple. Resist the temptation to add complexity before you need it. Every additional data source, every extra transformation, is maintenance burden and potential failure point.
Technology isn’t the answer to process mining success. The vendors selling complex connector infrastructure want you to believe otherwise, but our experience says different.
What actually matters:
The organizations that succeed with process mining are rarely the ones with the most sophisticated data infrastructure. They’re the ones who stay focused on business outcomes and don’t let data extraction become a goal unto itself.
We’ve built ProcessMind around these principles. Simple data requirements. Fast time to value. You stay in control.
Explore our continuous process improvement guides to find data templates for common processes. Each guide includes:
Or just start a free trial and upload some data. You might be surprised how quickly insights appear when you’re not waiting for a connector to be configured.
Data isn’t the goal. Understanding your process is. Let’s get there faster.
A comprehensive guide to leveraging data for effective process improvement and business transformation.
Compare Celonis process mining with ProcessMind for 2025. Discover which process mining software fits your business needs, budget, and goals.
Compare Disco and ProcessMind to find the best fit for your team's process mining needs in 2025. Discover key features, pricing, and use cases.
See how ProcessMind compares to SAP Signavio for process mining, modeling, and simulation. Find the best fit for your business in 2025.
Instant access, no credit card, no waiting. Experience how mapping, mining, and simulation work together for smarter, faster decisions.
Explore every feature, uncover deep insights, and streamline your operations from day one.
Start your free trial now and unlock the full power of Process Intelligence, see real improvements in under 30 days!