How to Create a Process Mining Event Log

What You'll Learn

In this guide, you’ll learn how to create a process mining event log from scratch. We’ll cover the three essential columns every event log needs, walk through a real-world example, and show you how to build your first event log using both Excel and SQL.

What is a Process Mining Event Log?

Related: Learn more about process improvement and find data templates for your system. Also read why we skip out-of-the-box connectors in favor of simple data templates.

A process mining event log is simply a table of data that records what happened in your business process. Think of it as a diary that tracks every step of every case that flows through your system. Process mining software reads this diary and creates visual diagrams showing how your process actually works.

Every event log needs three essential pieces of information for each row:

ColumnWhat It MeansExample
Case IDA unique identifier that groups related events togetherOrder #12345
TimestampWhen the event happened2025-01-15 09:30:00
ActivityWhat happened”Order Placed”

That’s it. With just these three columns, you can start doing process mining. Everything else, like customer names, order values, or employee IDs, are optional extras called “attributes” that make your analysis richer.

Understanding the Difference: Events vs Activities

Before we dive deeper, let’s clear up a common point of confusion.

An activity is a type of action, like “Order Shipped” or “Payment Received.” Think of it as a category or label.

An event is a specific occurrence of that activity. When Order #12345 gets shipped on January 15th at 2:30 PM, that’s an event.

Your event log contains events, but each event has an activity name. In practice, people often use these terms interchangeably, and that’s okay. Just remember: activities are the “what” and events are the “when it happened to whom.”

Our Example: The Pizza Palace Order System

To make this guide practical, let’s work with a fictional system. Imagine you manage Pizza Palace, a local pizzeria with an online ordering system. Customers place orders through the website, staff prepare the pizzas, and drivers deliver them.

The Pizza Palace system has several database tables that track different parts of the order process:

  • orders - Basic order information (order ID, customer, order time)
  • order_items - What was ordered (pizzas, sides, drinks)
  • kitchen_queue - When orders enter and leave the kitchen
  • delivery_assignments - Driver assignments and delivery tracking
  • payments - Payment processing records

Your goal: create an event log that shows the complete journey of each order from placement to delivery.

Types of Events: Direct vs Inferred

When building an event log, you’ll encounter two types of events:

Direct Events

Direct events are recorded explicitly in your system. Someone clicked a button or the system logged an action, and there’s a timestamp right there in the database.

Examples from Pizza Palace:

  • Order placed (timestamp in the orders table)
  • Payment received (timestamp in the payments table)
  • Delivery completed (timestamp in the delivery_assignments table)

Inferred Events

Inferred events don’t have their own timestamp, but you can figure out when they happened based on other data.

Examples from Pizza Palace:

  • “Order Assigned to Driver” might not have its own timestamp, but the delivery_assignments table has a created_at field that tells you when the assignment was made
  • ”Pizza Ready” might be inferred from when the kitchen queue status changed to “completed”

The key difference: direct events are explicitly recorded, while inferred events require you to interpret other data fields. Both are valid and useful for process mining.

Planning Your Event Log

Before extracting data, plan which events you want to capture. For Pizza Palace, let’s track these activities:

  1. Order Placed - Customer submits the order
  2. Payment Received - Payment is processed successfully
  3. Order Sent to Kitchen - Order enters the preparation queue
  4. Order Ready - Kitchen marks the order as complete
  5. Assigned to Driver - A driver is assigned for delivery
  6. Delivery Completed - Order is delivered to the customer

For each event, identify:

  • Which table contains the data
  • Which field provides the timestamp
  • What the Case ID is (in our case, the order ID)

Here’s our mapping:

ActivitySource TableTimestamp FieldCase ID Field
Order Placedorderscreated_atid
Payment Receivedpaymentspayment_timeorder_id
Order Sent to Kitchenkitchen_queuequeue_entry_timeorder_id
Order Readykitchen_queuecompleted_timeorder_id
Assigned to Driverdelivery_assignmentsassigned_atorder_id
Delivery Completeddelivery_assignmentsdelivered_atorder_id

Adding Case and Event Attributes

While Case ID, Timestamp, and Activity are required, attributes make your analysis much more powerful. Attributes are additional columns that provide context.

Case Attributes

Case attributes describe the entire case (order) and are the same for all events in that case:

  • Customer name
  • Order total value
  • Delivery address
  • Number of items ordered

Event Attributes

Event attributes are specific to each event:

  • Driver name (only relevant for delivery events)
  • Payment method (only relevant for payment events)
  • Kitchen station (only relevant for kitchen events)

Pro tip: It’s perfectly fine to include all attributes in every row, even if some don’t apply to certain events. For example, your “Order Placed” row can have a “Driver Name” column that’s empty. This keeps your event log in a simple, flat table format that process mining tools love.

Building the Event Log: The Simple Structure

Your final event log should be one table where each row represents one event. Here’s what our Pizza Palace event log will look like:

Case IDTimestampActivityCustomerOrder ValueDriverPayment Method
10012025-01-15 18:30:00Order PlacedJohn Smith45.99
10012025-01-15 18:30:15Payment ReceivedJohn Smith45.99Credit Card
10012025-01-15 18:31:00Order Sent to KitchenJohn Smith45.99
10012025-01-15 18:45:00Order ReadyJohn Smith45.99
10012025-01-15 18:46:00Assigned to DriverJohn Smith45.99Maria Garcia
10012025-01-15 19:05:00Delivery CompletedJohn Smith45.99Maria Garcia
10022025-01-15 18:35:00Order PlacedJane Doe28.50
10022025-01-15 18:35:20Payment ReceivedJane Doe28.50PayPal

Notice how case attributes (Customer, Order Value) repeat for every event in the same case. This duplication is intentional and makes the data easy to work with.

Method 1: Creating an Event Log in Excel

If you can export your data to spreadsheets, you can build an event log manually. This method works great for small datasets or when you’re learning.

Step 1: Export Each Event Type to a Separate Sheet

Create one worksheet per activity type:

Sheet 1: Order Placed

Case IDTimestampActivityCustomerOrder Value
10012025-01-15 18:30:00Order PlacedJohn Smith45.99
10022025-01-15 18:35:00Order PlacedJane Doe28.50

Sheet 2: Payment Received

Case IDTimestampActivityCustomerOrder ValuePayment Method
10012025-01-15 18:30:15Payment ReceivedJohn Smith45.99Credit Card
10022025-01-15 18:35:20Payment ReceivedJane Doe28.50PayPal

Step 2: Standardize the Columns

Make sure all sheets have the same columns in the same order. Add empty columns where needed:

Sheet 1: Order Placed (updated)

Case IDTimestampActivityCustomerOrder ValueDriverPayment Method
10012025-01-15 18:30:00Order PlacedJohn Smith45.99

Step 3: Combine All Sheets

Create a new “Event Log” sheet. Copy and paste all rows from each activity sheet into this combined sheet, one after another.

Step 4: Sort by Case ID, Then Timestamp

Select all your data and sort by:

  1. Case ID (ascending)
  2. Timestamp (ascending)

This puts events in chronological order within each case, making it easy to follow the journey of each order.

Step 5: Export to CSV

Save your combined sheet as a CSV file. This format works with virtually every process mining tool.

Excel Tips:

  • Use VLOOKUP or XLOOKUP to pull in case attributes (like customer name) from your orders sheet
  • Use consistent date formats (YYYY-MM-DD HH:MM:SS works best)
  • Remove any duplicate events before exporting

Method 2: Creating an Event Log with SQL

For larger datasets or regular extractions, SQL is more efficient and repeatable. The key technique is using UNION ALL to combine multiple queries into one result set.

Understanding UNION ALL

UNION ALL stacks the results of multiple SELECT statements on top of each other. Each SELECT becomes a set of rows in your final result. All SELECTs must have the same number of columns with compatible data types.

Complete SQL Example

Here’s a SQL query that creates a Pizza Palace event log:

-- Event Log Extraction for Pizza Palace
-- This query combines multiple event types into a single event log
-- Each SELECT block represents one activity type

-- Event 1: Order Placed
-- Source: orders table
-- This captures when customers submit their orders
SELECT 
    o.id AS case_id,                          -- The order ID is our case identifier
    o.created_at AS timestamp,                -- When the order was placed
    'Order Placed' AS activity,               -- The activity name (hardcoded)
    o.customer_name AS customer,              -- Case attribute: who ordered
    o.total_amount AS order_value,            -- Case attribute: order value
    NULL AS driver,                           -- Not applicable for this event
    NULL AS payment_method                    -- Not applicable for this event
FROM orders o
WHERE o.created_at >= '2025-01-01'            -- Filter to your desired date range

UNION ALL

-- Event 2: Payment Received
-- Source: payments table
-- This captures successful payment processing
SELECT 
    p.order_id AS case_id,
    p.payment_time AS timestamp,
    'Payment Received' AS activity,
    o.customer_name AS customer,              -- Join to get case attributes
    o.total_amount AS order_value,
    NULL AS driver,
    p.payment_method AS payment_method        -- Event-specific attribute
FROM payments p
JOIN orders o ON p.order_id = o.id            -- Join to get order details
WHERE p.payment_time >= '2025-01-01'
  AND p.status = 'successful'                 -- Only include successful payments

UNION ALL

-- Event 3: Order Sent to Kitchen
-- Source: kitchen_queue table
-- This captures when the kitchen starts working on the order
SELECT 
    k.order_id AS case_id,
    k.queue_entry_time AS timestamp,
    'Order Sent to Kitchen' AS activity,
    o.customer_name AS customer,
    o.total_amount AS order_value,
    NULL AS driver,
    NULL AS payment_method
FROM kitchen_queue k
JOIN orders o ON k.order_id = o.id
WHERE k.queue_entry_time >= '2025-01-01'

UNION ALL

-- Event 4: Order Ready
-- Source: kitchen_queue table (different timestamp field)
-- This is an inferred event based on when the kitchen marked it complete
SELECT 
    k.order_id AS case_id,
    k.completed_time AS timestamp,            -- Different timestamp than entry
    'Order Ready' AS activity,
    o.customer_name AS customer,
    o.total_amount AS order_value,
    NULL AS driver,
    NULL AS payment_method
FROM kitchen_queue k
JOIN orders o ON k.order_id = o.id
WHERE k.completed_time >= '2025-01-01'
  AND k.completed_time IS NOT NULL            -- Only include completed orders

UNION ALL

-- Event 5: Assigned to Driver
-- Source: delivery_assignments table
-- This captures when a driver is assigned to deliver the order
SELECT 
    d.order_id AS case_id,
    d.assigned_at AS timestamp,
    'Assigned to Driver' AS activity,
    o.customer_name AS customer,
    o.total_amount AS order_value,
    d.driver_name AS driver,                  -- Event-specific attribute
    NULL AS payment_method
FROM delivery_assignments d
JOIN orders o ON d.order_id = o.id
WHERE d.assigned_at >= '2025-01-01'

UNION ALL

-- Event 6: Delivery Completed
-- Source: delivery_assignments table (different timestamp field)
-- This captures when the order was delivered to the customer
SELECT 
    d.order_id AS case_id,
    d.delivered_at AS timestamp,
    'Delivery Completed' AS activity,
    o.customer_name AS customer,
    o.total_amount AS order_value,
    d.driver_name AS driver,
    NULL AS payment_method
FROM delivery_assignments d
JOIN orders o ON d.order_id = o.id
WHERE d.delivered_at >= '2025-01-01'
  AND d.delivered_at IS NOT NULL              -- Only include completed deliveries

-- Final ordering: by case, then by time
-- This makes the event log easy to read and follow
ORDER BY case_id, timestamp;

How to Extend This Query

To add more events to your log:

  1. Copy one of the SELECT blocks as a template
  2. Change the table name to your source table
  3. Update the timestamp field to the correct column
  4. Change the activity name to describe the event
  5. Adjust the attributes as needed
  6. Add appropriate WHERE conditions to filter the data

For example, to add a “Delivery Attempted” event:

UNION ALL

-- Event 7: Delivery Attempted
-- Add this to track failed delivery attempts
SELECT 
    d.order_id AS case_id,
    d.attempt_time AS timestamp,
    'Delivery Attempted' AS activity,
    o.customer_name AS customer,
    o.total_amount AS order_value,
    d.driver_name AS driver,
    NULL AS payment_method
FROM delivery_attempts d
JOIN orders o ON d.order_id = o.id
WHERE d.attempt_time >= '2025-01-01'

Best Practices for Event Log Creation

1. Start Simple, Add Complexity Later

Begin with the three required columns and just a few key activities. Once you’ve successfully created a basic event log and loaded it into a process mining tool, you can go back and add more events and attributes.

2. Validate Your Data

Before diving into analysis, check your event log for common issues:

  • Missing timestamps - Events without timestamps break process mining
  • Duplicate events - The same event recorded twice will skew your results
  • Out-of-order events - An “Order Ready” before “Order Placed” indicates data quality issues
  • Orphan events - Events with case IDs that don’t appear in other activities

3. Document Your Extraction

Keep notes on:

  • Which tables you used
  • What filters you applied
  • When the extraction was run
  • Any assumptions you made

This documentation is invaluable when you need to update or troubleshoot your event log later.

4. Use Consistent Naming

Keep activity names consistent across extractions:

  • “Order Placed” is better than sometimes using “Order Created” and sometimes “New Order”
  • Decide on a naming convention and stick to it

5. Handle Time Zones

If your data comes from multiple systems or regions, make sure all timestamps are in the same time zone. UTC is often the safest choice for consistency.

Common Challenges and Solutions

Challenge: Same Event from Multiple Sources

Sometimes the same event is logged in multiple tables. For example, both your order system and your ERP might record when an order ships.

Solution: Pick one source as the “source of truth” and use only that one. Document your choice.

Illustration of common challenges in creating process mining event logs

Challenge: Events Without Timestamps

Some events might not have their own timestamp. For example, an “Order Approved” might just be a boolean flag.

Solution: Look for related timestamps. Maybe there’s an “approved_at” field, or you can use the “modified_at” timestamp when the app%roved flag changed.

Challenge: Very High Event Volume

If you have multiple million events, your queries might be slow or crash when extracting.

Solution:

  • Add date filters to limit the extraction period
  • Extract in batches (one month at a time) and combine the files later
  • Consider using dedicated ETL tools for large-scale extractions

What’s Next? Load Your Event Log into a Process Mining Tool

Once you’ve created your event log as a CSV file or database export, you’re ready to load it into a process mining tool. Most tools follow a similar process:

  1. Upload your file or connect the process mining tool to your extracted data.
  2. Map your columns (Case ID, Timestamp, Activity)
  3. Configure any additional attributes
  4. Generate your process map

Modern process mining tools like ProcessMind make this process straightforward. Simply upload your event log, and the tool automatically visualizes your process, revealing bottlenecks, variations, and improvement opportunities.

Conclusion

Creating a process mining event log doesn’t require specialized tools or deep technical knowledge. At its core, you’re simply organizing your data into a table with three essential columns: Case ID, Timestamp, and Activity.

Whether you use Excel for smaller datasets or SQL for larger, more complex extractions, the principles remain the same:

  1. Identify the events you want to track
  2. Find the timestamp for each event type
  3. Combine everything into a single table
  4. Add attributes to enrich your analysis

The hardest part isn’t the technical extraction, it’s understanding your process well enough to know which events matter. Start with the obvious events (order placed, order completed) and gradually add more detail as you learn what insights your process mining tool reveals.

Ready to go deeper? Explore our continuous process improvement pages where you’ll find detailed information about activities and data requirements for popular processes like Purchase to Pay, Order to Cash, and Accounts Payable. These resources include data templates for well-known systems like SAP, Oracle, and Microsoft Dynamics, giving you a head start on your event log creation journey.

Get Started Today

Don’t wait for the perfect event log. Start with what you have, learn from the process maps you create, and iterate. Even a simple event log with basic activities can reveal surprising insights about how your processes really work.

Related Blog Posts

Receive expert insights on process mining and workflow optimization in your inbox
How to Analyze Your Process: A Practical Guide to Process Mining Insights

How to Analyze Your Process: A Practical Guide to Process Mining Insights

Turn your process mining dashboards into actionable insights. Learn the step-by-step approach to understanding your data, exploring patterns, and finding real i…

Why We Skip Out-of-the-Box Connectors (And What We Do Instead)

Why We Skip Out-of-the-Box Connectors (And What We Do Instead)

Out-of-the-box connectors promise easy data extraction for process mining, but often deliver complexity, delays, and vendor lock-in. Learn our simpler approach …

A Strategic Guide for Data-Driven Process Improvement

A Strategic Guide for Data-Driven Process Improvement

A comprehensive guide to leveraging data for effective process improvement and business transformation.

Celonis Process Mining Alternatives: Why ProcessMind Is the Smarter Choice

Celonis Process Mining Alternatives: Why ProcessMind Is the Smarter Choice

Compare Celonis process mining with ProcessMind for 2025. Discover which process mining software fits your business needs, budget, and goals.

Challenge yourself to unlock process improvements in under 30 days!

Instant access, no credit card, no waiting. Experience how mapping, mining, and simulation work together for smarter, faster decisions.

Explore every feature, uncover deep insights, and streamline your operations from day one.

Start your free trial now and unlock the full power of Process Intelligence, see real improvements in under 30 days!