Supported Data Formats for Event Logs

File support for data upload

ProcessMind supports the following file formats for uploading event logs:

  • XLS: Legacy Excel format, still supported by many systems.
  • XLSX: The most commonly used and modern format for Excel spreadsheets.
  • XLSB: A binary Excel format that offers faster load times and reduced file size. We recommend using XLSB for faster processing of large event logs.
  • CSV: Comma-separated values file, a simple and commonly used text format for storing tabular data.
  • TSV: Tab-separated values file, a format similar to CSV but with tab characters used to separate columns.
  • TXT: Plain text file, which can use various delimiters such as commas, tabs, or semicolons to structure data.

General File Structure Requirements

For successful process mining within ProcessMind, your uploaded files—whether they are in Excel formats (XLS, XLSX, XLSB) or text-based formats (CSV, TSV, TXT)—must adhere to specific structural guidelines. This ensures the app can interpret the data correctly and perform accurate analysis.

1. Header Row

  • The file must start with a header row, which should be located in the first line (e.g., cell A1 for Excel files or line 1 for CSV, TSV, or TXT files). The header defines the column names and should clearly indicate the type of data in each column (e.g., “Case ID,” “Activity,” “Timestamp”).
  • For CSV, TSV, and TXT formats, delimiters and quotes will be autodetected, making it easier to upload your data without needing to specify these settings manually.

2. Minimum Set of Attributes

To support a process mining event log structure, your file must contain, at minimum, the following attributes (columns):

  • Case ID: This column uniquely identifies each process instance (or case). Every row corresponding to the same process instance must have the same Case ID.
  • Activity: This column should describe the specific activity or event being recorded (e.g., “Order Created,” “Payment Processed”).
  • Timestamp: Each activity must be associated with a timestamp that marks the exact time or date the event occurred.
    • Note: The timestamp format will be autodetected as much as possible. Common formats like yyyy-MM-dd HH:mm:ss, MM/dd/yyyy, and others are automatically recognized.
  • Optional Attributes: You may include additional columns to enhance your analysis, such as:
    • Resource: Identifies who performed the activity (e.g., user, department).
    • Cost: Any costs associated with the activity.
    • Other Custom Data: You can include custom fields that are relevant to your specific process, as long as the required columns are present.

3. Data Formatting

  • Ensure that your data is consistently formatted across all columns:
    • Timestamps should be in a standard, recognizable format (e.g., yyyy-MM-dd HH:mm:ss), though ProcessMind will try to autodetect the date format if it’s different.
    • Avoid blank rows between data entries, as this may disrupt the import process.
    • Ensure that numeric data (e.g., costs, durations) is formatted as numbers in Excel, or correctly formatted in text-based files (CSV, TSV, TXT).
  • For CSV, TSV, and TXT formats, ProcessMind will autodetect delimiters (commas, tabs, semicolons, etc.) and handle quoted text, making file uploads seamless.

4. Sheet Selection (For Excel Files Only)

  • ProcessMind automatically processes data from the first sheet in your Excel file (XLS, XLSX, or XLSB), regardless of its name. Ensure that the required event log data is placed on the first sheet, as additional sheets will not be considered during import.

5. Tips

Performance Tip: Use XLSB Format for Faster Processing

While all supported formats can be uploaded and processed by ProcessMind, we highly recommend using the XLSB format for Excel files. The XLSB format stores your Excel file in a binary format, which offers significant performance benefits, especially for large datasets. This means faster loading times and quicker processing when compared to XLS or XLSX formats.