Document List
On This Page

Extracting Data from SAP for Purchase-to-Pay (P2P)

Extracting Data from SAP for Purchase-to-Pay (P2P) Process Mining

To analyze the Purchase-to-Pay process with process mining, you need to extract a high-quality event log from SAP. Below are the typical steps and structures involved.

1. Define Your P2P Process Scope

Before extracting data, clarify the process steps you want to include. A standard P2P process consists of:

  1. Purchase Requisition
  2. Purchase Order
  3. Goods Receipt
  4. Invoice Receipt
  5. Payment

2. Identify Key SAP Tables

You’ll usually pull data from the following tables:

Process StepSAP TableDescription
Purchase RequisitionEBANPurchase Requisition Header
Purchase OrderEKKO / EKPOPO Header (EKKO) & Items (EKPO)
Goods ReceiptMKPF / MSEGMaterial Document Header & Line Items
Invoice ReceiptBKPF / BSEGAccounting Document Header & Line Items
PaymentPAYR / REGUHPayment Document Tables

3. Identify Required Fields for Event Log

The following fields are typically required for process mining. You can adjust based on your specific needs.

Case IDPO Number / Document No.Groups all events in a single process
ActivityDerived (e.g., “PO Created”)Name of process step
TimestampAEDAT, BUDAT, CPUDTWhen the step occurred
User/AgentERNAM, USNAM, etc.Who executed the action
AmountWRBTR, DMBTRValue associated with the step
Vendor / MaterialLIFNR, MATNRPartner or material info

4. Extract the Raw Data

SAP ECC (ABAP stack)

  • Use ABAP Reports to extract data directly from the relevant tables.
  • Or create custom ABAP programs to join, filter and export relevant event data.
  • You can export the result as a CSV or flat file.

SAP S/4HANA (HANA stack)

  • Use CDS Views (Core Data Services) to define logical models and extract data.
  • Or use SAP Fiori apps or the HANA Studio to run SQL scripts.
  • Use ODATA API, SAP Datasphere, or SAP Data Intelligence for better automation.

5. Build the Event Log

Minimum columns needed for process mining

ColumnExampleExample
Case IDUnique identifier for the process instancePurchase Order ID
ActivityName of the process step”Create PO”, “Post Invoice”
TimestampDate and time of the event2025-04-17 10:22:12

Optional columns to enrich the event log

Column NameDescription
UserUser who performed the action
AmountMonetary value associated with the event
Vendor / MaterialVendor or material involved in the event
Company CodeCompany code associated with the event
CurrencyCurrency of the transaction
Document TypeType of document (e.g., PO, GR, Invoice)
Other FieldsAmount, Vendor, Currency, etc

tip

Unpivot your data to turn multiple date fields into one row per activity if needed.

6. Transform and Clean Data

Once you have the raw data, you may need to transform it into a format suitable for process mining. This often involves:

  • Normalizing timestamps (e.g., converting to UTC)
  • Labeling activities (e.g., “PO Created” instead of MIGO code)
  • Ensuring one row per event (long format)
  • Cleaning up NULL values and duplicates
  • Consistent Case IDs (unique per instance)

Tools you can use:

  • SQL scripts to clean and transform data
  • Excel or Power Query
  • Python + Pandas
  • KNIME, Alteryx, or ETL tools
Data cleaning and Preparation

Data cleaning and preparation is a crucial step in the data analysis process. It involves identifying and correcting errors, inconsistencies, and inaccuracies in the data to ensure its quality and reliability.

ABAP Example code template for extracting data
" Example ABAP code to extract Purchase-to-Pay event log data
REPORT z_p2p_event_log_export.

TABLES: ekko, ekpo, mseg, mkpf, rbkp, rseg.

DATA: BEGIN OF event_log OCCURS 0,
        case_id        TYPE ekko-ebeln,
        activity       TYPE char40,
        timestamp      TYPE sy-datum,
        username       TYPE sy-uname,
        amount         TYPE ekpo-netwr,
      END OF event_log.

START-OF-SELECTION.

  " 1. Extract Purchase Order creation
  SELECT ebeln aedat ernam netwr
    INTO (event_log-case_id, event_log-timestamp, event_log-username, event_log-amount)
    FROM ekko
    WHERE aedat >= '20240101'.
    event_log-activity = 'Create Purchase Order'.
    APPEND event_log.
  ENDSELECT.

  " 2. Extract Goods Receipt (MIGO)
  SELECT mblnr budat usnam ebeln
    INTO (event_log-case_id, event_log-timestamp, event_log-username, event_log-amount)
    FROM mseg
    INNER JOIN mkpf ON mseg~mblnr = mkpf~mblnr
    WHERE bwart = '101' AND budat >= '20240101'. " GR document type
    event_log-activity = 'Goods Receipt'.
    APPEND event_log.
  ENDSELECT.

  " 3. Extract Invoice Posting (MIRO)
  SELECT ebeln bldat usnam wrbtr
    INTO (event_log-case_id, event_log-timestamp, event_log-username, event_log-amount)
    FROM rseg
    INNER JOIN rbkp ON rseg~belnr = rbkp~belnr
    WHERE bldat >= '20240101'.
    event_log-activity = 'Invoice Posted'.
    APPEND event_log.
  ENDSELECT.

  " 4. Export results to file (optional, if permissions allow)
  DATA: filename TYPE string VALUE '/usr/sap/tmp/p2p_event_log.csv',
        file TYPE string,
        rc TYPE i.

  OPEN DATASET filename FOR OUTPUT IN TEXT MODE ENCODING DEFAULT.
  LOOP AT event_log.
    CONCATENATE event_log-case_id
                event_log-activity
                event_log-timestamp
                event_log-username
                event_log-amount
                INTO file SEPARATED BY ';'.
    TRANSFER file TO filename.
  ENDLOOP.
  CLOSE DATASET filename.

  WRITE: / 'Event log exported successfully.'.
Note on the ABAP code
SectionPurpose
ekkoHeader table for PO creation
mseg + mkpfFor Goods Receipt (GR) entries
rseg + rbkpFor Invoice Receipt (IR) entries
OPEN DATASETWrites the event log as a CSV file to the server (check permissions)
bwart = ‘101’Filter for GR movement types
Some Example SQL Queries to understand the data extraction process

You can use SQL queries to extract data from the relevant tables. Below are some example queries for each step of the P2P process. Adjust the queries based on your specific requirements and SAP version.

Here’s a simple example

-- Purchase Order creation
SELECT
  EKKO.EBELN AS CaseID,
  'PO Created' AS Activity,
  EKKO.AEDAT AS Timestamp,
  EKKO.ERNAM AS User,
  EKKO.LIFNR AS Vendor,
  EKKO.WAERS AS Currency,
  EKKO.BUKRS AS CompanyCode
FROM EKKO
WHERE EKKO.BSART = 'NB'; -- Standard PO type
-- Goods Receipt
SELECT
  MSEG.EBELN AS CaseID,
  'Goods Receipt' AS Activity,
  MKPF.BUDAT AS Timestamp,
  MKPF.USNAM AS User,
  MSEG.MATNR AS Material,
  MSEG.MENGE AS Quantity
FROM MSEG
JOIN MKPF ON MSEG.MBLNR = MKPF.MBLNR
WHERE MSEG.BEWTP = 'R'; -- Receipt type
-- Invoice Receipt
SELECT
  BSEG.BELNR AS CaseID,
  'Invoice Posted' AS Activity,
  BKPF.BUDAT AS Timestamp,
  BKPF.USNAM AS User,
  BSEG.WRBTR AS Amount
FROM BSEG
JOIN BKPF ON BSEG.BELNR = BKPF.BELNR
WHERE BKPF.BLART = 'RE'; -- Invoice type

You can unify these logs into a single event log table for process mining:

-- Combine event logs (simplified UNION)
SELECT * FROM (
  SELECT 'PO Created' AS Activity, EBELN AS CaseID, AEDAT AS Timestamp FROM EKKO
  UNION
  SELECT 'Goods Receipt', EBELN, BUDAT FROM MSEG JOIN MKPF ON MSEG.MBLNR = MKPF.MBLNR
  UNION
  SELECT 'Invoice Posted', BELNR, BUDAT FROM BKPF
) AS EventLog
ORDER BY CaseID, Timestamp;

7. Export as Event Log

Once you have a unified structure with at least: • Case ID • Activity • Timestamp

Pro Tip

• Normalize timestamps to the same timezone. • Make sure your case IDs are consistent across all events. • Clean the data to remove test or cancelled documents. • Use filters like company code, document type, or fiscal year to narrow the scope.