Data Template: Incident Management
Your Incident Management Data Template
- Recommended attributes to collect
- Key activities to track
- Extraction guidance for Bmc Helix
Incident Management Attributes
| Name | Description | ||
|---|---|---|---|
|
Activity
ActivityName
|
The name of the specific action or event that occurred at a point in the incident's lifecycle. | ||
|
Description
The Activity Name describes a step in the incident management process, such as 'Incident Categorized', 'Assigned to Support Group', or 'Incident Resolved'. These activities form the nodes in the discovered process map. Analyzing the sequence and frequency of these activities is central to process mining. It helps uncover the actual process flow, identify bottlenecks between steps, detect deviations from the standard operating procedure, and measure the duration of specific stages within the incident lifecycle.
Why it matters
This attribute defines the steps in the process, enabling the visualization of the process map and the analysis of flows, bottlenecks, and deviations.
Where to get
Typically derived from status changes, audit logs, or specific event records associated with the incident in the 'HPD:HelpDesk_AuditLogSystem' or similar audit tables.
Examples
Incident ReportedAssigned to Support GroupInvestigation StartedIncident ResolvedIncident Closed
|
|||
|
Incident ID
IncidentId
|
The unique identifier for each reported incident, serving as the primary key for tracking the incident's lifecycle. | ||
|
Description
The Incident ID is the cornerstone of incident management analysis. It uniquely identifies each case, allowing for the aggregation of all related events, status changes, and activities into a single, cohesive process instance. In process mining, this ID links every step, from 'Incident Reported' to 'Incident Closed', enabling a complete end-to-end view of the incident's journey. It is essential for calculating case-level metrics such as total resolution time, number of reassignments, and identifying process variants.
Why it matters
This attribute is the fundamental case identifier, making it possible to trace the entire lifecycle of an incident and analyze its flow through the management process.
Where to get
This is a core field in the primary Incident Management module or form, often labeled as 'Incident Number' or 'Incident ID'.
Examples
INC000012345678INC000009876543INC000011223344
|
|||
|
Start Time
EventTimestamp
|
The precise date and time when a specific activity or event occurred for an incident. | ||
|
Description
The Event Timestamp records the moment each activity took place. This temporal data is critical for ordering events chronologically and constructing the process flow accurately. In analysis, timestamps are used to calculate durations between activities, measure the total cycle time of incidents, and identify delays or waiting times in the process. Comparing timestamps against service level agreements (SLAs) is also essential for performance monitoring and compliance checks.
Why it matters
Timestamps provide the chronological order of events and are essential for all time-based analysis, including performance measurement, bottleneck identification, and SLA compliance.
Where to get
This information is found in the audit log tables, such as 'HPD:HelpDesk_AuditLogSystem', corresponding to each logged action or status change.
Examples
2023-10-26T10:00:00Z2023-10-26T11:30:00Z2023-10-27T14:45:00Z
|
|||
|
Last Data Update
LastDataUpdate
|
The timestamp indicating the last time the data for this record was refreshed from the source system. | ||
|
Description
This attribute provides the timestamp of the most recent data extraction. It is a metadata field that is essential for understanding the freshness of the data being analyzed. Knowing when the data was last updated helps analysts and business users trust the insights derived from the process mining tool. It confirms whether the analysis reflects the most current state of operations or is based on older data.
Why it matters
Ensures transparency about data freshness, which is critical for making timely and accurate business decisions based on the process analysis.
Where to get
This timestamp is generated and added during the data extraction, transformation, and loading (ETL) process.
Examples
2023-11-01T02:00:00Z2023-11-02T02:00:00Z
|
|||
|
Source System
SourceSystem
|
The system from which the incident data was extracted. | ||
|
Description
This attribute identifies the origin of the data, which is crucial in environments where data from multiple systems is consolidated for analysis. It helps ensure data lineage and provides context for the data's structure and content. For Bmc Helix, this would typically be a static value identifying the specific instance or environment, for example 'BmcHelix_Production'. It is useful for filtering and segmenting data if multiple source systems are ever integrated.
Why it matters
Provides traceability and context for the data's origin, which is important for data governance and troubleshooting in multi-system environments.
Where to get
This is typically a static value added during the data extraction, transformation, and loading (ETL) process to identify the data source.
Examples
BmcHelixBmcHelix_Prod_EUITSM_Platform_A
|
|||
|
Assigned Agent
AssignedAgent
|
The individual support agent assigned to handle the incident. | ||
|
Description
The Assigned Agent is the specific person responsible for the incident at a given time. This provides a more granular level of detail than the support group, enabling analysis of individual performance and workload. This attribute is essential for the 'Team Workload Distribution' dashboard and the 'Activity Volume per Agent StdDev' KPI. By analyzing the volume and types of incidents handled by each agent, managers can identify overburdened individuals, ensure equitable workload distribution, and spot coaching opportunities.
Why it matters
Enables granular analysis of individual workload and performance, helping to optimize resource allocation and identify top performers or those needing support.
Where to get
A standard field on the 'HPD:Help Desk' form, commonly named 'Assignee'.
Examples
Alice SmithBob JohnsonCharlie Brown
|
|||
|
Assigned Support Group
AssignedSupportGroup
|
The support team or group responsible for working on the incident. | ||
|
Description
This attribute identifies the team assigned to an incident. As an incident progresses, it may be reassigned to different support groups, such as the Service Desk, Network Team, or Application Support. Tracking changes in this attribute is fundamental for the 'Incident Reassignment Analysis' dashboard. It allows for visualizing handoffs between teams, measuring the number of transfers per incident, and identifying bottlenecks or knowledge gaps that lead to excessive reassignments. It also supports workload distribution analysis across teams.
Why it matters
This attribute is key to analyzing team performance, workload distribution, and the efficiency of incident routing and handoffs.
Where to get
A standard field on the 'HPD:Help Desk' form, typically named 'Assigned Group'.
Examples
Service DeskNetwork OperationsDatabase AdministrationApplication Support Tier 2
|
|||
|
Incident Category
IncidentCategory
|
The classification of the incident, typically organized in a hierarchical structure. | ||
|
Description
Incident Category provides a structured way to classify incidents based on the affected service, component, or type of issue, for example 'Hardware', 'Software', or 'Network'. This categorization is crucial for routing incidents to the correct team and for later analysis of incident trends. The 'Incident Categorization Accuracy' dashboard relies on this attribute, often comparing its initial value to its value upon resolution to measure the quality of initial triage. Accurate categorization helps in identifying recurring problems and enables more effective problem management.
Why it matters
Proper categorization is vital for efficient routing, trend analysis, and assessing the accuracy of the initial diagnosis.
Where to get
These are standard fields on the 'HPD:Help Desk' form, often a set of cascading fields like 'Categorization Tier 1', 'Categorization Tier 2', etc.
Examples
Software > Enterprise Apps > ERPHardware > Laptop > KeyboardNetwork > Connectivity > Wi-Fi
|
|||
|
Incident Status
IncidentStatus
|
The current or historical state of the incident within its lifecycle, such as 'New', 'In Progress', or 'Closed'. | ||
|
Description
The Incident Status indicates the stage of an incident at any given time. It is often the source for generating the 'Activity' log, where a change in status corresponds to a process step. Analyzing the status allows for filtering incidents by their current state, understanding how much time is spent in different statuses, and identifying incidents that are stuck. For example, a dashboard could highlight incidents that have been in a 'Pending' status for an unusually long time.
Why it matters
It provides a snapshot of an incident's progress and is crucial for identifying stalled incidents and analyzing time spent in various stages.
Where to get
A core field on the main incident form, typically 'HPD:Help Desk'. The field is often named 'Status'.
Examples
NewAssignedIn ProgressPendingResolvedClosed
|
|||
|
Priority
Priority
|
The priority level of the incident, which determines the required speed of resolution. | ||
|
Description
Priority is a key attribute that dictates the urgency of an incident. It is often derived from the incident's impact and urgency and is used to allocate resources and define service level agreement (SLA) targets. In process mining analysis, priority is used to segment incidents to compare the process flows of high-priority versus low-priority cases. This helps verify whether critical incidents are truly being handled faster and identify any deviations in their process paths, which is essential for the 'High-Priority Incident Performance' dashboard and related KPIs.
Why it matters
This attribute is critical for segmenting analysis to ensure that high-urgency incidents are handled according to defined procedures and SLAs.
Where to get
A standard field on the 'HPD:Help Desk' form, typically named 'Priority'.
Examples
CriticalHighMediumLow
|
|||
|
Severity
Severity
|
The measure of the incident's business impact. | ||
|
Description
Severity defines how significantly an incident affects business operations. It is a critical input, along with urgency, for determining the incident's priority and associated SLAs. Analyzing incidents by severity helps in understanding the process performance for the most impactful issues. It is used in dashboards like the 'Critical SLA Breach Overview' to categorize and focus on incidents that pose the greatest risk to the business.
Why it matters
Severity helps quantify the business impact of incidents, allowing for analysis focused on mitigating the most significant operational risks.
Where to get
Consult Bmc Helix documentation. This is often a standard field on the incident form, possibly named 'Impact' or 'Severity'.
Examples
1-Extensive/Widespread2-Significant/Large3-Moderate/Limited4-Minor/Localized
|
|||
|
SLA Status
SLAStatus
|
Indicates whether the incident is within its service level agreement (SLA) targets or has breached them. | ||
|
Description
The SLA Status provides a clear indicator of service performance for each incident. It typically shows states like 'Within Target', 'Warning', or 'Breached'. This is often a dynamically calculated field within Bmc Helix itself. This attribute is essential for the 'Critical SLA Breach Overview' dashboard and the 'Critical Incident SLA Breach Rate' KPI. It allows for immediate identification and prioritization of incidents that are failing to meet service commitments, enabling a focused analysis on the causes of delays.
Why it matters
Directly measures performance against commitments and is fundamental for compliance monitoring and identifying processes that cause SLA breaches.
Where to get
This is often a computed field within Bmc Helix, derived from the incident's priority and age. The status may be stored in a related SLA management module.
Examples
Within TargetWarningBreached
|
|||
|
Business Service
BusinessService
|
The business service or application affected by the incident. | ||
|
Description
This attribute links an incident to a specific business service defined in the Configuration Management Database (CMDB), such as 'Email Service' or 'ERP System'. Analyzing incidents by the affected business service is crucial for understanding the impact on the organization. It helps prioritize problem management efforts on services that generate the most incidents or suffer the most downtime. This view is essential for reporting on IT's performance from a business-centric standpoint.
Why it matters
Connects technical incidents to their business impact, enabling analysis that prioritizes work based on what is most critical to the organization.
Where to get
This is a Configuration Item (CI) field on the incident form, which links to the CMDB. The field might be labeled 'Service' or 'CI'.
Examples
Corporate Email ServiceSAP ERP FinancialsCustomer Relationship Management
|
|||
|
Channel
Channel
|
The method or channel through which the incident was reported. | ||
|
Description
The Channel attribute specifies how an incident was initiated, for example, via phone call, email, self-service portal, or automated monitoring. Analyzing the process by channel can reveal important differences in resolution times or process paths. For instance, incidents reported via the self-service portal might be resolved faster due to better initial data quality. This analysis can inform decisions about which channels to promote or improve.
Why it matters
Helps understand how the reporting method impacts the incident lifecycle, revealing opportunities to optimize specific channels for efficiency.
Where to get
A standard field on the 'HPD:Help Desk' form, often named 'Reported Source'.
Examples
EmailPhoneSelf-service PortalSystem Monitoring
|
|||
|
Is Reopened
IsReopened
|
A flag that indicates if an incident has been reopened after being resolved. | ||
|
Description
This calculated attribute is a boolean flag that is true if an incident has a 'Incident Reopened' activity in its history. A high rate of reopened incidents can signal issues with the quality of resolutions or premature closing of tickets. This flag is used directly in calculating the 'Incident Re-opening Rate' KPI and for the 'Incident Re-opening Rate Trend' dashboard. It allows analysts to easily filter for and investigate reopened incidents to understand the root causes, such as incomplete fixes or miscommunication with the user.
Why it matters
This flag directly measures resolution quality and customer satisfaction, highlighting cases where the initial fix was not effective.
Where to get
This is a calculated field, derived during data transformation by checking if an incident's event log contains a 'Reopened' activity or status transition.
Examples
truefalse
|
|||
|
Is SLA Breached
IsSlaBreached
|
A boolean flag indicating if the incident resolution time exceeded the SLA target. | ||
|
Description
This is a simplified flag derived from the 'SLAStatus' attribute, where 'true' indicates the SLA was breached. This provides a clear, binary outcome for easy filtering and aggregation. This attribute is used to calculate the 'Critical Incident SLA Breach Rate' KPI. It allows for straightforward segmentation of all incidents into two groups, breached and not breached, to analyze the process characteristics that are most common among incidents that fail to meet service targets.
Why it matters
Provides a simple, binary outcome for SLA compliance, making it easy to calculate breach rates and analyze the common paths of non-compliant cases.
Where to get
Derived from the 'SLAStatus' attribute during data transformation. If 'SLAStatus' is 'Breached', this flag is set to true.
Examples
truefalse
|
|||
|
Reassignment Count
ReassignmentCount
|
The total number of times an incident was transferred between different support groups. | ||
|
Description
This calculated attribute counts how many times the 'AssignedSupportGroup' field changed during an incident's lifecycle. A high number of reassignments, often called 'ticket ping-pong', indicates process inefficiencies, such as incorrect initial routing or knowledge gaps within support teams. This metric is the core of the 'Average Reassignments per Incident' KPI and the 'Incident Reassignment Analysis' dashboard. Tracking this count helps to identify opportunities to improve first-call resolution rates and streamline the routing process, ultimately reducing resolution time.
Why it matters
Quantifies process inefficiency and friction, highlighting incidents that suffer from being passed between teams, which delays resolution.
Where to get
Calculated during data transformation by counting the number of distinct values or changes in the 'AssignedSupportGroup' field over the lifecycle of each incident.
Examples
0135
|
|||
|
Resolution Code
ResolutionCode
|
A code or category indicating how the incident was ultimately resolved. | ||
|
Description
The Resolution Code captures the final outcome or the nature of the solution applied to an incident. This structured data is valuable for root cause analysis and understanding the types of solutions that are most frequently required. In analysis, this attribute can be compared with the initial 'IncidentCategory' to assess categorization accuracy. It also provides insights into common fixes, helping to build a knowledge base or identify areas where automation could be applied.
Why it matters
Provides structured data on incident outcomes, supporting root cause analysis and the improvement of knowledge management and automation.
Where to get
Consult Bmc Helix documentation. This field is typically found on the resolution tab or section of the incident form.
Examples
User Error - Training ProvidedSoftware Patch AppliedHardware ReplacedNo Fault Found
|
|||
|
Resolution Duration
ResolutionDuration
|
The total time elapsed from when an incident was first reported to when it was resolved. | ||
|
Description
This metric measures the duration from the 'Incident Reported' activity to the 'Incident Resolved' activity. It is a key performance indicator for the efficiency of the entire incident management process. This calculated attribute is the basis for the 'Average Incident Resolution Time' KPI and the 'Incident Resolution Cycle Time' dashboard. Analyzing this duration across different incident categories, priorities, or teams helps identify systemic sources of delay and measure the impact of process improvement initiatives.
Why it matters
This is a primary measure of process efficiency and customer experience, directly reflecting how long it takes to restore service for users.
Where to get
Calculated during data transformation by finding the time difference between the timestamp of the 'Incident Resolved' activity and the 'Incident Reported' activity for each case.
Examples
25920000086400000604800000
|
|||
Incident Management Activities
| Activity | Description | ||
|---|---|---|---|
|
Assigned to Support Group
|
This activity signifies the initial assignment of the incident to a specific support group for investigation and resolution. It represents the first handoff from the service desk to a technical team. | ||
|
Why it matters
This is a critical milestone for tracking first-touch resolution rates and initial response times. It helps identify delays in getting the incident to the right team.
Where to get
Captured by tracking the first population of the 'Assigned Group' field in the incident's audit history (HPD:HelpDesk_AuditLogSystem).
Capture
Inferred from the first recorded change to the 'Assigned Group' field in the audit logs.
Event type
inferred
|
|||
|
Incident Closed
|
The final activity in the lifecycle, where the incident record is formally closed and becomes a read-only historical record. This often occurs automatically after a set period in the 'Resolved' state. | ||
|
Why it matters
This activity marks the definitive end of the process. The time between 'Resolved' and 'Closed' can highlight delays in administrative processes or user confirmation windows.
Where to get
This is a distinct event inferred from the timestamp when the 'Status' field is updated to 'Closed'. This is tracked in the audit history.
Capture
Filter audit logs for the status change to 'Closed'.
Event type
inferred
|
|||
|
Incident Reported
|
Marks the creation of a new incident record in the system. This is the starting point of the incident lifecycle, typically triggered by a user submission via a portal, email, or a service desk agent manually creating the ticket. | ||
|
Why it matters
This activity is the primary start event for the process. Analyzing the time from this event to resolution is fundamental for measuring overall incident lifecycle duration and identifying upstream delays.
Where to get
This is an explicit creation event captured from the 'Submit Date' or 'Reported Date' timestamp on the HPD:Help Desk form. It is one of the most reliable and fundamental events in the system.
Capture
Captured from the creation timestamp of the incident record in the HPD:Help Desk table.
Event type
explicit
|
|||
|
Incident Resolved
|
Indicates that a resolution has been implemented and the service has been restored for the user. This is a key milestone, typically captured by a status change to 'Resolved'. | ||
|
Why it matters
This is a primary milestone for measuring the core resolution time. It signifies the end of the active work phase and often starts the clock for user confirmation or auto-closure procedures.
Where to get
This is a distinct event inferred from the timestamp when the 'Status' field is updated to 'Resolved'. This change is recorded in the audit history.
Capture
Filter audit logs for the status change to 'Resolved'.
Event type
inferred
|
|||
|
Resolution Identified
|
Represents the moment a support agent has found a solution and documented it in the incident record. The incident is now ready to be moved to a 'Resolved' state. | ||
|
Why it matters
This milestone marks the end of the technical investigation phase. The duration from this point to closure can reveal bottlenecks in communication, verification, and administrative processes.
Where to get
This is often inferred from the timestamp when the resolution details are entered and saved, just before the status is changed to 'Resolved'.
Capture
Use the timestamp of the last modification before the status changes to 'Resolved'.
Event type
inferred
|
|||
|
Workaround Implemented
|
Signifies that a temporary solution has been provided to the user, restoring service functionality while a permanent fix is being developed. This is often recorded by setting a specific flag or status. | ||
|
Why it matters
Tracking this helps measure the speed of service restoration, which is critical for user satisfaction. It separates temporary fixes from permanent resolutions in process analysis.
Where to get
This can be inferred from the timestamp when the 'Workaround' field in the incident resolution details is populated or when a specific 'Workaround Provided' status is used.
Capture
Use the timestamp of a status change to a 'Workaround' state or when resolution notes indicating a workaround are first saved.
Event type
inferred
|
|||
|
Incident Categorized
|
Represents the initial classification of the incident, including setting its category, type, and item. This activity is typically performed by a Level 1 service desk agent shortly after the incident is reported. | ||
|
Why it matters
Tracking this activity helps analyze the accuracy of initial classifications and its impact on routing and resolution. Changes to categorization later in the process indicate rework and potential knowledge gaps.
Where to get
Inferred from the first time the operational and product categorization fields ('OpCat', 'ProdCat') are populated in the incident's audit log (HPD:HelpDesk_AuditLogSystem).
Capture
Identify the first timestamp where categorization fields are set in the audit log.
Event type
inferred
|
|||
|
Incident Put On Hold
|
This activity occurs when the progress on an incident is paused, typically while waiting for information from the user or an external vendor. This is usually reflected by a status change to 'Pending'. | ||
|
Why it matters
This activity is crucial for accurately calculating resolution times. The time spent in a 'Pending' state should often be excluded from SLA calculations to fairly measure support team performance.
Where to get
Inferred from a change in the 'Status' field to a 'Pending' state. The audit log tracks the timestamp of this change.
Capture
Filter audit logs for status changes to 'Pending' or a similar on-hold status.
Event type
inferred
|
|||
|
Incident Reopened
|
Occurs when a previously resolved or closed incident is returned to an active state. This usually happens when the user reports that the issue has recurred. | ||
|
Why it matters
A high re-open rate indicates problems with the quality of resolutions. Tracking this rework loop is essential for identifying root causes of ineffective fixes and improving first-call resolution.
Where to get
Inferred from a status change from 'Resolved' or 'Closed' back to an active state like 'In Progress' or 'Assigned'. This transition is logged in the audit history.
Capture
Filter audit logs for a status change from a resolved/closed state to an open state.
Event type
inferred
|
|||
|
Investigation Started
|
Indicates that an assigned agent has started actively working on the incident. This is often represented by a status change from 'Assigned' to 'In Progress' or a similar state. | ||
|
Why it matters
Measuring the time between assignment and the start of investigation reveals queueing delays and helps assess agent responsiveness and workload capacity.
Where to get
This is inferred from a change in the 'Status' field from 'Assigned' to 'In Progress'. The timestamp of this status change is recorded in the audit log.
Capture
Filter audit logs for the first status change to 'In Progress' after an assignment.
Event type
inferred
|
|||
|
Pending Status Resumed
|
Marks the point where an incident that was on hold is reactivated. This happens when the required information is received, and is typically shown by the status moving from 'Pending' back to 'In Progress'. | ||
|
Why it matters
This event, paired with 'Incident Put On Hold', allows for the precise measurement of hold times. Analyzing long hold times can highlight communication issues with users or third parties.
Where to get
Inferred from a status change from 'Pending' to an active state like 'In Progress'. The timestamp is available in the audit log.
Capture
Filter audit logs for a status change from 'Pending' to 'In Progress' or 'Assigned'.
Event type
inferred
|
|||
|
SLA Breach Detected
|
A calculated event that occurs when the time taken to respond or resolve an incident exceeds the targets defined in its Service Level Agreement (SLA). This is not a direct system event. | ||
|
Why it matters
Identifying SLA breaches is crucial for compliance monitoring and prioritizing critical incidents. This helps pinpoint which stages of the process contribute most to breaches.
Where to get
This event is calculated by comparing the timestamp of the 'Incident Resolved' activity (or other SLA milestones) against the 'Reported Date' and the defined SLA target for that incident's priority.
Capture
Derive by comparing the resolution timestamp to the start timestamp plus the SLA duration.
Event type
calculated
|
|||
|
Transferred to Another Group
|
Represents the reassignment of an incident from one support group to another. This typically occurs when the initial group cannot resolve the issue and requires expertise from a different team. | ||
|
Why it matters
Frequent transfers, or 'ping-ponging', are a major source of delay and inefficiency. Analyzing these activities helps identify routing issues, skill gaps, and process bottlenecks.
Where to get
Inferred from a change in the value of the 'Assigned Group' field in the incident's audit history, subsequent to the initial assignment.
Capture
Each change to the 'Assigned Group' field in the audit log after the first assignment represents a transfer.
Event type
inferred
|
|||
|
User Confirmation Received
|
Represents the user acknowledging that the provided solution has fixed their issue. This is often an optional step and may be captured through a portal action or by an agent. | ||
|
Why it matters
Analyzing the rate and speed of user confirmations helps evaluate communication effectiveness and resolution quality. A low confirmation rate might lead to a higher re-open rate.
Where to get
This is difficult to capture directly and may need to be inferred. It could be a specific status like 'Resolved - Confirmed' or a note added to the incident work log.
Capture
Requires system analysis. Look for work log entries or status changes indicating user feedback.
Event type
inferred
|
|||