|
Process Mining |
3 |
|
|
Preface |
6 |
|
|
Acknowledgements |
9 |
|
|
Contents |
13 |
|
|
Part I: Introduction |
18 |
|
|
Chapter 1: Data Science in Action |
20 |
|
|
1.1 Internet of Events |
20 |
|
|
1.2 Data Scientist |
27 |
|
|
1.3 Bridging the Gap Between Process Science and Data Science |
32 |
|
|
1.4 Outlook |
37 |
|
|
Chapter 2: Process Mining: The Missing Link |
41 |
|
|
2.1 Limitations of Modeling |
41 |
|
|
2.2 Process Mining |
46 |
|
|
2.3 Analyzing an Example Log |
51 |
|
|
2.4 Play-In, Play-Out, and Replay |
57 |
|
|
2.5 Positioning Process Mining |
60 |
|
|
2.5.1 How Process Mining Compares to BPM |
60 |
|
|
2.5.2 How Process Mining Compares to Data Mining |
62 |
|
|
2.5.3 How Process Mining Compares to Lean Six Sigma |
62 |
|
|
2.5.4 How Process Mining Compares to BPR |
65 |
|
|
2.5.5 How Process Mining Compares to Business Intelligence |
65 |
|
|
2.5.6 How Process Mining Compares to CEP |
66 |
|
|
2.5.7 How Process Mining Compares to GRC |
66 |
|
|
2.5.8 How Process Mining Compares to ABPD, BPI, WM, … |
67 |
|
|
2.5.9 How Process Mining Compares to Big Data |
68 |
|
|
Part II: Preliminaries |
69 |
|
|
Chapter 3: Process Modeling and Analysis |
71 |
|
|
3.1 The Art of Modeling |
71 |
|
|
3.2 Process Models |
73 |
|
|
3.2.1 Transition Systems |
74 |
|
|
3.2.2 Petri Nets |
75 |
|
|
3.2.3 Work?ow Nets |
81 |
|
|
3.2.4 YAWL |
82 |
|
|
3.2.5 Business Process Modeling Notation (BPMN) |
84 |
|
|
3.2.6 Event-Driven Process Chains (EPCs) |
86 |
|
|
3.2.7 Causal Nets |
88 |
|
|
3.2.8 Process Trees |
94 |
|
|
3.3 Model-Based Process Analysis |
99 |
|
|
3.3.1 Veri?cation |
99 |
|
|
3.3.2 Performance Analysis |
101 |
|
|
3.3.3 Limitations of Model-Based Analysis |
104 |
|
|
Chapter 4: Data Mining |
105 |
|
|
4.1 Classi?cation of Data Mining Techniques |
105 |
|
|
4.1.1 Data Sets: Instances and Variables |
106 |
|
|
4.1.2 Supervised Learning: Classi?cation and Regression |
108 |
|
|
4.1.3 Unsupervised Learning: Clustering and Pattern Discovery |
110 |
|
|
4.2 Decision Tree Learning |
110 |
|
|
4.3 k-Means Clustering |
116 |
|
|
4.4 Association Rule Learning |
120 |
|
|
4.5 Sequence and Episode Mining |
123 |
|
|
4.5.1 Sequence Mining |
123 |
|
|
4.5.2 Episode Mining |
125 |
|
|
4.5.3 Other Approaches |
127 |
|
|
4.6 Quality of Resulting Models |
128 |
|
|
4.6.1 Measuring the Performance of a Classi?er |
129 |
|
|
4.6.2 Cross-Validation |
131 |
|
|
4.6.3 Occam's Razor |
134 |
|
|
Part III: From Event Logs to Process Models |
138 |
|
|
Chapter 5: Getting the Data |
140 |
|
|
5.1 Data Sources |
140 |
|
|
5.2 Event Logs |
143 |
|
|
5.3 XES |
153 |
|
|
5.4 Data Quality |
159 |
|
|
5.4.1 Conceptualizing Event Logs |
160 |
|
|
5.4.2 Classi?cation of Data Quality Issues |
163 |
|
|
5.4.3 Guidelines for Logging |
166 |
|
|
5.5 Flattening Reality into Event Logs |
168 |
|
|
Chapter 6: Process Discovery: An Introduction |
178 |
|
|
6.1 Problem Statement |
178 |
|
|
6.2 A Simple Algorithm for Process Discovery |
182 |
|
|
6.2.1 Basic Idea |
182 |
|
|
6.2.2 Algorithm |
186 |
|
|
6.2.3 Limitations of the alpha-Algorithm |
189 |
|
|
6.2.4 Taking the Transactional Life-Cycle into Account |
192 |
|
|
6.3 Rediscovering Process Models |
193 |
|
|
6.4 Challenges |
197 |
|
|
6.4.1 Representational Bias |
198 |
|
|
6.4.2 Noise and Incompleteness |
200 |
|
|
6.4.2.1 Noise |
200 |
|
|
6.4.2.2 Incompleteness |
201 |
|
|
6.4.2.3 Cross-Validation |
202 |
|
|
6.4.3 Four Competing Quality Criteria |
203 |
|
|
6.4.4 Taking the Right 2-D Slice of a 3-D Reality |
207 |
|
|
Chapter 7: Advanced Process Discovery Techniques |
210 |
|
|
7.1 Overview |
210 |
|
|
7.1.1 Characteristic 1: Representational Bias |
212 |
|
|
7.1.2 Characteristic 2: Ability to Deal With Noise |
213 |
|
|
7.1.3 Characteristic 3: Completeness Notion Assumed |
214 |
|
|
7.1.4 Characteristic 4: Approach Used |
214 |
|
|
7.1.4.1 Direct Algorithmic Approaches |
214 |
|
|
7.1.4.2 Two-Phase Approaches |
214 |
|
|
7.1.4.3 Divide-and-Conquer Approaches |
215 |
|
|
7.1.4.4 Computational Intelligence Approaches |
215 |
|
|
7.1.4.5 Partial Approaches |
216 |
|
|
7.2 Heuristic Mining |
216 |
|
|
7.2.1 Causal Nets Revisited |
216 |
|
|
7.2.2 Learning the Dependency Graph |
217 |
|
|
7.2.3 Learning Splits and Joins |
220 |
|
|
7.3 Genetic Process Mining |
222 |
|
|
7.4 Region-Based Mining |
227 |
|
|
7.4.1 Learning Transition Systems |
227 |
|
|
7.4.2 Process Discovery Using State-Based Regions |
231 |
|
|
7.4.3 Process Discovery Using Language-Based Regions |
233 |
|
|
7.5 Inductive Mining |
237 |
|
|
7.5.1 Inductive Miner Based on Event Log Splitting |
237 |
|
|
7.5.2 Characteristics of the Inductive Miner |
244 |
|
|
7.5.3 Extensions and Scalability |
248 |
|
|
7.6 Historical Perspective |
251 |
|
|
Part IV: Beyond Process Discovery |
256 |
|
|
Chapter 8: Conformance Checking |
258 |
|
|
8.1 Business Alignment and Auditing |
258 |
|
|
8.2 Token Replay |
261 |
|
|
8.3 Alignments |
271 |
|
|
8.4 Comparing Footprints |
278 |
|
|
8.5 Other Applications of Conformance Checking |
283 |
|
|
8.5.1 Repairing Models |
283 |
|
|
8.5.2 Evaluating Process Discovery Algorithms |
284 |
|
|
8.5.3 Connecting Event Log and Process Model |
287 |
|
|
Chapter 9: Mining Additional Perspectives |
290 |
|
|
9.1 Perspectives |
290 |
|
|
9.2 Attributes: A Helicopter View |
292 |
|
|
9.3 Organizational Mining |
296 |
|
|
9.3.1 Social Network Analysis |
297 |
|
|
9.3.2 Discovering Organizational Structures |
302 |
|
|
9.3.3 Analyzing Resource Behavior |
303 |
|
|
9.4 Time and Probabilities |
305 |
|
|
9.5 Decision Mining |
309 |
|
|
9.6 Bringing It All Together |
312 |
|
|
Chapter 10: Operational Support |
316 |
|
|
10.1 Re?ned Process Mining Framework |
316 |
|
|
10.1.1 Cartography |
318 |
|
|
10.1.2 Auditing |
319 |
|
|
10.1.3 Navigation |
320 |
|
|
10.2 Online Process Mining |
320 |
|
|
10.3 Detect |
322 |
|
|
10.4 Predict |
326 |
|
|
10.5 Recommend |
331 |
|
|
10.6 Processes Are Not in Steady State! |
333 |
|
|
10.6.1 Daily, Weekly and Seasonal Patterns in Processes |
333 |
|
|
10.6.2 Contextual Factors |
333 |
|
|
10.6.3 Concept Drift in Processes |
335 |
|
|
10.7 Process Mining Spectrum |
336 |
|
|
Part V: Putting Process Mining to Work |
337 |
|
|
Chapter 11: Process Mining Software |
339 |
|
|
11.1 Process Mining Not Included! |
339 |
|
|
11.2 Different Types of Process Mining Tools |
341 |
|
|
11.3 ProM: An Open-Source Process Mining Platform |
345 |
|
|
11.3.1 Historical Context |
345 |
|
|
11.3.2 Example ProM Plug-Ins |
347 |
|
|
11.3.3 Other Non-commercial Tools |
351 |
|
|
11.3.3.1 PMLAB |
351 |
|
|
11.3.3.2 CoBeFra |
351 |
|
|
11.3.3.3 RapidProM |
352 |
|
|
11.4 Commercial Software |
353 |
|
|
11.4.1 Available Products |
353 |
|
|
11.4.2 Strengths and Weaknesses |
359 |
|
|
11.4.2.1 Limited Support for Concurrency |
359 |
|
|
11.4.2.2 Limited Support for Conformance Checking |
361 |
|
|
11.4.2.3 Performance Perspective is Well Supported |
362 |
|
|
11.4.2.4 Data Perspective Not in Models |
362 |
|
|
11.4.2.5 Organizational Perspective |
362 |
|
|
11.4.2.6 Growing Support for XES |
363 |
|
|
11.4.2.7 Getting Event Data from Other Sources |
363 |
|
|
11.4.2.8 Filtering |
363 |
|
|
11.4.2.9 No Automatic Clustering |
363 |
|
|
11.4.2.10 Reporting and Animation |
364 |
|
|
11.4.2.11 Links to Other Tools |
365 |
|
|
11.4.2.12 Operational Support |
365 |
|
|
11.4.2.13 Scalability |
365 |
|
|
11.5 Outlook |
366 |
|
|
Chapter 12: Process Mining in the Large |
367 |
|
|
12.1 Big Event Data |
367 |
|
|
12.1.1 N = All |
368 |
|
|
12.1.2 Hardware and Software Developments |
370 |
|
|
12.1.2.1 In-Memory Databases and Analytics |
373 |
|
|
12.1.2.2 Columnar Databases |
374 |
|
|
12.1.2.3 Large-Scale Distributed File Systems |
375 |
|
|
12.1.3 Characterizing Event Logs |
378 |
|
|
12.2 Case-Based Decomposition |
382 |
|
|
12.2.1 Conformance Checking Using Case-Based Decomposition |
383 |
|
|
12.2.2 Process Discovery Using Case-Based Decomposition |
384 |
|
|
12.3 Activity-Based Decomposition |
387 |
|
|
12.3.1 Conformance Checking Using Activity-Based Decomposition |
388 |
|
|
12.3.2 Process Discovery Using Activity-Based Decomposition |
390 |
|
|
12.4 Process Cubes |
392 |
|
|
12.5 Streaming Process Mining |
395 |
|
|
12.6 Beyond the Hype |
398 |
|
|
Chapter 13: Analyzing "Lasagna Processes" |
400 |
|
|
13.1 Characterization of "Lasagna Processes" |
400 |
|
|
13.2 Use Cases |
404 |
|
|
13.3 Approach |
405 |
|
|
13.3.1 Stage 0: Plan and Justify |
406 |
|
|
13.3.2 Stage 1: Extract |
408 |
|
|
13.3.3 Stage 2: Create Control-Flow Model and Connect Event Log |
408 |
|
|
13.3.4 Stage 3: Create Integrated Process Model |
409 |
|
|
13.3.5 Stage 4: Operational Support |
409 |
|
|
13.4 Applications |
410 |
|
|
13.4.1 Process Mining Opportunities per Functional Area |
410 |
|
|
13.4.2 Process Mining Opportunities per Sector |
411 |
|
|
13.4.3 Two Lasagna Processes |
415 |
|
|
13.4.3.1 RWS Process |
415 |
|
|
13.4.3.2 WOZ Process |
417 |
|
|
Chapter 14: Analyzing "Spaghetti Processes" |
423 |
|
|
14.1 Characterization of "Spaghetti Processes" |
423 |
|
|
14.2 Approach |
427 |
|
|
14.3 Applications |
430 |
|
|
14.3.1 Process Mining Opportunities for Spaghetti Processes |
430 |
|
|
14.3.2 Examples of Spaghetti Processes |
432 |
|
|
14.3.2.1 ASML |
432 |
|
|
14.3.2.2 Philips Healthcare |
433 |
|
|
14.3.2.3 AMC Hospital |
436 |
|
|
Part VI: Re?ection |
440 |
|
|
Chapter 15: Cartography and Navigation |
442 |
|
|
15.1 Business Process Maps |
442 |
|
|
15.1.1 Map Quality |
443 |
|
|
15.1.2 Aggregation and Abstraction |
443 |
|
|
15.1.3 Seamless Zoom |
445 |
|
|
15.1.4 Size, Color, and Layout |
449 |
|
|
15.1.5 Customization |
451 |
|
|
15.2 Process Mining: TomTom for Business Processes? |
452 |
|
|
15.2.1 Projecting Dynamic Information on Business Process Maps |
452 |
|
|
15.2.2 Arrival Time Prediction |
455 |
|
|
15.2.3 Guidance Rather than Control |
455 |
|
|
Chapter 16: Epilogue |
457 |
|
|
16.1 Process Mining as a Bridge Between Data Mining and Business Process Management |
457 |
|
|
16.2 Challenges |
459 |
|
|
16.3 Start Today! |
461 |
|
|
References |
463 |
|
|
Index |
473 |
|