Exploring newland: Process mining and probability theory

ICPM exploring newland publication outlet

Talking with Arik Senderovich

Process mining aims at extracting valuable knowledge from data. In the real world, both processes and data (let alone the knowledge extracted from there!) are subject to uncertainty. In this context, probability theory naturally fits the challenge. We talk about research at the crossroads of process mining and probability with Arik Senderovich, one of the most renowned scholars conducting investigations in this fertile middle ground, and PC Chair of the upcoming edition of the process mining conference, ICPM 2023.

Tell us a bit about yourself and your research institute, Arik!

I am happily married to Liya and a proud father of 2 wonderful children, Loren (note the Italian name) and Alice (like the girl from Wonderland). We currently live in Canada where I work as an Assistant Professor at York University, in the northern part of Toronto. My position is at the school of information technology, which belongs to the Faculty of Liberal Arts and Professional Studies. We offer a strong program on the border between Computer Science and Information Systems, competing with other York programs such as Computer Science and Business Administration. I am also cross-appointed with the Rotman School of Management, at the University of Toronto, where I serve as a status-only Assistant Professor. Recently, I joined the Canadian Process Mining Community (CPMC), which is a new initiative proposed and very well managed by Najah El-Gharib from University of Ottawa (you can read more about the CPMC community in this edition of the newsletter).

A bit of history: After completing my PhD in process mining at the Technion (Israel), I came to Canada in 2018 for a postdoctoral fellowship, working with Chris Beck (University of Toronto/Industrial Engineering), an expert in AI Scheduling. The family fell in love with Toronto the minute we landed and we decided to stay here in Canada for the long-term.

My bachelor’s degree was in Industrial Engineering at the Technion, which was a great program, in my opinion, because it was inherently interdisciplinary: it offered courses on optimization, probability, stats, supply chain management, information systems, behavioral (and non-behavioral) economics, industrial psychology and marketing. Obviously, I was far more attracted by the quantitative courses, like statistics, data mining, queueing theory, and simulation.

I learned these courses from world-class faculty members like Aharon Ben-Tal, Haya Kaspi, Avishai Mandelbaum and others. One member of this excellent list you all know very well as he is a well-established member of our community: Avigdor (Avi) Gal, a world-expert in databases and uncertain data management, and process mining, taught me Data Structures & Algorithms (DSA) during my second year –he did not remember me when he considered recruiting me for a PhD position, but we will get to that part later :)

In the final stages of my Industrial Engineering degree, I took some of the more quantitative courses. One of the courses in which I did particularly well was Service Engineering (which was mainly about –surprise, surprise– queueing theory and probability). This was a turning point in my route, as I was spotted for grad studies by Avishai (Avi) Mandelbaum, a world-class expert in applied probability, queueing theory, and statistics, who later became my master’s supervisor. I did my master’s in stats while serving in the military for six years and completed it only after my discharge in 2012.

Avi Mandelbaum then became my PhD co-supervisor, alongside Avi Gal. Yes, I had two Avis (or Avi² as the joke went) as my supervisors, and I am happy to be still collaborating with both of them even after graduation, both because they are awesome and kind people, and because they are the perfect live combination between data & processes on one side (AviG), and queueing theory & applied probability on the other (AviM). This intersection is still one of my biggest passions in research.

When and why did you first come up with the idea to do research in process mining?

As I was completing my master’s thesis back in 2012, I was working on a novel chapter that tried to understand the behavior of call center agents using event data that was available at the SEELab (a data-oriented lab managed by AviM). What I proposed was to model the pathways of these agents using directed graphs, or Markov chains if you wish (as the graph was annotated with probabilities and durations), which could then be visualized and analyzed. Interestingly, 10 years later, a research line that I am exploring together with Artem Polyvyanyy, Andrei Tour, and Anna Kalenkova involves constructing similar models called agent graphs.

When I was researching the relevant literature, I came across a hardcover green book with a very interesting name. You probably guessed it by now: “Process Mining: Discovery, Conformance, and Enhancement of business processes” by Wil van der Aalst. And then the real fun began. I showed the book to AviM, who got excited and saw my research as an opportunity to integrate the algorithms from process mining into queueing theory, and vice versa. AviM organized a meeting with AviG, who back then supervised a young, talented, and ambitious postdoc, one to later be known as the (still young, talented, and ambitious) Prof. Dr. Matthias Weidlich. The four of us met at AviM’s lab and decided to work together on a weird yet novel topic that later became known as queue mining. That was really the beginning of it, and I’ve been doing queue mining (along with other related topics) ever since.

To me, process mining is the bridge that can connect the different communities that influenced me, namely the operations research community and the business process management community. I enjoy building and walking that bridge every day alongside my wonderful colleagues.

How does probability theory inform process mining and management at large?

Even though some of us hate to admit it, processes in real life are, inherently, uncertain. Pathways are often random (an activity can be skipped because the employee was not paying attention), durations vary between different cases, even for the same activity, resource assignments can depend on factors beyond our control, and so on. In my opinion, the most natural method to quantify this uncertainty is via probability theory. Think of probability as a measuring tape that can gauge the amount of uncertainty in a decision, a duration, or any other variable in your process. Models that lack probabilities can be useful (resource assignment using deterministic scheduling is still very common in practice) but are not capturing the full picture. Therefore, to make our methods and algorithms more realistic, I believe that we must take a probabilistic view. This has many implications on how we analyze the models (with and without the data).

Firstly, in probabilistic models, we can ask a richer collection of highly relevant queries. We can ask about the probability that the process terminates within K steps (and not only whether it will terminate eventually); we can ask about the time it will take for the model to terminate under uncertainty –and then the answer is not a number but a distribution. Lastly, if we think about the queueing (not queuing, although the latter is grammatically correct) perspective, we can ask whether, in the long run, our system will remain stable or not (whether the number of cases will continue growing on average, which is a soundness problem, in the probabilistic sense).

Secondly, our models can capture richer process behavior: we can attach chances to a decision, instead of leaving it to non-determinism. We can consider the probability of a patient missing their appointment, a doctor being late for clinic hours, or of the nurse mixing the wrong medications requiring a re-start of the procedure.

Lastly, the way we look at data changes, as probabilistic models open the door to their best friends: statistical models. The data that we see in our event logs can be viewed as just one realization, or a single sample, of the model as a probabilistic model can generate many sample paths. “Is the discovered model good enough given the data that we used to generate it?” is a question that is currently mostly answered (from a probabilistic perspective) based on a single realization of the true model, which is very narrow and insufficient from a statistical standpoint. Does that mean that we always need more data samples to answer this question? Not necessarily. We may “cut” the data into chunks that are (under assumptions) sufficiently independent, and then treat them as multiple samples from the model. Such questions were recently answered in our ICPM 2022 paper, where together with Alessio Cecconi and Claudio Di Ciccio, we use the data as a realization of stochastic declarative models, to quantify the interestingness of traces and logs.

As you see, adding probability to our models opens new horizons and sets exciting paths that process mining can follow. As for process management from a more general perspective, I believe that a probabilistic view should be adopted in other parts of the BPM life-cycle. When optimizing, we must start considering stochasticity (e.g., when solving scheduling problems). A recent challenge we’ve been working on with Remco Dijkman is looking at process optimization from a probabilistic perspective. We are asking the question: how would bad predictions (or accurate predictions) inform stochastic resource assignment problems?

Moreover, decision-making at run-time or off-line should be informed by probabilistic causal models such as causal Bayesian networks. In a recent work with my excellent master’s student, Ali Alaee, in a joint work with Avigdor Gal and Matthias Weidlich, we study how interventions on contextual variables would impact decisions.

How do you see the interplay of probability theory & process mining in the future? In other words, what are the typical challenges that darken the nights of a process miner dealing with probability?

The main challenges and questions at the interplay of probability and process mining that I foresee as highly interesting, and that are truly keeping me up at night drinking a lot of strong black coffee (because we are working on papers and tools to develop these topics) include –but are of course not limited to– the following questions.

How to build automated stochastic queueing-aware models of the system from event logs? Together with my colleagues Dmitry Krass and Opher Baron from the Rotman School of Management at the University of Toronto, we recently patented ServiceMiner, a tool that tries to answer this question. It is a tool based on AI and process mining, but its uniqueness is in that it brings together insights from operations management (e.g., queueing theory), ML (e.g., clustering of cases into types), and process mining (e.g., variant discovery) to provide a better solution to businesses that experience high congestion in their daily operation. In a related project, together with David Chapela and Marlon Dumas from Tartu, we investigate the discovery of simulation models with an emphasis on the question: how to compare probabilistic models after they were discovered by simulation miners methods like ServiceMiner? We propose a non-trivial collection of measures that test various aspects of the underlying stochastic model, including the arrival process, the control flow, and congestion.

How to best integrate causality and statistics into process management and mining? Probabilistic models open the door for statistics, but simply mining the data without considering causality may lead to bad results (especially for what-if and prescriptive analytics). In process mining and predictive monitoring, we often implicitly assume some form of causality, e.g., by saying that if A is always followed by B, then A is the cause of B, or by lumping all available features together in regression models, thus assuming that the features have an equal influence on the outcome. However, causality is also probabilistic, so in the work I already mentioned on causal decision support, we explain, using examples and formal arguments, that we must be more careful with what we assume, and that we need to perform additional steps to better integrate causality into BPM and process mining beyond the existing methods that do so.

How to choose the best probabilistic formalism for various types of analyses? There are many probabilistic models out there. Colored stochastic Petri nets are very useful for representing complex simulation models of business processes, but can only be analyzed via simulation, and often do not enable insights as they become a bit like black-box models (with numerous parameters and configurations). On the other hand, Markovian queueing and Petri net models (such as Jackson networks, and stochastic Petri nets) can be analyzed via applied probability in closed-form, without requiring simulation (for many of the KPIs). These models make multiple simplifying assumptions to enable very simple yet impactful insights. However, there is the question of: is there more out there? Are there models that are more complex than the Markovian families, but can still be used in process mining to drive insights without resorting to simulation?

In a recent research thread, together with Marco Montali, we look at recent developments in queueing theory, which we would like to connect to process mining, and see if we can perform data-driven scheduling and solve other optimization problems that are highly relevant for object-centric process mining. I think that these three threads (among many others), would be very cool directions that the community could tackle in the years to come. I certainly intend to do so :)

A few words on ICPM 2023 as a PC Chair? Any news, whispers, or suggestions to prospective authors...?

Please let me start by saying that I am very grateful for the exciting opportunity to serve the community as a PC Chair for this year’s ICPM edition. I am extremely lucky to have Stefanie and Jorge next to me. With them as co-PC Chairs, and with Claudio and Andrea as our general chairs, I feel very confident that this year’s ICPM will be a huge success (not speaking of the fact that it is in Rome, the cradle of civilization as we know it). The call for papers is already out for quite some time now, and we are hoping that the talented researchers in our community are already making plans and advancing the state of the art towards the deadline (June 13, 2023 for papers; abstracts are as usual, one week earlier). We expect to receive high-quality submissions, and of course, only the most-fit will survive the careful selection process. We have top-notch members in our program committee, with several new faces joining, who will diligently and objectively assess the papers, and select only the best of breed. I am confident that the scientific program will be a blast, especially with so much happening in the industry, and informing us, the researchers, with novel use cases and ideas.

My suggestion to the authors –and this is purely my take on PM research– is to try and lay down the foundations of your approach in a well-formed and (ideally) well-defined fashion. It is nice when your approach can solve a practical problem, but one would also want to know how the method works, conceptually (not only that it works), what could be an alternative to what you are proposing, and why these other alternatives did not work. Often the difference between research and engineering is that engineering provides a single path that solves the problem, while research explores multiple paths, focuses on one of them, and explains why it is the best to follow. This research-driven selection process allows us to learn, generalize, and expand the body of knowledge. It’s not that the next predictive monitoring method is unimportant, but if it works well, we want to know why, and we want to know what did not work as well as your method, while you were developing it. We want to know what we can learn from your paper, beyond the (important yet less informative) 5% improvement in the RMSE on BPIC2017 :)

Good luck to all the authors, and as I said, I am sure that we will have a fantastic ICPM edition in Rome!