Exploring newland: process mining and automated planning

challenge and study conference room exploring newland

Talking with Andrea Marrella

Over the years, automated planning revealed a handy Swiss knife for process mining. We speak about the research on this and more with Andrea Marrella, one of the pioneers of this cross-border area.

Andrea, tell us a bit about yourself and your research institute!

In October 2013, I acquired my PhD in Computer Science and Engineering. I had my postdoc till March 2019 and currently am an assistant professor at the Sapienza University of Rome. I had the luck to start teaching as a tutor already for Bachelor courses since I was a Master student, and also the privilege of participating in an EU research project in 2006 (WORKPAD). The interest in processes began with that very project. The aim of the project was emergency management. Procedures were not really standard or repetitive. Emergency management required ad-hoc strategies and adaptiveness. Hence my interest from the early stages in loosely-structured processes. That project pioneered the field as nowadays the management of what we would call “knowledge-intensive processes” is rather widespread. The adaptation to contingencies of a single case was key.

My PhD topic was specifically about dynamic runtime process adaptiveness. Specifically, I developed a novel approach and a concrete system implementation, called SmartPM, which allows to automatically adapt running processes when unanticipated runtime exceptions and exogenous events occur without explicitly defining recovery policies at design-time. The design and execution engine of SmartPM is backed by a formal model based on two well known action-based formalisms in AI (situation calculus and IndiGolog), while I resorted to planning to find recovery procedures on-the-fly. In other words, I had non-predefined exception handlers but let the planner generate a strategy to recover from a starting point to the goal (that is, turning the observed status into the expected one).

So my interest in planning versus processes came about.

During my PhD studies, I spent 9 months in Canada, working with Yves Lespérance, one of the most renowned experts in Golog. We applied planning for automated process synthesis (we called it template design), not starting from the event log but from the knowledge of the context and the goal. Planning allowed us to build a plan that led from the world status to the goal status. That plan was the process. Notice that, typically, a plan is a sequence of steps. Instead, our technique catered for partial orders and this allowed us to encompass parallelism and choice points. A particularly interesting design of that technique was that we did not need to know the whole world status: partial knowledge was sufficient (see here).

When and why did you first come up with the idea to do research in process mining? Why does planning offer a good toolkit for process mining?

The leap towards process mining happened in 2016. The collaborations with Massimiliano de Leoni and Fabrizio Maria Maggi, thanks to their strong interest in the field, pushed me towards the investigation of planning in a conformance checking context. With planning, we reduced trace alignment (let it be declarative or imperative) to an automated planning problem. The procedure was not so straightforward: We developed an ad-hoc ground algorithm to turn alignment into classic planning. Classic planning offered mature general-purpose tools and techniques, which we could employ for our intents and purposes. We have published a paper at AAAI’17 and an article in Expert Systems with Applications about this technique. As it turns out, it’s lightning-fast! It scales well also in presence of traces and models of considerable size.

This opportunity led to solving a number of additional problems. For example, we tackled alignments in the presence of coarse-grained timestamps – such that some events are marked as if they were contemporary, against the typical event log assumption. Very recently, we began working on a paper on alignments of declarative constraints with data employing planning techniques.

In short: planning offers a plethora of techniques and tools that can boost process mining and conformance checking in particular.

What are the typical challenges that darken the nights of a planner in process mining?

I see two main challenges here. First, planning shows quite some practical issues when it comes to handling integer parameters. Which might seem negligible, but it’s not. Temporal planners can handle integers well but non-classic planning algorithms are mostly designed for satisfiability checking (that is, showing that a plan exists) rather than for reaching an optimum. In conformance checking, though, one looks for the optimal alignment, not just one alignment. A practical drawback is that with a sub-optimal solution, a trace alignment might include many moves that are completely unnecessary. To avoid this effect, we should resort to classic planning, for which mature and efficient heuristics exist. Notice that theoretical results for non-classic planning are noteworthy, no doubt, but tools are not that mature at present. They would be in a few years, perhaps.

Another challenge is, planning is inherently a computationally intractable problem in the worst case, even for the simplest models, so handle it with care! Remember that planners are general-purpose: they are not bound to specific domains or problems. The price for generality is computational. However, if the problem reduction from trace alignment is smart, then it can practically scale quite well. One of the most expensive steps for planners is the creation of the search space (crossing actions and domain states for possible evolutions). So, the more one flattens the operations down to their instantiations, the more the overall performance improves. A sort of ex-ante propositionalization or grounding, if you like. Preprocessing can help to flatten the problem inside the domain. This trick is not a big deal to realise and the planner can take enormous advantage of a pre-digested input, as the time saved typically results in a performance boost. However, it implies tailoring the problem to the system at hand.

How do you see the interplay of planning & process mining in the future?

There are a number of exciting challenges ahead of us. Interested readers can have a look at this article, which discusses how automated planning techniques can be leveraged to enable new levels of automation and support for solving concrete problems in the BPM field that were previously tackled with hard-coded solutions. I myself have been trying to use planning for process model repair based on rules for years, for example. To understand if a process complies with a set of given rules, for instance, we should unfold it into its linearisations. It is quite a complex passage and is prone to performance disruption especially when the process has an elevated degree of concurrency. In our paper at ICSOC 2018 we made quite a number of assumptions to make this problem tractable (for instance, well-structuredness and acyclicity).

Also, we would like to use it for robotic process automation – in particular, discovering routines from event logs and segmenting the event stream into separate cases. It smells like undecidability if assumptions are not in place (think of how to distinguish to which case multiple copy-pasting operations are there), but we are working on it.

One day, new planning heuristics that are ad-hoc for process mining should see the light.

If you are interested in these topics, you are welcome to attend our tutorial at BPM 2021 with Tathagata Chakraborti!

As an organiser of BPM 2021 workshops, what’s your take on the upcoming events?

This year we follow the long-standing tradition of BPM workshops with a few new entries. We host the same events as last year, including the BPI workshop and three new ones, which could be of inspiration for new investigations in process mining: PROBLEMS, on the problems to solve in BPM before we die, and the workshops on BPM governance for and beyond digital transformation (BPMGOV’21) and on BPM and Routine Dynamics (BPM&RD’21), very suitable for the management track. We expect a great involvement of the audience as always, though with a totally new format: the blended modality, with a number of researchers and practitioners joining us in person in Rome and a wide audience connected on-line.