Academic stories: Artem Polyvyanyy

academic stories

Narrated by Artem Polyvyanyy

My name is Artem Polyvyanyy. I am a Senior Lecturer at the University of Melbourne, Australia. The University was founded in 1853 and is consistently ranked among the leading universities in the world. In the University, I belong to the School of Computing and Information Systems (CIS), Melbourne School of Engineering, a place full of brilliant academics and gifted students.

Before joining CIS, till 2018, I was a Research Fellow and later a Lecturer at the School of Information Systems, Queensland University of Technology, Australia, where I disembarked in 2012 after completing my PhD at the Business Process Technology Group led by Mathias Weske, Hasso Plattner Institute, Potsdam, Germany. If you are puzzled with the etymology of my name, let me help you. I was born in Mariupol, Ukraine, and my English name is the result of transliteration from the Cyrillic Артем Полив’яний. My ancestors come from a small village Polyvyane in a picturesque Poltava region, Ukraine. Here, in Kyiv, Ukraine, I took my first steps in Computer Science (2000-2004) at the National University of Kyiv-Mohyla Academy, one of the oldest universities in Eastern Europe.

At the University of Melbourne, I coordinate and (co-)teach several subjects, including Foundations of Algorithms at the bachelor level, a class that recently attracted as many as 800 students, and Modelling Complex Software Systems at the master level. I also coordinate and teach Data Warehousing at the Melbourne Business School. I co-supervise graduate research students and coordinate the Master of Information Technology course. Finally, once all the above duties are under control, together with my colleagues and PhD students, I enjoy solving complex puzzles in the areas of Process Mining and Process Querying.

I followed developments in Process Mining from the beginning of my PhD studies in Potsdam at the end of 2007, but my research in this area started relatively recently. During one of my initial meetings with Mathias Weske, my PhD advisor, I was told that doing a PhD is similar to running a start-up, meaning everything is in your control. You need to identify a problem, demonstrate its significance, and propose and evaluate effective and efficient solutions to this problem, Mathias said. My original plan was to start working on ideas in the intersection between Information Retrieval and Business Process Management, two areas of my strengths at that time, but Ahmed Awad was already driving a similar PhD project at the Group. The next day after my meeting with Mathias, I came back with another idea. Imagine, there is no process model, I said. Then, eventually, we observe a first execution of the process. For example, this first execution could start with task A, followed by task B, and conclude with task C. So, we construct the corresponding model that captures the observed ordering constraints between the tasks. Later, we observe a second execution of the same process, which, for instance, evidences that tasks A and B can be followed by an execution of task D. Consequently, we update the first version of the model to specify that task B should be followed by an exclusive choice between tasks C and D. In this way, by observing and incorporating constraints from the subsequent process executions, we incrementally learn the model of the process. Mathias listened to this idea patiently and then told me that it very much reminds him of Process Mining, and that researchers in Eindhoven, The Netherlands (specifically the research group at the Eindhoven University of Technology, led by Wil van der Aalst at that time), are experts in this area, and we should probably think of a different PhD topic. Indeed, Wil coined the concept of Process Mining at least ten years before this chat of ours in Potsdam.

Over the next 2-3 years, I worked on techniques for modeling flexible processes using hypergraphs, abstracting (or simplifying) process models, parsing workflow graphs, and structuring process models. I decided that the latter topic would form the core of my dissertation sharply three years after I started my PhD project, on the day our paper on structuring co-authored with Luciano García-Bañuelos and Marlon Dumas has won the best paper award at the 8th International Conference on Business Process Management in Hoboken, NJ, United States. The problem of structuring process models deals with transforming an arbitrarily structured model into a behaviourally equivalent model composed of four structural patterns: sequence, choice, repetition, and option for parallel execution. Thus, it generalizes the problem of GOTO statements elimination in computer programs by addressing programs, or process models, that support concurrent execution of instructions.

I wrote my first paper in the area of Process Mining in 2016–2017. The paper was co-authored with Wil van der Aalst, Arthur ter Hofstede, and Moe Wynn, and contributed to solving the problem of automated repair of a process model based on real-world event data. In early 2016, Wil spent a month with our research group in Brisbane, triggering the collaboration. This collaboration was my first true immersion in the fundamental problems and challenges of the Process Mining area. However, this my story is not about process model repair, but about my more recent results in conformance checking based on the concept of entropy.

The idea of using entropy for process analysis was “planted” in my mind in December 2015 during my visit to the research group of Jan Mendling at the Vienna University of Economics and Business, Austria. In 2015, Andreas Solti, Claudio Di Ciccio, and Jan, together with Matthias Weidlich from the Humboldt University of Berlin, Germany, were iterating thoughts on applying the concept of entropy from Information Theory in Process Mining. I must admit I did not pay much attention to those thoughts in 2015, as my research agenda was occupied with initial ideas in Process Querying – my “hobby” research topic. Only in June of 2017, when designing quality criteria for outcomes of Process Querying operations, I realized that to assess the quality of combining two process models into one model, for instance using an SQL-like INSERT or UPDATE statement, I may want to measure the possibly infinite collections of process traces described by the two models being combined and by the resulting model, and the idea of applying the concept of entropy to that end “germinated” inside of me.
I contacted Andreas to check the progress of their 2015 endeavors and whether they have published on the topic. Andreas replied that those initial thoughts did not lead to a publication and the work is on hold. But, he did share with me a battery of papers on entropy they collected in Vienna while “plowing” the field. So, there I was, sitting at my desk in Brisbane on June 14, 2017 (I still have my notes from that day), going over this collection of papers one by one, and eventually reaching the paper on the entropy of regular languages by Tullio Ceccherini-Silberstein et al. Once I read page 3 of the paper, and concretely Theorem 1 and the accompanying discussion, a solid understanding that the notion of topological entropy presented in that paper can be used to define the quality criteria for Process Querying operations, and the similar criteria in Process Mining, that satisfy the properties I was after, pierced my mind. A complete solution to the problem I was trying to solve became apparent in a blink of an eye. It was a true eureka moment.

I decided to invest my efforts in contributing the idea of using topological entropy for measuring precision and recall between designed process models and recorded event logs, the two fundamental criteria in Process Mining for assessing the quality of automatically discovered process models. This decision had two pragmatic reasons. First, in 2017, Process Mining was already a well-established field, so the chances for a new approach in conformance checking to generate an impact looked appealing. Second, it was already apparent to me that the envisaged entropy-based precision measure satisfies all the properties put forward by Niek Tax, Xixi Lu, Natalia Sidorova, Dirk Fahland, and Wil in their work on the imprecisions of the existing precision measures in Process Mining. This work by Tax et al. captured my attention because it showed that none of the existing precision measures could satisfy the five intuitive properties put forward by the authors.

With all the acquired results, I went back to Andreas, Matthias, Claudio, and Jan, and invited them to join forces and publish a work on the new entropy-based precision and recall measures. We put the first version of the technical report on this topic online already in early 2018. That year, Wil published a paper in which he refined the repertoire of existing properties for measures that check the quality of the discovered models to 21 propositions. It quickly became clear to me that our entropy-based measures also satisfy all the new propositions specific to precision and recall. This claim was confirmed in 2019 in the work by Anja Syring, Niek, and Wil. The final version of our work on the entropy-based quality measures in its full “blossom” was accepted for publication in the ACM Transactions on Software Engineering and Methodology journal on March 11, 2020, and is now printed in its July issue. It is worth mentioning that this publication demonstrates that our entropy-based conformance measures satisfy further properties allowing them to discriminate (possibly infinite) collections of traces that differ in a single trace.

At the end of 2018, I invited Anna Kalenkova to collaborate on the work on the generalization of the original entropy-based approach to allow measuring partial commonalities between traces captured in the compared model and event log. The paper that resulted from this collaboration won the best paper award at the 1st International Conference on Process Mining in Aachen, Germany, in June 2019. Since August 2019, Anna has been a Research Fellow in Process Mining at the University of Melbourne, where she works under my supervision on our Discovery Project with Marcello La Rosa. The project is funded by the Australian Research Council.

Recently, together with Anna and Marcello, we published a paper showing how entropy can support the assessment of the simplicity of automatically discovered process models. Besides, in 2019 and early 2020, I worked on two approaches for Stochastic conformance checking using entropy, one with Sander Leemans and one with Alistair Moffat and Luciano (accepted at ICPM 2020, in press). In the future, I will contribute new techniques in Stochastic, or how I also like to call it Statistical, Process Mining, and Data-Aware Process Mining. I strongly believe that a discovered process model should be annotated with decision probabilities and data-informed decision rules that closely reflect the real-world data the model was learned from. Finally, I would like to further contribute to the discussion on the desired properties of conformance measures in Process Mining. I believe much is still to be said on this topic.

For those who would like to take a closer look at some key publications I mentioned above, here are a few references:

You can use our open-source tool to compute the various conformance measures I discussed: https://arxiv.org/abs/2008.09558.

If you would like to stay updated on my academic life, follow my Twitter (https://twitter.com/uartem) or LinkedIn (https://www.linkedin.com/in/polyvyanyy/) account, or visit my University of Melbourne Find an Expert profile (https://findanexpert.unimelb.edu.au/profile/817239-artem-polyvyanyy).