<?xml version="1.0" encoding="utf-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">INFORMATICA</journal-id>
<journal-title-group><journal-title>Informatica</journal-title></journal-title-group>
<issn pub-type="epub">1822-8844</issn><issn pub-type="ppub">0868-4952</issn><issn-l>0868-4952</issn-l>
<publisher>
<publisher-name>Vilnius University</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">INFO1157</article-id>
<article-id pub-id-type="doi">10.15388/Informatica.2017.146</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Research Article</subject></subj-group></article-categories>
<title-group>
<article-title>Decision Support Using Belief Network Constructed from Business Process Event Log</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Savickas</surname><given-names>Titas</given-names></name><email xlink:href="titas.savickas@vgtu.lt">titas.savickas@vgtu.lt</email><xref ref-type="aff" rid="j_info1157_aff_001">1</xref><xref ref-type="corresp" rid="cor1">∗</xref><bio>
<p><bold>T. Savickas</bold> has a master’s degree in information system engineering acquired in 2013 and currently pursuits doctorate degree in Vilnius Gediminas Technical University in the area of informatics engineering. Current research is focused on process mining and its application in business process analysis and simulation.</p></bio>
</contrib>
<contrib contrib-type="author">
<name><surname>Vasilecas</surname><given-names>Olegas</given-names></name><email xlink:href="olegas.vasilecas@mii.vu.lt">olegas.vasilecas@mii.vu.lt</email><email xlink:href="olegas.vasilecas@vgtu.lt">olegas.vasilecas@vgtu.lt</email><xref ref-type="aff" rid="j_info1157_aff_001">1</xref><xref ref-type="aff" rid="j_info1157_aff_002">2</xref><bio>
<p><bold>O. Vasilecas</bold> is a full professor in Information System Department of the Vilnius Gediminas Technical University (VGTU) and a researcher in Vilnius University Institute of Mathematics and Informatics. He has many years of practical and research experience in information system development. Current research areas include business, information and software systems engineering; knowledge based information systems; business process modelling and simulation; systems theory and engineering, modern databases.</p></bio>
</contrib>
<aff id="j_info1157_aff_001"><label>1</label>Information Systems Department, <institution>Vilnius Gediminas Technical University</institution>, <country>Lithuania</country></aff>
<aff id="j_info1157_aff_002"><label>2</label>Institute of Mathematics and Informatics, <institution>Vilnius University</institution>, <country>Lithuania</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2017</year></pub-date><pub-date pub-type="epub"><day>1</day><month>1</month><year>2017</year></pub-date><volume>28</volume><issue>4</issue><fpage>687</fpage><lpage>701</lpage><history><date date-type="received"><month>1</month><year>2017</year></date><date date-type="accepted"><month>9</month><year>2017</year></date></history>
<permissions><copyright-statement>© 2017 Vilnius University</copyright-statement><copyright-year>2017</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>Information systems contain a lot of data regarding business process execution history. Use of this data, in the form of an event log, can greatly support business process management. The paper presents an approach to construct Bayesian belief network from an event log that could facilitate decision support in business process execution. The approach is evaluated against multiple event logs by inferring data probabilities occurring in the business processes. The results show that the approach is suitable for the task and could be used in decision support with future research focused on prediction and simulation of business processes.</p>
</abstract>
<kwd-group>
<label>Key words</label>
<kwd>event log</kwd>
<kwd>Bayesian belief network</kwd>
<kwd>decision support</kwd>
<kwd>probability inference</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="j_info1157_s_001">
<label>1</label>
<title>Introduction</title>
<p>Information systems are at the core of any organization in this information age. Big part of business-related data is now at least partially stored electronically and reflects how business processes are being executed in the organization. This data is not only used for controlling business processes, but also to discover knowledge that has previously been unknown, for example, business rules that are not explicitly documented that can only be found by applying data mining methods. The research area focused on using data about business process execution in information systems for analysis is called process mining. The process mining approaches can be used to discover business process models, enhance them or perform conformance checking of process execution versus existing models (van der Aalst <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1157_ref_003">2012</xref>). Another use of process mining approaches is to facilitate decision support.</p>
<p>While there are multiple methods for decision support, they focus on a specific task, such as predicting duration, follow-up activities or identify anomalies. In order to solve complex tasks, such as simulation model creation, the methods have to be combined and applied for each of the subtasks (Martin <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1157_ref_011">2015</xref>). Since processes are stochastic by nature (Kellner <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1157_ref_010">1999</xref>) or their behaviour is unpredictable (van der Aalst <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1157_ref_001">2010</xref>), probabilistic models could be applied to model business process data and its’ occurrence in a probabilistic manner. This could facilitate a single model for multi-perspective analysis of process data behaviour. Bayesian belief networks have been applied in multiple areas where there is a need to work with probabilities and dependencies between temporal events or data attributes (Arroyo-Figueroa and Sucar, <xref ref-type="bibr" rid="j_info1157_ref_004">1999</xref>), but their application in process mining has seen limited use (see Section <xref rid="j_info1157_s_002">2</xref>), therefore it is needed to see how it can be applied for decision support in process mining.</p>
<p>In this paper, an approach to create Bayesian belief network from an event log is presented. The discovered Bayesian belief network is used to infer probabilities of data occurring in business processes which could then be used for decision support, e.g. a manager could check probability of an activity in the process to receive status <italic>failed</italic> and react appropriately. There is a plan to use the Bayesian belief network for data generation and behaviour inference in business process simulation that is based not on statistics, but on probabilistic analysis; it could also be used to create initial simulation models from the belief network.</p>
<p>This paper consists of 5 sections. It starts with introduction – problem statement and proposed approach. It is followed by Section <xref rid="j_info1157_s_002">2</xref> with related literature. Section <xref rid="j_info1157_s_003">3</xref> introduces components of the approach and the method on how to construct the Bayesian Belief Network from an event log. Section <xref rid="j_info1157_s_003">3</xref> ends with the description on how inference is done using the created components. Section <xref rid="j_info1157_s_009">4</xref> provides evaluation of the approach – the experiment is defined, experimental environment is introduced and the results of the experiments are presented. The paper ends with conclusions, further research and applications of the presented approach.</p>
</sec>
<sec id="j_info1157_s_002">
<label>2</label>
<title>Related Work</title>
<p>Decision support is becoming widely used application of process mining methods. It has been used for activity or parameter prediction, decision rule mining and anomaly detection. Time prediction is one of the most widely used applications of process mining employing different prediction methods such as van Dongen <italic>et al.</italic> (<xref ref-type="bibr" rid="j_info1157_ref_006">2008</xref>), where regression equations based on event logs are used to prepare model for prediction on when the process instance (case) will be finished or generate transition system from an event log which is used for time prediction of a case (van der Aalst <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1157_ref_002">2011</xref>). Rogge-Solti and Kasneci (<xref ref-type="bibr" rid="j_info1157_ref_016">2014</xref>) use non-Markovian stochastic Petri nets with elapsed time since last observed event to predict the follow-up event most probable durations. Finally, Verenich <italic>et al.</italic> (<xref ref-type="bibr" rid="j_info1157_ref_023">2016</xref>) used SVM prediction model to eliminate over processing by detecting redundant activities.</p>
<p>Process mining has also been applied in decision analysis. In Rozinat and van der Aalst (<xref ref-type="bibr" rid="j_info1157_ref_017">2016</xref>) the authors attempt to extract rules for control flow point in the process model based on data in event logs. The rules are extracted using classification algorithms such as C.45. In de Leoni and van der Aalst (<xref ref-type="bibr" rid="j_info1157_ref_012">2013</xref>), the authors use alignment in business processes to extract data flow rules between activities. Instead of knowledge discovery, there are also methods for real-time decision support such as Liu <italic>et al.</italic> (<xref ref-type="bibr" rid="j_info1157_ref_013">2012</xref>) where it is proposed to simulate discovered models for use in decision support. Also, process mining has been applied to domain specific decision analysis as in Sarno <italic>et al.</italic> (<xref ref-type="bibr" rid="j_info1157_ref_018">2015</xref>) where an approach to use process mining and association rule learning for fraud detection is presented.</p>
<p>Bayesian probabilistic models have previously been used in process mining area, but not for general decision support. Ping <italic>et al.</italic> (<xref ref-type="bibr" rid="j_info1157_ref_014">2010</xref>) presented an approach to build Bayesian networks with data about event sequences and their temporal probabilities as additional nodes. The approach is specific to temporal anomalies and does not provide insight on how to detect general anomalies. In Rogge-Solti <italic>et al.</italic> (<xref ref-type="bibr" rid="j_info1157_ref_015">2013</xref>) the authors combine stochastic Petri nets, alignments and Bayesian networks for repairing event logs. The approach uses Bayesian inference to detect most likely timestamps for each event, but the approach uses only durations of activities and does not build a general model of the whole process. Sutrisnowati <italic>et al.</italic> (<xref ref-type="bibr" rid="j_info1157_ref_021">2015</xref>) build a general Bayesian belief network and a CPT building approach. It uses the built model to detect when a process in a ship port will be late. The approach uses Heuristic Miner to discover dependency graph and removes loops to generate the directed acyclic graph. It does not employ stateful information. In van der Spoel <italic>et al.</italic> (<xref ref-type="bibr" rid="j_info1157_ref_020">2012</xref>), the authors use multiple probabilistic methods for process predictions and one of them is Bayesian Naive Classifiers. The authors do not build a full Bayesian belief network nor explain data used, but for very complex log it shows prediction rate average between 30% and 45%.</p>
</sec>
<sec id="j_info1157_s_003">
<label>3</label>
<title>Bayesian Belief Network Construction</title>
<p>Business processes are by nature complex and stochastic, therefore it’s useful to analyse them using probabilistic methods which do not operate on clear-cut rules. The answers provided are with some degree of certainty. This way of thinking reflects real-life decision making where not all conditions are known and the process execution is not always governed by business rules, but by the context of the process.</p>
<p>Usual approach for building decision support systems is to collect expert knowledge and create a model which could be used for decision support. Expert knowledge collection is a manual labour and it needs to be automated. For this reason, data on historical business process execution can be employed for automated knowledge discovery. Process mining deals with process on how to re-use data existing in information systems regarding business process execution for discovering previously unknown knowledge. This section describes how Bayesian belief network is constructed from an event log and is used for inference to support decision process.</p>
<sec id="j_info1157_s_004">
<label>3.1</label>
<title>Bayesian Belief Network</title>
<p>One of probabilistic methods available for such analysis is Bayesian belief network (Darwiche, <xref ref-type="bibr" rid="j_info1157_ref_005">2008</xref>). It allows to represent a set of variables and their conditional dependencies via a directed acyclic graph. For example, given some known information about client, whether the process will execute successfully.</p><statement id="j_info1157_stat_001"><label>Definition 1.</label>
<p>A Bayesian network over variables <italic>X</italic> is a pair <inline-formula id="j_info1157_ineq_001"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="normal">Θ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$(G,\Theta )$]]></tex-math></alternatives></inline-formula> where <italic>G</italic> is a directed acyclic graph (DAG) over variables <italic>X</italic> and Θ is a set of conditional probability tables (CPTs).</p></statement>
<p>The Bayesian belief network could be used to answer questions important to business process owners by exploiting general probability calculations:</p>
<list>
<list-item id="j_info1157_li_001">
<label>•</label>
<p>Probability of evidence <inline-formula id="j_info1157_ineq_002"><alternatives><mml:math>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$P(X|e)$]]></tex-math></alternatives></inline-formula> can be used to answer questions such as “<italic>What’s the chance for insurance claim to be declined for someone aged 20–30 years old?</italic>”</p>
</list-item>
<list-item id="j_info1157_li_002">
<label>•</label>
<p>Most probable explanation <inline-formula id="j_info1157_ineq_003"><alternatives><mml:math>
<mml:mi mathvariant="italic">MPE</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mo movablelimits="false">arg</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mo movablelimits="false">max</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mi mathvariant="italic">r</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\mathit{MPE}(e)=\arg {\max _{x}}Pr(x,e)$]]></tex-math></alternatives></inline-formula> can be used to answer “<italic>What is the probability for process to end now given the current state?</italic>”</p>
</list-item>
<list-item id="j_info1157_li_003">
<label>•</label>
<p>Maximum a posteriori hypothesis <inline-formula id="j_info1157_ineq_004"><alternatives><mml:math>
<mml:mi mathvariant="italic">MAT</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">M</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mo movablelimits="false">arg</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mo movablelimits="false">max</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mi mathvariant="italic">r</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\mathit{MAT}(e,M)=\arg {\max _{x}}Pr(m,e)$]]></tex-math></alternatives></inline-formula> could be used for “<italic>What’s the most probable outcome of a claim check if the claimant is aged 23 years old and made the claim in Vilnius?</italic>”</p>
</list-item>
</list>
<p>The belief network contains two components – a DAG that represents variable dependency and CPTs that represent probabilities. Based on previous section, this is very suitable for event log transformation. In the event log, event sequences are stored and each unique event could be represented as a node in a DAG and data dependency of events could be represented by the CPTs. While nodes transformation is not a hard task, arc between nodes creation is not so easy, because they reflect conditional dependency in the Bayesian belief network and transitions between events in the business process, therefore the task of DAG creation becomes complex.</p>
</sec>
<sec id="j_info1157_s_005">
<label>3.2</label>
<title>Event Log</title>
<p>Process Mining methods focus on applying data-mining methods on data existing in information systems that represent historical execution of business processes (Verbeek <italic>et al.</italic>, <xref ref-type="bibr" rid="j_info1157_ref_022">2010</xref>). This data comes in a form of an event log which consists of data collected from various sources. There are a few ways to represent event logs, but the most common one is the <italic>XES</italic> file format (Günther and Verbeek, <xref ref-type="bibr" rid="j_info1157_ref_009">2014</xref>) – a standardized and extensible file format. It is extensible and allows addition of domain specific data about business process execution. The event log contains general information on execution of the business process, such as trace identifier to identify process instance and a list of events with occurrence timestamp and identifier (Table <xref rid="j_info1157_tab_001">1</xref>). Each trace and event might also contain any other additional data related to the behaviour, e.g. client names, ages, locations, system specific information such as subsystem, server, etc.</p>
<table-wrap id="j_info1157_tab_001">
<label>Table 1</label>
<caption>
<p>Fragmentof an examplary event log with data.</p>
</caption>
<table>
<thead>
<tr>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Trace ID</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Event</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Timestamp</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Attribute resource</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Attribute claimant</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Attribute status</td>
</tr>
</thead>
<tbody>
<tr>
<td style="vertical-align: top; text-align: left">1</td>
<td style="vertical-align: top; text-align: left">Incoming_claim</td>
<td style="vertical-align: top; text-align: left">2014.01.05 08:05</td>
<td style="vertical-align: top; text-align: left"/>
<td style="vertical-align: top; text-align: left">1</td>
<td style="vertical-align: top; text-align: left"/>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">1</td>
<td style="vertical-align: top; text-align: left">Register_claim</td>
<td style="vertical-align: top; text-align: left">2014.01.05 08:30</td>
<td style="vertical-align: top; text-align: left">A</td>
<td style="vertical-align: top; text-align: left">1</td>
<td style="vertical-align: top; text-align: left"/>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">1</td>
<td style="vertical-align: top; text-align: left">End</td>
<td style="vertical-align: top; text-align: left">2014.01.05 13:57</td>
<td style="vertical-align: top; text-align: left">B</td>
<td style="vertical-align: top; text-align: left">1</td>
<td style="vertical-align: top; text-align: left">Reject</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">2</td>
<td style="vertical-align: top; text-align: left">Incoming_claim</td>
<td style="vertical-align: top; text-align: left">2014.01.07 13:07</td>
<td style="vertical-align: top; text-align: left"/>
<td style="vertical-align: top; text-align: left">2</td>
<td style="vertical-align: top; text-align: left">New client</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">2</td>
<td style="vertical-align: top; text-align: left">Register_claim</td>
<td style="vertical-align: top; text-align: left">2014.01.07 13:37</td>
<td style="vertical-align: top; text-align: left">A</td>
<td style="vertical-align: top; text-align: left">2</td>
<td style="vertical-align: top; text-align: left"/>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">2</td>
<td style="vertical-align: top; text-align: left">Initiate_payment</td>
<td style="vertical-align: top; text-align: left">2014.01.10 11:15</td>
<td style="vertical-align: top; text-align: left">B</td>
<td style="vertical-align: top; text-align: left">2</td>
<td style="vertical-align: top; text-align: left">Payed</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">2</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">End</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">2014.01.10 11:17</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">B</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">2</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">Closed</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>In the scope of this paper, Event Log definition used for transformation, is based on van Dongen <italic>et al.</italic> (<xref ref-type="bibr" rid="j_info1157_ref_006">2008</xref>) and adapted from previous work (Savickas and Vasilecas, <xref ref-type="bibr" rid="j_info1157_ref_019">2014</xref>) and is defined as follows:</p><statement id="j_info1157_stat_002"><label>Definition 2.</label>
<p>An event log over a set of activities <italic>A</italic> and time domain <inline-formula id="j_info1157_ineq_005"><alternatives><mml:math>
<mml:mi mathvariant="italic">T</mml:mi>
<mml:mi mathvariant="italic">D</mml:mi></mml:math><tex-math><![CDATA[$TD$]]></tex-math></alternatives></inline-formula> is defined as <inline-formula id="j_info1157_ineq_006"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">A</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">TD</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">C</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">M</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">V</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">μ</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">α</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">γ</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">β</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo stretchy="false">≻</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${L_{A,\mathit{TD}}}=(E,C,M,V,\mu ,\alpha ,\gamma ,\beta ,\succ )$]]></tex-math></alternatives></inline-formula>, where: 
<list>
<list-item id="j_info1157_li_004">
<label>•</label>
<p><italic>E</italic> is a finite set of events,</p>
</list-item>
<list-item id="j_info1157_li_005">
<label>•</label>
<p><italic>I</italic> is a finite set of traces,</p>
</list-item>
<list-item id="j_info1157_li_006">
<label>•</label>
<p><italic>N</italic> is a finite set of attribute names,</p>
</list-item>
<list-item id="j_info1157_li_007">
<label>•</label>
<p><italic>V</italic> is a value space of attributes,</p>
</list-item>
<list-item id="j_info1157_li_008">
<label>•</label>
<p><inline-formula id="j_info1157_ineq_007"><alternatives><mml:math>
<mml:mi mathvariant="italic">M</mml:mi>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">N</mml:mi>
<mml:mo>×</mml:mo>
<mml:mi mathvariant="italic">V</mml:mi></mml:math><tex-math><![CDATA[$M:N\times V$]]></tex-math></alternatives></inline-formula>: is a finite set of attributes,</p>
</list-item>
<list-item id="j_info1157_li_009">
<label>•</label>
<p><inline-formula id="j_info1157_ineq_008"><alternatives><mml:math>
<mml:mi mathvariant="italic">μ</mml:mi>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo stretchy="false">→</mml:mo>
<mml:mi mathvariant="italic">M</mml:mi></mml:math><tex-math><![CDATA[$\mu :E\to M$]]></tex-math></alternatives></inline-formula> is a function assigning each event with attributes and their values,</p>
</list-item>
<list-item id="j_info1157_li_010">
<label>•</label>
<p><inline-formula id="j_info1157_ineq_009"><alternatives><mml:math>
<mml:mi mathvariant="italic">α</mml:mi>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo stretchy="false">→</mml:mo>
<mml:mi mathvariant="italic">A</mml:mi></mml:math><tex-math><![CDATA[$\alpha :E\to A$]]></tex-math></alternatives></inline-formula> is a function assigning each event to an activity,</p>
</list-item>
<list-item id="j_info1157_li_011">
<label>•</label>
<p><inline-formula id="j_info1157_ineq_010"><alternatives><mml:math>
<mml:mi mathvariant="italic">γ</mml:mi>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo stretchy="false">→</mml:mo>
<mml:mi mathvariant="italic">T</mml:mi>
<mml:mi mathvariant="italic">D</mml:mi></mml:math><tex-math><![CDATA[$\gamma :E\to TD$]]></tex-math></alternatives></inline-formula> is a function assigning each event to a time stamp,</p>
</list-item>
<list-item id="j_info1157_li_012">
<label>•</label>
<p><inline-formula id="j_info1157_ineq_011"><alternatives><mml:math>
<mml:mi mathvariant="italic">β</mml:mi>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo stretchy="false">→</mml:mo>
<mml:mi mathvariant="italic">C</mml:mi></mml:math><tex-math><![CDATA[$\beta :E\to C$]]></tex-math></alternatives></inline-formula> is a surjective function assigning each event to a case,</p>
</list-item>
<list-item id="j_info1157_li_013">
<label>•</label>
<p><inline-formula id="j_info1157_ineq_012"><alternatives><mml:math>
<mml:mi mathvariant="italic">name</mml:mi>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo stretchy="false">→</mml:mo>
<mml:mi mathvariant="italic">N</mml:mi></mml:math><tex-math><![CDATA[$\mathit{name}:E\to N$]]></tex-math></alternatives></inline-formula> is a function identifying the name of an event and <inline-formula id="j_info1157_ineq_013"><alternatives><mml:math>
<mml:mi mathvariant="italic">name</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mi mathvariant="italic">v</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">v</mml:mi>
<mml:mo>:</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">v</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">V</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">N</mml:mi>
<mml:mo>:</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">v</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">μ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>∧</mml:mo>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo>=</mml:mo>
<mml:mtext>“</mml:mtext>
<mml:mi mathvariant="italic">concept</mml:mi>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">name</mml:mi>
<mml:mtext>”</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\mathit{name}(ev)=v:(v\in V,n\in N:(v,n)\in \mu (e)\wedge n=\text{``}\mathit{concept}:\mathit{name}\text{''})$]]></tex-math></alternatives></inline-formula></p>
</list-item>
<list-item id="j_info1157_li_014">
<label>•</label>
<p><inline-formula id="j_info1157_ineq_014"><alternatives><mml:math>
<mml:mo mathvariant="normal">&gt;</mml:mo>
<mml:mo stretchy="false">⊆</mml:mo>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo>×</mml:mo>
<mml:mi mathvariant="italic">E</mml:mi></mml:math><tex-math><![CDATA[$>\subseteq E\times E$]]></tex-math></alternatives></inline-formula> is the succession relation, which imposes a direct ordering of the events in <italic>E</italic>,</p>
</list-item>
<list-item id="j_info1157_li_015">
<label>•</label>
<p><inline-formula id="j_info1157_ineq_015"><alternatives><mml:math>
<mml:mo stretchy="false">≻</mml:mo>
<mml:mo stretchy="false">⊆</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal">&gt;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>+</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[$\succ \subseteq {>^{+}}$]]></tex-math></alternatives></inline-formula> is the succession relation, which imposes a total ordering of the events in <italic>E</italic>.</p>
</list-item>
</list>
</p></statement>
</sec>
<sec id="j_info1157_s_006">
<label>3.3</label>
<title>State Transition System</title>
<p>One of the components of the Bayesian belief network is a directed acyclic graph. Business processes expose complex behaviour, such as parallelism, repeated execution of activities, loops and others. This causes standard process models that are based on graph theory to be unusable for Bayesian belief networks since they can expose the same cyclic behaviour and are not acyclic.</p>
<p>Event logs contain data on events that have occurred in the process. The information on the sequencing of specific events is hidden in the log and needs interpretation to understand what events can and what events cannot follow each other. In order for those event sequences to be transformed into a directed acyclic graph, a labelled state transition system can be used. In this transition system, each event has a unique label and no cycles are formed, because repeated events have unique labels. This way, event sequences are represented as a unique path between states, where a state is never accessed twice. For example, sequences <inline-formula id="j_info1157_ineq_016"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">a</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">b</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">b</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$\{a,b,c,b,d,e\}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_info1157_ineq_017"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">a</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">b</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$\{a,b,c,d,e\}$]]></tex-math></alternatives></inline-formula> in an event log would be represented as state transition sequences <inline-formula id="j_info1157_ineq_018"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">a</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">b</mml:mi>
<mml:mo fence="true" stretchy="false">}</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">b</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mo fence="true" stretchy="false">}</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">b</mml:mi>
<mml:mn>1</mml:mn>
<mml:mo fence="true" stretchy="false">}</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">b</mml:mi>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo fence="true" stretchy="false">}</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo fence="true" stretchy="false">}</mml:mo>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$\{\{a,b\},\{b,c\},\{c,b1\},\{b1,d\},\{d,e\}\}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_info1157_ineq_019"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">a</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">b</mml:mi>
<mml:mo fence="true" stretchy="false">}</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">b</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mo fence="true" stretchy="false">}</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo fence="true" stretchy="false">}</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo fence="true" stretchy="false">}</mml:mo>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$\{\{a,b\},\{b,c\},\{c,d\},\{d,e\}\}$]]></tex-math></alternatives></inline-formula>.</p>
<p>Van der Aalst <italic>et al.</italic> (<xref ref-type="bibr" rid="j_info1157_ref_002">2011</xref>) used state transition system to predict process instance duration and each state was for either a set of events that occurred, or event sequences that have occurred. We believe that process work-flow analysis is best represented by event sequences, therefore we choose to represent each state as a sequence of events and their data.</p><statement id="j_info1157_stat_003"><label>Definition 3.</label>
<p>Given a state representation function <inline-formula id="j_info1157_ineq_020"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">state</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${l^{\mathit{state}}}$]]></tex-math></alternatives></inline-formula>, an event representation function <inline-formula id="j_info1157_ineq_021"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">event</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${l^{\mathit{event}}}$]]></tex-math></alternatives></inline-formula> and a partial trace <italic>σ</italic>, a labelled transition system is defined as <inline-formula id="j_info1157_ineq_022"><alternatives><mml:math>
<mml:mi mathvariant="italic">TS</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">Y</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">T</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\mathit{TS}=(Y,E,T)$]]></tex-math></alternatives></inline-formula> where <inline-formula id="j_info1157_ineq_023"><alternatives><mml:math>
<mml:mi mathvariant="italic">Y</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">state</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">h</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">L</mml:mi>
<mml:mo>∧</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>⩽</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$Y=\{{l^{\mathit{state}}}(h{d^{k}}(\sigma ))|\sigma \in L\wedge 0\leqslant k\leqslant |\sigma |\}$]]></tex-math></alternatives></inline-formula> is the state space and <inline-formula id="j_info1157_ineq_024"><alternatives><mml:math>
<mml:mi mathvariant="italic">h</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$h{d^{k}}(\sigma )$]]></tex-math></alternatives></inline-formula> is a “head” of event sequence in a trace of first <italic>k</italic> elements. <inline-formula id="j_info1157_ineq_025"><alternatives><mml:math>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">event</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">L</mml:mi>
<mml:mo>∧</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>⩽</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$E=\{{l^{\mathit{event}}}(\sigma (k))|\sigma \in L\wedge 1\leqslant k\leqslant |\sigma |\}$]]></tex-math></alternatives></inline-formula> is the set of events labels, and <inline-formula id="j_info1157_ineq_026"><alternatives><mml:math>
<mml:mi mathvariant="italic">T</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">Y</mml:mi>
<mml:mo>×</mml:mo>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo>×</mml:mo>
<mml:mi mathvariant="italic">Y</mml:mi></mml:math><tex-math><![CDATA[$T\in Y\times E\times Y$]]></tex-math></alternatives></inline-formula> with <inline-formula id="j_info1157_ineq_027"><alternatives><mml:math>
<mml:mi mathvariant="italic">T</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">state</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">h</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">event</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">state</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">h</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">event</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">L</mml:mi>
<mml:mo>∧</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>⩽</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$T=\{{l^{\mathit{state}}}(h{d^{k}}(\sigma )),{l^{\mathit{event}}}(\sigma (k+1)),{l^{\mathit{state}}}(h{d^{k+1}}(\sigma )),{l^{\mathit{event}}}|\sigma \in L\wedge 0\leqslant k\leqslant |\sigma |\}$]]></tex-math></alternatives></inline-formula> is the transition relation. <inline-formula id="j_info1157_ineq_028"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">start</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>=</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">state</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mo fence="true" stretchy="false">⟨</mml:mo>
<mml:mo fence="true" stretchy="false">⟩</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[${Y^{\mathit{start}}}=\{{l^{\mathit{state}}}(\langle \rangle )\}$]]></tex-math></alternatives></inline-formula> is the singleton of initial states and <inline-formula id="j_info1157_ineq_029"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>=</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">state</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">L</mml:mi>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[${Y^{end}}=\{{l^{\mathit{state}}}(\sigma )|\sigma \in L\}$]]></tex-math></alternatives></inline-formula> is the set of final states.</p></statement>
<p>The definitions of trace state and event state are not clearly defined in van der Aalst <italic>et al.</italic> (<xref ref-type="bibr" rid="j_info1157_ref_002">2011</xref>), therefore we introduce the definitions for use in the CPT construction.</p>
<p>Event state describes attributes and their values that belong to the specific occurrence of an event in a trace, therefore we can reuse the definition of <italic>μ</italic> in the event log definition:</p><statement id="j_info1157_stat_004"><label>Definition 4.</label>
<p>Event state is defined as <inline-formula id="j_info1157_ineq_030"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">event</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">μ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">μ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">M</mml:mi></mml:math><tex-math><![CDATA[${l^{\mathit{event}}}(e)=\mu (e),e\in E,\mu (e)\in M$]]></tex-math></alternatives></inline-formula> and it describes attributes and their values of a specific event.</p></statement>
<p>A state of a trace is a collection of event states, therefore it can be defined as:</p><statement id="j_info1157_stat_005"><label>Definition 5.</label>
<p>Trace state for a partial trace is represented as a set of event states <inline-formula id="j_info1157_ineq_031"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">state</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">M</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">previous</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">T</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo>∧</mml:mo>
<mml:mo>∀</mml:mo>
<mml:mi mathvariant="italic">α</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">M</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">M</mml:mi>
<mml:mo>∧</mml:mo>
<mml:mi mathvariant="italic">μ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">M</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">previous</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">&gt;</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[${l^{\mathit{state}}}(\sigma )=\{(e,{M_{e}},{e_{\mathit{previous}}})|\sigma \in T,e\in E\wedge \forall \alpha (e)=t,{M_{e}}\in M\wedge \mu (e)={M_{e}},{e_{\mathit{previous}}}>e\}$]]></tex-math></alternatives></inline-formula>.</p></statement>
<p>While the labelled state transition system in the referenced paper is used for some general attribute prediction, it does not provide any prediction functions for attribute or sequence predictions, therefore it is used only as directed acyclic graph representation in Bayesian belief network. The state representations are used for observations used in conditional probability calculations.</p>
</sec>
<sec id="j_info1157_s_007">
<label>3.4</label>
<title>Event Log Transformation to Bayesian Belief Network</title>
<p>For a Bayesian belief network to be constructed, we start with an event log. Usually, process execution data is stored in information systems in many different places and forms. Data collection regarding business processes is always a context-dependent task. That is because each organization has unique information system implementations and their business process differ from one organization to another. Due to this reason, our approach ignores the data collection task and assumes that an event log is present as defined in Section <xref rid="j_info1157_s_005">3.2</xref>.</p>
<p>After collecting data and creating an event log, the construction of belief network can be started. The overall approach is depicted in Fig. <xref rid="j_info1157_fig_001">1</xref> and is done in a sequence of steps as follows:</p>
<list>
<list-item id="j_info1157_li_016">
<label>1.</label>
<p>From the event log <italic>l</italic> state transition system <inline-formula id="j_info1157_ineq_032"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${T_{l}}$]]></tex-math></alternatives></inline-formula> is discovered;</p>
</list-item>
<list-item id="j_info1157_li_017">
<label>2.</label>
<p>A DAG <italic>G</italic> is extracted from the state transition system <inline-formula id="j_info1157_ineq_033"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${T_{l}}$]]></tex-math></alternatives></inline-formula>. It is done by removing any state data in the state transition system. This leaves a Directed Acyclic Graph;</p>
</list-item>
<list-item id="j_info1157_li_018">
<label>3.</label>
<p>Conditional Probability Tables Θ are constructed;</p>
</list-item>
<list-item id="j_info1157_li_019">
<label>4.</label>
<p>The DAG <italic>G</italic> and CPTs Θ are combined into Bayesian belief network <inline-formula id="j_info1157_ineq_034"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="normal">Θ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$(G,\Theta )$]]></tex-math></alternatives></inline-formula>.</p>
</list-item>
</list>
<fig id="j_info1157_fig_001">
<label>Fig. 1</label>
<caption>
<p>Bayesian belief network construction.</p>
</caption>
<graphic xlink:href="info1157_g001.jpg"/>
</fig>
<p>CPT aggregates data in the event log for each event and its’ parents into a single table where each attribute combination is assigned a probability. Therefore, is is defined as:</p><statement id="j_info1157_stat_006"><label>Definition 6.</label>
<p>A Conditional Probability Table of an event is defined as <inline-formula id="j_info1157_ineq_035"><alternatives><mml:math>
<mml:mi mathvariant="italic">θ</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">V</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">ω</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\theta =({A_{e}},{V_{e}},\omega )$]]></tex-math></alternatives></inline-formula> where <inline-formula id="j_info1157_ineq_036"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo>:</mml:mo>
<mml:mo>∃</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>∃</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">|</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo>∧</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo>∧</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">≺</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>∧</mml:mo>
<mml:mi mathvariant="italic">β</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">β</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>∧</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">μ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[${A_{e}}=\{x:\exists {e_{i}}\exists {e_{j}}|{e_{j}}\in E\wedge {e_{i}}\in E\wedge {e_{j}}\prec {e_{i}}\wedge \beta ({e_{i}})=\beta ({e_{j}})\wedge x=\mu ({e_{j}})\}$]]></tex-math></alternatives></inline-formula> is the attribute space of event and its predecessors in the event log and <inline-formula id="j_info1157_ineq_037"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">V</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo>:</mml:mo>
<mml:mo>∃</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">|</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo>∧</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">μ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[${V_{e}}=\{x:\exists {e_{j}}|{e_{j}}\in E\wedge x=\mu ({e_{j}})\}$]]></tex-math></alternatives></inline-formula> is the set of values that belong to the attributes of the events and <inline-formula id="j_info1157_ineq_038"><alternatives><mml:math>
<mml:mi mathvariant="italic">ω</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">v</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">V</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">a</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\omega =P(v\in {V_{e}}|a\in {A_{e}})$]]></tex-math></alternatives></inline-formula> is the probability function for each possible attribute node related to attribute value set of parent node.</p></statement>
<p>The CPTs for the DAG are constructed as follows:</p>
<list>
<list-item id="j_info1157_li_020">
<label>1.</label>
<p>Data attributes and their values <inline-formula id="j_info1157_ineq_039"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">M</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${M_{e}}$]]></tex-math></alternatives></inline-formula> from an event log are collected for each event <italic>e</italic> with identical partial traces <italic>σ</italic>;</p>
</list-item>
<list-item id="j_info1157_li_021">
<label>2.</label>
<p>For each event <inline-formula id="j_info1157_ineq_040"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">previous</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${e_{\mathit{previous}}}$]]></tex-math></alternatives></inline-formula>, data attributes and their values <inline-formula id="j_info1157_ineq_041"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">M</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">previous</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${M_{\mathit{previous}}}$]]></tex-math></alternatives></inline-formula> from an event log are collected;</p>
</list-item>
<list-item id="j_info1157_li_022">
<label>3.</label>
<p>CPT <inline-formula id="j_info1157_ineq_042"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\theta _{e}}$]]></tex-math></alternatives></inline-formula> is constructed where each row represents a unique set of attribute subsets <inline-formula id="j_info1157_ineq_043"><alternatives><mml:math>
<mml:mi mathvariant="italic">N</mml:mi>
<mml:mo>×</mml:mo>
<mml:mi mathvariant="italic">V</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">M</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>∪</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">M</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">previous</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[$N\times V\in {M_{e}}\cup {M_{\mathit{previous}}}$]]></tex-math></alternatives></inline-formula> and each row has a probability of <inline-formula id="j_info1157_ineq_044"><alternatives><mml:math>
<mml:mi mathvariant="italic">ω</mml:mi>
<mml:mo>=</mml:mo><mml:mstyle displaystyle="false">
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="italic">count</mml:mi>
<mml:mi mathvariant="normal">ˍ</mml:mi>
<mml:mi mathvariant="italic">of</mml:mi>
<mml:mi mathvariant="normal">ˍ</mml:mi>
<mml:mi mathvariant="italic">times</mml:mi>
<mml:mi mathvariant="normal">ˍ</mml:mi>
<mml:mi mathvariant="italic">seen</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">total</mml:mi>
<mml:mi mathvariant="normal">ˍ</mml:mi>
<mml:mi mathvariant="italic">count</mml:mi>
<mml:mi mathvariant="normal">ˍ</mml:mi>
<mml:mi mathvariant="italic">of</mml:mi>
<mml:mi mathvariant="normal">ˍ</mml:mi>
<mml:mi mathvariant="italic">event</mml:mi>
<mml:mi mathvariant="normal">ˍ</mml:mi>
<mml:mi mathvariant="italic">occurences</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mstyle></mml:math><tex-math><![CDATA[$\omega =\frac{\mathit{count}\_ \mathit{of}\_ \mathit{times}\_ \mathit{seen}}{\mathit{total}\_ \mathit{count}\_ \mathit{of}\_ \mathit{event}\_ \mathit{occurences}}$]]></tex-math></alternatives></inline-formula>.</p>
</list-item>
</list>
</sec>
<sec id="j_info1157_s_008">
<label>3.5</label>
<title>Business Process Inference</title>
<p>Business processes, once automated in an information system, have a controlled work-flow. During this work-flow, performers of the process generate data with regard to the process execution, such as location, organizational resource or other domain specific data, e.g. student group, faculty, etc. This data, once taken as a whole, allows to detect causality between events or between data parameters.</p>
<p>Usual approaches for analysing business process execution is to use statistical data, such as averages, maximums, minimums, sums, frequencies and others. While this does provide a means to infer how the process behaves, it could be superficial, because it might not take conditional dependencies between data. For this, we believe that Bayesian inference could be used for decision support, because it provides reasonable expectations.</p>
<p>Bayesian inference derives the posterior probability using well known Bayesian inference formula (Darwiche, <xref ref-type="bibr" rid="j_info1157_ref_005">2008</xref>): 
<disp-formula id="j_info1157_eq_001">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">H</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">H</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>×</mml:mo>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">H</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ \frac{P(H|D)=P(D|H)\times P(H)}{P(D)}.\]]]></tex-math></alternatives>
</disp-formula>
</p>
<p>In here, <inline-formula id="j_info1157_ineq_045"><alternatives><mml:math>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">H</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$P(E|H)$]]></tex-math></alternatives></inline-formula> is the posterior probability of a hypothesis <italic>H</italic> based on evidence <italic>D</italic>, which is a consequence of two antecedents – the prior probability <inline-formula id="j_info1157_ineq_046"><alternatives><mml:math>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">H</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$P(H)$]]></tex-math></alternatives></inline-formula> and a likelihood <inline-formula id="j_info1157_ineq_047"><alternatives><mml:math>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">H</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$P(D|H)$]]></tex-math></alternatives></inline-formula> with a marginal likelihood <inline-formula id="j_info1157_ineq_048"><alternatives><mml:math>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$P(D)$]]></tex-math></alternatives></inline-formula>.</p>
<p>For business processes, the hypothesis is any set of event attributes and values whose probability we would like to infer, for example, “<italic>what is the probability of the</italic> <underline><italic>claim</italic></underline> <underline><italic>status</italic></underline> <italic>to be</italic> <underline><italic>declined</italic></underline><italic>?</italic>”, where <italic>claim</italic> is the event, <italic>status</italic> is the attribute and <italic>declined</italic> is the value.</p><statement id="j_info1157_stat_007"><label>Definition 7.</label>
<p>Hypothesis of a business process is defined as <inline-formula id="j_info1157_ineq_049"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">t</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">N</mml:mi>
<mml:mo>×</mml:mo>
<mml:mi mathvariant="italic">V</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>∃</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">h</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">t</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">h</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">μ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${H_{t}}\in N\times V,\exists {e_{i}}\in E,h\in {H_{t}}:h\in \mu ({e_{i}})$]]></tex-math></alternatives></inline-formula> – a set of event’s attributes and value pairs which have been observed in the past in the log.</p></statement>
<p>The hypothesis is not limited to a single <inline-formula id="j_info1157_ineq_050"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$\{e,m\}$]]></tex-math></alternatives></inline-formula> tuple. Since business processes can drift and mutate in time, we limit the possible choices for hypothesis only to those attribute and value pairs which have been seen before in the trace.</p>
<p>Since hypothesis contains multiple possible elements, the prior probability is calculated as a product for each of the attribute values to occur with no conditions, i.e. 
<disp-formula id="j_info1157_eq_002">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">H</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∏</mml:mo></mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∏</mml:mo></mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mi mathvariant="italic">ω</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ P(H)=\prod \limits_{i}P({h_{i}})=\prod \limits_{i}\omega ({h_{i}}).\]]]></tex-math></alternatives>
</disp-formula>
</p>
<p>In standard statistical methods, only the number of times when <inline-formula id="j_info1157_ineq_051"><alternatives><mml:math>
<mml:mi mathvariant="italic">H</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">E</mml:mi></mml:math><tex-math><![CDATA[$H|E$]]></tex-math></alternatives></inline-formula> occurred would be used for inference, but this is not really useful for decision support, because it does not take into account the marginal likelihood and only shows a number of times it has been seen regardless of the likelihood of each of the parameters, i.e. not only how often this hypothesis has been seen before, but also how likely is it to be seen given current evidence. Therefore, for inference, we need to use the evidence likelihood. We assume that the inference is done in the context of a single process instance, therefore the evidence is the current state of the trace of the process.</p><statement id="j_info1157_stat_008"><label>Definition 8.</label>
<p>Given a partial trace <italic>σ</italic> and the current state of a process <inline-formula id="j_info1157_ineq_052"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mi mathvariant="italic">a</mml:mi>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${l_{state}}(\sigma )$]]></tex-math></alternatives></inline-formula>, the evidence for a hypothesis is defined as a set of events that have occurred in the current partial trace and their attribute value pairs <inline-formula id="j_info1157_ineq_053"><alternatives><mml:math>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">m</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">m</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo>×</mml:mo>
<mml:mi mathvariant="italic">M</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">m</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">μ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$D=\{({e_{i}},{m_{i}})|{e_{i}}\in \sigma ,({e_{i}},{m_{i}})\in E\times M,{e_{i}}\in \sigma ,{m_{i}}\in \mu ({e_{i}})\}$]]></tex-math></alternatives></inline-formula>.</p></statement>
<p>Given the definition of the evidence, marginal likelihood can be calculated as a sum of all probabilities for the evidence to occur with the subsets of hypothesis, i.e. 
<disp-formula id="j_info1157_eq_003">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right left" columnspacing="0pt">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
</mml:mtd>
<mml:mtd class="align-even">
<mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>×</mml:mo>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mo maxsize="2.03em" minsize="2.03em" fence="true" mathvariant="normal">(</mml:mo>
<mml:mi mathvariant="italic">f</mml:mi>
<mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∏</mml:mo></mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">|</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo maxsize="2.03em" minsize="2.03em" fence="true" mathvariant="normal">)</mml:mo>
<mml:mo>×</mml:mo>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd class="align-odd">
<mml:mo>=</mml:mo>
</mml:mtd>
<mml:mtd class="align-even">
<mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mo maxsize="2.03em" minsize="2.03em" fence="true" mathvariant="normal">(</mml:mo>
<mml:mo maxsize="2.03em" minsize="2.03em" fence="true" mathvariant="normal">(</mml:mo>
<mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∏</mml:mo></mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mi mathvariant="italic">ω</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" stretchy="false">/</mml:mo>
<mml:mi mathvariant="italic">ω</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo maxsize="2.03em" minsize="2.03em" fence="true" mathvariant="normal">)</mml:mo>
<mml:mo>×</mml:mo>
<mml:mi mathvariant="italic">ω</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo maxsize="2.03em" minsize="2.03em" fence="true" mathvariant="normal">)</mml:mo>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[\begin{aligned}{}P(D)=& \sum \limits_{i}P(D|{h_{i}})\times P({h_{i}})=\sum \limits_{i}\bigg(f\prod \limits_{j}P({d_{i}}|{h_{i}})\bigg)\times P({h_{i}})\\ {} =& \sum \limits_{i}\bigg(\bigg(\prod \limits_{j}\omega ({d_{j}})/\omega ({h_{i}})\bigg)\times \omega ({h_{i}})\bigg).\end{aligned}\]]]></tex-math></alternatives>
</disp-formula>
</p>
<p>Finally, the prior likelihood is the probability to see the evidence given the hypothesis and it can be calculated as <inline-formula id="j_info1157_ineq_054"><alternatives><mml:math>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">H</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo><mml:mstyle displaystyle="false">
<mml:mfrac>
<mml:mrow>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo>×</mml:mo>
<mml:mi mathvariant="italic">H</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">M</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mstyle></mml:math><tex-math><![CDATA[$P(D|H)=\frac{|D\times H\in M|}{|D|}$]]></tex-math></alternatives></inline-formula>, i.e. the number of times the evidence has been seen together with the hypothesis divided by the number that the evidence has been seen in general.</p>
<p>Having all of the components, we get the final inference formula: 
<disp-formula id="j_info1157_eq_004">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">H</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo><mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:mstyle displaystyle="false">
<mml:mfrac>
<mml:mrow>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">ω</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo>∩</mml:mo>
<mml:mi mathvariant="italic">H</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">ω</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo>×</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mo largeop="false" movablelimits="false">∏</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mi mathvariant="italic">ω</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mo largeop="false" movablelimits="false">∑</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mo largeop="false" movablelimits="false">∏</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mi mathvariant="italic">ω</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" stretchy="false">/</mml:mo>
<mml:mi mathvariant="italic">ω</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>×</mml:mo>
<mml:mi mathvariant="italic">ω</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ P(H|D)=\frac{\frac{|\omega (D\cap H)|}{|\omega (D)|}\times {\textstyle\prod _{i}}\omega ({h_{i}})}{{\textstyle\sum _{i}}(({\textstyle\prod _{j}}\omega ({d_{j}})/\omega ({h_{i}}))\times \omega ({h_{i}}))}.\]]]></tex-math></alternatives>
</disp-formula>
</p>
</sec>
</sec>
<sec id="j_info1157_s_009">
<label>4</label>
<title>Evaluation of the Approach</title>
<p>The presented approach is to be used for decision support and allow to preemptively identify most probable execution path of a process. Usually in the real-life scenarios, there would be some before-known hypotheses whose probabilities should be identified in order to understand whether the execution is going in the “right” direction. Some of the exemplary hypotheses could be to identify whether some state will be reached such as event <italic>end</italic>, whether the state will contain some data such <inline-formula id="j_info1157_ineq_055"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mtext>“</mml:mtext>
<mml:mi mathvariant="italic">status</mml:mi>
<mml:mtext>”</mml:mtext>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mtext>“</mml:mtext>
<mml:mi mathvariant="italic">successful</mml:mi>
<mml:mtext>”</mml:mtext>
<mml:mo fence="true" stretchy="false">}</mml:mo>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="italic">μ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">done</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\{\text{``}\mathit{status}\text{''},\text{``}\mathit{successful}\text{''}\}\in \mu ({e_{\mathit{done}}})$]]></tex-math></alternatives></inline-formula>.</p>
<sec id="j_info1157_s_010">
<label>4.1</label>
<title>Experiment Definition</title>
<p>For formal verification of the approach, domain specific questions should be ignored and there needs to be an objective testing. Therefore it was decided to test the approach with event logs and calculate probabilities of already known values and see whether it is capable of achieving high probability rates.</p>
<p>The event logs used in experiments should be with multiple complexity levels. For this, a synthetic log (SL) and two publicly available logs were chosen (Table <xref rid="j_info1157_tab_002">2</xref>). The publicly available logs were of a Dutch Financial Institution event log (DL) taken from Business Process Intelligence Challenge 2012 (van Dongen, <xref ref-type="bibr" rid="j_info1157_ref_007">2015a</xref>) and Municipality event log (ML) from Business Process Intelligence Challenge 2015 (van Dongen, <xref ref-type="bibr" rid="j_info1157_ref_008">2015b</xref>). The ML log contains time stamp or unrelated attributes <italic>activityNameNL</italic>, <italic>dateFinished</italic>, <italic>dueDate</italic>, <italic>planned</italic>, <italic>datestop</italic> and during experiments they are ignored.</p>
<table-wrap id="j_info1157_tab_002">
<label>Table 2</label>
<caption>
<p>Parameters of the used event logs.</p>
</caption>
<table>
<thead>
<tr>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Log</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Traces</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Unique events</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Total events</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Attributes</td>
</tr>
</thead>
<tbody>
<tr>
<td style="vertical-align: top; text-align: left">SL</td>
<td style="vertical-align: top; text-align: left">3512</td>
<td style="vertical-align: top; text-align: left">9</td>
<td style="vertical-align: top; text-align: left">20339</td>
<td style="vertical-align: top; text-align: left">2–6</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">DL</td>
<td style="vertical-align: top; text-align: left">13087</td>
<td style="vertical-align: top; text-align: left">36</td>
<td style="vertical-align: top; text-align: left">262200</td>
<td style="vertical-align: top; text-align: left">3–4</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">ML</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">1156</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">289</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">59083</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">12</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The experiments were done independently for each event log as follows:</p>
<list>
<list-item id="j_info1157_li_023">
<label>1.</label>
<p>The event log is transformed into two subsets – 80% and 20%;</p>
</list-item>
<list-item id="j_info1157_li_024">
<label>2.</label>
<p>The 80% subset is used for discovering belief network;</p>
</list-item>
<list-item id="j_info1157_li_025">
<label>3.</label>
<p>Leftover set of the remaining 20% is used for experimental testing;</p>
</list-item>
<list-item id="j_info1157_li_026">
<label>4.</label>
<p>Average probability and standard deviation for the experiment is calculated;</p>
</list-item>
<list-item id="j_info1157_li_027">
<label>5.</label>
<p>Experiment is repeated for 4 more times with different subsets of event logs to create <inline-formula id="j_info1157_ineq_056"><alternatives><mml:math>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mtext>-fold</mml:mtext>
<mml:mo>=</mml:mo>
<mml:mn>5</mml:mn></mml:math><tex-math><![CDATA[$k\text{-fold}=5$]]></tex-math></alternatives></inline-formula> results.</p>
</list-item>
</list>
<p>The experiment itself is performed by imitating the execution of a business process. The system iterates through each event creating a partial trace with a state <inline-formula id="j_info1157_ineq_057"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">state</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${l^{\mathit{state}}}(\sigma )$]]></tex-math></alternatives></inline-formula>, where <italic>σ</italic> is currently iterated part of the trace in the event log. Knowing what the last event with a state <inline-formula id="j_info1157_ineq_058"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">event</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${l^{\mathit{event}}}(\sigma (k))$]]></tex-math></alternatives></inline-formula> is, we calculate <inline-formula id="j_info1157_ineq_059"><alternatives><mml:math>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">event</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">state</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">h</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$P({l^{\mathit{event}}}(\sigma (k))|{l^{\mathit{state}}}(h{d^{k-1}}(\sigma )))$]]></tex-math></alternatives></inline-formula> – the probability for the event’s state, given the already occurred events in the partial trace. Probability calculations are done only when <inline-formula id="j_info1157_ineq_060"><alternatives><mml:math>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">σ</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
<mml:mo mathvariant="normal">&gt;</mml:mo>
<mml:mn>0</mml:mn></mml:math><tex-math><![CDATA[$|\sigma |>0$]]></tex-math></alternatives></inline-formula>, i.e. at least one event is in the partial trace. This is done, because we are not interested in the first event in the trace – its’ probability does not have any conditional dependencies, therefore it does not test the approach.</p>
<p>After completing the experiment for each of the event log for the 5 times, the average probability is calculated for each event. This allows to see what is the general capability of the approach. Results, where probability is equal to 0 or events that have occurred less than 5 times are rejected as noise. The probability equal to 0 is rejected because it has some data that has never occurred in the event log, therefore decision support for such cases is impossible and it does not answer whether the approach is any good. The rarely occurring events are also rejected, because they do not appear frequently enough for reliable results.</p>
</sec>
<sec id="j_info1157_s_011">
<label>4.2</label>
<title>Experimental Environment</title>
<p>The selected experiment process allows to see how the proposed approach behaves with different attribute counts and differing complexity of the event logs. The approach was implemented in a prototype tool. The prototype tool is called BBNGs (Business process Belief Network enGine) and implemented using. NET framework. The BBNGs is a tool designed to receive an input event log of a business process, transform it into belief network and allow inferences on the belief network. The overall architecture is shown in Fig. <xref rid="j_info1157_fig_002">2</xref>.</p>
<fig id="j_info1157_fig_002">
<label>Fig. 2</label>
<caption>
<p>Architecture of the prototype implementing the proposed approach.</p>
</caption>
<graphic xlink:href="info1157_g002.jpg"/>
</fig>
<p>The main component responsible for the behaviour of the tool is controller and it exposes the behaviour to the GUI. Graphical user interface is used by users to perform actions like setting source files, performing observations, or previewing extracted graphs and inference results to the user.</p>
<p>The initial task is to receive input in the XES format from an external system. The specific input format was chosen, because, as described previously, most of the time information systems have no clear event logs of business processes and the data might be heterogeneous. The <italic>XES</italic> parser component loads the data into the BBNGs and makes the event log accessible in memory.</p>
<p>Afterwards, a component for each of the steps described in previous section is present – DAG Extractor for extracting labelled transition system and the directed acyclic graphs from an event log, CPT Builder for creating conditional tables and inference engine which is responsible for observing variables and performing inferences. Inference Engine component is responsible for making inferences on the generated belief network. It is used by UI component to make required inferences and allow extraction of knowledge about business processes.</p>
</sec>
<sec id="j_info1157_s_012">
<label>4.3</label>
<title>Experimental Results</title>
<table-wrap id="j_info1157_tab_003">
<label>Table 3</label>
<caption>
<p>Probability inference results.</p>
</caption>
<table>
<thead>
<tr>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Log</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Inferences taken into account</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Total inferences/total events in the log</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Events observed/events in the log</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Precision</td>
</tr>
</thead>
<tbody>
<tr>
<td style="vertical-align: top; text-align: left">SL</td>
<td style="vertical-align: top; text-align: left">13262</td>
<td style="vertical-align: top; text-align: left">16811/20339</td>
<td style="vertical-align: top; text-align: left">8/9</td>
<td style="vertical-align: top; text-align: left"><inline-formula id="j_info1157_ineq_061"><alternatives><mml:math>
<mml:mn>0.77</mml:mn>
<mml:mo>±</mml:mo>
<mml:mn>0.26</mml:mn></mml:math><tex-math><![CDATA[$0.77\pm 0.26$]]></tex-math></alternatives></inline-formula></td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">DL</td>
<td style="vertical-align: top; text-align: left">95097</td>
<td style="vertical-align: top; text-align: left">147070/262200</td>
<td style="vertical-align: top; text-align: left">33/36</td>
<td style="vertical-align: top; text-align: left"><inline-formula id="j_info1157_ineq_062"><alternatives><mml:math>
<mml:mn>0.52</mml:mn>
<mml:mo>±</mml:mo>
<mml:mn>0.35</mml:mn></mml:math><tex-math><![CDATA[$0.52\pm 0.35$]]></tex-math></alternatives></inline-formula></td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">ML</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">11787</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">43435/59083</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">42/289</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin"><inline-formula id="j_info1157_ineq_063"><alternatives><mml:math>
<mml:mn>0.95</mml:mn>
<mml:mo>±</mml:mo>
<mml:mn>0.16</mml:mn></mml:math><tex-math><![CDATA[$0.95\pm 0.16$]]></tex-math></alternatives></inline-formula></td>
</tr>
</tbody>
</table>
</table-wrap>
<fig id="j_info1157_fig_003">
<label>Fig. 3</label>
<caption>
<p>Inference results of (a) SL, (b) DL, (c) ML.</p>
</caption>
<graphic xlink:href="info1157_g003.jpg"/>
</fig>
<p>The experiments resulted in a total of 17750 trace runs with a total of 207316 events (Table <xref rid="j_info1157_tab_003">3</xref>). From all of the probability inferences, 87170 of them were rejected as unsuitable, because they had data not available in the training set, were anomalous with <inline-formula id="j_info1157_ineq_064"><alternatives><mml:math>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">H</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mn>0</mml:mn></mml:math><tex-math><![CDATA[$P(D|H)=0$]]></tex-math></alternatives></inline-formula> or because they were the first event in the trace. The inference results are visualized in Fig. <xref rid="j_info1157_fig_003">3</xref>.</p>
<p>From all of the inferences taken into account, the highest average probability was for synthetic log, as expected. This was due to the underlying process being rather simple. The inferences were successful on average 77% of time with 4 of the events having inferred probability on average higher than 99% with deviation &lt;1%. Other events had the average probabilities spanning from 31% to 82% with deviation ranging from ±38% to ±49%.</p>
<p>Other processes were much more complex, having many more possible data variations. This resulted in a lower average probability. In the case of DL log, the events contained barely 3–4 data attributes, therefore their causal dependency is arguable. This resulted in average probability of 52%, but 10 out of 36 events had average inference probability higher than 80%. 3 events in the DL log were ignored in inferences, because either they were always the first event or have occurred less than 5 times in each of the test sets.</p>
<p>The ML log has the lowest results regarding calculations taken into account (11787 out of 43435), but the process itself is the most complex, because the log has 289 unique possible events and only 1156 traces in total. This causes the belief network to be under-trained due to such complex structure and low amount of data. Also, it had usable inferences only on 42 out of possible 289 events. Ignoring that, it had average probability of 95%. Even more so, in total 39 events out of 42 taken into account had the average probability higher than 80%.</p>
<p>To sum up, the experiments show that the approach is usable, but it relies heavily on data – in case attribute values are observed that have never been observed, or if there is limited amount of historical observations which do not fully cover the process behaviour, the approach has limited use. But for events whose behaviour is expressed in the event log, the proposed approach shows great results and allows to answer questions important to the execution of processes – whether events are expected in the process instance, what data might occur there and others.</p>
</sec>
</sec>
<sec id="j_info1157_s_013">
<label>5</label>
<title>Conclusions</title>
<p>The paper presents an approach on how to construct a Bayesian belief network from an event log and perform inferences on the constructed network for business process decision support. The approach takes an event log, creates a system state transition which is then used to create directed acyclic graph and combines the directed acyclic graph with the data in the event log to construct Bayesian belief network. The created network has been evaluated using 3 event logs with multiple complexity levels to test whether inferences are reliable and can be used for decision support. The main conclusions of the paper are: 
<list>
<list-item id="j_info1157_li_028">
<label>•</label>
<p>The presented approach allows construction of Bayesian belief network from an event log;</p>
</list-item>
<list-item id="j_info1157_li_029">
<label>•</label>
<p>The approach, when used for decision support, provides, on average, 52% to 95% probabilities for actual data, proving that the approach can be used for inferences;</p>
</list-item>
<list-item id="j_info1157_li_030">
<label>•</label>
<p>The approach is dependent on data quantity in the event log and its’ expressiveness.</p>
</list-item>
</list> 
As can be seen, the approach provides satisfying probability inferences which can be used for decision support. There is still a need to improve the approach and make it more suitable for very complex processes, where there can be a lot of event types but a relatively small amount of events in the log. Further plans of the research is planned to see how to the approach can be applied to automatically predict business process behaviour, i.e. what events will occur and with what data attributes. Also, it is planned to research how the approach can be used to generate initial business process simulation models, therefore reducing human labour required to create such models.</p>
</sec>
</body>
<back>
<ref-list id="j_info1157_reflist_001">
<title>References</title>
<ref id="j_info1157_ref_001">
<mixed-citation publication-type="chapter"><string-name><surname>van der Aalst</surname>, <given-names>W.M.P.</given-names></string-name>, <string-name><surname>Nakatumba</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Rozinat</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Russell</surname>, <given-names>N.</given-names></string-name> (<year>2010</year>). <chapter-title>Business process simulation: how to get it right?</chapter-title> In: <source>Handbook on Business Process Managemen</source>. <publisher-name>Springer</publisher-name>, <publisher-loc>Berlin, Heidelberg</publisher-loc>, pp. <fpage>313</fpage>–<lpage>338</lpage>.</mixed-citation>
</ref>
<ref id="j_info1157_ref_002">
<mixed-citation publication-type="journal"><string-name><surname>van der Aalst</surname>, <given-names>W.M.P.</given-names></string-name>, <string-name><surname>Schonenberg</surname>, <given-names>M.H.</given-names></string-name>, <string-name><surname>Song</surname>, <given-names>M.</given-names></string-name> (<year>2011</year>). <article-title>Time prediction based on process mining</article-title>. <source>Information Systems</source>, <volume>36</volume>(<issue>2</issue>), <fpage>450</fpage>–<lpage>475</lpage>.</mixed-citation>
</ref>
<ref id="j_info1157_ref_003">
<mixed-citation publication-type="chapter"><string-name><surname>van der Aalst</surname>, <given-names>W.M.P.</given-names></string-name>, <string-name><surname>Adriansyah</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Medeiros</surname>, <given-names>A.K.A.</given-names></string-name>, <string-name><surname>Arcieri</surname>, <given-names>F.</given-names></string-name>, <string-name><surname>Baier</surname>, <given-names>T.</given-names></string-name>, <string-name><surname>Blickle</surname>, <given-names>T.</given-names></string-name>, <string-name><surname>Bose</surname>, <given-names>J.C.</given-names></string-name>, <string-name><surname>van den Brand</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Brandtjen</surname>, <given-names>R.</given-names></string-name>, <string-name><surname>Buijs</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Burattin</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Carmona</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Castellanos</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Claes</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Cook</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Costantini</surname>, <given-names>N.</given-names></string-name>, <string-name><surname>Curbera</surname>, <given-names>F.</given-names></string-name>, <string-name><surname>Damiani</surname>, <given-names>E.</given-names></string-name>, <string-name><surname>de Leoni</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Delias</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>van Dongen</surname>, <given-names>B.F.</given-names></string-name>, <string-name><surname>Dumas</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Dustdar</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Fahland</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>Ferreira</surname>, <given-names>D.R.</given-names></string-name>, <string-name><surname>Gaaloul</surname>, <given-names>W.</given-names></string-name>, <string-name><surname>van Geffen</surname>, <given-names>F.</given-names></string-name>, <string-name><surname>Goel</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Günther</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Guzzo</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Harmon</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>ter Hofstede</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Hoogland</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Ingvaldsen</surname>, <given-names>J.E.</given-names></string-name>, <string-name><surname>Kato</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Kuhn</surname>, <given-names>R.</given-names></string-name>, <string-name><surname>Kumar</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>La Rosa</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Maggi</surname>, <given-names>F.</given-names></string-name>, <string-name><surname>Malerba</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>Mans</surname>, <given-names>R.S.</given-names></string-name>, <string-name><surname>Manuel</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>McCreesh</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Mello</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Mendling</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Montali</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Motahari-Nezhad</surname>, <given-names>H.R.</given-names></string-name>, <string-name><surname>zur Muehlen</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Munoz-Gama</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Pontieri</surname>, <given-names>L.</given-names></string-name>, <string-name><surname>Ribeiro</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Rozinat</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Seguel Pérez</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Seguel Pérez</surname>, <given-names>R.</given-names></string-name>, <string-name><surname>Sepúlveda</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Sinur</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Soffer</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Song</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Sperduti</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Stilo</surname>, <given-names>G.</given-names></string-name>, <string-name><surname>Stoel</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Swenson</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Talamo</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Tan</surname>, <given-names>W.</given-names></string-name>, <string-name><surname>Turner</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Vanthienen</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Varvaressos</surname>, <given-names>G.</given-names></string-name>, <string-name><surname>Verbeek</surname>, <given-names>E.</given-names></string-name>, <string-name><surname>Verdonk</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Vigo</surname>, <given-names>R.</given-names></string-name>, <string-name><surname>Wang</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Weber</surname>, <given-names>B.</given-names></string-name>, <string-name><surname>Weidlich</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Weijters</surname>, <given-names>T.</given-names></string-name>, <string-name><surname>Wen</surname>, <given-names>L.</given-names></string-name>, <string-name><surname>Westergaard</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Wynn</surname>, <given-names>M.</given-names></string-name> (<year>2012</year>). <chapter-title>Process mining manifesto</chapter-title>. In: <source>Business Process Management Workshops</source>. <publisher-name>Springer</publisher-name>, <publisher-loc>Berlin, Heidelberg</publisher-loc>, pp. <fpage>169</fpage>–<lpage>194</lpage>.</mixed-citation>
</ref>
<ref id="j_info1157_ref_004">
<mixed-citation publication-type="chapter"><string-name><surname>Arroyo-Figueroa</surname>, <given-names>G.</given-names></string-name>, <string-name><surname>Sucar</surname>, <given-names>L.E.</given-names></string-name> (<year>1999</year>). <chapter-title>A temporal Bayesian network for diagnosis and prediction</chapter-title>. In: <source>Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence</source>. <publisher-name>Morgan Kaufmann Publishers Inc.</publisher-name>, pp. <fpage>13</fpage>–<lpage>20</lpage>.</mixed-citation>
</ref>
<ref id="j_info1157_ref_005">
<mixed-citation publication-type="chapter"><string-name><surname>Darwiche</surname>, <given-names>A.</given-names></string-name> (<year>2008</year>). <chapter-title>Chapter 11. Bayesian Networks</chapter-title>. In: <string-name><surname>van Harmelen</surname></string-name>, <string-name><surname>Lifschitz V</surname>, <given-names>F.</given-names></string-name>, <string-name><surname>Porter</surname>, <given-names>B.</given-names></string-name> (Eds.), <source>Handbook of Knowledge Representation</source>, Vol. <volume>3</volume>. <publisher-name>Elsevier</publisher-name>, <publisher-loc>Amsterdam</publisher-loc>, pp. <fpage>467</fpage>–<lpage>509</lpage>.</mixed-citation>
</ref>
<ref id="j_info1157_ref_006">
<mixed-citation publication-type="chapter"><string-name><surname>van Dongen</surname>, <given-names>B.F.</given-names></string-name>, <string-name><surname>Crooy</surname>, <given-names>R.A.</given-names></string-name>, <string-name><surname>van der Aalst</surname>, <given-names>W.M.P.</given-names></string-name> (<year>2008</year>). <chapter-title>Cycle time prediction: when will this case finally be finished?</chapter-title> In: <source>Lecture Notes in Computer Science</source>. <publisher-name>Springer</publisher-name>, <publisher-loc>Berlin, Heidelberg</publisher-loc>, pp. <fpage>319</fpage>–<lpage>336</lpage>.</mixed-citation>
</ref>
<ref id="j_info1157_ref_007">
<mixed-citation publication-type="book"><string-name><surname>van Dongen</surname>, <given-names>B.F.</given-names></string-name> (<year>2015</year>a). <source>BPI Challenge 2012</source>. <publisher-name>Eindhoven University of Technology. Dataset</publisher-name>.</mixed-citation>
</ref>
<ref id="j_info1157_ref_008">
<mixed-citation publication-type="book"><string-name><surname>van Dongen</surname>, <given-names>B.F.</given-names></string-name> (<year>2015</year>b). <source>BPI Challenge 2015 Municipality 5</source>. <publisher-name>Eindhoven University of Technology</publisher-name>.</mixed-citation>
</ref>
<ref id="j_info1157_ref_009">
<mixed-citation publication-type="other"><string-name><surname>Günther</surname>, <given-names>C.W.</given-names></string-name>, <string-name><surname>Verbeek</surname>, <given-names>E.</given-names></string-name> (2014). <italic>XES Standard 2.0</italic>. Eindhoven.</mixed-citation>
</ref>
<ref id="j_info1157_ref_010">
<mixed-citation publication-type="journal"><string-name><surname>Kellner</surname>, <given-names>M.I.</given-names></string-name>, <string-name><surname>Madachy</surname>, <given-names>R.J.</given-names></string-name>, <string-name><surname>Ra</surname>, <given-names>D.M.</given-names></string-name>, <string-name><surname>Raffo</surname>, <given-names>D.M.</given-names></string-name> (<year>1999</year>). <article-title>Software process simulation modeling: why? what? how?</article-title> <source>Journal of Systems and Software</source>, <volume>46</volume>(<issue>2</issue>), <fpage>91</fpage>–<lpage>105</lpage>.</mixed-citation>
</ref>
<ref id="j_info1157_ref_011">
<mixed-citation publication-type="chapter"><string-name><surname>Martin</surname>, <given-names>N.</given-names></string-name>, <string-name><surname>Depaire</surname>, <given-names>B.</given-names></string-name>, <string-name><surname>Caris</surname>, <given-names>A.</given-names></string-name> (<year>2015</year>). <chapter-title>The use of process mining in a business process simulation context: overview and challenges</chapter-title>. In: <source>CIDM 2014 2014 IEEE Symposium on Computational Intelligence and Data Mining</source>, pp. <fpage>381</fpage>–<lpage>388</lpage>.</mixed-citation>
</ref>
<ref id="j_info1157_ref_012">
<mixed-citation publication-type="chapter"><string-name><surname>de Leoni</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>van der Aalst</surname>, <given-names>W.M.P.</given-names></string-name> (<year>2013</year>). <chapter-title>Data-aware process mining: discovering decisions in processes using alignments</chapter-title>. In: <source>Proceedings of the 28th Annual ACM Symposium on Applied Computing</source>. <publisher-name>ACM</publisher-name>, pp. <fpage>1454</fpage>–<lpage>1461</lpage>.</mixed-citation>
</ref>
<ref id="j_info1157_ref_013">
<mixed-citation publication-type="journal"><string-name><surname>Liu</surname>, <given-names>Y.</given-names></string-name>, <string-name><surname>Zhang</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Li</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Jiao</surname>, <given-names>R.J.</given-names></string-name> (<year>2012</year>). <article-title>Workflow simulation for operational decision support using event graph through process mining</article-title>. <source>Decision Support Systems</source>, <volume>52</volume>(<issue>3</issue>), <fpage>685</fpage>–<lpage>697</lpage>.</mixed-citation>
</ref>
<ref id="j_info1157_ref_014">
<mixed-citation publication-type="journal"><string-name><surname>Ping</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Chen</surname>, <given-names>Y.</given-names></string-name>, <string-name><surname>Chen</surname>, <given-names>B.</given-names></string-name>, <string-name><surname>Howboldt</surname>, <given-names>K.</given-names></string-name> (<year>2010</year>). <article-title>A robust statistical analysis approach for pollutant loadings in urban rivers</article-title>. <source>Journal of Environmental Informatics</source>, <volume>16</volume>, <fpage>35</fpage>–<lpage>42</lpage>.</mixed-citation>
</ref>
<ref id="j_info1157_ref_015">
<mixed-citation publication-type="chapter"><string-name><surname>Rogge-Solti</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Mans</surname>, <given-names>R.S.</given-names></string-name>, <string-name><surname>van der Aalst</surname>, <given-names>W.M.</given-names></string-name>, <string-name><surname>Weske</surname>, <given-names>M.</given-names></string-name> (<year>2013</year>). <chapter-title>Improving documentation by repairing event logs</chapter-title>. In: <source>IFIP Working Conference on The Practice of Enterprise Modeling</source>. <publisher-name>Springer</publisher-name>, <publisher-loc>Berlin, Heidelberg</publisher-loc>, pp. <fpage>129</fpage>–<lpage>144</lpage>.</mixed-citation>
</ref>
<ref id="j_info1157_ref_016">
<mixed-citation publication-type="chapter"><string-name><surname>Rogge-Solti</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Kasneci</surname>, <given-names>G.</given-names></string-name> (<year>2014</year>). <chapter-title>Temporal anomaly detection in business processes</chapter-title>. In: <source>Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)</source>, Vol. <volume>8659</volume>. <publisher-name>Springer</publisher-name>, pp. <fpage>234</fpage>–<lpage>249</lpage>.</mixed-citation>
</ref>
<ref id="j_info1157_ref_017">
<mixed-citation publication-type="other"><string-name><surname>Rozinat</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>van der Aalst</surname>, <given-names>W.M.P.</given-names></string-name> (2016). Decision mining in business processes. <italic>Beta, Research School for Operations Management and Logistics</italic>, <italic>2006</italic>.</mixed-citation>
</ref>
<ref id="j_info1157_ref_018">
<mixed-citation publication-type="journal"><string-name><surname>Sarno</surname>, <given-names>R.</given-names></string-name>, <string-name><surname>Dewandono</surname>, <given-names>R.D.</given-names></string-name>, <string-name><surname>Ahmad</surname>, <given-names>T.</given-names></string-name>, <string-name><surname>Naufal</surname>, <given-names>M.F.</given-names></string-name> (<year>2015</year>). <article-title>Hybrid association rule learning and process mining for fraud detection</article-title>. <source>IAENG International Journal of Computer Science</source>, <volume>42</volume>, <fpage>59</fpage>–<lpage>72</lpage>.</mixed-citation>
</ref>
<ref id="j_info1157_ref_019">
<mixed-citation publication-type="chapter"><string-name><surname>Savickas</surname>, <given-names>T.</given-names></string-name>, <string-name><surname>Vasilecas</surname>, <given-names>O.</given-names></string-name> (<year>2014</year>). <chapter-title>Bayesian belief network application in process mining</chapter-title>. In: <source>Proceedings of the 15th International Conference on Computer Systems and Technologies – CompSysTech ’14</source>, Vol. <volume>883</volume>. <publisher-name>ACM</publisher-name>, pp. <fpage>226</fpage>–<lpage>233</lpage>.</mixed-citation>
</ref>
<ref id="j_info1157_ref_020">
<mixed-citation publication-type="chapter"><string-name><surname>van der Spoel</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Van Keulen</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Amrit</surname>, <given-names>C.</given-names></string-name> (<year>2012</year>). <chapter-title>Process prediction in noisy data sets: a case study in a dutch hospital</chapter-title>. In: <source>International Symposium on Data-Driven Process Discovery and Analysis</source>. <publisher-name>Springer</publisher-name>, <publisher-loc>Berlin, Heidelberg</publisher-loc>, pp. <fpage>60</fpage>–<lpage>83</lpage>.</mixed-citation>
</ref>
<ref id="j_info1157_ref_021">
<mixed-citation publication-type="journal"><string-name><surname>Sutrisnowati</surname>, <given-names>R.A.</given-names></string-name>, <string-name><surname>Bae</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Song</surname>, <given-names>M.</given-names></string-name> (<year>2015</year>). <article-title>Bayesian network construction from event log for lateness analysis in port logistics</article-title>. <source>Computers &amp; Industrial Engineering</source>, <volume>89</volume>, <fpage>53</fpage>–<lpage>66</lpage>.</mixed-citation>
</ref>
<ref id="j_info1157_ref_022">
<mixed-citation publication-type="chapter"><string-name><surname>Verbeek</surname>, <given-names>H.M.W.</given-names></string-name>, <string-name><surname>Buijs</surname>, <given-names>J.C.</given-names></string-name>, <string-name><surname>van Dongen</surname>, <given-names>B.F.</given-names></string-name>, <string-name><surname>van der Aalst</surname>, <given-names>W.M.</given-names></string-name> (<year>2010</year>). <chapter-title>Xes, xesame, and prom 6</chapter-title>. In: <source>Forum at the Conference on Advanced Information Systems Engineering (CAiSE)</source>. <publisher-name>Springer</publisher-name>, <publisher-loc>Berlin, Heidelberg</publisher-loc>, pp. <fpage>60</fpage>–<lpage>75</lpage>.</mixed-citation>
</ref>
<ref id="j_info1157_ref_023">
<mixed-citation publication-type="chapter"><string-name><surname>Verenich</surname>, <given-names>I.</given-names></string-name>, <string-name><surname>Dumas</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>La Rosa</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Maggi</surname>, <given-names>F.M.</given-names></string-name>, <string-name><surname>di Francescomarino</surname>, <given-names>C.</given-names></string-name> (<year>2016</year>). <chapter-title>Minimizing overprocessing waste in business processes via predictive activity ordering</chapter-title>. In: <source>Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)</source>, Vol. <volume>9694</volume>. <publisher-name>Springer</publisher-name>, pp. <fpage>186</fpage>–<lpage>202</lpage>.</mixed-citation>
</ref>
</ref-list>
</back>
</article>