Extraction of encodings in from even logs. In total, 22 different encodings are implemented. The encodings are divided into four categories: baseline, process mining based, text based and graph based.
Clone this repo to your local machine using
git clone https://github.com/gbrltv/business_process_encoding.git
Create an environment and activate it
conda create --name bpe python=3.10
conda activate bpe
Install dependencies
python -m pip install -r requirements.txt
To generate the encodings, simply call the main function and provide the arguments. Example:
python main.py --dataset=event_logs/scenario1_1000_attribute_0.05.xes --encoding=onehot
The computed encodings are returned and then can be used for further tasks.
Parameter | Description | Options |
---|---|---|
dataset | event log path | - |
encoding | encoding method used to extract embeddings | onehot, count2vec, alignment, logskeleton, tokenreplay, doc2vec, hash2vec, tfidf, word2vec, boostne, deepwalk, diff2vec, glee, grarep, hope, laplacianeigenmaps, netmf, nmfadmm, node2vec, nodesketch, role2vec, walklets |
vector_size | number of desired dimensions for the encoding (note that some encoding methods do not allow to configure this option) | - |
aggregation | how to aggregate activities' encodings to represent a complete trace (exclusive for some methods of the text and graph families of encodings) | average, max |
embed_from | extract encodings from nodes or edges (exclusive for the graph-based encodings) | nodes, edges |
edge_operator | how to aggregate edge embeddings (exclusive for the graph-based encodings) | average, hadamard, weightedl1, weightedl2 |
- Gabriel Marques Tavares, Postdoc at LMU München
- Rafael Seidi Oyamada PhD candidate at Università degli Studi di Milano
- Paolo Ceravolo, Associate Professor at Università degli Studi di Milano
- Sylvio Barbon Junior, Associate Professor at Univeristà degli Studi di Trieste