This project contains scripts to create various data files/analyses for our experiment:
- Write the files need to create graphics and run the MultiMatch tool for all the participants.
- Calculate distribution and transition matrices for each of the phases for all of the participants
- Calculate conventional eye-tracking metrics based on fixations and saccades.
- Given similarity scores from MultiMatch, separate participants into clusters.
For our experiment, we split the fixation data into three phases:
- Finding initial focus points
- Building on those points
- Fixing the bug.
This script will separate the data into groups for each bug and for each phase in each bug. For each of those, it will create files to generate the following graphics and computations: alpscarf tool, radial transition graph tool, scatter plots, and MultiMatch tool.
It creates these csv files from the merged_data.csv
file that is created by the iTrace-post project.
This script will also create Distribution and Transition csv files for each of the bugs for all of the participants. The Distribution csv file contains the percentage of time the participant spent on each of the software entities listed below for each of the three phases. The Transition csv file contains a transition matrix between each of the entities for each phase for all of the participants.
- Comments
- Method_Body
- Member_Variable
- Bug_Report
- Class_Signature
- Method_Signature
This program requires a path to a processed_data directory that contains all of the output files from iTrace-post. This directory must have the following structure:
processed_data/
├── P100
│ ├── bug1
│ │ └── merged_data.csv
│ └── bug2
│ └── merged_data.csv
├── P101
│ ├── bug1
│ │ └── merged_data.csv
│ └── bug2
│ └── merged_data.csv
├── Rest of Participants...
- Note: If a certain participant does not have any data for a specific bug, the subdirectory for that bug is not required.
In order to split the data up by phases, two epoch times (ms) are needed to distinguish the three stages. These times should be located in a CSV file with the following header
Participant,Trial,endPhase1,endPhase2
The program also creates two output directories, one for each bug. These directories can be specified with the -o
and -t
command line option.
Each of these directories must have the following structure:
Bug1_Output/
├── Phase1
├── Phase2
└── Phase3
Below is the command line interface for genAllPhases.py:
-
Required Arguments
- -p, --processed : a path to the processed directory that has the structure described above
- -c, --changes : a path to a csv file that contains the times of the phases changes
- -o, --one : a path to a directory to store the bug1 output
- -t, --two : a path to a directory to store the bug2 output
-
Optional Arguments
-
-d, --dictPhaseFile : This file represents the options that the output files can be created with. By default, the program will have the following options selected
- isRadial:entity
- isAlpScarf:entity
- isStimulus:Phase
- isColors:mapEntityColor.txt
The file must have the following format.
OptionForPhaseDict:Value
-
-
Arguments for PhaseDict Text File
- isColors: a text file that contains a mapping of functions to hex colors. This file must have the following format
ColUsedForIsAlpScarf-Color
- isAlpScarf: a string that indicates that the alpscarf plot data should be generated with the AOI column as the passed in argument
- isRadial: a string that indicates the radial data should be generated with the AOIName column as the passed in argument
- isStimulus: a string that indicates the radial data should be generated with the stimulus column as the passed in argument
- isColors: a text file that contains a mapping of functions to hex colors. This file must have the following format
python3 genAllPhases.py -p ./processed_data -c ./PhaseChanges.csv -o ./Bug1_Output -t ./Bug2_Output
*Note this script requires the genVisualPhases.py script since it uses the createMergeDF()
, parseMergeCSV()
createSinglePhase()
, createTransMatrix()
, and createDistMatrix()
functions.
This script will compute pairwise scanpath comparisons of all participants for a single phase for a single bug using the MultiMatch tool. It will create a CSV file with the following header:
Part1_ID,Part2_ID,Shape,Length,Direction,Position,Duration
Because MultiMatch has the potential to have a long runtime, caching has been implemented to save time. This means that on every successful comparison, the script will write out the results to the passed in csv file. The next time the script is called, it will ignore all of the already-completed comparisions and pick up where it left of.
This script requires that the files used for the comparison are generated from the genAllPhases.py script.
python3 calcMultiScore.py <directoryToTSVFiles> <outputName> <pathToCachedCSV>