Machine Learning to actively classify android applications
Given a PCAP file, this will create feature vectors based on given PCAPs and then will train a decision tree classifier which can classify PCAP files.
The apps are: FruitNinja, GoogleNews, Wikipedia, WeatherChannel, and Youtube. Sample PCAPs generated from the Android VM can be found in the Samples folder at https://github.com/sjcomeau43543/MLforAndroidApps.
scikit-learn==latest
pyshark==0.3.8
To create the features file from the PCAPs run...
python logFlows.py -d Samples/
To train and use a model run...
python classifyFlows.py -t traffic.csv -e SAMPLE.PCAP
where
traffic.csv was generated by logFlows.py SAMPLE.PCAP is the PCAP you want to classify
During our tests we got between 75-90% accuracy. For test logs look in Testing.
The mean accuracy
will not be accurate if there are no test labels. For example: when giving a pcap file it will assume that the label is 0
rather than a number 1,2,3,4,5
which represent the app names. To get the accuracy we have been giving PCAPS from Samples/APPNAME/dump.pcap
and then getting the label from the APPNAME
directory. If you wanted to see accuracy for a specific file you would run python classifyFlows.py -t traffic.csv -e Samples/FruitNinja/dump_a1.pcap
. Files can be found by cloning https://github.com/sjcomeau43543/MLforAndroidApps.