Comments (4)
Good question.
In fact, LibRadar can be divided into two parts including 'clustering' part and 'instant detection' part.
I downloaded more than 1 million apps and extract static features from them. Then I clustered them into groups and record the groups that have more than 1000 items, and take them as Lib. Some volunteers and I tagged some items so we got tgst5.dat. You can refer to LibRadar - ICSE 2016 for more details.
I don't think that is there anyone want to use the 'clustering' part code to do this work again because it costs a dozen servers for a month to create these data. At the same time, the code I used are ugly and they are just like patches and patches = _ =.
Therefore, I released the instant detection part onto my github as LibRadar. I hope that's enough for users.
from libradar.
Ok. Thank you so much for you fast reply!
from libradar.
I just found your awesome project this week. Separating first party and third party code is exactly what I have been looking for! My previous whitelist approach, as your paper clearly shows, is a losing approach.
I am very interested in the clustering code, however unpolished, since it would allow myself and others to extend and maintain the instant detection database. It would be very cool if there was a way to capture the manual part of the tagging under version control so it can be reused, extended, and updated in a collaborative fashion. At least for my use case, dozens of servers are not a disqualifying requirement.
from libradar.
@jevinskie Glad to hear that.
Project https://github.com/pkumza/lib-detector is the way to generate raw_data. Unpolished though and difficult to use.
In dev branch in https://github.com/pkumza/LibRadar, I used 5 steps to filter and tag raw_data into tgst5.dat.
By the way, APK files I used to generate data are becoming old and this approach is losing coverage too. Therefore, I am trying to create a new version of LibRadar to update the data automatically as I put new apps into this machine.
from libradar.
Related Issues (20)
- LibRadar.rdb is not available anymore
- LibRadar isn't working HOT 4
- Redis integration HOT 3
- Online trail http://radar.pkuos.org/ not working. HOT 2
- Command "python setup.py egg_info" failed with error code 1 HOT 1
- Link rot "http://lxwiki.oss-cn-beijing.aliyuncs.com/lite_dataset_10.csv" HOT 1
- 'int' object has no attribute 'iteritems' HOT 3
- job_dispatching.py does not update the Redis Database HOT 1
- Python Script Not working HOT 1
- https://www.dropbox.com/s/ljtzw74twt8xgy6/d.tar.gz?dl=0 broken HOT 1
- could the tools be run locally? HOT 1
- any idea about the redis error when get data? HOT 1
- Multiple .dex files not considered
- Link not working HOT 1
- Is this project extensible? HOT 1
- website not found
- How to create lite_dataset_10.csv?
- Running into an error HOT 2
- download data issue
- runtime error
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from libradar.