endgameinc / dga_predict Goto Github PK
View Code? Open in Web Editor NEWLicense: GNU General Public License v2.0
License: GNU General Public License v2.0
I get this error when running on Colab, please help me. Thanks
can't multiply sequence by non-int of type 'float'
TypeError Traceback (most recent call last)
*******\dga_predict-master\run.py in
93
94 if name == "main":
---> 95 create_figs(nfolds=1) # Run with 1 to make it fast
*******\dga_predict-master\run.py in create_figs(isbigram, islstm, nfolds, force)
32 # Generate results if needed
33 if force or (not os.path.isfile(RESULT_FILE)):
---> 34 bigram_results, lstm_results = run_experiments(isbigram, islstm, nfolds)
35
36 results = {'bigram': bigram_results, 'lstm': lstm_results}
*******\dga_predict-master\run.py in run_experiments(isbigram, islstm, nfolds)
21
22 if isbigram:
---> 23 bigram_results = bigram.run(nfolds=nfolds)
24
25 if islstm:
*******\dga_predict-master\dga_classifier\bigram.py in run(max_epoch, nfolds, batch_size)
21 def run(max_epoch=50, nfolds=10, batch_size=128):
22 """Run train/test on logistic regression model"""
---> 23 indata = data.get_data()
24
25 # Extract data and labels
*******\dga_predict-master\dga_classifier\data.py in get_data(force)
128 def get_data(force=False):
129 """Returns data and labels"""
--> 130 gen_data(force)
131
132 return pickle.load(open(DATA_FILE))
*******\dga_predict-master\dga_classifier\data.py in gen_data(force)
118 """
119 if force or (not os.path.isfile(DATA_FILE)):
--> 120 domains, labels = gen_malicious(10000)
121
122 # Get equal number of benign/malicious
*******\dga_predict-master\dga_classifier\data.py in gen_malicious(num_per_dga)
46 segs_size = max(1, num_per_dga/len(banjori_seeds))
47 for banjori_seed in banjori_seeds:
---> 48 domains += banjori.generate_domains(segs_size, banjori_seed)
49 labels += ['banjori']*segs_size
50
*******\dga_predict-master\dga_classifier\dga_generators\banjori.py in generate_domains(nr_domains, seed)
14 ret = []
15
---> 16 for int i in range(nr_domains):
17 seed = next_domain(seed)
18
TypeError: 'float' object cannot be interpreted as an integer
i keep getting this error can anyone please help me i'm not familiar with python coding thanks in advance
Hello,
The lengths of domain names generated by the Simda generator are bad (range from 0 to 32-8). Thus, the dataset used from training the model is a bit corrupted.
To fix this issue, just replace this piece of code in data.py:
simda_lengths = range(8, 32)
segs_size = max(1, num_per_dga/len(simda_lengths))
for simda_length in range(len(simda_lengths)):
domains += simda.generate_domains(segs_size,
length=simda_length,
tld=None,
base=random.randint(2, 2**32))
labels += ['simda']*segs_size
By this one:
simda_lengths = range(8,
segs_size = max(1, num_per_dga/len(
for simda_length in simda_lengths:
domains += simda.generate_domains(segs_size,
length=simda_length,
tld=None,
base=random.randint(2, 2**32))
labels += ['simda']*segs_size
The only difference is that the new code takes use of simda_lengths.
I hope it'll help !
Hi,
Could you please post a few lines of code with a sample of checking domain name against trained model and returning result (generated/non-generated)?
Thanks!
I run the dga_predict on windows, the environment is below:
(tensorflow) C:\JT\deeplearning\dga\dga_predict-master>pip freeze
backports.weakref==1.0rc1
bleach==1.5.0
certifi==2018.1.18
chardet==3.0.4
cycler==0.10.0
h5py==2.7.0
html5lib==0.9999999
idna==2.6
Keras==2.1.3
Markdown==2.6.11
matplotlib==2.0.2
numpy==1.14.0
protobuf==3.5.1
pyparsing==2.2.0
python-dateutil==2.6.1
pytz==2017.2
PyYAML==3.12
requests==2.18.4
requests-file==1.4.3
scikit-learn==0.19.1
scipy==0.19.1
six==1.11.0
sklearn==0.0 (I notice here looks stange! )
tensorflow==1.2.1
tldextract==2.2.0
urllib3==1.22
Werkzeug==0.14.1
wincertstore==0.2
And the error information is below:
(tensorflow) C:\JT\deeplearning\dga\dga_predict-master>python run.py
Using TensorFlow backend.
C:\Users\Admin\Anaconda3\envs\tensorflow\lib\site-packages\sklearn\cross_validat
ion.py:41: DeprecationWarning: This module was deprecated in version 0.18 in fav
or of the model_selection module into which all the refactored classes and funct
ions are moved. Also note that the interface of the new CV iterators are differe
nt from that of this module. This module will be removed in 0.20.
"This module will be removed in 0.20.", DeprecationWarning)
Traceback (most recent call last):
File "run.py", line 95, in
create_figs(nfolds=1) # Run with 1 to make it fast
File "run.py", line 34, in create_figs
bigram_results, lstm_results = run_experiments(isbigram, islstm, nfolds)
File "run.py", line 23, in run_experiments
bigram_results = bigram.run(nfolds=nfolds)
File "C:\JT\deeplearning\dga\dga_predict-master\dga_classifier\bigram.py", lin
e 23, in run
indata = data.get_data()
File "C:\JT\deeplearning\dga\dga_predict-master\dga_classifier\data.py", line
130, in get_data
gen_data(force)
File "C:\JT\deeplearning\dga\dga_predict-master\dga_classifier\data.py", line
120, in gen_data
domains, labels = gen_malicious(10000)
File "C:\JT\deeplearning\dga\dga_predict-master\dga_classifier\data.py", line
48, in gen_malicious
domains += banjori.generate_domains(segs_size, banjori_seed)
File "C:\JT\deeplearning\dga\dga_predict-master\dga_classifier\dga_generators
banjori.py", line 16, in generate_domains
for i in range(nr_domains):
TypeError: 'float' object cannot be interpreted as an integer
The python version is 3.5.4, and i used 2to3 to change the code. Any response will be appreciated, thanks!
The function gen_data takes the count of malicious+benign domains and adds this number of labels to the malicious labels.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.