I wanna the data set. plz!

Hi @ForeverZH0204 - in answer to your questions: <p dir="auto

I just want to know that whether could u provide the data used in the project,THX about malware-prediction-rnn HOT 5 CLOSED

ForeverRuri commented on July 4, 2024

I just want to know that whether could u provide the data used in the project,THX

from malware-prediction-rnn.

Comments (5)

mprhode commented on July 4, 2024

@ForeverZH0204 We hope to release the dataset but are waiting for a review on the paper before release, I will post here when it's out

from malware-prediction-rnn.

ForeverRuri commented on July 4, 2024

thanks a lot!

I also read the paper related.I have several questions to list here and waiting for your are available:

1. how to understand the name 'a particular time into file execution'?

if i have a record with length of 20s, sample it with a interval of 5 seconds, what is the value of 'a particular time into file execution'? I suppose that to be 4, in other word, the actual amount of data used in the model.Is that right?

2. Confusion with figure7 , table X and table XI

As i read in the readme.md and the code, figure7 and table X come from the setting that use the whole training set and then test with a feature(s)-omit test set? And table XI comes from a omition on the whole dataset ? then explore the difference of total process's impact score.
If so, the conclusion of The

'impact score increases relative to others as more features are
omitted, this may indicate that total processes are combined
with other inputs to create discriminating features, though the
input is not highly impactful alone.'

is really hard for me to accept
I hope that i have a mistake.

thanks for your reply!
Wish u a good day.

@mprhode

from malware-prediction-rnn.

mprhode commented on July 4, 2024

Hi @ForeverZH0204 - in answer to your questions:

By "a particular time into file execution" we mean the real time since the start of the execution of the sample. We are arguing that more snapshots (i.e. more data) has a higher correlation with accuracy than the real time since the file began executing.
You are right, Fig 7 and Table X looks at omission of data in the test set and Table XI looks at omission during training and testing. We are looking at the impact of all the features but in the discussion of the total processes feature, we argue that it's average impact score grows as more features are ommitted (the impact score is the fall in accuracy / number of features omitted). For some features, the impact does not really change when just this single feature is omitted, this feature + one other feature, or this feature + 2 other features. This implies that for these features the impact of their omission is not really affected by co-omission of other features. Because it the impact score of "total processes" increases with the number of features omitted at the same time, we believe this indicates that total processes is combined with other features in the RNN to give distinguishing representations between malicious and benign samples. In Table XI the omission of total processes sees one of the biggest falls in accuracy, so we think it is a useful feature for the model but that it's usefulness is realised when combined with other data. We can train further models with different combinations of inputs to test this (but we did not yet for this paper).

Thank you for your questions and I hope that has made it a little more clear - I will work on a presentation of the work which explains these points more clearly.

from malware-prediction-rnn.

ForeverRuri commented on July 4, 2024

thanks for your reply!
But for question 2,if we want to explore the relationship between the difference and a certain variable,I think we need to keep the other conditions unchanged.

from malware-prediction-rnn.

vinayakumarr commented on July 4, 2024

When will exactly data set will be released for further research

from malware-prediction-rnn.

I just want to know that whether could u provide the data used in the project,THX about malware-prediction-rnn HOT 5 CLOSED

Comments (5)

1. how to understand the name 'a particular time into file execution'?

2. Confusion with figure7 , table X and table XI

Related Issues (7)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent