Comments (17)
I am facing the same issues of multiple tilib.exe errors
from flare-ida.
I use about 3.* version BeautifulSoup, then debug the msdn_crawler.py , fix some error, etc:
constant_names = re.findall(
"<dl><dt>(.*?)</dt>", descriptions[i])
if not constant_names:
continue
constant_names = [strip_html(unicode(c, 'utf-8'))
.encode('utf-8') for c in constant_names]
parsed_html = BeautifulSoup(descriptions[i])
constant_descriptions = []
for string in parsed_html.findAll(width='60%'):
constant_descriptions.append(strip_html(string.text.encode('ascii')).encode('utf-8'))
the result is :
Parsed 341278 files
Extracted information about 34218 functions
ERROR processing 197 files
the size of msdn_data_nn.xml
is 33.7Mb
, what about others?
I'll upload this file in my blog site, so people can download this, now i'm focusing on the rest error file..
from flare-ida.
It's still not work,although i change the py file
from flare-ida.
@flypuma can u paste the error infos in ur post?
from flare-ida.
@niklaus520
Traceback (most recent call last):
File "C:\flare-ida-master\MSDN_crawler\msdn_crawler.py", line 414, in
main()
File "C:\flare-ida-master\MSDN_crawler\msdn_crawler.py", line 399, in main
(file_counter, results) = parse_files(msdn_directory, tilib_exe, til_dir)
File "C:\flare-ida-master\MSDN_crawler\msdn_crawler.py", line 372, in parse_fi
les
result = parse_file(os.path.join(root, file), const_enum)
File "C:\flare-ida-master\MSDN_crawler\msdn_crawler.py", line 277, in parse_fi
le
return parse_new_style(file, content, const_enum)
File "C:\flare-ida-master\MSDN_crawler\msdn_crawler.py", line 185, in parse_ne
w_style
constant_descriptions.append(strip_html(string.text.encode('ascii')).encode(
'utf-8'))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 64:
ordinal not in range(128)
from flare-ida.
@flypuma which line have u changed? How many files caught error while processing?
from flare-ida.
@niklaus520 just line you pasted, it begins from line 176. If I do not chang these lines, It has the issue just like #2. I didn't show the number
from flare-ida.
@niklaus520 I run the original msdn-crawler.py and txtracted about 33984 functions, just colse to yours. How could i get the file in you blog site.
from flare-ida.
@flypuma http://blog.depressedmarvin.com/upload/2015/02/09/msdn_data_nn.xml
you can just wget
it
from flare-ida.
@niklaus520 Thanks a lot. Could you upload the file msdn_crawler.py?
from flare-ida.
http://blog.depressedmarvin.com/upload/2015/02/10/msdn_crawler.py
well, now u can try my script, see if there are still errors.
Then u can compare them, maybe some lines are different~
from flare-ida.
i got the issue when run python script annotate_IDB_MSDN, please help me
Traceback (most recent call last):
File "C:/Program Files/IDAPro6.6/python/flare/annotate_IDB_MSDN.py", line 117, in on_ok_button
IDB_MSDN_Annotator.main(config)
File "C:/Program Files/IDAPro6.6/python/flare\IDB_MSDN_Annotator__init__.py", line 523, in main
functions_map = parse_xml_data_files(msdn_data_dir)
File "C:/Program Files/IDAPro6.6/python/flare\IDB_MSDN_Annotator__init__.py", line 486, in parse_xml_data_files
additional_functions = xml_parser.parse(xml_file)
File "C:/Program Files/IDAPro6.6/python/flare\IDB_MSDN_Annotator\xml_parser.py", line 283, in parse
parser.parse(xmlfile)
File "C:\Program Files\IDAPro6.6\lib\xml\sax\expatreader.py", line 107, in parse
xmlreader.IncrementalParser.parse(self, source)
File "C:\Program Files\IDAPro6.6\lib\xml\sax\xmlreader.py", line 123, in parse
self.feed(buffer)
File "C:\Program Files\IDAPro6.6\lib\xml\sax\expatreader.py", line 211, in feed
self._err_handler.fatalError(exc)
File "C:\Program Files\IDAPro6.6\lib\xml\sax\handler.py", line 38, in fatalError
raise exception
xml.sax._exceptions.SAXParseException: C:\Program Files\IDAPro6.6\python\flare\annotate_IDB_MSDN.py:1:2: not well-formed (invalid token)
Thank you very much!
from flare-ida.
to niklaus520
if you change this instruction you can eliminate all error related to unicode:
for string in parsed_html.findAll(width='60%'):
try:
constant_descriptions.append(strip_html(string.text.encode('ascii')).encode('utf-8'))
except Exception,e:
constant_descriptions.append(strip_html(string.text.encode('utf-8')))
please upload to your code so all can download it.
Ivan
from flare-ida.
@I-VANN Here is the code: http://blog.depressedmarvin.com/upload/2015/02/10/msdn_crawler.py && here is the data file: http://blog.depressedmarvin.com/upload/2015/02/09/msdn_data_nn.xml
from flare-ida.
I've just modified the file with my suggestion, it was only for other people.
So if you think that this change to your modified file is acceptable you can modify for all.
Thank you for your availability.
from flare-ida.
@I-VANN cool, thanks for your suggestion
from flare-ida.
Closing this old issue. Please check if the following file works for you after unzipping it.
https://github.com/mr-tz/flare-ida/blob/master/MSDN_data/msdn_data.zip
Please reopen this issue if you need further assistance.
from flare-ida.
Related Issues (20)
- shellcode-hashes - create enum of resolved values HOT 9
- MSDN_Crawler issue HOT 13
- idb2pat.py issue on IDA 7.5 HOT 1
- idb2pat sigmake FATAL: Bad xdigit: error HOT 3
- 0 functions applied in IDA from .sig file HOT 10
- 'itertools.count' object has no attribute 'next' HOT 2
- shellcode hashes operand size issue
- Rename Conti hashing algorithm to MurmurHash2 HOT 1
- An error occurred while using argtacker HOT 1
- objc2_xrefs_helper.py MemoryError
- Several errors of objc2_analyzer.py HOT 1
- idb2pat: RIP-relative addressing not handled correctly
- ironstrings alloca_probe stack size calculation errors HOT 2
- Python 3 support HOT 5
- Possible problem with 64 bit code (find_ref_loc fucntion)?
- No table with addresses is getting printed in ironstring, and so many "DEBUG:root..." in the output
- objc2_analyzer.py cannot work for IDA 7.5
- shellcode_hash_search.py has some logic errors HOT 1
- sc_hashes.db: add process name database + filename database
- idb2pat fix bugs HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from flare-ida.