anupdhml / url_inspect Goto Github PK
View Code? Open in Web Editor NEWEmulate link previews from google search
Emulate link previews from google search
By anupdhml Dependenceies: bs4 (beautiful soup4), lxml. Install from pip/easy_install OR the SRC folder (has bs4) A module meant to emulate google's preview (or 'inspect') feature for search results (see file google_feature.png) # make_tables.py ############################################################# usage: python make_tables.py <url or html file> python make_tables.py -f <file with list of urls or html file paths> Extract all the link sets from urls/html provided, and if used in batch mode (-f option), generate a file with the table of all the link labels, arranged by their frequency over all the pages. The link labels are stemmed using the porter stemming algorithm. # hij_inspect.py ############################################################# usage: python hij_inspect.py <table filename> <url or html file> Use the table generated by make_tables.py to calculate scores for each link set present in the specified url/html-file. The link set with the best score is shown in the search results, as a preview. Make sure that table file provided is appropriate for the used url/html file (eg: probably want news table for a site like cnn) # TESTING #################################################################### For testing, try these: python make_tables.py testfiles/serafina_mod.html python make_tables.py http://www.nytimes.com/ python make_tables.py -f testfiles/urls_test # file table_testfiles-urls_test is the result of the last command in the above list... python hij_inspect.py table_testfiles-urls_test http://www.cnn.com Sample output of these commands can be seen in the SAMPLE_RUN.txt file
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.