Twint core component; new lite version due to Twitter Legacy removal
twintproject / twint-ng Goto Github PK
View Code? Open in Web Editor NEWTwint core component; new lite version due to Twitter Legacy removal
License: MIT License
Twint core component; new lite version due to Twitter Legacy removal
License: MIT License
I would like to start off with proposing (and agreeing on) a functional design of the new tool. I had already written something up and I will post that in this issue over the next day or so.
I think the combination of a package and a command line tool is something we absolutely want to keep.
Below is my braindump. Please respond in the comments what you think!!!
Output will be written to stdout (default) or file.
-o <filename> output to file
--csv Write as .csv format
--json Write as .jsonlines format (independent json object on every line)
Errors and informational messages will be output to stderr (default) or file.
-v enable verbose output (loglevel info)
-vv enable debug logging (loglevel debug)
-q disable error output completely (loglevel none)
-l <filename> logfile instead of stderr
--count Display number of Tweets scraped at the end of session.
--stats Show number of replies, retweets, and likes.
The tool needs might need to be able to circumvent most measures taken by Twitter.
-ua <user agent>
-uafile <filename> (with ua strings, one per line, tool will rotate through them)
-proxy <proxyurl>
-proxyfile <filename> (with proxyurls, one per line, tool will rotate through them))
TBD rate limiting, for instance backoff exponent, min/max/random wait time
I consolidated all command line args that have to do with searching and filtering. I think we need to keep the search params (i.e. those that send a different request to Twitter) and remove the filters (i.e. those that remove things that are in the output). Filtering can be done by an external program.
Can someone with more internal knowledge split these args in those two groups maybe?
--to USERNAME Search Tweets to a user.
--all USERNAME Search all Tweets associated with a user.
--favorites Scrape Tweets a user has liked.
-nr, --native-retweets
Filter the results for retweets only.
--min-likes MIN_LIKES
Filter the tweets by minimum number of likes.
--min-retweets MIN_RETWEETS
Filter the tweets by minimum number of retweets.
--min-replies MIN_REPLIES
Filter the tweets by minimum number of replies.
--links LINKS Include or exclude tweets containing one o more links.
If not specified you will get both tweets that might
contain links or not.
--source SOURCE Filter the tweets for specific source client.
--members-list MEMBERS_LIST
Filter the tweets sent by users in a given list.
-fr, --filter-retweets
Exclude retweets from the results.
--videos Display only Tweets with videos.
--images Display only Tweets with images.
--media Display Tweets with only images or videos.
--retweets Include user's Retweets (Warning: limited).
--email Filter Tweets that might have email addresses
--phone Filter Tweets that might have phone numbers
--verified Display Tweets only from verified users (Use with -s).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.