Comments (6)
It looks like twitter-intact-stream
does some manipulation to the JSON object it receives from Tweepy
, and isn't in the Tweet JSON Specification supplied by Twitter. I'd recommend using twarc
to rehydrate a list of the IDs from the data.
Note, that not all the information birdspotter
requires is in the JSON supplied by twitter-intact-stream
.
from birdspotter.
Thanks for your reply. Would you please be a bit more specific on how to prepare the input for birdspotter? As per your answer, birdspotter cannot use the output of twitter-intact-stream directly and it has to go through twarc, doesn't it? If possibly, may you provide some examples?
from birdspotter.
On closer inspection; it looks like twitter-intact-stream
does leave us in the correct format, assuming no post-processing has been done. I'll need to investigate further, as to why one or more of the json
objects don't have a "text" or "full_text" field.
from birdspotter.
It looks like the twitter-intact-stream
contains lines to indicated rate-limiting, such as
{"limit":{"track":283540,"timestamp_ms":"1483189188944"}}
Normally, birdspotter
would ignore corrupted lines that aren't in valid json
format, but this is valid json
.
The work around at the moment would be to filter out lines that look like the above and then feed the result into birdspotter
.
I think it would be better if birdspotter
dealt with cases like this better. I'm going to put a temporary fix in so that it ignores objects without a text
or full_text
field.
I'll leave this open till that is implemented.
from birdspotter.
Rate limit messages are normal when using the search API, they give the number of lost tweets. You would expect them when using other Twitter API tools, so would be good if birdspotter
would filter them our automatically.
from birdspotter.
db93307 in the development
branch should fix this problem, at least temporarily. There should probably be a more robust check of the json
object to verify it's validity, but we'll wait till there is more interest.
from birdspotter.
Related Issues (11)
- Trouble parsing Twarc dump HOT 8
- Installation fail on macOS Mojave 10.14.6 because of xgboost=0.81 dependancy
- BirdSpotter does not have a licence blurb in README; is MIT the right license?
- Error when trying to use BirdSpotter on specialised Twitter Dump HOT 7
- ValueError: empty vocabulary; perhaps the documents only contain stop words HOT 6
- What is the threshold for the bot score ?
- Trouble with formats/filenames and downloading resources HOT 5
- Documentation link in the README
- KeyError: "['botness', 'influence'] not in index" HOT 1
- Add other hawkes kernels to influence quantification (namely PL)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from birdspotter.