Code Monkey home page Code Monkey logo

Comments (5)

eh-93 avatar eh-93 commented on May 22, 2024 2

My thoughts below:

1) The package & CLI options are a good way of exposing Twint's APIs
Agreed - the combination is enough to allow the majority of use cases
2) Propose to remove databases, ES, translations
Agreed - we should keep this package lean and purpose specific
3) Do we really need all the async stuff?
Synchronous requests should suffice
4) Python3 only
Definitely :)

Output should be a file or stdout
The module should expose a generator interface which can iterate between requests - this will allow "streaming" of results

The CLI options all make sense

I will setup project scaffold this weekend to get us started

from twint-ng.

o7n avatar o7n commented on May 22, 2024 2

I think you forgot to push the branch.
We can port all the logic concerning which URL's to use and the HTML elements to look at. But since we're going to use (synchronous) Requests there is no use in porting the entire scraper I think.

During tests I did this weekend, I did not find any reason to do things like rotate user agents. So we should keep the code very simple, only adding bells and whistles when we really need them.

from twint-ng.

eh-93 avatar eh-93 commented on May 22, 2024 1

Scaffolding done in a separate branch - let me know what you guys think of the tooling choices

How much is portable from the current Twint package? I would assume the scraper can be moved across

from twint-ng.

o7n avatar o7n commented on May 22, 2024

The module should expose a generator interface which can iterate between requests - this will allow "streaming" of results

Yeah, second that, that would be really neat.

from twint-ng.

pielco11 avatar pielco11 commented on May 22, 2024

Totally agree 🎉

from twint-ng.

Related Issues (1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.