Code Monkey home page Code Monkey logo

py-web-search's Introduction

py-web-search

Latest VersionJoin the chat at https://gitter.im/rohithpr/py-web-search

A Python module to fetch and parse results from different search engines.

Warning: Do not make queries rapidly! The servers may block you.

Related project

Use the search-api to get results in JSON format using http requests. (Does not need python)

Table of Contents

Search engines supported

Installation

Needs Python3. Install using pip:

    pip install py-web-search

Usage

Web search

    from pws import Google
    from pws import Bing

    print(Google.search('hello world', 5, 2))
    print(Bing.search('hello world', 5, 2))
    
    # Arguments:
    # search(query, num, start, sleep, recent)
    # query: Required. The keyword that will be searched.
    # num: Default 10. The number of results returned.
    # start: Default 0. The number of top results that are to be ignored.
    # sleep: Default True. If True, the program will wait for a second, when applicable, to avoid overwhelming the servers.
    # recent: Default None. The following values are allowed: 'h': hour, 'd': day, 'w': week, 'm': month and 'y': year.(Buggy)

Prints 5 results from the the third result onwards (ignores the first 2) in the following format.

    {
        'url': '...',
        'expected_num': 5,
        'received_num' : 5, # There will be a difference in case of insufficient results
        'start': 2,
        'search_engine': 'google',
        'total_results': ...,
        'results':
        [
            {
                'link': '...',
                'link_text': '...',
                'link_info': '...',
                'related_queries': [...],
                'additional_links':
                {
                    linktext: link,
                    ...
                }
        	},
        	...
        ]
    }

News search

    from pws import Bing
    from pws import Google

    print(Bing.search_news('github', 10, 0, True, 'h'))
    print(Google.search_news('github', 10, 0, True, 'd'))
    
    # Arguments:
    # search_news(query, num, start, sleep, recent)
    # query: Required. The keyword that will be searched.
    # num: Default 10. The number of results returned.
    # start: Default 0. The number of top results that are to be ignored.
    # sleep: Default True. If True, the program will wait for a second, when applicable, to avoid overwhelming the servers.
    # recent: Default None. The following values are allowed: 'h': hour, 'd': day, 'w': week, 'm': month and 'y': year.(Buggy)

Prints 10 results from the the first result onwards (ignores the first 0) in the following format.

    {
        'url': '...',
        'num': 10,
        'start': 0,
        'search_engine': 'bing',
        'results':
        [
            {
                'link': '...',
                'link_text': '...',
                'link_info': '...',
                'source': '...',
                'time': '...',
                'additional_links':{}, # Always empty for Bing.
            },
            ...
        ]
    }

Todo

  • Other search engines
  • Images etc.

Contribution

Feel free to add any features that you think might be useful.

py-web-search's People

Contributors

rohithpr avatar colwilson avatar gitter-badger avatar

Watchers

 avatar James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.