Code Monkey home page Code Monkey logo

keywordfinder's Introduction

keywordFinder

Python script iterates through a series or URLs looking for the instances of a keyword provided by the user. It will return a new text file with all the times the keyword was found on each URL, along with a certain number of characters (also provided by user) before and after the keyword, to give an idea of the context that keyword was used on.


This scripts accepts 3 arguments over the command line:

1 - Text file containing the list of URLs

2 - Number of characters to be displayed

3 - Keyword being looked for (can be a phrase, as long as it is inside a "string")

The program will look on the text file for the list of URLs, go to each of them, and look for all the times the keyword is found on the page. Once the keyword is found, a fragment of the text in where it is found will be extracted from the website. The size of the fragment will be dictated by the number of characters specified on the command line. Half of the characters will go before the keyword and half after, returning a string of a length of the number of characters specified + the length of the keyword. The script will create a file called "output.txt" where it will print the URL, then on the next line print "FOUND" or "NOT FOUND", depending if the keyword was found at least once on the URL, and the string containing the keyword.

Example: python keywordFinder.py mypages.txt 200 "literature"

This will look on the "mypages.txt" file, which has the list of URLs, go to those websites and look for the word "literature". If it finds it it will return a string of 100 characters before the word "literature", then the actual word "literature", and after another 100 characters.

keywordfinder's People

Contributors

fabianhr8 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.