Code Monkey home page Code Monkey logo

flight_scraper's Introduction

Hi there! ๐Ÿ‘‹

๐Ÿ’ป About Me

  • ๐Ÿ”ฌ I'm passionate about computers, security, compilers, open source and technology in general
  • โš™๏ธ I love to hack
  • ๐ŸŽ“ I did my PhD at Columbia University
  • ๐Ÿ—ฃ I'm always up for a tech talk

๐Ÿ“ฌ Contact

See bio ๐Ÿ‘ˆ. Feel free to contact me at any time!

flight_scraper's People

Contributors

joeyo avatar nickruiz avatar rockg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

flight_scraper's Issues

Add daily notifications for saved search criteria

My main reason for implementing the calendar search is that I'm looking for a flight with very flexible dates and want to have a daily notification of prices of all flights. I looked around and do not see any functionality like this. I think this will require a few things:

  1. Saved search criteria that will every day execute the search and send out the notification.
  2. Saved email account information in order to send out the email.
  3. A way to modify search criteria--turn the searches on or off once the flight no longer needs to be searched.
  4. Command-line ability (already on your list) as the main interaction for this would not be through the webpage.
  5. How to automate daily...cron job or something else?

All this sounds relatively straightforward. What do you think?

Matrix requests changed

It seems like there has been an overhaul of the ITA matrix search request format. The following request taken from an earlier closed issue used to work on www.hurl.it but now returns an error :

Follow redirects: On
POST

Headers---
Host: matrix.itasoftware.com
Content-Type:application/x-www-form-urlencoded
Cache-Control: no-cache
Content-Length: 0

Parameters---
name: specificDatesSlice
summarizers: sliceSelections,carrierStopMatrixSlice,currencyNotice,solutionListSlice,priceSliderSlice,carrierListSlice,departureTimeRangesSlice,arrivalTimeRangesSlice,durationSliderSlice,originsSlice,destinationsSlice,stopCountListSlice,warningsSlice
format: JSON
inputs: {"slices":[{"origins":["SEA"],"originPreferCity":true,"destinations":["NYC"],"destinationPreferCity":true,"date":"2014-09-19","isArrivalDate":false,"dateModifier":{"minus":0,"plus":0},"timeRanges":[{"min":"17:00","max":"21:00"},{"min":"21:00","max":"23:59"}]},{"destinations":["SEA"],"destinationPreferCity":true,"origins":["NYC"],"originPreferCity":true,"date":"2014-09-26","isArrivalDate":false,"dateModifier":{"minus":0,"plus":0},"timeRanges":[{"min":"17:00","max":"21:00"},{"min":"21:00","max":"23:59"}]}],"pax":{"adults":1},"cabin":"COACH","changeOfAirport":true,"checkAvailability":true,"sliceIndex":0,"page":{"size":30},"sorts":"default"}

The error returned is

500 Server Error

The server encountered an error and could not complete your request.

Capturing the search request from the new ITA matrix, the JSON in the request seems to be something like

{"method":"search","params":{"2":["carrierStopMatrix","currencyNotice","solutionList","itineraryPriceSlider","itineraryCarrierList","itineraryDepartureTimeRanges","itineraryArrivalTimeRanges","durationSliderItinerary","itineraryOrigins","itineraryDestinations","itineraryStopCountList","warningsItinerary"],"3":{"4":{"1":1,"2":30},"5":{"1":1},"7":[{"3":["NYC"],"5":["LHR"],"8":"2015-01-10","9":1,"11":0},{"3":["LHR"],"5":["NYC"],"8":"2015-01-17","9":0,"11":1}],"8":"COACH","9":1,"10":1,"15":"SUNDAY","22":"default"},"4":"specificDates"}}

Flights in separate mongodb collection?

I'm making some additions to your codebase and I noticed that only one "Document" is stored in the DB: Solution. I was debating about the tradeoffs of eliminating flight redundancy by having a flight "document" as well. I'm new to mongoengine at the moment. I was about to store flights separately, but I think it might affect other areas of your code.

What are your thoughts?

No JSON object could be decoded

Hi, I am checking your app, I think is very interesting. But I have the following issue when I try to search flights. When web_app.py is correctly running, I set up a valid search in ITA Matrix and the webapp sends me the error below. Could you provide me some support to solve this error?

Thanks

Traceback (most recent call last):
  File "/Library/Python/2.7/site-packages/flask/app.py", line 1836, in __call__
    return self.wsgi_app(environ, start_response)
  File "/Library/Python/2.7/site-packages/flask/app.py", line 1820, in wsgi_app
    response = self.make_response(self.handle_exception(e))
  File "/Library/Python/2.7/site-packages/flask/app.py", line 1403, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/Library/Python/2.7/site-packages/flask/app.py", line 1817, in wsgi_app
    response = self.full_dispatch_request()
  File "/Library/Python/2.7/site-packages/flask/app.py", line 1477, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/Library/Python/2.7/site-packages/flask/app.py", line 1381, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/Library/Python/2.7/site-packages/flask/app.py", line 1475, in full_dispatch_request
    rv = self.dispatch_request()
  File "/Library/Python/2.7/site-packages/flask/app.py", line 1461, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/Users/jimenezster/workspaces/flight_scraper/web_app.py", line 74, in flight_query
    v = [d[0].isoformat(), d[1].isoformat(), flight_scraper.search_flights()]
  File "/Users/jimenezster/workspaces/flight_scraper/flight_scraper/scraper.py", line 19, in search_flights
    return ita_driver.build_solutions()
  File "/Users/jimenezster/workspaces/flight_scraper/flight_scraper/engines/ita_matrix/driver.py", line 82, in build_solutions
    response_json = json.loads(response.text[4:])
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 326, in loads
    return _default_decoder.decode(s)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 360, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 378, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

Is the ita_scrapper still working

Hi,

I am trying to use the way in your scrapper to fetch the fares.
I tested below at http://www.hurl.it/
by POST

http://matrix.itasoftware.com/xhr/shop/search?name=specificDates&inputs={"slices":[{"origins":["BOS"],"originPreferCity":true,"destinations":["SFO"],"destinationPreferCity":false,"date":"2014-09-10","isArrivalDate":false,"dateModifier":{"minus":0,"plus":0}},{"destinations":["BOS"],"destinationPreferCity":true,"origins":["SFO"],"originPreferCity":false,"date":"2014-09-18","isArrivalDate":false,"dateModifier":{"minus":0,"plus":0}}],"pax":{"adults":1},"cabin":"COACH","changeOfAirport":true,"checkAvailability":true,"page":{"size":30},"sorts":"default"}&summarizers=carrierStopMatrix,currencyNotice,solutionList,itineraryPriceSlider,itineraryCarrierList,itineraryDepartureTimeRanges,itineraryArrivalTimeRanges,durationSliderItinerary,itineraryOrigins,itineraryDestinations,itineraryStopCountList,warningsItinerary&format=JSON

with header

Host matrix.itasoftware.com
Content-Type application/x-www-form-urlencoded
Cache-Control no-cache
Content-Length 0

But I got a response like:

Alternate-Protocol: 80:quic
Content-Length: 0
Content-Type: text/plain; charset=UTF-8
Date: Tue, 19 Aug 2014 14:29:33 GMT
Location: http://matrix.itasoftware.com/xhr/shop/search?name=specificDates&summarizers=carrierStopMatrix%2CcurrencyNotice%2CsolutionList%2CitineraryPriceSlider%2CitineraryCarrierList%2CitineraryDepartureTimeRanges%2CitineraryArrivalTimeRanges%2CdurationSliderItinerary%2CitineraryOrigins%2CitineraryDestinations%2CitineraryStopCountList%2CwarningsItinerary&format=JSON&inputs=%7B%22slices%22%3A%5B%7B%22origins%22%3A%5B%22BOS%22%5D%2C%22originPreferCity%22%3Atrue%2C%22destinations%22%3A%5B%22SFO%22%5D%2C%22destinationPreferCity%22%3Afalse%2C%22date%22%3A%222014-09-10%22%2C%22isArrivalDate%22%3Afalse%2C%22dateModifier%22%3A%7B%22minus%22%3A0%2C%22plus%22%3A0%7D%7D%2C%7B%22destinations%22%3A%5B%22BOS%22%5D%2C%22destinationPreferCity%22%3Atrue%2C%22origins%22%3A%5B%22SFO%22%5D%2C%22originPreferCity%22%3Afalse%2C%22date%22%3A%222014-09-18%22%2C%22isArrivalDate%22%3Afalse%2C%22dateModifier%22%3A%7B%22minus%22%3A0%2C%22plus%22%3A0%7D%7D%5D%2C%22pax%22%3A%7B%22adults%22%3A1%7D%2C%22cabin%22%3A%22COACH%22%2C%22changeOfAirport%22%3Atrue%2C%22checkAvailability%22%3Atrue%2C%22page%22%3A%7B%22size%22%3A30%7D%2C%22sorts%22%3A%22default%22%7D&cc
Server: Apache
Set-Cookie: PREF="ID=893bb06e583439e19fc9ab4a5096d4e5ba047dfa7df8c2c5fd841d7abe95ac56e54bd54689c1e3b0aff4edbddcfb3a5ca581d7ade8ba772efdeb8d833d10b453:TM=1408458573:S=6HGECk7Jzb068IDO"; Version=1

Do you know there is something wrong with my POST request??

Thanks a lot.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.