Code Monkey home page Code Monkey logo

relevant-search-book's Introduction

Relevant Search

Code and Examples for Relevant Search by Doug Turnbull and John Berryman. Published by Manning Publications.

Relevant Search is all about leveraging Solr and Elasticsearch to build more intelligent search applications with intuitive results!

How to run

Install Python

Examples for this book are written in Python 2.7 and use iPython notebook. The first thing you'll need to do is install Python, pip (the Python package installer).

  1. Install Python for your platform here. For Windows we recommend the ActivePython distribution.
  2. Install pip, the Python installer, by simply running easy_install pip

Install Elasticsearch

The examples expect Elasticsearch to be hosted at localhost:9200. So you'll need to install Elasticsearch to work with the examples. There's two ways to install Elasticsearch

Recommended: Vagrant

Vagrant is a tool for installing and provisioning virtual machines locally for development purposes. If you've never used vagrant, you can follow the installation instructions here. OpenSource Connections maintains a basic Elasticsearch vagrant box here.

To use the vagrant box

  1. Install vagrant

  2. Clone the Elasticsearch vagrant box from Github locally

    git clone [email protected]:o19s/elasticsearch-vagrant.git
    
  3. Provision the Vagrant box (this install Elasticsearch and turns the box on)

    cd elasticsearch-vagrant
    vagrant up --provision
    
  4. Confirm Elasticsearch is running

curl -XGET http://localhost:9200

or visit this URL in your browser.

You should see JSON returned from the Elasticsearch instance. Something like:

   {
     "name" : "Mary Zero",
     "cluster_name" : "elasticsearch",
     "version" : {
       "number" : "2.0.0-rc1",
       "build_hash" : "4757962b01a4d837af282f90df9e1fbdb68b524e",
       "build_timestamp" : "2015-10-01T10:06:08Z",
       "build_snapshot" : false,
       "lucene_version" : "5.2.1"
     },
     "tagline" : "You Know, for Search"
   }
  1. When you're done working with examples, turn off the Vagrant box
vagrant halt

Locally on Your Machine

Follow Elasticsearch's instructions to install Elasticsearch on your machine.

Running The Python Examples

The examples are written in Python 2.7 in ipython notebooks depending only on a few basic libraries. The only external library needed is the requests HTTP library. Some of the external APIs require API keys (for example TMDB, you can obtain one here).

To run the IPython Notebook Examples

  1. First ensure you have git, python 2.7 and pip installed and in your PATH

  2. Then use the following commands to install the required dependencies

git clone [email protected]:o19s/relevant-search-book.git
cd relevant-search-book
pip install requests
pip install jupyter
cd ipython/
  1. Launch!

ipython notebook

  1. Play!

Switch to your default browser where the Ipython examples are ready for you to experiment with. Keep in mind many examples are order dependent, so you can't just jump to an interesting listing and run it. Indexing commands with certain settings and what not need to be run. Be sure to run the prior ipython notebook commands too!

Happy Searching!

relevant-search-book's People

Contributors

jnbrymn-eb avatar softwaredoug avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

relevant-search-book's Issues

Error setting up Ipython NB on Mac OSX Yosemite

From Valentin

On my iMac (Yosemite), I went through the explanations on the github wiki, and when starting ipython I got an error "No module named notebook.notebookapp
". Goggled and landed on SO at http://stackoverflow.com/questions/31397421/ipython-server-cant-launch-no-module-named-notebook-notebookapp and then ran "pip install jupyter" as advised. I then got " No module named functools32". I could finally start ipython after also running "pip install functools32".

Permission denied (publickey)

Getting permission denied message when cloning relevant-search-book.git. How can this be resolved?
See detailed message below:

$ git clone [email protected]:o19s/relevant-search-book.git
Cloning into 'relevant-search-book'...
[email protected]: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

Error in getting analyzer code running

Tried using the code for analyzer as given in chapter 3 codes and got the following error. I am unable to understand whats wrong. ES version is 6.5. Any ideas on how to solve this will be appreciated. Thanks!

error:
root_cause:

  • type: "illegal_argument_exception"
    reason: "Failed to parse request body"
    type: "illegal_argument_exception"
    reason: "Failed to parse request body"
    caused_by:
    type: "json_parse_exception"
    reason: "Unrecognized token 'Fire': was expecting ('true', 'false' or 'null')\n
    \ at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@1d20d049;
    \ line: 1, column: 6]"
    status: 400

TMDB ssl errors

Running the code in Appendix A gives errors about trusting SSL on my Ubuntu 14.04 machine.

If I run the following "hello world" script:

import requests
import os

# you'll need to have an API key for TMDB
# to run these examples,
# run export TMDB_API_KEY=<YourAPIKey>
tmdb_api_key = os.environ["TMDB_API_KEY"]
tmdb_api = requests.Session()
tmdb_api.params={'api_key': tmdb_api_key}

httpResp = tmdb_api.get('https://api.themoviedb.org/3/movie/top_rated')

I recieve the error

(venv)doug@76$~/ws/relevant-search-book/ipython(ma) $ python tmdb_hello_world.py 
/home/doug/workspace/relevant-search-book/ipython/venv/local/lib/python2.7/site-packages/requests/packages/urllib3/util/ssl_.py:79: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
Traceback (most recent call last):
  File "tmdb_hello_world.py", line 11, in <module>
    httpResp = tmdb_api.get('https://api.themoviedb.org/3/movie/top_rated')
  File "/home/doug/workspace/relevant-search-book/ipython/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 476, in get
    return self.request('GET', url, **kwargs)
  File "/home/doug/workspace/relevant-search-book/ipython/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 464, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/doug/workspace/relevant-search-book/ipython/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 576, in send
    r = adapter.send(request, **kwargs)
  File "/home/doug/workspace/relevant-search-book/ipython/venv/local/lib/python2.7/site-packages/requests/adapters.py", line 431, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: [Errno 1] _ssl.c:510: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed

Typo in Listing 6.2.1

In listing 6.2.1, the index names int he comments are wrong:

# PUT albinoelphant/docs/1
# { "title":"albino", "body": "elephant"}
# PUT albinoelphant/docs/1
# { "title":"elephant", "body": "elephant"}

should be

# PUT albinoelephant/docs/1
              ^
# { "title":"albino", "body": "elephant"}
# PUT albinoelephant/docs/1
              ^
# { "title":"elephant", "body": "elephant"}

HTTP 429, too many requests

When looping through the top rated books I get a HTTP 429 error, i.e. I performed too many requests in a given time. I was able to work around that by specifying a timeout every tenth requests for a couple of seconds.

So basically I added

import time

and then

for page in range(1, numPages + 1):
    if page % 10 == 0:
        time.sleep(3)  # Sleep for 3 seconds every tenth request
    httpResp = tmdb_api.get('https://api.themoviedb.org/3/movie/top_rated', params={'page': page})  #(1)

But I am not sure if thats the best way to do it.

Can't connect to Elasticsearch on port 9200 - get "connection refused"

Hello, I followed the setup instructions on Readme, installing Elasticsearch on vagrant and using VirtualBox (all on a MacBook). When going through the setup instructions, I did the following (from command line):

vagrant up --provision

[I can see the VirtualBox virtual machine start up]

curl -XGET http://localhost:9200

I get:
curl: (7) Failed to connect to localhost port 9200: Connection refused

This used to work for me (as well as going to localhost:9200 in Chrome).

Any tips for troubleshooting this error? Thanks

Steps to run ipython notebook or python scripts for setup of TMDB

Hi. I've just received the book and am working through examples. I'm comfortable with elasticsearch, but have never used python. I've followed the readme for setting up ipython notebook (now jupyter), but can't find out how to run the cells. I've also tried to use the python directly as scripts, but that is also elusive at the moment. Since time is of the essence, might you have a set of steps to run and verify the setup of the tmdb data? I tried to use _bulk directly in elastic, but the format isn't compatible. Any help greatly appreciated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.