Code Monkey home page Code Monkey logo

harry-potter-search's Introduction

Harry Potter and the Elasticsearch Engine

This project covers serch use cases on Harry Potter text databases, with a focus on python integrations.

Part 1: Intro to Elasticsearch 0 --> 4

Create a index where each document is a Harry Potter character with their attributes. This index can them be used to create customized search queries to identify subsets of characters with particular properties.

This example project covers the basic introductory concepts of elasticsearch and kibana.

In Phase 2 Notebooks 5 --> 9

Introduce the python client to communicate with the Elasticsearch engine via code. Create an index from the first Harry Potter movie script to use fore more complex, natural language queries. Use Hugging Face models to add Sentiment Analysis and Embeddings for Semantic Search. Combine multiple models for hybrid search; compare to the native functionality of ELSER (knn Search).

In Phase 3 Files 11

Build a simple Flask APP as a User Interface for search Introduce a new index to store historical searches as they are ran - we can use this for observability & tracking. Separate some helper_functions that we can reuse.

Implemented features and planned additions

  • HP characters index & search
  • HP characters index - python client interface for search
  • HP sentiment analysis on movie subtitles
  • Embeddings and semantic search with ELSER
  • Python-DSL client
  • Web APP with Flask
  • Observability & Monitoring

Watch the video

Setup Environment

Requirements Installation of Elasticsearch (either local or on cloud) see docs

For python environment, recommend to set up a virtual environment see docs. Requirements: pandas.

Harry Potter Characters Index | Intro to Elasticsearch

Python notebook for some essential data cleaning with pandas dataframes.

Instructinos for adding data to the elastic cluster.

Short intro to Dashboards and visualizations in Kibana.

Short intro to Discover and KQL.

Working with Console / dev tools, intro to data types in elastic.

Building requests and intro to queries.

Harry Potter Movie Dialoogue Index | Intro to Elasticsearch Python Client

Working with the python client to build an index and mapping, bulk ingest documents, and run queries.

6 TBD - Elasticsearch Python DSL Client

Use the Eland client to import models from Hugging Face and run Sentiment Analysis on the data

Create embeddings for semantic (natural language) search

Compare with the ELSER model built by Elastic

See blog for our Advent Calendar here

Phase 3

Using Flask for a simple user interface allowing users to search (for a live demo). Added historical query tracking in a new index to later use for observability. img

harry-potter-search's People

Contributors

iuliaferoli avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.