This repository provides an introductory example of using the Elasticsearch Go client to find documents in Elasticsearch. Specifically, it covers three types of search:
- Traditional keyword search.
- Vector search, making use of the sentence-transformers/msmarco-MiniLM-L-12-v3 model from Hugging Face to generate the embeddings.
- Hybrid search combining the keyword and vector approaches.
The quickest way to setup your own cluster is to register for a free trial of Elastic Cloud. You'll need to perform these additional steps:
- Note your Cloud ID
- Generate an API Key
- Populate your instance with data in the same format as those in the Sources section below
- Upload your model from Hugging Face using Eland
- Enriching your ingested documents using an ingest pipeline
This script requires setting the essential environment variables before running the script. I recommend using something like direnv
, invoked via .envrc
and then adding the variables to a top-level .env
file. Alternatively you can explicitly set the environment variables in your current session according to your operating system.
The following environment variables are required:
ELASTIC_CLOUD_ID=<MY_INSTANCE_CLOUD_ID>
ELASTIC_API_KEY=<MY_API_KEY>
Running server.go
will start a net/http
server on port 80
that you can use to query Elasticsearch:
cd server
go run .
Navigate to the below URLs to obtain the Gopher search results for each search type:
- Keyword: http://localhost/gophers
- Vector: http://localhost/vector-gophers
- Vector with keyword filter: http://localhost/filtered-gophers
- Vector with query embeddings generated using the Hugging Face inference API: http://localhost/filtered-gophers
- Hybrid search with manual boosting: http://localhost/hybrid-gophers
- Hybrid search with RRF: http://localhost/rrf-gophers
The slides from the Women Who Go meetup @ Elastic are available in the docs/slides folder.
The below set of rodent-focused Wikipedia pages have been extracted to Elasticsearch using the Elastic Web Crawler:
- Rodent | Wikipedia
- Gopher | Wikipedia
- Rat | Wikipedia
- Prairie Dog | Wikipedia
- Porcupine | Wikipedia
- Guinea Pig | Wikipedia
- Hamster | Wikipedia
- Capybara | Wikipedia
- Pedetes | Wikipedia
- Beaver | Wikipedia
- House Mouse | Wikipedia
- Squirrel | Wikipedia
If you're new to Go and would like to build your own Web Crawler, I recommend having a stab at this exercise in the Tour of Go where you can build your own concurrent web crawler.
Check out the below resources to learn more about Elasticsearch, Keyword Search and Vector Search.