Code Monkey home page Code Monkey logo

clothes-in-space's People

Contributors

bigluck avatar jacopotagliabue avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

clothes-in-space's Issues

Getting error on Uploading docs to Elastic search

def upload_docs_to_es(index_name,docs):
"""
index_name is a string
docs is a map doc id -> doc as a Python dictionary (in our case SKU -> product)
"""
# first we delete an index with the same name if any
# ATTENTION: IF YOU USE THIS CODE IN THE REAL WORLD THIS LINE WILL DELETE THE INDEX
if es_client.indices.exists(index_name):
print("Deleting {}".format(index_name))
es_client.indices.delete(index=index_name)
# next we define our index
body = {
'settings': {
"number_of_shards" : 1,
"number_of_replicas" : 0
},
"mappings": {
"properties": {
"name": { "type": "text", "analyzer": LANGUAGE },
"target": { "type": "text", "analyzer": LANGUAGE },

"image": { "type": "text", "analyzer": LANGUAGE } ,

            "vector": {
                  "type": "dense_vector",
                  "dims": EMBEDDING_DIMS
                }
        }
    }
}
# create index
print(body)

print(index_name=ame=ame)

res = es_client.indices.create(index=index_name, body=body)
# finally, we bulk upload the documents
actions = [{
               "_index": index_name,
               "_id": sku,
               "_source": doc
           } for sku, doc in docs.items()
        ]
# bulk upload
res = helpers.bulk(es_client, actions)

return res

def query_with_es(index_name, search_query, n=5):
search_query = {
"from": 0,
"size": n,
"query" : {
"script_score" : {
"query": {
"match" : {
"name" : {
"query" : search_query
}
}
},
"script": {
"source" : "doc['popularity'].value / 10"
}
}
}
}
res = es_client.search(index=index_name, body=search_query)
print("Total hits: {}, returned {}\n".format(res['hits']['total']['value'], len(res['hits']['hits'])))

return [(hit["_source"]['sku'], hit["_source"]['image']) for hit in res['hits']['hits']]

return [(hit["_source"]['sku']) for hit in res['hits']['hits']]

def query_and_display_results_with_es(index_name, search_query, n=5):
res = query_with_es(index_name, search_query, n=n)
return display_image(res)

def display_image(skus, n=5):
for (s, image) in skus[:n]:
print('{} - {}\n'.format(s, image))
display(Image(image, width=150, unconfined=True))

def query_and_rerank_and_display_results_with_es(index_name, search_query, n, session_vector):
res = query_with_es(index_name, search_query, n=n)
skus = [r[0] for r in res]
re_ranked_sku = re_rank_results(session_vector, skus)

return display_image([(sku, res[skus.index(sku)][1]) for sku in re_ranked_sku])

CCC
BBB
AAA

Getting error on Uploading docs to Elastic search eventhough connection to Elastic search is done

INDEX_NAME = 'catalog' #where index_name is a string. I also want to know whether the 'catalog' here is the catalog file path or what has to be given in this INDEX_NAME.

Got error in below portion of code and Error is given below code portion:
def upload_docs_to_es(index_name,docs):
"""
index_name is a string
docs is a map doc id -> doc as a Python dictionary (in our case SKU -> product)
"""
#first we delete an index with the same name if any
#ATTENTION: IF YOU USE THIS CODE IN THE REAL WORLD THIS LINE WILL DELETE THE INDEX
if es_client.indices.exists(index_name):
print("Deleting {}".format(index_name))
es_client.indices.delete(index=index_name)
#next we define our index
body = {
'settings': {
"number_of_shards" : 1,
"number_of_replicas" : 0
},
"mappings": {
"properties": {
"name": { "type": "text", "analyzer": LANGUAGE },
"target": { "type": "text", "analyzer": LANGUAGE },
"image": { "type": "text", "analyzer": LANGUAGE } ,
"vector": {
"type": "dense_vector",
"dims": EMBEDDING_DIMS
}
}
}
}
#create index
print(body)
res = es_client.indices.create(index=index_name, body=body)
#finally, we bulk upload the documents
actions = [{
"_index": index_name,
"_id": sku,
"_source": doc
} for sku, doc in docs.items()
]
# bulk upload
res = helpers.bulk(es_client, actions)

return res

def query_with_es(index_name, search_query, n=5):
search_query = {
"from": 0,
"size": n,
"query" : {
"script_score" : {
"query": {
"match" : {
"name" : {
"query" : search_query
}
}
},
"script": {
"source" : "doc['popularity'].value / 10"
}
}
}
}
res = es_client.search(index=index_name, body=search_query)
print("Total hits: {}, returned {}\n".format(res['hits']['total']['value'], len(res['hits']['hits'])))
return [(hit["_source"]['sku'], hit["_source"]['image']) for hit in res['hits']['hits']]

def query_and_display_results_with_es(index_name, search_query, n=5):
res = query_with_es(index_name, search_query, n=n)
return display_image(res)

def display_image(skus, n=5):
for (s, image) in skus[:n]:
print('{} - {}\n'.format(s, image))
display(Image(image, width=150, unconfined=True))

def query_and_rerank_and_display_results_with_es(index_name, search_query, n, session_vector):
res = query_with_es(index_name, search_query, n=n)
skus = [r[0] for r in res]
re_ranked_sku = re_rank_results(session_vector, skus)

return display_image([(sku, res[skus.index(sku)][1]) for sku in re_ranked_sku])

DDD
EEE
FFF

Doubt on session.txt file

In an e-commerce platform,
"sessions.txt is a TAB separated text file storing a session on each line; each session has a numerical id first and then the list of SKUs (matching the content of catalog.csv) that were viewed in that session."
I have a doubt that whether :

  1. the session.txt file is text file that is unique and different for different users
    or
  2. the session.txt file is single file for that e-commerce platform containing skus of products viewed by different users in each session

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.