Code Monkey home page Code Monkey logo

ibm / watson-discovery-food-reviews Goto Github PK

View Code? Open in Web Editor NEW
36.0 19.0 44.0 18.88 MB

Combine Watson Knowledge Studio and Watson Discovery to discover customer sentiment from product reviews

Home Page: https://developer.ibm.com/patterns/get-customer-insights-from-product-reviews/

License: Apache License 2.0

JavaScript 92.91% CSS 7.09%
ibmcode watson-discovery-service ibm-cloud watson discovery enrichment data-enrichment sentiment keyword

watson-discovery-food-reviews's Introduction

Build Status

Discover customer sentiment from product reviews

In this code pattern, we walk you through a working example of a web application that queries and manipulates data from the Watson Discovery Service. With the aid of a custom model built with Watson Knowledge studio, the data will have additional enrichments that will provide improved insights for user analysis.

This web app contains multiple UI components that you can use as a starting point for developing your own Watson Discovery and Knowledge Studio service applications.

The main benefit of using the Watson Discovery Service is its powerful analytics engine that provides cognitive enrichments and insights into your data. This app provides examples of how to showcase these enrichments through the use of filters, lists and graphs. The key enrichments that we will focus on are:

  • Entities: people, companies, organizations, cities, and more.
  • Categories: classification of the data into a hierarchy of categories up to 5 levels deep.
  • Concepts: identified general concepts that aren't necessarily referenced in the data.
  • Keywords: important topics typically used to index or search the data.
  • Entity Types: the classification of the discovered entities, such as person, location, or job title.
  • Sentiment: the overall positive or negative sentiment of each document.

With Watson Knowledge Studio, a machine learning annotator can be trained to recognize mentions of custom entity and relation types which can then be incorporated into the Discovery application enrichment process.

For this code pattern, we will be using data that contains food reviews from Amazon, see the Kaggle dataset for further information.

When the reader has completed this code pattern, they will understand how to:

  • Use Watson Knowledge Studio to create a custom annotator.
  • Deploy a Watson Knowledge Studio model to Watson Discovery.
  • Load and enrich data in the Watson Discovery Service.
  • Query and manipulate data in the Watson Discovery Service.
  • Create UI components to represent enriched data created by the Watson Discovery Service.
  • Build a complete web app that utilizes popular JavaScript technologies to feature Watson Discovery Service data and enrichments.

architecture

Flow

  1. A sample set of review documents are loaded into Watson Knowledge Studio for annotation.
  2. A Watson Knowledge Studio model is created.
  3. The Watson Knowledge Studio model is applied to a Watson Discovery service instance.
  4. The food review json files are added to the Discovery collection.
  5. The user interacts with the backend server via the app UI. The frontend app UI uses React to render search results and can reuse all of the views that are used by the backend for server side rendering. The frontend is using semantic-ui-react components and is responsive.
  6. User input is processed and routed to the backend server, which is responsible for server side rendering of the views to be displayed on the browser. The backend server is written using express and uses express-react-views engine to render views written using React.
  7. The backend server sends user requests to the Watson Discovery Service. It acts as a proxy server, forwarding queries from the frontend to the Watson Discovery Service API while keeping sensitive API keys concealed from the user.

NOTE: see DEVELOPING.md for project structure.

Included components

  • Watson Discovery: A cognitive search and content analytics engine for applications to identify patterns, trends, and actionable insights.
  • Watson Knowledge Studio: Teach Watson the language of your domain with custom models that identify entities and relationships unique to your industry, in unstructured text. Use the models in Watson Discovery, Watson Natural Language Understanding, and Watson Explorer.

Featured technologies

  • Node.js: An open-source JavaScript run-time environment for executing server-side JavaScript code.
  • React: A JavaScript library for building User Interfaces.
  • Express: A popular and minimalistic web framework for creating an API and Web server.
  • Semantic UI React: React integration of Semantic UI components.
  • Chart.js: JavaScript charting package.
  • Jest: A JavaScript test framework.

Watch the Video

video

Steps

  1. Clone the repo
  2. Create IBM Cloud services
  3. Create a Watson Knowledge Studio workspace
  4. Upload Type System
  5. Import Corpus Documents
  6. Create the model
  7. Deploy the machine learning model to Watson Discovery
  8. Create Discovery Collection
  9. Deploy the application

1. Clone the repo

git clone https://github.com/IBM/watson-discovery-food-reviews

2. Create IBM Cloud services

Create the following services:

3. Create a Watson Knowledge Studio workspace

Launch the Watson Knowledge Studio tool and click on Create entities and relations workspace.

create_wks_workspace

Enter a unique name and press Create.

4. Upload Type System

A type system allows us to define things that are specific to review documents, such as product and brand names. The type system controls how content can be annotated by defining the types of entities that can be labeled and how relationships among different entities can be labeled.

To upload our pre-defined type system, from the Assets -> Entity Types panel, press the Upload button to import the Type System file data/types-2aa46ad0-31da-11e8-89a9-efc0f3b77492.json found in the local repository.

upload_type_system

Press the Upload button. This will upload a set of Entity Types and Relation Types:

wks_entity_types

wks_relation_types

5. Import Corpus Documents

Corpus documents are required to train our machine-learning annotator component. For this code pattern, the corpus documents will contain sample review documents.

From the Assets -> Documents panel, press the Upload Document Sets button to import a Document Set file. Use the corpus documents file data/watson-discovery-food-reviews/data/corpus-2aa46ad0-31da-11e8-89a9-efc0f3b77492.zip found in the local repository.

NOTE: Select the option to "upload corpus documents and include ground truth (upload the original workspace's type system first)"

import_corpus

Once uploaded, you should see a set of documents:

wks_document_set

6. Create the model

Since the corpus documents that were uploaded were already pre-annotated and included ground truth, it is possible to build the machine learning annotator directly without the need for performing human annotations.

Go to the Machine Learning Model -> Performance panel, and press the Train and Evaluate button.

wks_training_sets

From the Document Set name list, select the annotation sets Docs28.csv and Docs122V2.csv. Also, make sure that the option Run on existing training, test and blind sets is checked. Press the Train & Evaluate button.

This process may take several minutes to complete. Progress will be shown in the upper right corner of the panel.

You can view the log files of the process by clicking the View Log button.

Once complete, you will see the results of the train and evaluate process:

wks_training_complete

7. Deploy the machine learning model to Watson Discovery

Now we can deploy our new model to the already created Watson Discovery service. Navigate to the Versions menu on the left and press Create Version.

wks_snapshot_page

The new version will now be available for deployment to Watson Discovery.

wks_model_version

To start the process, click the Deploy button associated with your version.

wks_deployment_options

Select the option to deploy to Discovery.

wks_deployment_location

Enter your IBM Cloud account information to locate your Discovery service to deploy to.

Once deployed, a Model ID will be created. Keep note of this value as it will be required later when configuring your credentials.

wks_deployment_model

NOTE: You can also view this Model ID by clicking the Deployed Models link under the model version.

8. Create Discovery Collection

Launch the Watson Discovery tool. Create a new data collection by clicking the Upload you own data button. Enter a unique name to create your collection.

disco_create_collection

Creating the Discovery Collection and populating the .env file with the appropriate credentials is all that is required to deploy and run the app. Once started, the app will load all of the data files into your collection. For details on how to do this manually, go to the Discovery collection configuration details section below.

To locate your environment_id and collection_id values for your collection, click the drop-down button at the top of your collection panel.

find_disco_ids

To locate the service credentials for your discovery service, click on the Service Credentials tab.

get_disco_creds

9. Deploy the application

There are several ways to deploy the app. Each requires that you provide the necessary credentials for both your Watson Discovery and Watson Knowledge Studio services (see above for how to retrieve the credentials).

Click on one of the options below for instructions on deploying the app.

openshift public local

Sample UI layout

sample_output

Discovery collection configuration details

For reference, the following screen-shots detail how to set-up a collection configuration and load data files. In this code pattern, this process is completed for you when the application is initially started, but it is important to know what is happening in the background.

If you were to create the configuration manually, these are the steps you would take:

Launch the Watson Discovery tool. Create a new data collection by clicking the Upload you own data button. Enter a unique name to create your collection.

disco_create_collection

From the new collection data panel, click the Configure Data button at the top of the panel. Then select the Enrich fields tab.

enrich_fields_panel

You can see that as a default, there are several enrichments that will be applied to your data collection. But we need to add to this list.

Click on Add enrichments.

At the top of the list, select Keyword Extraction.

keyword_extraction

At the bottom of the list, select both Entity Extraction and Relation Extraction. Enter the Model ID that we created in Watson Knowledge Studio.

Close the enrichments window.

Click Apply changes to collection to start the process of loading the discovery files.

select_disco_files

Drag and drop your documents here or browse to your local computer files to load the collection with the json files located in data/food_reviews.

NOTE: If using the Discovery Lite plan, you are limited to loading up to 1000 files into your discovery service. This limit is not per collection, but the combined number for all collections in your service.

Troubleshooting

  • Error when loading files into Discovery

    Loading all 1000 document files at one time into Discovery can sometimes lead to "busy" errors. If this occurs, start over and load a small number of files at a time.

  • No keywords appear in the app

    This can be due to not having a proper configuration file assigned to your data collection. See Step 5 above.

Links

Learn more

  • Artificial Intelligence Code Patterns: Enjoyed this code pattern? Check out our other AI Code Patterns.
  • AI and Data Code Pattern Playlist: Bookmark our playlist with all of our code pattern videos
  • With Watson: Want to take your Watson app to the next level? Looking to utilize Watson Brand assets? Join the With Watson program to leverage exclusive brand, marketing, and tech resources to amplify and accelerate your Watson embedded commercial solution.

License

This code pattern is licensed under the Apache Software License, Version 2. Separate third party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the Developer Certificate of Origin, Version 1.1 (DCO) and the Apache Software License, Version 2.

Apache Software License (ASL) FAQ

watson-discovery-food-reviews's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

watson-discovery-food-reviews's Issues

Use latest Watson SDK

Similar to recent changes to https://github.com/IBM/watson-discovery-ui:

  • switch from watson-developer-cloud to ibm-watson 4.0.2 (or newer)
  • update Discovery version date to '2019-03-25'
  • modify param list when creating a discovery configuration
  • modify 'size' to use 'LT' when creating a discovery environment

On Cloud Pak for Data Can Watson Knowledge Studio can be connected to Watson Discovery

Q.1 Can this food-reviews project executed on Cloud Pak for Data on Prem?
Q.2 If Yes, then How .env file can be created, what parameters I can pass in .env file?
Q.3 If No, then Can you please provide me with Sample Apps with Cloud Pak for Data on Watson Discovery and Watson Knowledge Studio to Watson Discovery Service?

I am deploying on My Local Machine.
I am using IBM Cloud Pak for Data on Prem - 3.0.1 Enterprise
Example:-
(This is the environment for IBM Cloud)

Copy this file to .env and replace the credentials with
your own before starting the app.

Watson Discovery
DISCOVERY_URL=<add_discovery_url>
DISCOVERY_IAM_APIKEY=<add_discovery_iam_apikey>
DISCOVERY_ENVIRONMENT_ID=<add_discovery_environment_id>
DISCOVERY_COLLECTION_ID=<add_discovery_collection_id>

Watson Knowledge Studio
WKS_MODEL_ID=<add_wks_model_id>

Run locally on a non-default port (default is 3000)
PORT=3000

As on Cloud Pak for Data on Prem there is no option to Deploy Knowledge Studio Machine Learning model , ONLY we can export .zip file Knowledge Studio Machine Learning Model to Watson Discovery Machine Learning Model .
Therefore WKS_MODEL_ID cant be created.
Similarly DISCOVERY_IAM_APIKEY , DISCOVERY_ENVIRONMENT_ID , DISCOVERY_COLLECTION_ID parameters cant be created on
Cloud Pak for Data on Prem.

Only Parameters I have :-
While Launching Watson Discovery - In Access Information

  1. URL
  2. Bearer Token
    So with this parameter can we create .env file or some more parameter should I pass?

Sort by Rating doesn't work

All other sorting methods are good, but the highest/lowest rated sorting seems not to work.
high_low_rated

Can you please look into it? Thanks.

Help with breaking the reviews file into json

I am working on a project and I have a csv file which I want to convert the reviews present in the csv file to json, I have been able to do that and I have a large json file, but I was wondering how you broke your json file such that you have each reviews from 1-1000 well labelled

Input 'Time' won't work with timeslice queries. Need better workaround.

Looks like we need the JSON to either use a number timestamp or a quoted string formatted date, but csvtojson gave us a quoted timestamp. This is currently fixed in the lib when we addDoc, but we probably should fix it in JSON so files can be added using the Disco UI (and remove the current tweak).

Search features might not work normally

I clicked Interactive Queries on the top of main page and typed "subway" in the search bar, but the result seems to give me all the reviews instead of only those includes "subway".

Here's what I tried to search:
search_bar

And here's the result, I used ctrl+f and there's no "subway" I'm looking for:
search_result

Can you please look into it and advise? Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.