Code Monkey home page Code Monkey logo

prototype-content-consumer's Introduction

prototype-content-consumer - API

Background

The project will allow users to capture data on pages and store it to catalogue and organize for later use.

Getting Started

Follow the steps below to setup a development environment. This project expects you will be using a flavor of Unix that supports a standard node.js environment.

  1. Clone repository

    $ git clone [email protected]:chadwpry/prototype-content-consumer.git

  2. Set NODE_ENV variable to development. It can be included in a .bash_profile or .bashrc configuration.

    $ export NODE_ENV=development

  3. Install node packages

    $ npm install

  4. Create configuration

    $ cp config/config.json.example config/config.json

    Modify the path to your Google Cloud credentials

  5. Create a 'self signed' SSL certificate

    $ openssl req -nodes -x509 -newkey rsa:2048 -keyout credentials/key.pem -out credentials/cert.pem -days 365

  6. Seed datastore

    $ npm run seeds

  7. Start server

    $ npm start

API Endpoints

Request Example

https://localhost/api/v1/suppliers/www.lumens.com

Response Example

{"data": {
  "type": "Selector",
  "id": 5649391675244544,
  "attributes": {
    source_id: {
      attribute: "data-pid",
      selector: "[itemscope][itemtype='http://schema.org/Product'] > [data-pid]:first"
    },
    product_url: {
      attribute: "content",
      selector: "[itemscope][itemtype='http://schema.org/Product'] [itemprop='url']:first"
    },
    product_image_url: {
      attribute: "content",
      selector: "[itemscope][itemtype='http://schema.org/Product'] .productimages meta[property='og:image']"
    },
    product_brand_name: {
      attribute: "content",
      selector: "[itemscope][itemtype='http://schema.org/Product'] [itemscope][itemtype='http://schema.org/Brand'] meta[itemprop='name']:first"
    },
    product_brand_url: {
      attribute: "href",
      selector: "[itemscope][itemtype='http://schema.org/Product'] [itemscope][itemtype='http://schema.org/Brand'] a[itemprop='url']:first"
    },
    product_offer: {
      attribute: "content",
      selector: "[itemscope][itemtype='http://schema.org/Product'] [itemscope][itemtype='http://schema.org/Offer'] meta[itemprop='price']:first"
    },
    product_offer_currency: {
      attribute: "content",
      selector: "[itemscope][itemtype='http://schema.org/Product'] [itemscope][itemtype='http://schema.org/Offer'] meta[itemprop='priceCurrency']:first"
    }
  }
},
"jsonapi": {
  "version":"1.0.0"
}

Sample Resource

When sending a POST request to create a sample, the request must have the below URI and include the following key/value pairs in the payload -- hostname, payload and payload['nonce']:

Request Example

`POST /api/v1/samples

body: {
 hostname=www.ikea.com,
 nonce: 'kjhs9182791hakajh19',
 payload={
           product_id: kj9128kk,
           product_name: Boots, ...
         }

}`

Contributing

  • Check out the latest master to make sure the feature hasn't been implemented or the bug hasn't been fixed yet
  • Check out the issue tracker to make sure someone already hasn't requested it and/or contributed it
  • Fork the project
  • Start a feature/bugfix branch
  • Commit and push until you are happy with your contribution
  • Make sure to add tests for it. This is important so future version are not broken unintentionally.

prototype-content-consumer's People

Contributors

chadwpry avatar

Watchers

 avatar James Cloos avatar

Forkers

robskrob

prototype-content-consumer's Issues

Create Selector API persistence

Add sequelize model, and migration
Add seed selector (lumens.com) + more?
Integrate /api/v1/selectors endpoint with model

Supplier API persistence

  • Create API endpoint to support in url host param
  • Integrate google datastore read queries for selector
  • Create jsonapi serialization for google datastore entities

Sample API persistence

Create API endpoint
Integrate google datastore write queries for sample
Create jsonapi serialization for google datastore entities

Research product detection techniques

Review multiple techniques for detecting in page product and its properties. Plan to support all expectations.

Expectations:

  1. supports unique product properties per domain
  2. upgradeable without deploying new extension version
  3. supports pushing captured content to collector api
  4. self reliant code libraries, does not rely on existing domain host libraries

Possibilites:

  1. assign a css selector based on domain host
  2. assign a parser script based on domain host
  3. ???

Create selector for lumens

Use this lumens url as an example and place to learn more about the details below.

host: www.lumens.com
location: http://www.lumens.com/masters-chair-by-kartell-uu373098.html#cgid=%0A%09%0A%09%09172%0A%09%0A&&tileIndex=1

Use the itemscope html element attribute to help with css selection. Keep in mind in the data schema of lumens, there are nested entities.

Define the appropriate selector for entity properties shown in the lumens page.

product_id
product_url
product_name
product_brand_name
product_brand_url
product_image_url
product_offer
product_offer_currency

Create initial gcloud datastore skeleton

Describe gcloud integration in README, keys detail included
Create models abstraction file
Implement savePayload method with this signature (body, callbackSuccess, callbackError)
Implement validateProduct method to be called by savePayload and always return true for now

Create Collector API persistence with Google Cloud Datastore

Description: The API needs to have an endpoint for persisting selector data.

Expectations:

  1. Clients can request dynamic responses from server for Selector data on a page
  2. Clients can POST to a specific endpoint with data from any HTML page to persist collector data.

Implementation:

  1. Install gcloud and its corresponding platform service Google Cloud Datastore.
  2. Activate and configure application to use the above document storage solution.
  3. Define endpoint for the POST route to save the selector data into Google Cloud Datastore.
  4. Define endpoint for a request to GET selector data by hostname
  5. Define draw route and define endpoint for POST collector data

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.