Code Monkey home page Code Monkey logo

commons's People

Contributors

adarsh avatar gtourtellot avatar nsanta avatar ptrikutam avatar t-dnzt avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

commons's Issues

Re-create the elasticsearch / rake db:seed error.

When running rake db:seed for the first time on production, Elasticsearch threw an error about a document not being found.

After running it once, the error didn't happen again. I'd like to reproduce the issue consistently and make sure it doesn't happen again. I suspect once you create the documents in ES it doesn't happen but maybe on the first try? Not sure.

ElasticSearch Rake task fails

Occasionally, the rake task will fail because it can't find the class Resource. This can be solved by adding :environment to the rake task definition.

Allow for or boolean searches in api

I bet CloudSearch already supports this. Maybe we should just create a set of api docs/examples to show things like or and not and whatnot(?).

Getting Faraday::ConnectionFailed: Broken pipe when running Resource.import

I've had the issue both in local and in production. Anytime I run Resource.import from the console or from a rake task, I'm getting Faraday::ConnectionFailed: Broken pipe.

D, [2017-01-13T13:35:53.791263 #4] DEBUG -- :   Resource Load (1112.1ms)  SELECT  "resources".* FROM "resources" ORDER BY "resources"."id" ASC LIMIT $1  [["LIMIT", 1000]]
rake aborted!
Faraday::ConnectionFailed: Broken pipe

I'm not sure what's wrong, and I haven't looked into it yet since it doesn't prevent the application from working (it only makes the reset_resource_index task fail).

I'm planning to do some research and see if I can figure it out. Any insights appreciated!

Search API

Overview

We'd like to start building out an external-facing JSON RESTful API. Let's start the discussion & development of said API for the Search endpoint. We're choosing this endpoint because it's probably the most useful, and also wont require any kind of authentication mechanism.

Requirements

General API Requirements

  • For the time being, don't worry about authenticating or verifying the requester. Anyone can hit this endpoint and get results.
  • The API should return a JSON response.
  • The API should be versioned (putting everything under v1 is good enough for now).

Search API Requirements

These requirements are presented in order of priority:

  • First, allow searching via a query string. At its simplest, the API should return a JSON response for a request like this: GET https://greencommons.herokuapp.com/api/v1/search.json?q=<query>
  • The results should return whatever information is contained in a Summary card initially, as well as a link to the actual resource / group / whatever itself.
  • The API should allow for paginating the results
  • The API should allow for sorting the results
  • We should allow faceting via the API (i.e. filter by Resource, Group, Article, etc)
  • The API should also return "You make also like" suggestions
  • The API should allow the user to filter what fields are returned

Open to discussion

  • I'm open to how we implement it (i.e. using a gem like Grape, or manually)
  • I think the API should be logically separated somehow from the controllers, but I haven't fully thought through what that looks like. Got any suggestions on how best to go about this?
  • How should we allow for additional parameters in the search API for faceting, pagination, etc? Should we accept a JSON payload in a POST or should we accept query parameters?

Feel free to submit each of the requirements in separate PRs (smaller is better!). Also if you think we're not thinking of anything, please raise the issue here.

Common Date Field for Groups/Resources/Lists/Users

As mentioned in PR #117, we need to have a unique column containing the date that will be used to sort the records when different models are returned.

For all of them, we can use the date present in the metadata if there's one, or fall back to the created_at date. I guess we could call the field date or published_at maybe?

Let me know if you want me to proceed with this.

Epub import pipeline fails when a book cannot be opened

I was happily running the import when I (sadly) ran into this issue:

NoMethodError: undefined method `fetch' for #<String:0x007f9cc589aa80>

Here are the logs:

2017-01-13T14:39:52.627365+00:00 app[run.3772]: "Title: The Dominant Animal"
2017-01-13T14:39:52.627427+00:00 app[run.3772]: "Metadata: "
2017-01-13T14:39:52.627659+00:00 app[run.3772]: {
2017-01-13T14:39:52.627672+00:00 app[run.3772]:      "creators" => "Paul R. Ehrlich",
2017-01-13T14:39:52.627673+00:00 app[run.3772]:          "date" => "2012-06-15",
2017-01-13T14:39:52.627674+00:00 app[run.3772]:     "publisher" => "Island Press"
2017-01-13T14:39:52.627674+00:00 app[run.3772]: }
2017-01-13T14:39:52.631014+00:00 app[run.3772]: "Content: \n\n\n\n\tThe Dominant Animal\n\t\n\n\n\n\n\n\n\n\n\n\n\n\tThe Dominant Animal\n\t\n\n\n\n\n\n\n\n\n\n\n\tThe Dominant Animal\n\t\n\n\n\n..."
2017-01-13T14:39:52.682561+00:00 app[run.3772]: "Error opening epub: unable to locate end-of-central-directory record"
2017-01-13T14:39:52.682821+00:00 app[run.3772]: rake aborted!
2017-01-13T14:39:52.686129+00:00 app[run.3772]: NoMethodError: undefined method `fetch' for #<String:0x007f9cc589aa80>

And the code responsible for the error:

require 'epub/parser'

class TransformEpub
  def process(input_epub)
    @input_epub = input_epub

    {
      title: title,
      content: PageContentExtractor.new(parsed_book).start,
      metadata: {
        creators: creators,
        date: date,
        publisher: publisher,
      }
    }

  rescue => error
    ap "Error opening epub: #{error}"
  end
class CreateNewResourceRecord
  
  # Hidden

  def title
    attributes.fetch(:title)
  end
end

When an error is raised, the transformer will return a string instead of a hash. I'm not sure why it happens exactly yet since the transformer shouldn't pass anything if an error occurs.

Update seeds.db to use FactoryGirl

Reduces the number of places we need to update when we update a model.

Please try to follow a similar structure:

  • Create multiple users with 1 - 2 lists (use a random sampling of 10 - 50 resources for each).
  • Create multiple groups with multiple users.
  • For each group, create 3 - 5 lists (with a random sampling of 30 - 80 resources for each list)

Resource View Page

Wireframes: https://invis.io/BH8GKH38D#/185307862_RESOURCE_-_View

We'd like to build out the base UI for the Resource page.

Most of it should be straightforward, but for sake of specificity I'd like to clarify a few things:

  • For the time being, please ignore (don't display) the "Discuss" section
  • Please skip the "Content" section for now.
  • Try to extract the actual metadata for the Resource in the "About" section -- currently we're pulling Title, Creators, Date (publish date), Publisher, and a "Metadata" field. Can you investigate what type of data is included in the "Metadata" field and see if we can extract that to be displayed in the Resource view?
  • For the "Explore" section, let's go for things that share the same tags to start. Ideally, this should use a similar mechanism as the one described in #98 in the "You May Also Like" section. Basically, we want a centralized place to recommend similar / related (directly or tangentially) items for a given resource, list, group, etc. Happy to talk through this part.

Wrong ElasticSearch index name

The following code in the Resource model might not be generating the index name we expect.

class Resource < ApplicationRecord
  include Elasticsearch::Model
  index_name SearchIndex.index_name(self)

Since self here is the class, and given the following code in the SearchIndex class:

  def self.index_name(record)
    "#{record.class.name.pluralize.downcase}-#{Rails.env}"
  end

We end up with the following index: classes-development (or classes-production).

Also, in the reset_resource_index rake task, we are still using the old index name instead of relying on the new method to generate it:

namespace :elasticsearch do
  desc 'Deletes the "resource" index and regenerates it with all records currently in Resource'
  task reset_resource_index: :environment do
    client = Elasticsearch::Client.new(
      url: ENV.fetch('BONSAI_URL', 'http://localhost:9200'), log: true
    )
    client.indices.delete index: 'resources' # <--- HERE
    Resource.__elasticsearch__.create_index!
    Resource.import
  end
end

Add group view page

Based on the wireframes for the Group View page, please start coding up a skeleton for the Group view.

Some points about functionality:

  • A Group can have many Lists. Each of these Lists will have many resources -- anyone in a group can modify a List that belongs to the Group.
  • In the "What's New" section, we want to display recent items that have been added to any List owned by the Group. This might require creating some kind of helper method / scope on Group to access all of its Resources through its various Lists. Open to discussion on the best way to approach this.
  • Clicking the "See More" link should display the rest of the Group's metadata, such as the description, website, created date, etc.
  • Anyone can view a group, even if they're not a member
  • Anyone can view a group's member list, even if they're not a member
  • The "Leave Group" button should only appear if you're a member of the group
  • Please leave out the "Add to List" button for now.

ElasticSearch indexing fails in production

Currently, Sidekiq fails to add new resources to the ElasticSearch index in production with the following error:

Elasticsearch::Transport::Transport::Errors::BadRequest: [400] 
{"error":
  {
    "root_cause":
      [{"type":"mapper_parsing_exception","reason":"object mapping for [content] tried to parse field [content] as object, but found a concrete value"}],
    "type":"mapper_parsing_exception","reason":"object mapping for [content] tried to parse field [content] as object, but found a concrete value"
  },
 "status":400
}

This might be related to issue 97 (or maybe not).

Load books from Amazon S3 into index on production

Please run the rake task rake etl:import_s3_epub on production (https://greencommons.herokuapp.com).

To do it, I imagine we need to do the following steps:

  • Add @T-Dnzt to Heroku
  • Provide @T-Dnzt with S3 access
  • Update the environment variables for S3 on Heroku
  • Clear the database of any existing List and Resource objects
  • Run the rake task to import the actual books
  • Do a quick check in Sidekiq to make sure things are being indexed properly
  • Do a quick check on the site by searching for one of the books and ensuring it comes up in the search results.
  • Create some dummy lists in the back end for the Groups in the database with some of these Resources

Create a Group "Summary Card"

We have the concept of "Summary Cards" in a number of places in the wireframes.

Examples:

Essentially, we want to create reusable components that we can drop into anywhere on the site to display a Group, List, or Resource. We'd like them to have a similar structure, but differ based on the object they're representing.

Initially, we don't really have a ton of info or metadata to display for a Group summary card. Please keep it in a good place in the views / directory structure, and make it easy to pass a Group object to it so that it will know what info to populate in the view.

This is an example of a Group summary card: https://cl.ly/122E0e0d001e

Add a nightly rake task to re-index everything in ElasticSearch

Reason for Change

  • In the unlikely event we have missed something in indexing various resources, it's probably a good idea to re-index things nightly.
  • This is "a good idea" until we have way too much stuff.

Proposed Changes

  • A rake task which re-indexes everything under ElasticSearch's control.
  • Use something like Heroku Scheduler to run it nightly (simple, free)

Search Results Page

We want to build out the Search results page: https://invis.io/BH8GKH38D#/185307864_SEARCH

This will involve a few different things:

  • Taking the search term entered and searching across Group, Resource, and List to begin with. The search page should return a list of all items combined together.
  • If they don't already exist, we need to create summary cards for different search result items.
  • Building out the left-hand filter list. We should discuss what types of filters actually belong in this section. At the minimum, we should be able to filter by Group, Resource, and List.
  • Allow sorting by Date or "Relevance" (see "Stackscore / Relevancy" below)
  • Ensure we can paginate search results

Stackscore / Relevancy

@dweinberger wants to assign a "Stackscore" or "Relevancy" number to every search result we get. This can be a number between 0 and 100 or something similar, but ultimately is a "smart" determination of what an item's relevancy is to what you've searched for. Initially, we can make this very simple, but we should build a stub or foundation to expand and make this more sophisticated as we go along. I'm happy to walk through this via a audio / video chat.

"You may also like"

Not pictured in the wireframes, we want to include a small section below the search results that provides suggestions on other things a user may be interested in. Initially, we can keep it simple, and the recommendations can be rather uninteresting, but it'd be good to set up a way to see "related items" based on a search term or an actual object in the database. This is somewhat similar in idea to "Stackscore / Relevancy," so it's worth discussing what makes sense here.

NOTE: This is a rather large task, so if it makes sense, feel free to break this up into smaller issues and re-order as you see fit.

Convert all remaning .erb views to .slim

  • layouts/application.html.erb
  • layouts/mailer.html.erb
  • devise/registrations/new.html.erb
  • devise/sessions/new.html.erb
  • devise/shared/_links.html.erb
  • pages/style.html.erb
  • shared/_alerts.html.erb
  • shared/_nav.html.erb

Run elasticsearch callbacks asynchronously.

Originally discussed here: #25 (comment)

If we set these callbacks to run asynchronously (e.g. spin up a Sidekiq job to change the search index), we can have a bit more control over when we actually update the index.

For example, we could handle errors more gracefully. Or just not execute the job in development or test environments unless some ENV is set.

https://github.com/elastic/elasticsearch-rails/tree/master/elasticsearch-model#asynchronous-callbacks

Bug: Wrong record match

The name "Bhaskar Chakravorti" is in exactly one record of the json data provided thus far, namely, in the contents of a guardian article from the data file: guardian-sustainability_OR_sustainable_AND_environment-1.json; that article's title is "Sustainable business and sustainable development: two sides of the same coin".

The single match that's returned site is for a different article, namely: the guardian article titled "Davos 2013: new vision for agriculture is old news for farmers | David Nally and Bhaskar Vira" that comes from data file: guardian-sustainability_OR_sustainable_AND_environment-22.json

Try it, here:
https://greencommons.herokuapp.com/search?utf8=โœ“&query=Bhaskar+Chakravorti

Alternatively, a search for "Sustainable business and sustainable development: two sides of the same coin" does bring up the right record, so at least for this record, it seems that matching based on title works correctly but matching based on content doesn't.

Add the "Add Member" feature

  • Show autocomplete dropdown when a user starts typing an email (and a name later on)
  • Allow a user to submit the form once an email has been found and add the specified user to the group

Add Capybara tests for the Group Functionality

There are currently no complete feature tests to ensure that the various parts of the Group Functionality work as expected.

Here are the happy path scenarios that have to be implemented as acceptance tests:

  • scenario 'users can create a group'
  • scenario 'users can update a group'
  • scenario 'users can add members'
  • scenario 'users can remove members'
  • scenario 'users can make other members admins'
  • scenario 'users can remove admin from other members'

Some of these scenarios might get merge together to have faster tests.

Add Group Summary card to /style

Can you add an example of the group summary card (and a code snippet of its usage in Rails) on the /style page? I'd like to keep an up-to-date list of all the reusable components we create.

Rake task to clear the ElasticSearch index

Sometimes when running rake db:reset the index doesn't get reset. Would be good to have a rake task to handle this. All it needs to do is:

client = Elasticsearch::Client.new log:true
client.indices.delete index: 'resources'

Add Group member management functionality

Member Management

Groups can have many admins and many regular members. Admin members are allowed to add/ remove people from the group or grant admin rights to the group, but regular members cannot. For the time being, that's the only difference.

  • Add the necessary migrations & changes to the User/Group models to support Admin / Regular members
  • Add any necessary routes / controller actions to implement the feature.
  • Ensure the route to manage or view the members of a group is: /groups/id/members
  • Implement an admin view and regular member view. It should be on the same route, only displaying options on the page if an Admin of the group is logged in. Make sure you use the style guide as a base for building the UI. The UI doesn't have to be perfect, but should be close to what we see in the wireframes.
  • By clicking "Make Admin" below a user's card you should be able to turn a user into an admin for the current group. By Clicking "Remove" you should be able to remove a user from the group. Again, these options are only available to admins.

Deleting Resources has a race condition

I ran into this issue (after previously manually deleting record 100): https://cl.ly/2x2c1r1I0W26

So that pic is showing errors where the job is failing because it's trying to find record 100 but it doesn't exist. That's a bug - we are asking ES to remove a record from the index which does not exist in the database anymore :(

I'm opening an issue for this. Solution is probably a separate delete job, something like "don't try to rehydrate the AR record when you are deleting it - just use what the job was sent."

Add a "My Groups" link to dropdown at top right

This is a task to add the "index" or "View all" view for the Group model.

  • Add a "My Groups" link to the dropdown at the top right of the page: https://cl.ly/460E3e1g162y. This should only appear when logged in, and should take you to a very simple page where you view a list of groups.
  • The page should be full-width -- it doesn't need to be paginated for now. Please use Group "summary cards" to display each group.

Use update_document method to re-index updated record

I think we should separate the create/update callback here and start using the update_document from there. I'm worried about duplicates and was able to produce a bug by updating a record, searching for its old name and still finding it even though it should have been de-indexed.

It would also be interesting to unify the way we interact with ES by not using the ElasticSearch::Modal.client directly and instead relying on the methods provided by elasticsearch-rails.

delete_document could replace the three lines below for example.

  def remove
    if search_index_callbacks_enabled?
      Elasticsearch::Model.client.delete(
        index: model_name.constantize.index_name,
        type: model_name.downcase,
        id: id,
      )
    else
      log_callback_warning
    end
  end

@ptrikutam Let me know if that's something you'd like me to work on or not.

Ensure only group admins can update or delete a group

There are currently no permission checks in the GroupsController for update and destroy. These actions should only be performed by group admins, and therefore, we need to ensure that the current_user is an admin before allowing the changes.

ruby epub parsing issues

FYI, for me, 16 of the Island Press epubs have parse issues using ruby 'epub/parser'.
Three of these files are legitimately bad (corresponding to the message "unable to locate end-of-central-directory record"), the other 13 epubs can actually parse correctly (as witnessed elsewhere).

"Error opening epub 9781597265171.epub: unable to locate end-of-central-directory record"
"Error opening epub 9781597265935.epub: unable to locate end-of-central-directory record"
"Error opening epub 9781610914529.epub: undefined method refines=' for #<EPUB::Metadata::UnsupportedModel:0x007fa704b56e10>\nDid you mean? readlines" "Error opening epub 9781610914574.epub: undefined method refines=' for #EPUB::Metadata::UnsupportedModel:0x007fa703b02ac8\nDid you mean? readlines"
"Error opening epub 9781610914802.epub: undefined method refines=' for #<EPUB::Metadata::UnsupportedModel:0x007fa703949920>\nDid you mean? readlines" "Error opening epub 9781610915007.epub: undefined method refines=' for #EPUB::Metadata::UnsupportedModel:0x007fa70301ff98\nDid you mean? readlines"
"Error opening epub 9781610915403.epub: unable to locate end-of-central-directory record"
"Error opening epub 9781610915762.epub: undefined method refines=' for #<EPUB::Metadata::UnsupportedModel:0x007fa7049005f8>\nDid you mean? readlines" "Error opening epub 9781610915861.epub: undefined method refines=' for #EPUB::Metadata::UnsupportedModel:0x007fa704ab8c88\nDid you mean? readlines"
"Error opening epub 9781610916639.epub: undefined method refines=' for #<EPUB::Metadata::UnsupportedModel:0x007fa7048522f0>\nDid you mean? readlines" "Error opening epub 9781610916677.epub: undefined method refines=' for #EPUB::Metadata::UnsupportedModel:0x007fa704ac47e0\nDid you mean? readlines"
"Error opening epub 9781610916684.epub: undefined method refines=' for #<EPUB::Metadata::UnsupportedModel:0x007fa703a8b888>\nDid you mean? readlines" "Error opening epub 9781610916691.epub: undefined method refines=' for #EPUB::Metadata::UnsupportedModel:0x007fa704a71888\nDid you mean? readlines"
"Error opening epub 9781610916707.epub: undefined method refines=' for #<EPUB::Metadata::UnsupportedModel:0x007fa70387ad28>\nDid you mean? readlines" "Error opening epub 9781610916714.epub: undefined method refines=' for #EPUB::Metadata::UnsupportedModel:0x007fa703a90270\nDid you mean? readlines"
"Error opening epub 9781610916745.epub: undefined method `refines=' for #EPUB::Metadata::UnsupportedModel:0x007fa7028c6940\nDid you mean? readlines"

Alignment issues for Explore / Group List

Currently, if one of the summary cards is too big, it will create an empty space like in the screenshot below:

screen shot 2560-02-03 at 12 43 15 pm

We can fix it by adding a .row class every 2 records. That also means we cannot display more cards per row in the future (but one card per row for the mobile version is fine).

Let me know if you want me to proceed with this.

Elasticsearch Rake task fails if resource index doesn't exist

2016-12-09 09:57:22 -0800: < {"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"resources","index_uuid":"_na_","index":"resources"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"resources","index_uuid":"_na_","index":"resources"},"status":404}
2016-12-09 09:57:22 -0800: [404] {"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"resources","index_uuid":"_na_","index":"resources"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"resources","index_uuid":"_na_","index":"resources"},"status":404}
rake aborted!
Elasticsearch::Transport::Transport::Errors::NotFound: [404] {"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"resources","index_uuid":"_na_","index":"resources"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"resources","index_uuid":"_na_","index":"resources"},"status":404}
/Users/pavan/.rvm/gems/ruby-2.3.2@commons/gems/elasticsearch-transport-5.0.0/lib/elasticsearch/transport/transport/base.rb:201:in `__raise_transport_error'
/Users/pavan/.rvm/gems/ruby-2.3.2@commons/gems/elasticsearch-transport-5.0.0/lib/elasticsearch/transport/transport/base.rb:312:in `perform_request'
/Users/pavan/.rvm/gems/ruby-2.3.2@commons/gems/elasticsearch-transport-5.0.0/lib/elasticsearch/transport/transport/http/faraday.rb:20:in `perform_request'
/Users/pavan/.rvm/gems/ruby-2.3.2@commons/gems/elasticsearch-transport-5.0.0/lib/elasticsearch/transport/client.rb:128:in `perform_request'
/Users/pavan/.rvm/gems/ruby-2.3.2@commons/gems/elasticsearch-api-5.0.0/lib/elasticsearch/api/namespace/common.rb:21:in `perform_request'
/Users/pavan/.rvm/gems/ruby-2.3.2@commons/gems/elasticsearch-api-5.0.0/lib/elasticsearch/api/actions/indices/delete.rb:44:in `delete'
/Users/pavan/Development/delete/commons/lib/tasks/elasticsearch.rake:7:in `block (2 levels) in <top (required)>'
/Users/pavan/.rvm/gems/ruby-2.3.2@commons/gems/rake-11.3.0/exe/rake:27:in `<top (required)>'
/Users/pavan/.rvm/gems/ruby-2.3.2@commons/bin/ruby_executable_hooks:15:in `eval'
/Users/pavan/.rvm/gems/ruby-2.3.2@commons/bin/ruby_executable_hooks:15:in `<main>'
Tasks: TOP => elasticsearch:reset_resource_index
(See full trace by running task with --trace)

Probably just need to rescue the error and move to the next step.

Create capybara tests to verify auth flows

There's no need to test the registration / auth controllers, but in case we've messed something up on the views end we should add a few integration tests to verify the auth flow works as we expect.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.