The valkyrie from samvera

Solr adapter can't save values that are really long.

I think this is because it's storing them in string fields.

rake tasks fail because tmp directory doesn't exist

In the readme it says to run rake server:development but this causes an error because the tmp directory is not part of the git clone.

$ rake server:development
Loading configuration from /Users/jcoyne/.solr_wrapper.yml
Unable to copy /var/folders/9t/rygbnddx0b1ckw6tjs3m18qm0000gq/T/d20170706-12948-s9kond to tmp/blacklight-core: No such file or directory @ dir_s_mkdir - tmp/blacklight-core

Isolate Valkyrie model code into `lib` for extraction into gem later.

Use DynamicClass instead of DynamicKlass?

Recommended by @botimer in #59

Document memory adapters, link to them from other adapters.

Formalize a `transaction buffer` pattern

I have no idea what the interface for this might be like, but the speed was dramatically different and it helped things a lot. A transaction buffer (as is possible now) looks something like this:

    memory_adapter = Valkyrie::Persistence::Memory::Adapter.new
    adapter = Valkyrie::AdapterContainer.new(
      persister: CompositePersister.new(Valkyrie::Adapter.find(:postgres).persister, memory_adapter.persister),
      query_service: Valkyrie::Adapter.find(:postgres).query_service
    )
   ## Save a bunch of stuff via adapter.persister.save(model: book)
   #
   ## Now that you have a bunch of saved objects in the database, 
   # you can DRASTICALLY speed up solr indexing (18 mins -> a few seconds for 2600) by doing them all in one call
   Valkyrie::Adapter.find(:solr).persister.save_all(models: memory_adapter.query_service.find_all)

Write documentation on how ValueMapper works.

named param in find_by_id(id: id) is redundantly redundant

If the method name is find_by_id, I don't think a named id parameter provides any extra clarity.

.find_by_id(id: id)

vs.

.find(id: id)

or

.find_by_id(id)

Would prefer either of the latter two forms.

Rename Fedora adapter to ActiveFedora Adapter?

I think it's more honest.

File metadata on storage adapters

How do we store mime_type and filename? In Fedora these are stored with the binary.

Documentation standards?

The shared specs are a good step, but as we start to solidify some of the interfaces we should probably find a good way to add proper documentation.

Identify MVP Use Cases

What features in an example repository which, when fulfilled, mean this is a viable pattern for Hydra (and/or Hyrax)?

Ideas I'd like feedback on:

Collections
Access controls
File upload/download
Derivative generation
1. Storage of derivatives in any of the multiple backends.
Re-ordering of membered resources
Custom work types.
1. I don't think we need to write a generator, but a set of steps to add a new one which could be programmatic.
Load a pre-existing AF model from Hyrax into a Valkyrie model.
1. This would be the ideal migration strategy - in that there's no data migration.

I have a branch now which has two folders in this repository. However, now that I think about it, I wonder if we can turn off autoloading of the lib directory and just have an entire gem structure in lib/valkyrie

named parameters in persisters

Convert methods to use named parameters in persister classes.

Add support for `member_ids` having strings and not just local URIs

Supporting multiple types per property will be important for use cases such as controlled in-repository terms. Need some way to distinguish "3" as a remote ID from "3" the string.

Support for GlobalIDs?

When we work on #53 we're going to need to store the user's identifier. Originally I thought that was going to be the username or email or whatever devise said was the primary key, but I realized it might be better to simply support GlobalIDs as a data type in Valkyrie.

That way you could just have GlobalID turn them into objects if you wanted that, and there'd be a difference between "tpend" and "gid://app/User/1"

Generic support for ordered properties?

Virtus objects can have metadata attached to properties - ordered: true seems like one we could add.

However, this will probably be annoyingly difficult to implement for the AF adapter.

Implement configurable ID generation

For things like NOIDs. I think this is necessary - especially for migration.

Storage adapter for Fedora

It's in the charter.

Update Valkyrie gem readme

Run characterization on files during upload.

Look into Shrine as a replacement for storage adapters

I haven't dug into it a lot, but it seems to have a lot of good and similar opinions to Valkyrie, with a lot more work put into it:

https://github.com/janko-m/shrine

Determine Prototype Storage Adapters

Now there's two: Disk & Memory.

I think we'll need at least three for a valid prototype:

Disk
Memory
Fedora

In the future I'd like to look at

AWS
Content-Addressable Storage (Whether this is a disk-offshoot that stores based on fixity, IPFS, or both, I dunno.)

Get deep nesting + mixed nesting working for AF Adapter

The alternative is fix the raw Fedora adapter's performance problem (#72).

This would be nested resources (Using hash code URIs) for the edm:TimeSpan use case (UCSB)

indexing_persister configured adapter is a persister, not an adapter.

Implement Access Controls

I think this basically means make blacklight-access-controls work. What's the difference between that and HydraAccessControls? @jcoyne ?

Demonstrate Carrierwave Usage?

File upload gems tend to be pretty locked into ActiveRecord norms. It would be nice if we could prove that Carrierwave could be used with a Valkyrie model without too much interference.

https://coderwall.com/p/e9d_ja/using-carrierwave-uploader-for-tableless-model-in-rails relevant?

Implement a dashboard to show all objects uploaded by "me"

Question here about where one should draw the line between "query powered by the fact that you have a Solr index" and "query necessary for the backend to support."

Proposal: StorageAdapter#upload should not return a file

I suggest we return void. Returning a File could be expensive and we may not use the result.

Identify supported data types.

It takes work in each persister to navigate back and forth between native ruby datatypes and the data-store. We need to document which data types we support.

Right now all that's supported is Internal IDs, language-tagged RDF Literals, and strings. Dates? Times? Integers? ::RDF::URIs?

"save_all"?

In bulk migration use cases, it might be more efficient to load up a lot of resources, change them in memory, and then persist them all at once (at least for solr/postgres.) The implementations can sometimes be complex (postgres in particular), and it's not efficient for all adapters (AF for instance). Do we want this?

Virtus gem not actively maintained

Piotr Solnica, the main dev behind virtus wrote this comment a while ago: https://www.reddit.com/r/ruby/comments/3sjb24/virtus_to_be_abandoned_by_its_creator/ and there hasn't been much activity on the gem recently.

This might not be an issue and Virtus might be stable enough for our needs, but we might have to eliminate Virtus at some point.

Generate a document of recommendations on how to implement the code in the core of the Samvera stack.

Document how to use shared specs

Created/Modified Date

Add rake tasks to readme for setting up solr/Fedora

Custom Work Type Implementation

Create a short list of steps on what it takes to add a new work type (IE Book or Page). Consider a generator.

Figure out how to talk about the lack of dirty tracking

I think our forms have dirty tracking, but our models way don't (on purpose.) We should find a way to document that.

When migration from Hyrax to another adapter is possible, document how to do it.

Probably going to be something along the lines of

fedora_adapter = Valkyrie::Adapter.find(:fedora)
postgres_adapter = Valkyrie::Adapter.find(:postgres)
book = fedora_adapter.query_service.find_by(id: "myid")
book.id = nil
new_book = postgres_adapter.persister.save(model: book)

Implement derivative generation

Try really hard to use hydra-derivatives here.

Hyrax Adapter?

This would be an adapter which is proven to be able to interact with the way Hyrax stores data in Fedora/Solr. It will probably be difficult, and isn't actually part of the charter.

Nested resources?

The use case exists in Hyrax, and at least two institutions I know of use it (UCSB & CHF):

I have a record which has complex metadata as one of the properties - IE, a date range where it's important that the beginning and the end of the range are stored together.

Possible implementation:

  it "can save nested resources" do
    book = resource_class.new(title: "Sub-nested")
    book2 = resource_class.new(title: "Nested", nested_resource: book)
    book3 = persister.save(model: resource_class.new(nested_resource: book2))

    reloaded = query_service.find_by(id: book3.id)
    expect(reloaded.nested_resource.first.title).to eq ["Nested"]
    expect(reloaded.nested_resource.first.nested_resource.first.title).to eq ["Sub-nested"]
  end

Now, the problem: Getting that test to pass with the postgres & memory adapters took about 20 LOC. Both natively support the concept of nesting and the abstractions are already written and debugged. However, for the other two adapters:

ActiveFedora: There's no interface for "here's a nested resource, build out the hash URIs and handle this for me please." I can't imagine how to write one, either. I could see this working out with something lower level, IE a Fedora persister which directly integrates with LDP, but I don't think that's an option ATM. Maybe the solution here is to reach out to those institutions have implemented this and see what they've done, so we can at least have a compatibility layer.

Solr: There is no such thing as nesting. You can add "child documents", but they're indexed independently, require an ID, and don't have the same lifespan as their parents (https://issues.apache.org/jira/browse/SOLR-6096).

So I'm inclined to say we either:

Figure out how to get those two adapters to do nested objects.
Find a workaround - IE, recommend explicitly creating those nested objects as independent things and coming up with a good way for form objects to handle that.
Start pushing for the solutions which were easy (I don't think this is politically feasible)
Declare the experiment a failure because of the difficulty of abstraction of nested resource behavior.

Add support for DateTime data type

schema_migrations_table_name is deprecated

DEPRECATION WARNING: schema_migrations_table_name is deprecated and will be removed from Rails 5.2 (called from block (2 levels) in <top (required)> at /Users/jcoyne/workspace/valkyrie/spec/support/database_cleaner.rb:4)

samvera / valkyrie Goto Github PK

valkyrie's People

Contributors

Stargazers

Watchers

Forkers

valkyrie's Issues

Recommend Projects

Recommend Topics

Recommend Org