chnm / datascribe-module Goto Github PK
View Code? Open in Web Editor NEWAn Omeka S module for the transcription of structured data.
License: GNU General Public License v3.0
An Omeka S module for the transcription of structured data.
License: GNU General Public License v3.0
We need to find a place to display the guidelines for transcribing a dataset. Logically, they belong somewhere on the record add/edit form, but I'm not sure how we display them. What makes the most sense for a transcriber? A separate tab? A modal window? Always available at the bottom of the page?
Currently, it's possible for me (as admin, and presumably other users as well) to start trying to add a record when there isn't a form specified yet. Probably the simplest fix is just to add an alert message when there is no form, possibly with a link to the form builder for the users who have permission to access the form builder?
When I was trying to build my test form (which is long... there are over a hundred parishes in each bill of mortality) I kept wanting to save my progress, which booted me back to the main screen where I had to click several times to get back to the form builder. This is especially important since I need to save before the name field displays, which will be important to keep my place while I'm adding fields once I switch over to using actual parish names.
I've applied some changes to the dashboard on the dashboard
branch:
Pulled browse/add projects actions out as buttons: I found we were repeating these actions in the top bar and in the main content area. There are only two of them, and both have short strings.
Included recently modified projects: I created a simplified listing of the 10 most recently modified projects. It links to the project and shows when it was last modified.
Re-organized filters within My Projects: I tried to organize the filters to be more scannable at a glance while also reducing unique strings.
I would appreciate folks trying out these new styles, especially those who have many projects within their installation.
Right now, we seem to be optimized for horizontal images and vertical images aren't displaying very well. Even when I minimize the sidebars to maximize the view, the image is way too small to transcribe without zooming in, which then requires a lot of mousework to fiddle and readjust with placement as I transcribe.
Currently dataset names must be unique across all projects. Given the possibility that users will create datasets with identical names across all projects, the names should only be unique within individual projects.
For example, let's say I have two projects: "Project One" and "Project Two". "Project One" already has a dataset named "My Dataset". When I try to create a "My Dataset" in "Project Two" I get an error: "The name 'My Dataset' is already taken."
The Checkbox data type is an outlier in that it's the only data type that does not have a NULL
state. (All DataScribe data types should account for NULL
values, as is typical for spreadsheet applications.) I've made several attempts to apply a NULL
state, but it's proven to be awkward when using a checkbox (a true boolean form element).
We should be able to add MultiRadio and MultiCheckbox data types that'll fulfill every use of Checkbox plus many more uses. We just have to add a mandatory NULL
option for each to account for a NULL
state.
I get the following error when I try to save the form. I've tried uninstalling and re-installing the module.
Doctrine\DBAL\DBALException
Unknown column type "json" requested. Any Doctrine type that you use has to be registered with \Doctrine\DBAL\Types\Type::addType(). You can get a list of all the known types with \Doctrine\DBAL\Types\Type::getTypesMap(). If this error occurs during database introspection then you might have forgot to register all database types for a Doctrine Type. Use AbstractPlatform#registerDoctrineTypeMapping() or have your custom types implement Type#getMappedDatabaseTypes(). If the type name is empty you might have a problem with the cache or forgot some mapping information.
Details:
Doctrine\DBAL\DBALException: Unknown column type "json" requested. Any Doctrine type that you use has to be registered with \Doctrine\DBAL\Types\Type::addType(). You can get a list of all the known types with \Doctrine\DBAL\Types\Type::getTypesMap(). If this error occurs during database introspection then you might have forgot to register all database types for a Doctrine Type. Use AbstractPlatform#registerDoctrineTypeMapping() or have your custom types implement Type#getMappedDatabaseTypes(). If the type name is empty you might have a problem with the cache or forgot some mapping information. in /Users/kim/Sites/omeka-s/vendor/doctrine/dbal/lib/Doctrine/DBAL/DBALException.php:240
Stack trace:
#0 /Users/kim/Sites/omeka-s/vendor/doctrine/dbal/lib/Doctrine/DBAL/Types/Type.php(172): Doctrine\DBAL\DBALException::unknownColumnType('json')
#1 /Users/kim/Sites/omeka-s/vendor/doctrine/dbal/lib/Doctrine/DBAL/Statement.php(111): Doctrine\DBAL\Types\Type::getType('json')
#2 /Users/kim/Sites/omeka-s/vendor/doctrine/orm/lib/Doctrine/ORM/Persisters/Entity/BasicEntityPersister.php(277): Doctrine\DBAL\Statement->bindValue(6, Array, 'json')
#3 /Users/kim/Sites/omeka-s/vendor/doctrine/orm/lib/Doctrine/ORM/UnitOfWork.php(1014): Doctrine\ORM\Persisters\Entity\BasicEntityPersister->executeInserts()
#4 /Users/kim/Sites/omeka-s/vendor/doctrine/orm/lib/Doctrine/ORM/UnitOfWork.php(378): Doctrine\ORM\UnitOfWork->executeInserts(Object(Doctrine\ORM\Mapping\ClassMetadata))
#5 /Users/kim/Sites/omeka-s/vendor/doctrine/orm/lib/Doctrine/ORM/EntityManager.php(356): Doctrine\ORM\UnitOfWork->commit(NULL)
#6 /Users/kim/Sites/omeka-s/application/src/Api/Adapter/AbstractEntityAdapter.php(369): Doctrine\ORM\EntityManager->flush()
#7 /Users/kim/Sites/omeka-s/application/src/Api/Manager.php(233): Omeka\Api\Adapter\AbstractEntityAdapter->update(Object(Omeka\Api\Request))
#8 /Users/kim/Sites/omeka-s/application/src/Api/Manager.php(136): Omeka\Api\Manager->execute(Object(Omeka\Api\Request))
#9 /Users/kim/Sites/omeka-s/application/src/Mvc/Controller/Plugin/Api.php(152): Omeka\Api\Manager->update('datascribe_data...', '1', Array, Array, Array)
#10 /Users/kim/Sites/omeka-s/modules/Datascribe/src/Controller/Admin/DatasetController.php(66): Omeka\Mvc\Controller\Plugin\Api->update('datascribe_data...', '1', Array)
#11 /Users/kim/Sites/omeka-s/vendor/zendframework/zend-mvc/src/Controller/AbstractActionController.php(78): Datascribe\Controller\Admin\DatasetController->editAction()
#12 /Users/kim/Sites/omeka-s/vendor/zendframework/zend-eventmanager/src/EventManager.php(322): Zend\Mvc\Controller\AbstractActionController->onDispatch(Object(Zend\Mvc\MvcEvent))
#13 /Users/kim/Sites/omeka-s/vendor/zendframework/zend-eventmanager/src/EventManager.php(179): Zend\EventManager\EventManager->triggerListeners(Object(Zend\Mvc\MvcEvent), Object(Closure))
#14 /Users/kim/Sites/omeka-s/vendor/zendframework/zend-mvc/src/Controller/AbstractController.php(106): Zend\EventManager\EventManager->triggerEventUntil(Object(Closure), Object(Zend\Mvc\MvcEvent))
#15 /Users/kim/Sites/omeka-s/vendor/zendframework/zend-mvc/src/DispatchListener.php(138): Zend\Mvc\Controller\AbstractController->dispatch(Object(Zend\Http\PhpEnvironment\Request), Object(Zend\Http\PhpEnvironment\Response))
#16 /Users/kim/Sites/omeka-s/vendor/zendframework/zend-eventmanager/src/EventManager.php(322): Zend\Mvc\DispatchListener->onDispatch(Object(Zend\Mvc\MvcEvent))
#17 /Users/kim/Sites/omeka-s/vendor/zendframework/zend-eventmanager/src/EventManager.php(179): Zend\EventManager\EventManager->triggerListeners(Object(Zend\Mvc\MvcEvent), Object(Closure))
#18 /Users/kim/Sites/omeka-s/vendor/zendframework/zend-mvc/src/Application.php(332): Zend\EventManager\EventManager->triggerEventUntil(Object(Closure), Object(Zend\Mvc\MvcEvent))
#19 /Users/kim/Sites/omeka-s/index.php(21): Zend\Mvc\Application->run()
#20 {main}
Users need a way to seamlessly browse through an item's media on the record add, edit, and show pages. On the initial load, this component should render pagination and load the first media in the viewer. The viewer should use existing file renderers to generate markup for each page, except that images should include pan/zoom options. It will need to account for possibly dozens or hundreds of media per item, so it cannot load every media at once. This means it will need a way to fetch individual media pages asynchronously. It should also be modular, meaning it should be rendered on any page that has an item using a view helper. This helper can be provided by the DataScribe module or a separate module altogether (though we need to discuss the pros/cons of adding a dependency).
The media-viewer-save-state branch is an attempt to resolve #37 by making the "Save progress" and "Add+" workflows more seamless. The idea here is to preserve the the state of the media viewer and page layout after submitting the page. I think is the most jarring part of the current workflow is that the user will have to reestablish the pan, zoom, rotate, focus mode, and layout each time. This should do it for them.
Test this in the record add and edit pages. After submitting the form using "Save progress" or "Add+", any adjustments to the media viewer, focus mode, or layout should be preserved on reload. Note that this will not preserve page scroll.
It's worth noting that the alternative solution (update page state) will be far more difficult to implement and maintain, and potentially more liable result in unexpected behavior.
When I finish a Record and click Add+ while in Focus mode, it resets so I leave Focus mode to New Record. Focus mode suggests to me that I could stay in this setting and not see the new record lines until I leave so I don't lose my place or have to restart Focus mode for every record.
I'm not sure if this is the parameters of the project setup or not. @jmotis this might be related to setting number of officers to "is missing" per the transcription notes. Setting any of the required fields to null makes the data invalid and does not save that I checked the box. "Is missing" and "is illegible" are saved and marked null in the record.
I'm guessing setting any of the numeric fields (Number of Officers) to have blank data as missing means that fields (Shillings/Pence) left blank to indicate 0 are following the same valid/invalid input rules.
Hello, I would like to use this module - but how do I get started?
Downloading? Importing to an already existing module?
Thank you!
The form builder (i.e. the dataset edit form) has some peculiarities that may-or-may-not need to be addressed as we move forward:
Currently we rely on browser and API validation to avoid these issues, which is probably good enough for now. In any case, these problems are not uncommon for dynamic forms, and watertight validation is arguably not as important here as is would be in, say, the record form.
Even so, the most viable solution is probably to submit the form data in a background request and handle any validation errors that the server returns. This will save the form's state, but we will lose the form's ability to embed error messages into the form markup. Here's a very simplified proof of concept:
On the client:
$('#datasetform').on('submit', function(e) {
e.preventDefault();
fetch('path-to-dataset-edit', {
method: 'post',
mode: 'cors',
body: new FormData(this)
})
.then((response) => {
if (response.ok) {
// redirect to dataset show
window.location.replace('path-to-dataset-show');
}
return response.json();
})
.then((data) => {
// handle error messages
});
});
On the server:
if ($this->getRequest()->isPost()) {
$postData = $this->params()->fromPost();
$form->setData($postData);
$response = $this->getResponse();
if ($form->isValid()) {
$postData['o:item_set'] = ['o:id' => $postData['o:item_set']];
try {
$this->api(null, true)->update('datascribe_datasets', $this->params('dataset-id'), $postData);
$response->setStatusCode(200); // OK
} catch (\Omeka\Api\Exception\ValidationException $e) {
$errorStore = $e->getErrorStore();
$response->setStatusCode(422); // Unprocessable Entity
$response->setContent(json_encode($errorStore->getErrors()));
}
} else {
$response->setStatusCode(422); // Unprocessable Entity
$response->setContent(json_encode($form->getMessages()));
}
return $response;
}
We don't have a plan for this page that I've forgotten about, do we? Because we have a lot of whitespace here we should probably fill, I'm thinking with a bit of grounding text, e.g. "Welcome to the DataScribe module. To get started, create a new project using the menu in the upper right-hand corner. For full module documentation, see ."
The required-fields branch introduces required fields. Form builders can now mark fields as required, which makes null
values invalid. (Previously, all null
values were valid.)
To test this feature, open the form builder, add a field, and check the "Field is required" checkbox. Then go to add a record and submit an empty value for the required field. The value should be marked as invalid on the browse page and when you go to edit the record. Once you enter a non-null
value, the value should not be marked as invalid anymore.
Note that you'll need to uninstall and reinstall the module after pulling the branch since the data model changed.
To the extent possible, I need you to clean up the item-show
branch. I will begin implementing the record wireframes therein and need to start from a prepared state. This includes removing cruft (unused CSS and JS), cleaning up existing markup, and ensuring that we're using design patterns consistent with Omeka's administrative interface. This does not mean I want a definitive version of the interface. I just want some assurance that the wireframes are prepared for me to copy over to the master
branch.
The record-position branch introduces a way for users to adjust the position of records in relation to each other. Now, when adding or editing a record, a reviewer or locked transcriber can select to insert the record before or after another record. The default sorting for the record browse page is now by position rather than by ID.
This is a powerful feature, but also a fragile one, potentially, given the precise adjustments that need to be made to retain consecutive positions. So we need to test this thoroughly, with a close attention to detail. Make position changes to your records using both "Insert before" and "Insert after". Move records from low positions to higher positions and from high positions to lower positions. If you find errors it will be very hard to troubleshoot unless you make notes about what happened before the error.
One important caveat: the upgrade will make changes to your database that can't easily be downgraded. This shouldn't be a problem if you're okay uninstalling the module when switching between the branch version (0.2.0) and the master version (0.1.0). If you already have a bunch of records you're using for testing on master, and you'd prefer not to lose them, I recommend you use a separate Omeka S installation to test this feature.
(This addresses #36)
When I attempted to add records in Safari, using focus mode, I could find any save or save progress buttons and had to leave focus mode to finish adding the record or start adding a new one.
The export branch introduces the export dataset feature. You'll need to uninstall and reinstall the module for this to work. To export a dataset, go to the dataset show / item browse page and click on the "Export dataset" action in the top right. Once this is done, you should see a link to the CSV file at the bottom of the dataset sidebar. Note that
To facilitate data entry and review, a record should have adjustable position relative to other records.
For example, let's say an item already has two records and a user needs to insert another record between those two. Currently there is no way to do this - the user would have to add the record at the end of the list. With adjustable record position, the user would be able to insert the record immediately after the first record (or immediately before the second).
This could be as simple as providing an optional "Position" input on the record add/edit page, which would accept position 1-n, where n is the total number of records. Once submitted, the module would update the position of each record according to the position of the created/updated record.
I created a record, saved my progress, and then came back to it. I can't then scroll the form, though I can scroll the right hand sidebar. If I resize the browser window to make it taller, I can see more of the form below.
Here's the item in question on the testing site. Screenshot:
I've done this a few times now where I type in a number, but because my cursor is still over the Number box and the box is still selected, it auto scrolls for me when I try to scroll down the page. Since its just a small mouse movement to take the cursor out of the box, I've often moved the mouse enough so that the numbers scroll to 1 or 2 off my typed entry and I've moved down the page without noticing.
This feature is helpful for preset options like years in the Account Books and Bills of Mortality, but for other sections like Number of officers, it causes problems with ensuring the data is accurately counted. Is this just dependent of the project set up or how the Number input section functions?
The record-batch-edit branch introduces the ability for reviewers to batch edit record values. On the record browse page, the "Edit selected" and "Edit all" batch actions should take you to a page that now renders a record form that's modified for bulk data entry. You should be able to bulk-mark values as missing and illegible; and you should be able to bulk-edit the values themselves, using the same form elements used in the normal record add/edit pages. The one necessary difference is that you have to flag intent to edit a particular value by checking "Edit value?"
The plain-text
branch makes some far-reaching changes of the way DataScribe stores values. Please test each data type (including the new Date and Time data types). There should be no change from previous behavior.
In the record-form
branch, I've styled the record form's common element checkboxes ("is missing", "is illegible", "set to null") to be more vertically compact and better represent the "set to null" option, currently labeled as "reset value". Other notes:
How do we feel about these changes?
When I have the browser (Edge and Chrome) in half screen, I am missing columns and the last one is cut off. There are cutoffs when not on the last page of the columns too. Here it should end with columns Pounds (L), Shillings (s), Pence (d). However, its not the first time I use the arrows to go to the last columns. Its after I go to the end, navigate back to the left columns, and try to return to the end of the line.
Maybe its related to the cell overflow issues referred to in #46
tl;dr datascribe is trying to display html media from the Omeka item - it doesn't work.
I brought in items using the Tropy export to Omeka S, which sends along any notes you make as an HTML media item. I just looked at one of these in Datascribe, and it's treating the html media as a page to load in the dropdown/pagination on the Add Record page.
If I page to the html media, I get a broken image symbol.
It seems to me there ought to be a way to stop it from trying to load non-image media in the first place?
As far as I can tell, if I skip or miss a line in the item, I can only add it as the newest record, wherever correcting the mistake occurs. As far as creating data, this doesn't seem like a problem since the data will still be created, but will this cause challenges on the review end? Does an 'inset record' option like an insert row in Excel make sense for transcribers to have?
The "Add +" and "Save progress" buttons should save the record asynchronously, without reloading the page. This is primarily so the media viewer does not lose state, causing users to lose their place when adding another record or saving progress on the current one.
This shouldn't be too difficult for "Save progress" since the state of the surrounding page doesn't change, but it will be quite tricky to implement for "Add +" because the surrounding page's state will have to change, including the record form itself, the sidebar inputs, and the "Before" table (and the "After" table should we implement record position).
The magnifying glass buttons work, but I did not find them helpful to use based on how it functioned. Here is an example where I zoomed in on the corner and then when I zoomed out it brought me into the white void. Zooming in/out and navigating the image with the mouse was more controlled. I'm not sure if you would be able to limit the zoom abilities to within the confines of the item's image, but it would make the zoom buttons more effective for the user.
To prepare for #37 I've consolidated the markup of the add record and edit record pages. Nothing should have changed visually or functionally, but the pages should still be thoroughly tested to confirm this.
The library we're currently using for panning and zooming media in the records views no longer uses jQuery, which has made troubleshooting issues like #33 cumbersome.
I was working on a test transcription where I did three records, then left, then came back and tried to add a new record. Right now, the preview of the three completed record takes up the majority of the screen space (and vanishes entirely if I have the screen a bit smaller) so that I have to scroll down to get to it. I'm assuming that's because I have a fairly long pair of text fields at the beginning of the record. I don't know if there's a way we can "cap" the percentage of screen space given to the prior records to make it adjustable by screen size?
Deleting a Record removes the number of the record in the listing. I.e. I created records 39, 40, and 41. Deleted 41. Added a new Record that was assigned 42.
I can see his as a helpful to not mess up transcriber notes by changing all the numbers should I go and delete Record 20, ruining any references to later records. However, my thought process is that if I need to find line 15, I can't assume that Record 15 corresponds to it.
The following actions have a cascading destructive effect, so they probably should come with sufficient warnings prior to their execution:
Currently, modules can add custom DataScribe data types using the datascribe_data_types
configuration. What we need now is a real-world example of a data type that tests the limits of DataScribe's capabilities. Specifically, can data types accommodate dynamic content? Can they use external sources of data to populate, validate, and render content?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.