Comments (9)
You can't install individual plugins, but this extension is basically centred around the DCAT harvester so you want to install the whole lot anyway.
I've added some install instructions:
https://github.com/ckan/ckanext-dcat#install
from ckanext-dcat.
Thank you. I've installed it, and configure a Harvester for a XML RDF, but in the gather_consumer.log it shows this error: ERROR [ckanext.harvest.queue] No harvester could be found for source type dcat_xml
It seems that the queue can't find the harvester for RDF_XML.
from ckanext-dcat.
Sorry, my fault. I didn't restart the supervisor...
Now the JSON harvester is working, but the XML is always crashing with this error in the fetch_consumer:
ValueError: The provided document does not seem to contain a dcat:Dataset element
I've also tried with the example files. Thanks in advance.
Full trace:
File "/usr/lib/ckan/default/bin/paster", line 9, in
load_entry_point('PasteScript==1.7.5', 'console_scripts', 'paster')()
File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/paste/script/command.py", line 104, in run
invoke(command, command_name, options, args[1:])
File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/paste/script/command.py", line 143, in invoke
exit_code = runner.run(args)
File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/paste/script/command.py", line 238, in run
result = self.command()
File "/usr/lib/ckan/default/src/ckanext-harvest/ckanext/harvest/commands/harvester.py", line 127, in command
fetch_callback(consumer, method, header, body)
File "/usr/lib/ckan/default/src/ckanext-harvest/ckanext/harvest/queue.py", line 294, in fetch_callback
fetch_and_import_stages(harvester, obj)
File "/usr/lib/ckan/default/src/ckanext-harvest/ckanext/harvest/queue.py", line 311, in fetch_and_import_stages
success_import = harvester.import_stage(obj)
File "/usr/lib/ckan/default/src/ckanext-dcat/ckanext/dcat/harvesters.py", line 305, in import_stage
package_dict, dcat_dict = self._get_package_dict(harvest_object)
File "/usr/lib/ckan/default/src/ckanext-dcat/ckanext/dcat/harvesters.py", line 398, in _get_package_dict
dcat_dict = dataset.read_values()
File "/usr/lib/ckan/default/src/ckanext-dcat/ckanext/dcat/formats/xml.py", line 26, in read_values
tree = self.get_xml_tree()
File "/usr/lib/ckan/default/src/ckanext-dcat/ckanext/dcat/formats/xml.py", line 58, in get_xml_tree
raise ValueError('The provided document does not seem to contain a {0} element'.format(self.base_class))
ValueError: The provided document does not seem to contain a dcat:Dataset element
from ckanext-dcat.
@montxo5 this looks like a bug in the XML parsing. I'll try and push the fix in the next couple of days
from ckanext-dcat.
@montxo5 can you see if the latest changes in d289c58 fix the issue?
from ckanext-dcat.
Thank you very much! Now its working perfect with your example.
I'm trying with other DCAT from an Open Data Portal of Madrid. The import for datasets works fine, but with the resources it ignoring it. Its only creating empty datasets.
The RDF is here: http://datos.madrid.es/egob/catalogo.rdf
Maybe the RDF they publish it's not correct, could it be?
Thanks.
from ckanext-dcat.
Hi @montxo5.
In DCAT land, the distributions are defined using the dcat:Distribution class. So for example, if you are using XML/RDF:
<dcat:distribution>
<dcat:Distribution>
<dct:title xml:lang="es">Consultas ciudadanas (2004-2013)</dct:title>
<!-- ... -->
</dcat:Distribution>
</dcat:distribution>
Note that the Madrid portal is using the dcat:Download
class, which AFAICT does not exist:
<dcat:distribution>
<dcat:Download>
<dct:title xml:lang="es">Consultas ciudadanas (2004-2013)</dct:title>
<!-- ... -->
</dcat:Download>
</dcat:distribution>
We followed the recommendations of the DCAT Application Profile for Data Portals in Europe as basis for our support for harvesting DCAT based documents, in case you want to have a reference.
Also, check the examples
folder of this extension to see the serializations supported.
Hope this helps.
from ckanext-dcat.
Thank you very much! you're right. I will try to concact with Madrid's Open portal to explain it.
For your information, we're trying to use this extension for a BigOpenPlatform to use it in a datathon event with Madrid's city hall called MADdata. If you are interested, or if you know someone, please check this page: http://maddata.es/
If we finally use this extension, we will mention it in the presentation.
Thanks.
from ckanext-dcat.
That looks great @montxo5, hope it's a good one in Madrid!
Closing the issue now
from ckanext-dcat.
Related Issues (20)
- ckanext-harvest mandatory or not HOT 2
- Command line interface not working with ckan 2.9 and python 2.8 HOT 2
- Making a new profile for Dcat v.2.0 HOT 8
- DCAT not mapping all metadata to extras
- Catalog.xml UnicodeEncodeError
- Unkown RDF profiles error
- Loading multiple datasets and distribution for those datasets in one jsonld HOT 4
- does the harvester/dcat have the ability to parse an entire catalog with multiple datasets and their distributions at once. HOT 7
- schemaorg profile not working HOT 1
- Could not build url for endpoint 'dcat.read_catalog' HOT 5
- Does not install with python 3.10 HOT 1
- dcat:mediaType must be a resource HOT 3
- Already deleted records are to be deleted again
- Backslash? Forward slash HOT 2
- New version for dropped Py2 and CKAN<2.9 support HOT 3
- two many locn:geometry
- do not split keywords HOT 2
- Harvester crashes with missing title HOT 1
- Support for DCAT 3 HOT 2
- Improving Pagination Handling in RDF Harvester's gather_stage
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ckanext-dcat.