culturehack / data-tool Goto Github PK
View Code? Open in Web Editor NEWA collection of cultural data sets and sources & a website to browse them.
License: MIT License
A collection of cultural data sets and sources & a website to browse them.
License: MIT License
Obviously it's possible/likely that the copy describing each bit of data will be a bit of a moveable feast, but looking at the site now, I'm not sure which bits might have been placeholder text put in by Frankie and which are ones put in by you. This one is a case in point: http://data.culturehack.org.uk/dataset/37251027-Pepys-Diary All for a jolly tone of voice, but not sure about the "It's great" and the typo.
Currently http://www.culturehack.org.uk/data resolves to http://culturehack.org.uk/2012/06/16/data-sets-by-type/
it should probably be changed to resolve to
http://data.culturehack.org.uk
Assigning to @jamesjefferies for whenever he has a moment, is low priority
Creative Commons licences as mentioned on /about
are not really suitable for data, they are licenses for content. See http://opendatacommons.org/licenses/ for some suitable data licenses, and http://opendatacommons.org/faq/licenses/#Why_Not_Use_a_Creative_Commons_or_FreeOpen_Source_Software_License_for_Databases for the explanation on CC. Or OSM's move from CC to ODbL: http://www.osmfoundation.org/wiki/License/We_Are_Changing_The_License#Why_are_we_changing_the_license.3F
How many data set entries do we want / need?
Question for Rachel, really: would she prefer 50 really rich well described ones, or 200 less well described ones?
Caper strategy document mentions 200...
How do we cross reference between data tool and editorial on main CH site?
In order to pull WP posts through / evidence hacks, how do we best combine this with the data tool entry points?
Currently, our categories are (i think) defined in site.rb lines 6-14
CATEGORIES = [
'Art',
'Literature',
'Music',
'Performance',
'Fashion',
'Media',
'History'
]
They're also then listed out in _prose.yml line 18 on
- name: "categories"
field:
element: "multiselect"
label: "Categories"
options:
- name: "Art"
value: "Art"
QUESTIONS:
To add a new category, or rename an existing category, is it just a case of editing them in those two places?
Can Categories take spaces? If so, do they need to be surrounded by quotes in site.rb and any of the source.md files?
Is it possible to make the licensing info a click through to something? I'm not sure what PD means, for instance, on this page http://data.culturehack.org.uk/dataset/37251027-Pepys-Diary and there's no way of finding out.
I forgot to add any analytics tracking to the site.
Probably best to use the same tracking account as the main Culture Hack site, I’d guess? (that way journeys between the two bits of the site could be tracked).
Can you take this out of the footer and add it to the About page pls?
Initial work done, needs proper data adding.
the html file sent
Need to copy-paste the actual page contents!
See the SOCH merge notes.
we're looking for a post-doc researcher to work with
In one file there is a media: data pair in the YAML frontmatter (it also appears as media: text in 37251018-British-Museum-object-catalog.md)
media: doesn't seem to be defined in _prose.yml
Q: What was media:? what were we going to do with it? Did we define a list of options?
ALSO
Am I correct in saying: Yaml is flexible and doesn't mind if you add additional values in there? So we could just make up fields on the fly?
This would be pretty useful.
Currently unsure on approach.
Could import all the entries into postgres upon launch and use the postgres full text search feature. Has the advantage of built-in features like stemming and spelling correction. Disadvantages: another dependency, makes site more complicated to install, etc.
Alternative could implement some basic in-memory text searching. Wouldn't be too tricky to simply return matching results, but wouldn't be as sophisticated.
It'd be useful if you could include links to sample data files (eg CSV, JSON) on the dataset pages.
Could be hosted externally, or within the project.
schedule a meeting with the four of us soon (and James, if you're planning to be in London at any point?) as would be useful to discuss some of these as we consider next steps/develop the strategy.
Rachel needs to tell KP and FR what the requirements for the TSB report are
Are there any existing methods of describing data / datasets etc with such a simple YAML format? Can we point to anything?
Related - how do we the integrate with other data sources in future - CKAN interoperability?
Would be useful to have this on the dataset pages...
https://twitter.com/GuWa/status/400647398302547969
Not really the right place to put this, but it's pretty cool
Manually creating a text file is a bit of a PITA
We know that files will follow a standard template
The process would be roughly
is this possible to script? It would make sense.
It might be possible to create a new file in github via the API...
Would be easy to add. Not sure how much of a priority this is though.
The filtering is a bit confusing; my instinct is always to click through the top category list, and each time I find it a bit weird that more categories, rather than different categories, are being displayed. Could you change this to toggle through the list please, rather than add each category to the display? And then keep the small/medium/large as a filter on each category.
Ta
Hello
Travis is helpfully telling me the most recent builds are failing, but not giving me tremendously useful feedback about why.
Build #62 was broken. 31 seconds
Kim Plowright 2140777 Changeset →
Add DigitalNZ, edit other Sources
Cooper Hewiitt and Open Library with more data. Digital NZ apis added
https://travis-ci.org/culturehack/data-tool/builds/13781983
Build #63 is still failing. 33 seconds
Kim Plowright 08a08e5 Changeset →
correct empty yaml value
Attempting to fix the Travis Error being thrown. Unclear *which* file is making it barf, other than that it's a line 19. This doc has an empty value at line 19 and is in the commit that made it barf. perhaps this is the culprit. (NB build is not failing for me locally)
https://travis-ci.org/culturehack/data-tool/builds/13800862
Seems to be a parse error in site.rb -
`parse': (): found unexpected document indicator while scanning a quoted scalar at line 19 column 22 (Psych::SyntaxError)
other possibility - naming the test file 9999999999-whatereveritwas is the number causing the problem.
Any ideas? LMK what the solution is so I can fix for myself in future too!
Should add some cache headers so that the pages can be stored in public proxy caches, for even speedier loading.
Suggest expiry of 10 mins?
Need to write some code that takes the data sources and creates webpages for each of them.
In the copy box at the top, it would be useful to indicate how many bits of data are in the database at any one time. How much of a faff would it be to automate this?
Obviously I don't expect every category to be populated, but filtering "art" by "small" and "medium" returns 0 entries, which might be seen to look at bit bad at launch, as it's the first set of filter. Is it possible to pop something in here for cosmetic purposes pls?!
Would it be a nicer UX thing if the "visit site" opened in a new window, or would that just be annoying in a different way?
This would be nice. Not sure how necessary it really is though, as the site is pretty speedy already (given that there's no database calls involved, and all the data is in memory).
Caper: check UX sketches and feed back
In the UX folder of Dropbox - UX and IA ideas.pdf
@we-are-caper Rachel - you were looking at the intro copy on the main page. Can you confirm what wording you'd like here?
Currently:
Explore open data about arts and culture, and the creative things people have done with it. Find out more →
Previous version you were checking with Katy:
Culture Hack Data is a simple way to explore open data about arts and culture, and the creative things people do with it. To get started, search or filter our list of data sources using the categories to the left.
Find Out More →
Suggest we can reword that slightly
Culture Hack Data is a simple way to explore open data about arts and culture, and the creative things people do with it. Search or filter our list of XX data sources, or contribute a new entry
Find Out More →
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.