ccs-amsterdam / digitaltrackingworkshop18 Goto Github PK
View Code? Open in Web Editor NEWGithub page for the Digital Tracking Workshop in Amsterdam, 2018
Github page for the Digital Tracking Workshop in Amsterdam, 2018
In order to ensure replicability of our results when reaching tracking data, data needs to be stored securely for at least 5 years. I would be interested in best practices around data storage.
A major challenge in my experience is that even with large samples, most interesting behavior is very rare. I.e., most people do not visit a given, specific website on a given day, let alone read the same article. I'd like to discuss strategies to cope with and/or address this problem.
Anyone interested in continuing the conversation about Roxy 3.0 can join this mailing list: http://eepurl.com/dLHfXs
A presentation by Ericka Menchen-Trevino and Chris Karr
We will discuss the open source tools we have already developed to incorporate web browsing history into social science methods (interviews, surveys, experiments), and sketch out our plans for a new tool that incorporates mobile data.
Maybe we should also discuss what kind of (collaborative) funding opportunities there are, and whether it makes sense to work to something like a special issue or something related?
Notwithstanding our best intentions, there can be legal complications with collecting and storing digital tracking data. As such, it might be good to have a round table discussion, led by participants who have dealt with (or are dealing with) this issue.
I already heard from some participants who have experience in this matter. Who would be willing to take a lead on this, and are there any points in particular the we really should discuss?
When working with the browser tracking data that Judith and colleages collected, for a while I considered relying primarily on concepts (and code libraries/tools) for network analysis, such as igraph for analysis. For various reasons we didn't really do this, with the result that our code feels somewhat home brew to me (in essence, we turned logs of visited sites into "grams" such as "google.com -> facebook.com" and then worked with that).
I would be interested in whether people at the workshop are considering seriously relying on network analysis for studying browser clickstream data and if not, what alternative strategies they have. What we did feels in retrospect like reinventing the wheel, and I personally feel better relying on well-developed packages for some of the very generic data processing issues involved, but on the other hand graph analysis has its own caveats. I know that Sandra Gonzales-Bailon and colleages have used such an approach with ComScore data, but I think in that particular case the approach fit well with their interest in the centrality of news sources.
In any case, I think a dedicated library for turning clickstream information in table format into something more meaningful along with some pretty plots would be useful. We wrote a few functions that point into this direction, but nothing comprehensive yet.
Many of us have used and/or developed tools for collecting digital trace data. It would be great if one of the outcomes of this workshop is a curated list of good tools, and a general overview of what types of tools work well for what purposes. Also, for the tools being developed by participants, it would be great to share ideas regarding future plans, and to combine efforts where possible.
We can combine this with tool demo's (but please let us know beforehand so we can make a schedule)
It seems especially tricky to track mobile devices (in particular if we are interested in more than the top-level domain). I have done some exploring with regard to the options and am curious how far others got.
Who will present: Vincent van Hees (NLeSC) & The Amsterdam Team
Presentation topic: Tracking techniques: what works, what doesn't? [link to paper]
Linked to: #2
The literature list compiled by @fe_loe :
Having talked to many communication scholars about tracking data in the last years I noticed that one major obstacle to get engaged with computational communication science is that it looks way too difficult. So it would be worthwhile to map useful interfaces and visualization options for non-coding researchers interested in tracking data.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.