datasets.simula.no

A collection of open datasets published by Simula Research Laboratory and SimulaMet.

Currently, we have published the following datasets:

Medical and Biology Datasets

Depresjon, The Depresjon Dataset. [ publication ]
HyperKvasir, The Largest Gastrointestinal Dataset. [ publication ]
HYPERAKTIV, A Motor Activity Database of Patients with ADHD. [ publication ]
KvasirCapsule SEG, A Capsule Endoscopy Segmentation Dataset. [ publication ]
Cellular, A cell autophagy dataset. [ publication ]
GastroVision, A multicenter dataset. [ publication ]
Nerthus, A Bowel Preparation Quality Video Dataset. [ publication ]
Kvasir Capsule, The largest gastrointestinal PillCAM dataset. [ publication ]
Kvasir Instrument, A gastrointestinal instrument Dataset. [ publication ]
Kvasir SEG, Segmented Polyp Dataset for Computer Aided Gastrointestinal Disease Detection. [ publication ]
Kvasir, A Multi-Class Image-Dataset for Computer Aided Gastrointestinal Disease Detection. [ publication ]
Psykose, A Motor Activity Database of Patients with Schizophrenia. [ publication ]
VISEM QC, A sperm quality control dataset.
VISEM, A Multimodal Video Dataset of Human Spermatozoa. [ publication ]

Sport Datasets

Alfheim, Soccer video and player position dataset. [ publication ]
ARX, A Text-Classification Dataset Consisting of Norwegian Soccer Articles from VG and TV2. [ publication ]
Heimdallr, A Dataset For Sport Analysis.
ScopeSense, A 8.5-month sport, nutrition, and lifestyle lifelogging dataset.
Soccer Summarization, Soccer game captions and summary in English for game summarization. [ publication ]
SoccerMon, Subjective and objective data collected over two years from two different elite women´s soccer teams.
SoccerSum, The SoccerSum Dataset for Automated Detection, Segmentation, and Tracking of Objects on the Soccer Pitch [ publication ]
SoccerNet-Echoes, SoccerNet-Echoes: A Soccer Game Audio Commentary Dataset [ publication ]
PMData , A lifelogging dataset of 16 persons during 5 months using Fitbit, Google Forms and PMSys.
TACDEC, TACDEC: Dataset of Tackle Events in Soccer Game Videos [ publication ]

Other Datasets

Anarchy Online, Server-side Network Traffic from Anarchy Online: Analysis, Statistics and Applications. [ publication ]
European Cloud Cover, A dataset containing reanalysis data from ERA5 and satellite retrievals from METeosat Second Generation. [ publication ]
Eye Tracker, A Serious Game Based Dataset. [ publication ]
HSDPA, HSDPA-bandwidth logs for mobile HTTP streaming scenarios.
HTAD, A Home-Tasks Activities Dataset with Wrist-accelerometer and Audio Features. [ publication ]
Image Sentiment, A dataset for image sentiment analysis. [ publication ]
Njord, A fishing boat dataset.
Right Inflight, A Dataset for Exploring the Automatic Prediction of Movies Suitable for a Watching Situation.
THREAT, A Large Annotated Corpus for Detection of Violent Threats.
Toadstool, A Dataset for Training Emotional and Intelligent Machines Playing Super Mario Bros. [ publication ]
WICO Graph Dataset, A Labeled Dataset of Twitter Subgraphs based on Conspiracy Theory and 5G-Corona Misinformation Tweets. [ publication ]
WICO Text, A labeled dataset of conspiracy theory and 5G-corona misinformation tweets. [ publication ]

How to contribute

To add a new dataset, follow these steps:

Fork the Repository: Fork this repository to your GitHub account.
Create a Markdown File: In your forked repository, navigate to the datasets folder and create a new Markdown file (.md) for your dataset. The file name should be descriptive of the dataset.

Add Dataset Information: Copy and paste the following template into your Markdown file:

---
title: <dataset name>
desc: <dataset description>
thumbnail: <dataset thumbnail>
publication: <link to publication>
github: <link to github>
tags:
  - <list of tags>
---

Fill in the template with the appropriate information about your dataset.

Add a Dataset Thumbnail: Add a thumbnail to the dataset that will be displayed on the main page. The thumbnail should use a 16:9 aspect ratio, like 320 x 180 or 640 x 360 pixels, and be placed under public/thumbnails.
Update the README: Update this README with the new dataset added under one of the categories above. Add links to the publication, code, or other things that may be useful.
Create a Pull Request: Once you have added the Markdown file and filled in the dataset information, commit your changes. Push the changes to your forked repository. Create a pull request to merge your changes into the main repository.

Contact

If you have any questions or need assistance, please open an issue in the repository or contact [email protected].

simula / datasets.simula.no Goto Github PK