This repository contains the data and code created under the project NLP4Rare-cm-uc3m.
You can find a description of the RareDis corpus in:
The RareDis corpus: a corpus annotated with rare diseases, their signs and symptoms Claudia Martínez-deMiguel, Isabel Segura-Bedmar, Esteban Chacón-Solano, Sara Guerrero-Aspizua https://arxiv.org/abs/2108.01204
- The folder corpus contains the RareDis corpus as well as a script to obtain some statistics of the corpus.
- The PutbTator2Brat.zip contains a Python script that uses the PubTator tool to annotate a sample of abstracts abour skin rare diseases. It also contains a script to obtain some statistics about this sample.