Open repository of codes to train language models for program evaluations.
git clone https://github.com/casualcomputer/evaluation-ai.git
you can use -h
with the scripts to see the help messages.
cd evaluation-ai/src/data/
python 00_load_raw_data.py
python 01_extract_text.py
Source Name | Source Link | Number of Extracted Reports |
---|---|---|
ESDC | Link | 177 |
CRA | Link | 196 |
Health Canada | Link | 129 |
Natural Resources Canada | Link | 119 |
- Reports are named as
<department acronym>_<id>_<title>.<extension>