Making some cool maps about what is happening in Ukraine right now.
#SupportUkraine
image_preprocessing.ipynb
- Requires:
- Pillow:
pip3 install PyMuPDF Pillow
- Pillow:
- Input:
data
(see Google Drive for all files; sample on GitHub) - Output:
data_resized
(see Google Drive for all files; sample on GitHub)
text_extraction.ipynb
- Requires:
- Pillow:
pip3 install PyMuPDF Pillow
- PyTesseract:
pip install pytesseract
- Tesseract:
brew install tesseract
- Pillow:
- Input:
data_resized
(see Google Drive for all files; sample on GitHub) - Output:
raw_text
(see Google Drive for all files; sample on GitHub)
text_processing.ipynb
- Input:
raw_text
(see Google Drive for all files; sample on GitHub) - Output:
cleaned_text
(see Google Drive for all files; sample on GitHub)
- Run Named Entity Recognition on GPU:
- Input:
cleaned_text
(see Google Drive for all files; sample on GitHub) - Output: charts in Notebook and on website
- Input: