John Giorgi's Projects
A Python framework for performing information retrieval experiments, building on http://terrier.org/
Create PyTerrier compatible dense indices using any sentence_transformers model
A PyTorch framework for an image retrieval task including implementation of N-pair Loss (NIPS 2016) and Angular Loss (ICCV 2017).
PyTorch deep learning projects made easy.
Forked from the original QuickThought repo and modified to be pip installable.
Re-Examining System-Level Correlations of Automatic Summarization Evaluation Metrics - This repository contains the code for the NAACL 2022 paper "Re-Examining System-Level Correlations of Automatic Summarization Evaluation Metrics."
Reach Biomedical Information Extraction
Tools for working with the S800 corpus
Saber is a deep-learning based tool for information extraction in the biomedical domain. Pull requests are welcome! Note: this is a work in progress. Many things are broken, and the codebase is not stable.
Sample module for Python-Guide.org.
Code for Paper: SBERT-WK: A Sentence Embedding Method By Dissecting BERT-based Word Models
SciRepEval benchmark training and evaluation scripts
A full spaCy pipeline and models for scientific/biomedical documents.
Scientific Discourse Tagging with Large Language models
This repository is meant to serve as an opinionated, pedagogical guide on software engineering best practices for those of us in machine learning. Follow along with the guide here: https://johngiorgi.github.io/se_best_practices_ml_perspective/
Sentence Embeddings with BERT & XLNet
A python tool for evaluating the quality of sentence embeddings.
A simple and extensible package for evaluating models against the SentEval benchmark.
The corresponding code for our paper: A sequence-to-sequence approach for document-level relation extraction.
This is a companion repository to seq2rel (https://github.com/JohnGiorgi/seq2rel) which aims to make it easy to generate training data.
A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)
Automate the process of downloading, trimming and mapping SRA files from a GEO data set
Conversion from brat-flavored standoff to CoNLL format
Data loaders and abstractions for text and NLP
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
COLING2016: Table Filling Multi-Task Recurrent Neural Network for Joint Entity and Relation Extraction
Submission to Timestamp Microservice project (part of the Apis and Microservices Projects) on freecodecamp.org.