holocaust-spacy's Introduction

🪐 spaCy Project: Holocaust spaCy

This is a pipeline designed to work with documents from the Holocaust. It allows users to identify Holocaust-specific data, such as CAMP and GHETTO. Its vectors are also trained on Holocaust-specific data.

📋 project.yml

The project.yml defines the data assets required by the project, as well as the available commands and workflows. For details, see the spaCy projects documentation.

⏯ Commands

The following commands are defined by the project. They can be executed using spacy project run [name]. Commands are only re-run if their inputs have changed.

Command	Description
`build_floret`	Creates the floret embeddings for the .md model
`floret2spacy`	Create a base spaCy pipeline with the floret embeddings
`build_rules`	Build Pipeline
`train`	Train model
`package`	Package the Pipeline
`push2hub`	Pushes the new version to HuggingFace Hub
`build_corpus`	Downloads the collection of oral testimonies from HuggingFace and then creates a corpus.txt file for training floret embeddings
`build_env`	Builds the environment for training on GPU

⏭ Workflows

The following workflows are defined by the project. They can be executed using spacy project run [name] and will run the specified commands in order. Commands are only re-run if their inputs have changed.

Workflow	Steps
`all-vectors`	`train` → `package` → `push2hub`

🗂 Assets

The following assets are defined by the project. They can be fetched by running spacy project assets in the project directory.

File	Source	Description
`assets/train.json`	Local	Demo training data adapted from the `ner_demo` project
`assets/dev.json`	Local	Demo development data

Recommend Projects

wjbmattingly / holocaust-spacy Goto Github PK

holocaust-spacy's Introduction

🪐 spaCy Project: Holocaust spaCy

📋 project.yml

⏯ Commands

⏭ Workflows

🗂 Assets

holocaust-spacy's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent