🙂 Vincent D. Warmerdam ┣━━ 📦 Open Source Packages ┃ ┣━━ bulk - simple bulk labelling interface ┃ ┣━━ embetter - embeddings ready for sklearn ┃ ┣━━ doubtlab - suite of tools to help find bad labels ┃ ┣━━ drawdata - draw datasets in jupyter ┃ ┣━━ scikit-lego - lego bricks for sklearn ┃ ┣━━ scikit-partial - partial_fit() pipelines for sklearn ┃ ┣━━ scikit-bloom - bloom transformers for sklearn ┃ ┣━━ fh-matplotlib - matplotlib for FastHTML ┃ ┣━━ fh-altair - altair for FastHTML ┃ ┣━━ human-learn - rule-based components for sklearn ┃ ┣━━ sentence-models - a different take on textcat ┃ ┣━━ mktestdocs - turn markdown files into pytest tests ┃ ┣━━ lazylines - lightweight utils for .jsonl wrangling ┃ ┣━━ cluestar - inspiration for your first text labels ┃ ┣━━ durations - pytest duration insights ┃ ┣━━ tuilwindcss - tailwindcss for textual tui apps ┃ ┣━━ memo - saves a whole log of time ┃ ┣━━ skedulord - makes cron a bit more fun ┃ ┣━━ icepickle - cool and safe storage for linear models ┃ ┗━━ evol - grammar for genetic heuristics ┣━━ 👍 Project Contributions ┃ ┣━━ fairlearn - contributed the CorrelationFilter ┃ ┣━━ polars - contributed the .pipe() method ┃ ┗━━ BERTopic - added lightweight sklearn pipeline support ┣━━ ⭐ Online Projects ┃ ┣━━ calmcode.io - intermediate developer education ┃ ┣━━ koaning.io - personal blog ┃ ┗━━ dearme.email - reflection via a 30 day delay ┣━━ 🎙️ Popular Talks ┃ ┣━━ Natural Intelligence is All You Need ┃ ┣━━ Group-by statements that save the day ┃ ┣━━ Tools to Improve Training Data ┃ ┣━━ Optimal on Paper, Broken in Reality ┃ ┣━━ Playing by the Rules-Based-Systems ┃ ┣━━ How to Constrain Artificial Stupidity ┃ ┣━━ The Profession of Solving the Wrong Problem ┃ ┣━━ Winning with Simple, even Linear, Models ┃ ┗━━ Untitled12.ipynb ┣━━ 🔬 Random Experiments ┃ ┣━━ scikit-prune - prune scikit learn pipelines ┃ ┣━━ gitlit - tracking github action times across open source ┃ ┣━━ sentimany - many sentiment models, one repo ┃ ┣━━ tokenwiser - sklearn token tricks ┃ ┣━━ clumper - functional API for lists of dicts ┃ ┗━━ whatlies - exploration tools for word embeddings ┗━━ 👨💻 Employer ┣━━ 🎲 :probabl. - scikit-learn and friends ┃ ┣━━ scikit-churn - safety rails for churn work ┃ ┣━━ scikit-playtime - rethinking pipelines ┃ ┗━━ scikit-mdn - mixture density networks ┣━━ 💥 Explosion - developer tools for nlp ┃ ┣━━ prodigy-hf - Prodigy integration for the HuggingFace stack ┃ ┣━━ prodigy-pdf - Annotate PDFs via Prodigy ┃ ┣━━ prodigy-ann - ANN techniques to find relevant subsets ┃ ┣━━ prodigy-segment - Prodigy integration for Segment Anything ┃ ┣━━ prodigy-lunr - Search techniques to find relevant subsets ┃ ┣━━ prodigy-whisper - Transcribe audio with OpenAI's whisper models ┃ ┣━━ prodigy-tui - Prodigy from the terminal ┃ ┗━━ cluestar - inspiration for your first text labels ┗━━ 🤖 Rasa - conversational software provider ┣━━ nlu examples - custom nlu components for Rasa ┣━━ taipo - data augmentation tools ┗━━ algo whiteboard - nlp education Follow me on twitter @fishnets88
koaning Goto Github PK
Name: vincent d warmerdam
Type: User
Company: @explosion
Bio: Solving problems involving data. Mostly NLP these days. AskMeAnything[tm].
Twitter: fishnets88
Location: Amsterdam
Blog: https://koaning.io