Code Monkey home page Code Monkey logo

llm_botanist's Introduction

Overview/Background

During my work as a data scientist/consultant for Conservation International, the monitoring team expressed interest in being able to automatically classify plants as native, alien, or invasive in certain landscapes.

Automating the process for classifying invasives proved a little more straightforward, since there was a centralized database for invasive species. I developed a tool that combines webscraping and API calls to databases that accomplishes this. I packaged this into a Shiny App, which you can see the source code for here or use directly on Shiny.

The challenge in doing something like this for 'nativeness' or 'alienness' arises from the fact that there isn't a centralized database that tracks native plants all over the world. Some exists for certain regions, but since the projects I was analyzing were based in dozens of different countries, I needed another solution.

The LLM-powered Classifier

My solution involved writing a script to process species-country pairings that essentially does the following: 1.) Query Wikipedia for information about the plant.

2.) Filter the page content that is returned so that only sentences with relevant key words are included (i.e. native, alien, endemic, range, invasive, etc.)

3.) Using LangChain to engineer prompts and bring an LLM into the loop, parse this wikipedia context for specific, relevant information like...

  • Native range

  • Alien range

4.) Only using the information that you extracted, make a classifcation decision.

I talk about the process a bit more in depth here, and I put it all into a Streamlit app with a runnable demo here. Aside from the demo, which you can run by clicking the 'run' demo button at the top, you can also input your own text to process a single pairing or your own CSV with columns for 'species' and 'country' to process multiple.

llm_botanist's People

Contributors

johannesnelson avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.