Code Monkey home page Code Monkey logo

mage-ai's Introduction

PyPi mage-ai License Join Slack Try In Colab

Intro

Mage is an open-source data management platform that helps you clean data and prepare it for training AI/ML models.

Mage demo

Join us on Slack Slack

Table of contents

  1. Quick start
  2. Features
  3. Roadmap
  4. Contributing
  5. Community

🏃‍♀️ Quick start

Fire mage

1. Install Mage

$ pip install mage-ai

2. Load and connect data

import mage_ai
from mage_ai.sample_datasets import load_dataset


df = load_dataset('titanic_survival.csv')
mage_ai.connect_data(df, name='titanic dataset')

3. Launch tool

mage_ai.launch()

Open http://localhost:5789 in your browser to access the tool locally.

If you’re launching Mage in a notebook, the tool will render in an iFrame.

4. Clean data

After building a data cleaning pipeline from the UI, you can clean your data anywhere you can execute Python code:

mage_ai.clean(df, pipeline_uuid='pipeline name')

Demo video (2 min)

Mage quick start demo

More resources

Here is a 🗺️ step-by-step guide on how to use the tool.

  1. Jupyter notebook example
  2. Google Colaboratory (Colab) example

Check out the 📚 tutorials to quickly become a master of magic.

🔮 Features

  1. Data visualizations
  2. Reports
  3. Cleaning actions
  4. Data cleaning suggestions

1. Data visualizations

Inspect your data using different charts (e.g. time series, bar chart, box plot, etc.).

Here’s a list of available charts.

dataset visualizations

2. Reports

Quickly diagnose data quality issues with summary reports.

Here’s a list of available reports.

dataset reports

3. Cleaning actions

Easily add common cleaning functions to your pipeline with a few clicks. Cleaning actions include imputing missing values, reformatting strings, removing duplicates, and many more.

If a cleaning action you need doesn’t exist in the library, you can write and save custom cleaning functions in the UI.

Here’s a list of available cleaning actions.

cleaning actions

4. Data cleaning suggestions

The tool will automatically suggest different ways to clean your data and improve quality metrics.

Here’s a list of available suggestions.

suggested cleaning actions

🗺️ Roadmap

Big features being worked on or in the design phase.

  1. Encoding actions (e.g. one-hot encoding, label hasher, ordinal encoding, embeddings, etc.)
  2. Data quality monitoring and alerting
  3. Apply cleaning actions to columns and values that match a condition

Here’s a detailed list of 🪲 features and bugs that are in progress or upcoming.

🙋‍♀️ Contributing

We welcome all contributions to Mage; from small UI enhancements to brand new cleaning actions. We love seeing community members level up and give people power-ups!

Check out the 🎁 contributing guide to get started by setting up your development environment and exploring the code base.

Got questions? Live chat with us in Slack Slack

Anything you contribute, the Mage team and community will maintain. We’re in it together!

🧙 Community

We love the community of Magers (/ˈmājər/); a group of mages who help each other realize their full potential!

To live chat with the Mage team and community, please join the free Mage Slack Slack channel.

For real-time news and fun memes, check out the Mage Twitter Twitter.

To report bugs or add your awesome code for others to enjoy, visit GitHub.

🪪 License

See the LICENSE file for licensing information.


Wind mage casting spell

mage-ai's People

Contributors

wangxiaoyou1993 avatar tommydangerous avatar ci-mage avatar johnson-mage avatar skunichetty avatar nathaniel-mage avatar shrey-mage avatar dy46 avatar witchfelicia avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.