Code Monkey home page Code Monkey logo

opendataology's Introduction

OpenDataology


CII Best Practices Creative Commons License

Overview


OpenDataology is a project for AI model trainning with trusted dataset compliance. Our project enables users of publicly available datasets and users who curate a dataset from multiple data sources (particularly for use as a part of machine learning models) to identify the potential license compliance risks. Our project is primarily comprised of three key components.

  • A dataset license compliance analysis workflow that ascertains the final allowed rights and the required obligations associated with using a publicly available dataset or a dataset that is curated from multiple data sources for any purpose. Please refer to the paper Can I use this publicly available dataset to build commercial AI software?-A Case Study on Publicly Available Image Datasets for more details.
  • A growing database and a web portal that documents the final rights and obligations (after the license compliance analysis is conducted) associated with the datasets and the data sources analyzed in our project. The database also documents the metadata collected and used to conduct the compliance workflow.
  • An online license generation toolkit that creators of datasets to generate custom licenses depending on the exact rights and obligations that they want to allow (instead of having to rely on existing available and limited dataset specific licenses)

OpenDatalogy's recommendations cannot be constituted as legal advice.

Getting Involved


Contributing


We love contributions in various forms. To contribute to OpenDataology please see CONTRIBUTING.md

Governance


OpenDataology is a project hosted by the LF AI and Data Foundation. The project governance details can be found at GOVERNANCE.md.

Reporting a Problem


To report a problem, you can open an issue in the repository against a specific workflow. If the issue is sensitive in nature or a security related issue, please do not report it in the issue tracker but instead email [email protected].

Meeting Schedule and Minutes


Here are the records of the meeting notes and progress.

License


OpenDataology is licensed under CC-BY-4.0

Copyright (c) 2022 The OpenDataology Authors

All rights reserved.

opendataology's People

Contributors

jmertic avatar li-clement avatar rgopikrishnan91 avatar zichengqu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

opendataology's Issues

Different Licenses

Why different repositories adopt different licenses? The MIT license is used in the OpenDataology repository. However, the Apache-2.0 license is used in the other repositories including metadata-API, portal-frontend, and license-generator.

About the database

I have an advice about the database mentioned in Overview. I think if we allow the visitors of the database to edit and modify the database, maybe this visitors can help to check the licenses and may upload their analysis of new licenses, then the database can be expanded faster and those visitors can be the contributors.

No SSL cert

Describe the bug
Not really a bug, but an shortcoming. The main website does not have any SSL certificates. Please add one from Letsencrypt.

To Reproduce
Steps to reproduce the behavior:

  1. Go to www.opendataology.com
  2. Does not have SSL cert

Expected behavior
It should have a lock to indicate a secure connection.

Desktop (please complete the following information):

  • Any browser

Create a new repo for dataset tools API

Is your feature request related to a problem? Please describe.
The OpenDataology project should have an infrastructure repository, or a repository of tool services set, to store the various infrastructure or tool modules of the OpenDataology platform.

Describe the solution you'd like
Now we have developed a new service for dataset review, which can be integrated into the dataset metadata and license-sharing platform.

Describe alternatives you've considered
We will develop more tools, not just covering this dataset review service, so we wanna whether it is ok to create a new repository for us. Is it better to name this new repository as infrastructure, service-set, or any names better?

A sandbox project charter is needed

Is your feature request related to a problem? Please describe.
No

Describe the solution you'd like
A sandbox project charter is needed to define the TSC and its responsibility.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.