Code Monkey home page Code Monkey logo

dn4il-dataset's Introduction

This is the novel DN4IL dataset introduced for Domain-Incremental setting in Continual Learning domain. It was introduced in the TMLR paper, "Dual Cognitive Architecture: Incorporating Biases and Multi-Memory Systems for Lifelong Learning" by Shruthi Gowda, Elahe Arani and Bahram Zonooz (https://github.com/NeurAI-Lab/DUCA)

DN4IL is a subset of the standard DomainNet dataset used in domain adaptation. It consists of six different domains: real, clipart, infograph, painting, quickdraw, and sketch. The shift in distribution between domains is challenging. A few examples and statistics of the dataset can be seen below.

Dataset Usage

These new annotations can be used with the DomainNet dataset for training and evaluation in Continual Learning

Dataset statistics

The original DomainNet consists of 59k samples with 345 classes in each domain. The classes have redundancy, and moreover, evaluating the whole dataset can be computationally expensive in a CL setting. DN4IL version considers different criteria such as relevance of classes, uniform sample distribution, computational complexity, and ease of benchmarking for CL.

All classes were grouped into semantically similar supercategories. Of these, a subset of classes was selected that had relevance to domain shift, while also having maximum overlap with other standard datasets such as CIFAR, to facilitate out-of-distribution analyses. 20 supercategories were chosen with 5 classes each (resulting in a total of 100 classes). In addition, to provide a balanced dataset, we performed a class-wise sampling. First, we sample images per class in each supercategory and maintain class balance. Second, we choose samples per domain, so that it results in a dataset that has a near-uniform distribution across all classes and domains. The final dataset DN4IL is succinct, more balanced, and more computationally efficient for benchmarking, thus facilitating research in CL. The challenging distribution shift between domains provides an apt dataset to test the capability of CL methods in the Domain-IL setting.

Details on supercategory and classes in DN4IL dataset. image info image info image info

Cite Our Work

@article{
  gowda2023dual,
  title={Dual Cognitive Architecture: Incorporating Biases and Multi-Memory Systems for Lifelong Learning},
  author={Shruthi Gowda and Bahram Zonooz and Elahe Arani},
  journal={Transactions on Machine Learning Research},
  issn={2835-8856},
  year={2023},
  url={https://openreview.net/forum?id=PEyVq0hlO3}
}

License

This project is licensed under the terms of the MIT license.

dn4il-dataset's People

Contributors

elahearani avatar shruum avatar

Stargazers

Steve Mao avatar Jeff Carpenter avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.