Code Monkey home page Code Monkey logo

imageinwords's Introduction

ImageInWords: Unlocking Hyper-Detailed Image Descriptions

arXiv: https://arxiv.org/abs/2405.02793

Please visit the webpage for all the information about the IIW project, data, visualizations, and much more. The data can be downloaded directly from the datasets/ folder, as well as from Huggingface (see below).

Please reach out to [email protected] for thoughts/feedback/questions/collaborations.

License: CC-BY-4.0

Other resources

🤗Hugging Face🤗

  • IIW-Benchmark Eval Dataset
  • from datasets import load_dataset
    
    # `name` can be one of: IIW-400, DCI_Test, DOCCI_Test, CM_3600, LocNar_Eval
    # refer: https://github.com/google/imageinwords/blob/main/datasets/README.md
    dataset = load_dataset('google/imageinwords', token=None, name="IIW-400", trust_remote_code=True)
  • Dataset-Explorer
  • Cite

    If you use our data or refer to our work, please include the following citation

    @misc{garg2024imageinwords,
          title={ImageInWords: Unlocking Hyper-Detailed Image Descriptions}, 
          author={Roopal Garg and Andrea Burns and Burcu Karagol Ayan and Yonatan Bitton and Ceslee Montgomery and Yasumasa Onoe and Andrew Bunner and Ranjay Krishna and Jason Baldridge and Radu Soricut},
          year={2024},
          eprint={2405.02793},
          archivePrefix={arXiv},
          primaryClass={cs.CV}
    }
    

    imageinwords's People

    Contributors

    aburns4 avatar roopalgarg avatar yonatanbitton-google avatar

    Stargazers

     avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

    Watchers

     avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

    imageinwords's Issues

    Datasets' Images

    Can you provide a link to download the packaged dataset images directly?

    Recommend Projects

    • React photo React

      A declarative, efficient, and flexible JavaScript library for building user interfaces.

    • Vue.js photo Vue.js

      🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

    • Typescript photo Typescript

      TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

    • TensorFlow photo TensorFlow

      An Open Source Machine Learning Framework for Everyone

    • Django photo Django

      The Web framework for perfectionists with deadlines.

    • D3 photo D3

      Bring data to life with SVG, Canvas and HTML. 📊📈🎉

    Recommend Topics

    • javascript

      JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

    • web

      Some thing interesting about web. New door for the world.

    • server

      A server is a program made to process requests and deliver data to clients.

    • Machine learning

      Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

    • Game

      Some thing interesting about game, make everyone happy.

    Recommend Org

    • Facebook photo Facebook

      We are working to build community through open source technology. NB: members must have two-factor auth.

    • Microsoft photo Microsoft

      Open source projects and samples from Microsoft.

    • Google photo Google

      Google ❤️ Open Source for everyone.

    • D3 photo D3

      Data-Driven Documents codes.