Code Monkey home page Code Monkey logo

minato's Introduction

⚓ Minato

Actions Status Python version License pypi version

A Unified File I/O Library for Python

Minato is a Python library that provides a unified and simple interface to work with local and remote files, as well as compressed and archived files. With Minato, you can seamlessly read and write files from various sources like local filesystem, HTTP(S), Amazon S3, Google Cloud Storage, and Hugging Fase Hub. It also supports reading and writing compressed files such as gzip, bz2, and lzma, as well as directly accessing files inside archives like zip and tar.

One of Minato's key features is its built-in caching mechanism, which allows you to cache remote files locally, and manage the cache with a provided CLI. The cache is automatically updated based on ETag headers, ensuring that you always work with the latest version of the files.

Features

  • Unified file I/O for local and remote files (HTTP(S), S3, GCP, Hugging Face Hub)
  • Support for reading and writing compressed files (gzip, bz2, lzma)
  • Direct access to files inside archives (zip, tar)
  • Local caching of remote files with cache management CLI
  • Automatic cache updates based on ETag headers

Installation

Install Minato using pip:

pip install minato                   # minimal installation for only local/http(s) file I/O
pip install minato[s3]               # for Amazon S3
pip install minato[gcs]              # for Google Cloud Storage
pip install minato[huggingface-hub]  # for Hugging Face Hub
pip install minato[all]              # for all supported file I/O

Usage

Quick Start

Here's a simple example demonstrating how to read and write files on online storage:

import minato

# Write a file to an S3 bucket
s3_path = "s3://your_bucket/path/to/file"
with minato.open(s3_path, "w") as f:
    f.write("Create a new file on AWS S3!")

Access cached online resources in local storage:

# Cache a remote file and get its local path
remote_path = "http://example.com/path/to/archive.zip!inner/path/to/file"
local_filename = minato.cached_path(remote_path)

Access files inside archives like zip by connecting the archive path and inner file path with an exclamation mark (!) like above.

Automatically decompress files with gzip / lzma / bz2 compression:

with minato.open("data.txt.gz", "rt", decompress=True) as f:
    content = f.read()

In the example above, Minato will automatically detect the file format based on the file's content or filename and decompress the file accordingly.

Cache Management

❯ poetry run minato --help
usage: minato

positional arguments:
  {cache,list,remove,update}
    cache               cache remote file and return cached local file path
    list                show list of cached files
    remove              remove cached files
    update              update cached files

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit

minato's People

Contributors

altescy avatar dependabot[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.