Code Monkey home page Code Monkey logo

gpu-csv-merger's Introduction

CSV Merge Tool

This repository contains a versatile CSV merging tool with both GPU (CUDA) and CPU versions for efficient consolidation of multiple CSV files. The tool leverages RAPIDS cuDF and Dask libraries for GPU acceleration.

Languages

Features

  • GPU (CUDA) and CPU Versions: Choose the appropriate version based on your hardware configuration.
  • Efficient Merging: Consolidate multiple CSV files into a single merged CSV file.
  • Flexible Naming: Customize the naming convention for the merged CSV file.
  • Support for Large Files: Handles large CSV files with ease.

Additional Features for GPU Version:

  • CUDA Acceleration: Utilizes the power of CUDA for significantly faster CSV merging operations compared to CPU computation.

Usage

GPU Version (Linux, NVIDIA GPU with CUDA support required):

python merge_gpu.py --folder_path /path/to/csv/files

CPU Version:

python merge_cpu.py --folder_path /path/to/csv/files

Note: The folder specified is relative to the current directory, not an absolute path. If you need to modify the way the directory is specified, simply adjust the following lines:

# Specify a specific folder as the directory and call the function
specified_folder = 'Example0'
folder_path = os.path.join(os.getcwd(), specified_folder)
merge_csv_files(folder_path)

Requirements

  • GPU Version:

    • Python 3.x
    • RAPIDS cuDF
    • Dask (for Dask cuDF support)
    • Linux environment
    • NVIDIA GPU with CUDA support
  • CPU Version:

    • Python 3.x

Note:

  • The GPU version requires a Linux environment with an NVIDIA GPU that supports CUDA.
  • RAPIDS is not compatible with Windows.
  • CUDA does not support macOS.
  • For detailed instructions on installing RAPIDS, please refer to the RAPIDS documentation.

Example

For demonstration purposes, a folder named Example0 has been provided, containing three CSV files: file0.csv, file1.csv, and file2.csv. You can run the tool on this example by specifying the folder path.

python merge_gpu.py --folder_path Example0

gpu-csv-merger's People

Contributors

tiiiiiida avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.