Code Monkey home page Code Monkey logo

bigdata-file-viewer's Introduction

bigdata-file-viewer

A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, etc.

Note, you're recommended to download release v1.1.1 to if you just want to view local bigdata binary files, it's lightweight without dependency to AWS SDK, Azure SDK, etc. Quite honestly, you can download data files from web portal of AWS, Azure ,etc. before viewing it with this tool. The reason why I integrted the cloud storage system's SDK into this tool is more like a demo of how to use Java to read files from specific storage system.

GitHub stars GitHub release GitHub license

Feature List

  • Open and view Parquet, ORC and AVRO at local directory, HDFS, AWS S3, etc.
  • Convert binary format data to text format data like CSV
  • Support complex data type like array, map, struct, etc
  • Suport multiple platforms like Windows, MAC and Linux
  • Code is extensible to involve other data format

Usage

  • Download runnable jar from release page or run from directory by mvn exec:java -Dexec.mainClass=org.eugene.App
  • Invoke it by java -jar BigdataFileViewer-1.1-SNAPSHOT-jar-with-dependencies.jar
  • Open binary format file by "File" -> "Open". Currently, it can open file with parquet suffix, orc suffix and avro suffix. If no suffix specified, the tool will try to extract the it as Parquet file
  • Set the maximum rows of each page by "View" -> Input maximum row number -> "Go"
  • Set visible properties by "View" -> "Add/Remove Properties"
  • Convert to CSV file by "File" -> "Save as" -> "CSV"
  • Check schema information by unfolding "Schema Information" panel

Click here for live demo

Build

  • To build an all-in-one runnable jar, use mvn clean compile assembly:single
  • Java 1.8 or higher is required
  • Make sure the Java has javafx bound. For example, I installed openjdk 1.8 on Ubuntu 18.04 and it has no javafx bound, I installed it following guide here.

Screenshots

Main page

bigdata-file-viewer's People

Contributors

eugene-mark avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.