Code Monkey home page Code Monkey logo

pyp's Introduction

The Pyed Piper (pyp)

The original project page on Google Code here: http://code.google.com/p/pyp/

Installation

git clone https://github.com/alexbyrnes/pyp.git
cd pyp
chmod u+x pyp
# Optionally: cp pyp /usr/local/bin

It's handy to put pyp into a directory on your path (for example /usr/local/bin) so you can type "pyp" instead of "./pyp".

####Background

Pyp, or The Pyed Piper, is an incredibly useful command line tool for:

  • High volume transformations of unstructured data
  • Operations that aren't available in Unix/Linux, or aren't easy
  • Thinkers-in-python

####Usage

Filters

cat very_large_file.csv | pyp -L " len(p) > 5 " > only_long_lines.csv

Regular Expressions

cat very_large_file.csv | pyp -L " p.re('[0-9a-fA-F]*') " > only_hex_digits.csv

Compose multiple operations

cat very_large_file.csv | pyp -L " p.upper() | whitespace | p[:2] " > first_two_colums_uppercase.csv

Many more examples in the manual, and in examples.sh

####Running the Tests

python setup.py test

This will test pyp under multiple versions of python. If you only need to test a single version of python you can do this instead:

python setup.py test -a "-epy27"

####Making the C version (requires Cython)

make test

This will output a binary cyp and test it with a simple command.

#####What's New

  • -L flag for large (> 50,000 line) files
  • --DEBUG to debug output with line numbers and stack trace
  • -D to output tab delimited text. The large file flag includes this automatically. Add -S to specify the delimiter.
  • "cyp" compiled version
  • Optimizations -- p.file, p.dir, and p.ext moved to p.file(), p.dir(), p.ext()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.