Code Monkey home page Code Monkey logo

junix's Introduction

junix: Unix as if JSON mattered

Unix tools are built around the idea of reading and writing lines of text. Anxieties about the limitations of this approach, and ideas for working with more complex structures, go back decades.

What if the tools were built around JSON tokens and objects instead?

Quoting conventions for arguments

Ideally commmand-line arguments would also be JSON tokens or objects, but the tools need to interoperate with the standard shell and other utilities, and arguments are normally typeless, interpreted as strings or options.

For explicit typing, any argument that begins with a colon has that character stripped and applies normal JSON parsing to the remainder of the argument. So:

  • :true, :false, :null
  • :"hello world"
  • :12345.67
  • :{ "true": false }

Any argument that does not begin with a colon is interpreted as a number if it is uniformly numeric (including scientific notation), and as a string otherwise. Strings that begin with a colon, or strings that look like numbers, must be quoted with : to avoid misinterpretation.

By this standard, command line options like -rf and the options-terminator -- are strings. Perhaps this should be revisited.

Note that the argument "hello" (which would be typed as '"hello"' or "\"hello\"") is equivalent to :"\"hello\"", with nested quotation marks.

(A future shell should also be aware of these conventions for * expansion and tab completion.)

jecho

jecho prints its arguments. By convention and for readability, each is printed on a separate line, although they could be separated by any whitespace in practice.

$ jecho "hello world" ":{ \"foo\": true }" 12345
"hello world"
{"foo":true}
12345

jarg

jarg stringifies its input objects using the command line argument : quoting conventions, so that output of one command can be passed as arguments to another. The idea is that

jecho $(jwhatever | jarg)

should be equivalent to jwhatever on its own.

$ jecho "hello world" ":{ \"foo\": true }" 12345 | jarg
:"hello\u0020world"
:{"foo":true}
:12345

Note that spaces within strings are quoted to avoid creating separate shell arguments. The only whitespace in the output separates JSON objects.

jcat

jcat concatenates JSON streams, which is basically like concatenating any other streams.

jarray

jarray wraps the JSON objects that it reads in an array.

$ jecho "hello world" ":{ \"foo\": true }" 12345 | jarray
["hello world",{"foo":true},12345]

junarray

junarray removes an array wrapper from its input. It is an error if an input object is not an array. If there are multiple arrays in the input, their contents are merged together.

$ jecho ":[1,2,3]" ":[4,5]" | junarray
1
2
3
4
5

jkeys

jkeys extracts the keys of each hash in its input into an array. It is an error if an input object is not a hash.

$ jecho ":{\"foo\": true, \"bar\": false}" ":{\"yes": \"no\"}"
["foo","bar"]
["yes"]

jls

It might be nice if the output of plain jls were just a stream of strings, but this creates needless incompatibility with recursive or long listings. So instead they are hash keys, and if you just want the names, you can use jkeys.

$ jls / | jkeys
"bin"
"dev"
"etc"
"usr"
"tmp"
$ jls -l /
{
	"bin": { "user": "root", "group": "wheel", "bytes": 1326, "modified": "2016-08-01T12:34:45Z", "type": "directory" },
	"dev": { "user": "root", "group": "wheel", "bytes": 4218, "modified": "2016-08-29T11:22:33Z", "type": "directory" },
	...
}
$ jls -r /
{
	"bin": {
		"entries": {
			"bash":{}, "cat":{}, "chmod":{}, "cp":{}, "date":{}, ...
		}
	},
	"usr": {
		"entries": {
			"bin": {
				"entries": { "awk":{}, "bc":{}, "c++":{}, ... }
			},
			"include": {
				"entries": { "stdio.h":{}, "stdlib.h":{}, "unistd.h":{}, ... }
			},
			...
		}
	},
	...
}

TBD: How should tersely or verbosely should permissions be reported?

jgrep

Grep is hard. There needs to be support for specifying both

  • the unit of analysis, and
  • the pattern to be matched

and the unit of analysis may be

  • top-level objects
  • objects in an array, or
  • key-value pairs, by key or value

and the pattern is probably an arbitrary JS-like expression, not something compact like a regular expression, although the pattern match will probably involve regular expression matching.

If the unit of analysis is not top-level objects, should the wrappers be retained? Must the wrappers be retained if key-value pairs are the unit of analysis?

GeoJSON

To grep GeoJSON, you need to specify how to recognize a feature, and then an expression for which features you are interested in. So something like:

jgrep -F 'type == "Feature"' 'properties.MTFCC =~ /^S/'

to split on Features and to select the ones that are TIGER roads.

ls

If directory entries are hash keys, and you want to select, say, all the PDF files in a recursive ls, then you need first of all to be able to recognize which hash keys are filenames (as opposed to keys for the file size and date), and then to match on them.

Maybe the answer is to treat each key-value pair as a temporary pseudo-object with key and value components? Also requires magic parentkey and parentvalue references to talk about the the key-value pair that this is a child of, as opposed to the entire hash that it is a child of.

jgrep -F 'key != undef && (parentkey == null || parentkey == "entries')" 'key =~ /\.pdf$/i'

Grepping multiple files

Traditional grep output is just the matched text if there is one input, and the name of the file followed by the matched text if there are multiple files.

Should the top level be a hash mapping filenames to an array of the objects that were matched in each one? With an empty string as the key if searching the standard input?

awk

comm

cut

du

expr

file

find

head

join

last

less

paste

ps

sed

sort

tail

test

tr

uniq

vi

wc

who

xargs

junix's People

Contributors

e-n-f avatar

Stargazers

大関 金城 秀喜 カシオ avatar Chris Boette avatar Martin Paljak avatar Fulvio Scapin avatar Athanasios Anastasiou avatar Martin Raifer avatar Aaron Lidman avatar

Watchers

Aaron Lidman avatar James Cloos avatar Martin Raifer avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.