Code Monkey home page Code Monkey logo

bashstyle's Introduction

progrium/bashstyle

Bash is the JavaScript of systems programming. Although in some cases it's better to use a systems language like C or Go, Bash is an ideal systems language for smaller POSIX-oriented or command line tasks. Here's three quick reasons why:

  • It's everywhere. Like JavaScript for the web, Bash is already there ready for systems programming.
  • It's neutral. Unlike Ruby, Python, JavaScript, or PHP, Bash offends equally across all communities. ;)
  • It's made to be glue. Write complex parts in C or Go (or whatever!), and glue them together with Bash.

This document is how I write Bash and how I'd like collaborators to write Bash with me in my open source projects. It's based on a lot of experience and time collecting best practices. Most of them come from these two articles, but here integrated, slightly modified, and focusing on the most bang for buck items. Plus some new stuff!

Keep in mind this is not for general shell scripting, these are rules specifically for Bash and can take advantage of assumptions around Bash as the interpreter.

Big Rules

  • Always double quote variables, including subshells. No naked $ signs
  • All code goes in a function. Even if it's one function, main.
    • Unless a library script, you can do global script settings and call main. That's it.
    • Avoid global variables. Though when defining constants use readonly
  • Always have a main function for runnable scripts, called with main or main "$@"
    • If script is also usable as library, call it using [[ "$0" == "$BASH_SOURCE" ]] && main "$@"
  • Always use local when setting variables, unless there is reason to use declare
    • Exception being rare cases when you are intentionally setting a variable in an outer scope.
  • Variable names should be lowercase unless exported to environment.
  • Always use set -eo pipefail. Fail fast and be aware of exit codes.
    • Use || true on programs that you intentionally let exit non-zero.
  • Never use deprecated style. Most notably:
  • Prefer absolute paths (leverage $PWD), always qualify relative paths with ./.
  • Always use declare and name variable arguments at the top of functions that are more than 2-lines
    • Example: declare arg1="$1" arg2="$2"
    • The exception is when defining variadic functions. See below.
  • Use mktemp for temporary files, always cleanup with a trap.
  • Warnings and errors should go to STDERR, anything parsable should go to STDOUT.
  • Try to localize shopt usage and disable option when finished.

If you know what you're doing, you can bend or break some of these rules, but generally they will be right and be extremely helpful.

Best Practices and Tips

  • Use Bash variable substitution if possible before awk/sed.
  • Generally use double quotes unless it makes more sense to use single quotes.
  • For simple conditionals, try using && and ||.
  • Don't be afraid of printf, it's more powerful than echo.
  • Put then, do, etc on same line, not newline.
  • Skip [[ ... ]] in your if-expression if you can test for exit code instead.
  • Use .sh or .bash extension if file is meant to be included/sourced. Never on executable script.
  • Put complex one-liners of sed, perl, etc in a standalone function with a descriptive name.
  • Good idea to include [[ "$TRACE" ]] && set -x
  • Design for simplicity and obvious usage.
    • Avoid option flags and parsing, try optional environment variables instead.
    • Use subcommands for necessary different "modes".
  • In large systems or for any CLI commands, add a description to functions.
    • Use declare desc="description" at the top of functions, even above argument declaration.
    • This can be queried/extracted using reflection. For example:
    eval $(type FUNCTION_NAME | grep 'declare desc=') && echo "$desc"
    
  • Be conscious of the need for portability. Bash to run in a container can make more assumptions than Bash made to run on multiple platforms.
  • When expecting or exporting environment, consider namespacing variables when subshells may be involved.
  • Use hard tabs. Heredocs ignore leading tabs, allowing better indentation.

Good References and Help

Examples

Regular function with named arguments

Defining functions with arguments

regular_func() {
	declare arg1="$1" arg2="$2" arg3="$3"

	# ...
}

Variadic functions

Defining functions with a final variadic argument

variadic_func() {
	local arg1="$1"; shift
	local arg2="$1"; shift
	local rest="$@"

	# ...
}

Conditionals: Testing for exit code vs output

# Test for exit code (-q mutes output)
if grep -q 'foo' somefile; then
  ...
fi

# Test for output (-m1 limits to one result)
if [[ "$(grep -m1 'foo' somefile)" ]]; then
  ...
fi

More todo

bashstyle's People

Contributors

briceburg avatar clarete avatar dtruebin avatar eedrah avatar progrium avatar shazow avatar unixorn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bashstyle's Issues

Tabs are not evil!

I using tabs, because of the great application in these heredocs:

#!/usr/bin/env bash

main() {
    cat > "message.txt" <<-EOF
        Hello, ${USER}!

        This has no leading whitespace.
        It is also nested in a more readable way.
        All indentation here is with tabs.
    EOF
}

[[ "$0" == "${BASH_SOURCE}" ]] && main "$@"

[discuss] Suffix all bash scripts with .sh

Use .sh or .bash extension if file is meant to be included/sourced. Never on executable script.

I'm curious about the philosophy of NOT including .sh as a suffix for scripts.

The main reason I always suffix shell scripts with .sh is to make shellcheck linting simpler. A makefile rule with find . -type f -name '*.sh' | xargs shellcheck is the simplest way I know to ensure quality control for all scripts in a repo.

Using STDOUT, STDERR, and the other file descriptors

STDOUT should be easily parseable. Output from STDOUT should avoid using locale-dependent human language wherever possible. (The closer you can match the syntax used by an existing, older utility your output would be piped to, the better.)

STDERR should be used for any warnings or errors (since they won't go in a pipe and will normally go to the terminal). See this Stack Overflow question.

The other file descriptors should maybe have a convention, taking into account how they get used and what bugs there are around them (isn't there something about fd5 not being copied for subcommands or something)? For example, fd6 should be used as the holding descriptor when swapping two other file descriptors.

I know I used the convention in Plushu that a command's programmatic output was to be output on STDOUT (so it could be captured with command substitution), and the command's terminal output (the output from its calls' STDOUT) was output on fd3.

Ownership and modes

(This is a best practice that sort of runs into the "protips" area, one that applies more in Plushu or Dokku than it does to most Bash scripts.)

Ownership and sudo

Whenever you're creating something in a script, you should make sure that the files will be owned by the correct user.

For example, if you have a script that creates a directory of files, and this script is meant to be run as root, after creating it, you should do something like this:

if [[ "$EUID" == 0 && -n "$SUDO_USER" ]]; then
  chown -R "$SUDO_USER:" "$created_dir"
fi

Take care to note the colon after the username in the chown command. This tells chown to change not only the owning user on the files, but also the owning group on those files to the specified user's group (the same ownership it would have had had that user created the files themselves).

Permission bits

If you want to create a file with certain modes unset, you can run the command that creates the file in a subshell, prefixed by a umask command which will unset permission bits for any file created in that subshell:

(umask 0226; printf '%s\n' \
  "$PLUSHU_USER ALL=(ALL)NOPASSWD:`command -v nginx` -s reload" \
  >/etc/sudoers.d/plushu-reload-nginx)

Note that the umask is an inverted octal bitmask to restrict the permissions that files will be created with. If the script will normally create files with permission bits 0666 (-rw-rw-rw-), a umask of 0226 will create files with permissions of 0440 (-r--r-----).

This is specifically useful for creating a sudoers file (either the main /etc/sudoers or a file included from it), as sudo will refuse to run when a file in sudo's configuration does not have the proper permission bits. This can also be useful when working with files in a user's .ssh directory.

Only capitalize exported/environment variable names

This is one of the stylistic rules I'm using for Plushu, and it's working out pretty well: only use ALL_CAPS_SNAKE_CASE when you're going to be getting a variable from the environment, or when you're going to export it to the environment. Names for script-local variables that will only appear in subshells should be small_caps_snake_case.

maintain in bpkg ?

Heyo - would you be into maintaining this in bpkg ? It'd be great if we could spread these same ideas about bash in bpkg as part of its core philosophy. I could always just reference : ]

`if expr` vs `if [[ expr ]]`?

In what cases does it make sense to use if expr?

I recall there is some edge case scenario where it's more straight forward to do that, but I can't remember when.

Shell linter

Probably makes sense to write a tool that analyzes shell scripts to tell you if you did it right.

set -u

I would add set -u into your advice for set -eo pipefail, i.e. set -euo pipefail

An enhancement to that is to fail early in a script with useful feedback.

: ${FOO:?You have not set the FOO variable}

.bash_profile tips

Hey @progrium , loved the idea to put this repo together.
I was wondering what would be a good advice when having to export and set alias on .bash_profile.

Is there any good convention to keep those organised like:

aliases() {
    alias lsa="ls -la"
    …
}

exports() {
    export PATH="/usr/local/bin"
}

cheers

posix shell instead of bash?

Why not encourage posix shell instead of bash? It's still possible to stick with /bin/sh shell and make things looks as neat as bash.

Notes on randomness, /tmp files, cleanup, and signal traps

  • When you need to save a stream, use a file created by mktemp. Note that mktemp will use some (crypto-strong? I don't even think so) randomness to give your file an unpredictable name that's not going to collide with anything like another concurrent running instance of your script.
    • This is the most stable and secure solution I could find: I'd be glad if someone had a better solution here, as this does kind of give me that "this is totally a hole" feeling.
  • When you're making temporary files, make sure they're deleted as soon as possible.
  • If you've set -eo pipefail, your script could exit at any point, so make sure you've got your cleanup run as part of a trap for the exit signal.

When, and how, to use xargs

Since this is apparently a somewhat obscure tool for a very necessary task (converting strings to argument lists while preserving quoting/grouping), I think it's called for to have a note on using xargs -x (and xargs -xa when you can't clobber STDIN) to deserialize arguments from strings/streams, involving how to interpolate multiple inputs to output your string as a stream (use a sequence of commands in a subshell).

Throwing `set -a` and `set +a` around a variable pair list

set -a makes it so that every assignment has an implicit export in front of it. set +a turns this off.

This makes it really easy to keep configuration in a file like this (which is compatible with Docker's --env option and EnvironmentFile in systemd):

SECRET_BASE=qweasd4321
SPECIAL_KEY=ghfdwqu98

and then export its variables (eg. setting up an environment for a script):

set -a; . config_file; set +a

Why to always use "if [[ ... ]]; then; fi" instead of a conditional statement

Statements joined with && don't end a script even when it's running with set -e, but if they're the last statement in your file, their non-zero exit code will be your script's exit code (think of it like an implicit return), and if that script was getting called by something with set -e, the whole house of cards will come tumbling down.

In other words, while it's an easily-avoided ledge, it still needs a railing, because a single careless bump will send you way the hell crash tumbling down.

I used to remember this pitfall, but then I went away for a few months, and when I came back I couldn't remember why I didn't do this, and I did it all over my code, and immediately got bitten by this. In general, if you're using set -e, only use tests to test things.

Locale independence

Setting LC_ALL=C

See http://unix.stackexchange.com/questions/87745/what-does-lc-all-c-do

If you don't set LC_ALL=C for, say, sorting a list of numbers from a container/network source, it's going to be dependent on the system locale. That's a problem, if, say, that system's locale is designed for using commas as the decimal separator rather than dots.

Even if you're managing the environment in a container, make sure that the default locale in that container's environment is LC_ALL=C, rather than something like en_US.

Serializing with "$(printf %q "$VARIABLE")"

This is one of those holes that can open up if you're not being conscious of how you're outputting your strings. If it could have spaces, quotes, or dollar signs (and you should always treat strings as potentially having these things), and you're putting it somewhere where it will be treated as part of a string representing an argument list (to be deserialized using xargs -x, see #17), you should output it with printf %q rather than just naively echoing it (which should also be done with printf, see #18).

Namespacing / environmental consciousness; and, when to use args and when to use vars

There should maybe be a note about how general to make the environment variables you choose: remember that an ENV variable like DIR can be the only usage of DIR throughout the entire call stack leading up to your command, and if you use too general a name for too specific a thing, you're going to force authors to wrap your command in something that sets the environment up for your command anyway. (Besides, for this specific example, you should be using PWD for the one "dir" you're going to work in - that's why it's called the working directory.)

Similarly, maybe there should be a note on when you should use environment variables (for long-running deep truths about the environment in which the command and all lower commands are being executed) and when you should use arguments (when it's a parameter for this specific operation). Maybe even some thoughts on when to use positional opts vs. long opts vs. short opts vs. subcommands?

Coverage

Please take a look at LCOV.SH full BASH implementation of coverage… no need for additional interpreters like RUBY or BINARY executable in your machine. Check coverage of BASH just with BASH https://github.com/javanile/lcov.sh

Tracing and Debugging

One nice thing to put in a script's boilerplate is an environment variable to read that will enable xtrace and verbose, something like:

if [[ -n "$PLUSHU_TRACE" ]]; then set -vx; fi

This makes it possible to debug a script without having to mess with its option parsing or shebang.

Fully-qualified paths

Even relative paths should be qualified with ./: ideally, all paths should be fully-qualified down to the root, so that they'll be equally valid when passed to another command with a different PWD.

Never use paths with an implicit containing directory (like rm *): if one of these filenames start with a hyphen, it's entirely likely to be interpreted as an option rather than a filename (hope that rm * wasn't run in a directory with a file named -r)!

Use [[ -n "$whatever" ]] rather than [[ "$whatever" ]]

Okay, so, in Bash, with double-brace tests, this will never actually be misinterpreted.

However:

  • Using -n makes your test have an operator, just like every other test will, making it perfectly clear that this wasn't, say, somebody forgetting to add an operator/operand to the test.
  • In older shell constructs like the single-brace test (which your tests could be converted to if somebody copies a snippet of it to convert to zsh or something), [ "$whatever" ... is how you specify a variable test operator (yikes!), and thus this opens up a potential channeling attack (see #18).

Use `printf '%s\n'` rather than `echo` to echo strings

If the string you're echoing is -n, passing that string to echo will output nothing, potentially causing tons of stuff to break. This is why you should use printf '%s\n' for all line-echoes, except maybe when you're echoing fixed messages to STDERR with echo >&2.

Bash Unit Testing

There's more than just shunit2 out there for bash unit testing. I've used:

  • roundup
  • plus a bunch of additions such as test reporting via junit (internal, which I'd like to get out to the community), namely better integration with:

Mocking:

I've also been wanting to try out:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.