burntsushi / ripgrep Goto Github PK

ripgrep recursively searches directories for a regex pattern while respecting your gitignore

License: The Unlicense

Rust 94.39% Shell 2.13% Python 2.67% Ruby 0.04% Roff 0.77%

ripgrep recursively-search search regex gitignore grep command-line-tool command-line cli rust

ripgrep's Introduction

ripgrep (rg)

ripgrep is a line-oriented search tool that recursively searches the current directory for a regex pattern. By default, ripgrep will respect gitignore rules and automatically skip hidden files/directories and binary files. (To disable all automatic filtering by default, use rg -uuu.) ripgrep has first class support on Windows, macOS and Linux, with binary downloads available for every release. ripgrep is similar to other popular search tools like The Silver Searcher, ack and grep.

Dual-licensed under MIT or the UNLICENSE.

CHANGELOG

Please see the CHANGELOG for a release history.

Documentation quick links

Screenshot of search results

Quick examples comparing tools

This example searches the entire Linux kernel source tree (after running make defconfig && make -j8) for [A-Z]+_SUSPEND, where all matches must be words. Timings were collected on a system with an Intel i9-12900K 5.2 GHz.

Please remember that a single benchmark is never enough! See my blog post on ripgrep for a very detailed comparison with more benchmarks and analysis.

Tool	Command	Line count	Time
ripgrep (Unicode)	`rg -n -w '[A-Z]+_SUSPEND'`	536	0.082s (1.00x)
hypergrep	`hgrep -n -w '[A-Z]+_SUSPEND'`	536	0.167s (2.04x)
git grep	`git grep -P -n -w '[A-Z]+_SUSPEND'`	536	0.273s (3.34x)
The Silver Searcher	`ag -w '[A-Z]+_SUSPEND'`	534	0.443s (5.43x)
ugrep	`ugrep -r --ignore-files --no-hidden -I -w '[A-Z]+_SUSPEND'`	536	0.639s (7.82x)
git grep	`LC_ALL=C git grep -E -n -w '[A-Z]+_SUSPEND'`	536	0.727s (8.91x)
git grep (Unicode)	`LC_ALL=en_US.UTF-8 git grep -E -n -w '[A-Z]+_SUSPEND'`	536	2.670s (32.70x)
ack	`ack -w '[A-Z]+_SUSPEND'`	2677	2.935s (35.94x)

Here's another benchmark on the same corpus as above that disregards gitignore files and searches with a whitelist instead. The corpus is the same as in the previous benchmark, and the flags passed to each command ensure that they are doing equivalent work:

Tool	Command	Line count	Time
ripgrep	`rg -uuu -tc -n -w '[A-Z]+_SUSPEND'`	447	0.063s (1.00x)
ugrep	`ugrep -r -n --include='.c' --include='.h' -w '[A-Z]+_SUSPEND'`	447	0.607s (9.62x)
GNU grep	`grep -E -r -n --include='.c' --include='.h' -w '[A-Z]+_SUSPEND'`	447	0.674s (10.69x)

Now we'll move to searching on single large file. Here is a straight-up comparison between ripgrep, ugrep and GNU grep on a file cached in memory (~13GB, OpenSubtitles.raw.en.gz, decompressed):

Tool	Command	Line count	Time
ripgrep (Unicode)	`rg -w 'Sherlock [A-Z]\w+'`	7882	1.042s (1.00x)
ugrep	`ugrep -w 'Sherlock [A-Z]\w+'`	7882	1.339s (1.28x)
GNU grep (Unicode)	`LC_ALL=en_US.UTF-8 egrep -w 'Sherlock [A-Z]\w+'`	7882	6.577s (6.31x)

In the above benchmark, passing the -n flag (for showing line numbers) increases the times to 1.664s for ripgrep and 9.484s for GNU grep. ugrep times are unaffected by the presence or absence of -n.

Beware of performance cliffs though:

Tool	Command	Line count	Time
ripgrep (Unicode)	`rg -w '[A-Z]\w+ Sherlock [A-Z]\w+'`	485	1.053s (1.00x)
GNU grep (Unicode)	`LC_ALL=en_US.UTF-8 grep -E -w '[A-Z]\w+ Sherlock [A-Z]\w+'`	485	6.234s (5.92x)
ugrep	`ugrep -w '[A-Z]\w+ Sherlock [A-Z]\w+'`	485	28.973s (27.51x)

And performance can drop precipitously across the board when searching big files for patterns without any opportunities for literal optimizations:

Tool	Command	Line count	Time
ripgrep	`rg '[A-Za-z]{30}'`	6749	15.569s (1.00x)
ugrep	`ugrep -w '[A-Z]\w+ Sherlock [A-Z]\w+'`	6749	21.857s (1.40x)
GNU grep	`LC_ALL=C grep -E '[A-Za-z]{30}'`	6749	32.409s (2.08x)
GNU grep (Unicode)	`LC_ALL=en_US.UTF-8 grep -E '[A-Za-z]{30}'`	6795	8m30s (32.74x)

Finally, high match counts also tend to both tank performance and smooth out the differences between tools (because performance is dominated by how quickly one can handle a match and not the algorithm used to detect the match, generally speaking):

Tool	Command	Line count	Time
ripgrep	`rg the`	83499915	6.948s (1.00x)
ugrep	`ugrep the`	83499915	11.721s (1.69x)
GNU grep	`LC_ALL=C grep the`	83499915	15.217s (2.19x)

Why should I use ripgrep?

It can replace many use cases served by other search tools because it contains most of their features and is generally faster. (See the FAQ for more details on whether ripgrep can truly replace grep.)
Like other tools specialized to code search, ripgrep defaults to recursive search and does automatic filtering. Namely, ripgrep won't search files ignored by your .gitignore/.ignore/.rgignore files, it won't search hidden files and it won't search binary files. Automatic filtering can be disabled with rg -uuu.
ripgrep can search specific types of files. For example, rg -tpy foo limits your search to Python files and rg -Tjs foo excludes JavaScript files from your search. ripgrep can be taught about new file types with custom matching rules.
ripgrep supports many features found in grep, such as showing the context of search results, searching multiple patterns, highlighting matches with color and full Unicode support. Unlike GNU grep, ripgrep stays fast while supporting Unicode (which is always on).
ripgrep has optional support for switching its regex engine to use PCRE2. Among other things, this makes it possible to use look-around and backreferences in your patterns, which are not supported in ripgrep's default regex engine. PCRE2 support can be enabled with -P/--pcre2 (use PCRE2 always) or --auto-hybrid-regex (use PCRE2 only if needed). An alternative syntax is provided via the --engine (default|pcre2|auto-hybrid) option.
ripgrep has rudimentary support for replacements, which permit rewriting output based on what was matched.
ripgrep supports searching files in text encodings other than UTF-8, such as UTF-16, latin-1, GBK, EUC-JP, Shift_JIS and more. (Some support for automatically detecting UTF-16 is provided. Other text encodings must be specifically specified with the -E/--encoding flag.)
ripgrep supports searching files compressed in a common format (brotli, bzip2, gzip, lz4, lzma, xz, or zstandard) with the -z/--search-zip flag.
ripgrep supports arbitrary input preprocessing filters which could be PDF text extraction, less supported decompression, decrypting, automatic encoding detection and so on.
ripgrep can be configured via a configuration file.

In other words, use ripgrep if you like speed, filtering by default, fewer bugs and Unicode support.

Why shouldn't I use ripgrep?

Despite initially not wanting to add every feature under the sun to ripgrep, over time, ripgrep has grown support for most features found in other file searching tools. This includes searching for results spanning across multiple lines, and opt-in support for PCRE2, which provides look-around and backreference support.

At this point, the primary reasons not to use ripgrep probably consist of one or more of the following:

You need a portable and ubiquitous tool. While ripgrep works on Windows, macOS and Linux, it is not ubiquitous and it does not conform to any standard such as POSIX. The best tool for this job is good old grep.
There still exists some other feature (or bug) not listed in this README that you rely on that's in another tool that isn't in ripgrep.
There is a performance edge case where ripgrep doesn't do well where another tool does do well. (Please file a bug report!)
ripgrep isn't possible to install on your machine or isn't available for your platform. (Please file a bug report!)

Is it really faster than everything else?

Generally, yes. A large number of benchmarks with detailed analysis for each is available on my blog.

Summarizing, ripgrep is fast because:

It is built on top of Rust's regex engine. Rust's regex engine uses finite automata, SIMD and aggressive literal optimizations to make searching very fast. (PCRE2 support can be opted into with the -P/--pcre2 flag.)
Rust's regex library maintains performance with full Unicode support by building UTF-8 decoding directly into its deterministic finite automaton engine.
It supports searching with either memory maps or by searching incrementally with an intermediate buffer. The former is better for single files and the latter is better for large directories. ripgrep chooses the best searching strategy for you automatically.
Applies your ignore patterns in .gitignore files using a RegexSet. That means a single file path can be matched against multiple glob patterns simultaneously.
It uses a lock-free parallel recursive directory iterator, courtesy of crossbeam and ignore.

Feature comparison

Andy Lester, author of ack, has published an excellent table comparing the features of ack, ag, git-grep, GNU grep and ripgrep: https://beyondgrep.com/feature-comparison/

Note that ripgrep has grown a few significant new features recently that are not yet present in Andy's table. This includes, but is not limited to, configuration files, passthru, support for searching compressed files, multiline search and opt-in fancy regex support via PCRE2.

Playground

If you'd like to try ripgrep before installing, there's an unofficial playground and an interactive tutorial.

If you have any questions about these, please open an issue in the tutorial repo.

Installation

The binary name for ripgrep is rg.

Archives of precompiled binaries for ripgrep are available for Windows, macOS and Linux. Linux and Windows binaries are static executables. Users of platforms not explicitly mentioned below are advised to download one of these archives.

If you're a macOS Homebrew or a Linuxbrew user, then you can install ripgrep from homebrew-core:

$ brew install ripgrep

If you're a MacPorts user, then you can install ripgrep from the official ports:

$ sudo port install ripgrep

If you're a Windows Chocolatey user, then you can install ripgrep from the official repo:

$ choco install ripgrep

If you're a Windows Scoop user, then you can install ripgrep from the official bucket:

$ scoop install ripgrep

If you're a Windows Winget user, then you can install ripgrep from the winget-pkgs repository:

$ winget install BurntSushi.ripgrep.MSVC

If you're an Arch Linux user, then you can install ripgrep from the official repos:

$ sudo pacman -S ripgrep

If you're a Gentoo user, you can install ripgrep from the official repo:

$ sudo emerge sys-apps/ripgrep

If you're a Fedora user, you can install ripgrep from official repositories.

$ sudo dnf install ripgrep

If you're an openSUSE user, ripgrep is included in openSUSE Tumbleweed and openSUSE Leap since 15.1.

$ sudo zypper install ripgrep

If you're a RHEL/CentOS 7/8 user, you can install ripgrep from copr:

$ sudo yum install -y yum-utils
$ sudo yum-config-manager --add-repo=https://copr.fedorainfracloud.org/coprs/carlwgeorge/ripgrep/repo/epel-7/carlwgeorge-ripgrep-epel-7.repo
$ sudo yum install ripgrep

If you're a Nix user, you can install ripgrep from nixpkgs:

$ nix-env --install ripgrep

If you're a Guix user, you can install ripgrep from the official package collection:

$ guix install ripgrep

If you're a Debian user (or a user of a Debian derivative like Ubuntu), then ripgrep can be installed using a binary .deb file provided in each ripgrep release.

$ curl -LO https://github.com/BurntSushi/ripgrep/releases/download/13.0.0/ripgrep_13.0.0_amd64.deb
$ sudo dpkg -i ripgrep_13.0.0_amd64.deb

If you run Debian stable, ripgrep is officially maintained by Debian, although its version may be older than the deb package available in the previous step.

$ sudo apt-get install ripgrep

If you're an Ubuntu Cosmic (18.10) (or newer) user, ripgrep is available using the same packaging as Debian:

$ sudo apt-get install ripgrep

(N.B. Various snaps for ripgrep on Ubuntu are also available, but none of them seem to work right and generate a number of very strange bug reports that I don't know how to fix and don't have the time to fix. Therefore, it is no longer a recommended installation option.)

If you're an ALT user, you can install ripgrep from the official repo:

$ sudo apt-get install ripgrep

If you're a FreeBSD user, then you can install ripgrep from the official ports:

$ sudo pkg install ripgrep

If you're an OpenBSD user, then you can install ripgrep from the official ports:

$ doas pkg_add ripgrep

If you're a NetBSD user, then you can install ripgrep from pkgsrc:

$ sudo pkgin install ripgrep

If you're a Haiku x86_64 user, then you can install ripgrep from the official ports:

$ sudo pkgman install ripgrep

If you're a Haiku x86_gcc2 user, then you can install ripgrep from the same port as Haiku x86_64 using the x86 secondary architecture build:

$ sudo pkgman install ripgrep_x86

If you're a Void Linux user, then you can install ripgrep from the official repository:

$ sudo xbps-install -Syv ripgrep

If you're a Rust programmer, ripgrep can be installed with cargo.

Note that the minimum supported version of Rust for ripgrep is 1.72.0, although ripgrep may work with older versions.
Note that the binary may be bigger than expected because it contains debug symbols. This is intentional. To remove debug symbols and therefore reduce the file size, run strip on the binary.

$ cargo install ripgrep

Alternatively, one can use cargo binstall to install a ripgrep binary directly from GitHub:

$ cargo binstall ripgrep

Building

ripgrep is written in Rust, so you'll need to grab a Rust installation in order to compile it. ripgrep compiles with Rust 1.72.0 (stable) or newer. In general, ripgrep tracks the latest stable release of the Rust compiler.

To build ripgrep:

$ git clone https://github.com/BurntSushi/ripgrep
$ cd ripgrep
$ cargo build --release
$ ./target/release/rg --version
0.1.3

NOTE: In the past, ripgrep supported a simd-accel Cargo feature when using a Rust nightly compiler. This only benefited UTF-16 transcoding. Since it required unstable features, this build mode was prone to breakage. Because of that, support for it has been removed. If you want SIMD optimizations for UTF-16 transcoding, then you'll have to petition the encoding_rs project to use stable APIs.

Finally, optional PCRE2 support can be built with ripgrep by enabling the pcre2 feature:

$ cargo build --release --features 'pcre2'

Enabling the PCRE2 feature works with a stable Rust compiler and will attempt to automatically find and link with your system's PCRE2 library via pkg-config. If one doesn't exist, then ripgrep will build PCRE2 from source using your system's C compiler and then statically link it into the final executable. Static linking can be forced even when there is an available PCRE2 system library by either building ripgrep with the MUSL target or by setting PCRE2_SYS_STATIC=1.

ripgrep can be built with the MUSL target on Linux by first installing the MUSL library on your system (consult your friendly neighborhood package manager). Then you just need to add MUSL support to your Rust toolchain and rebuild ripgrep, which yields a fully static executable:

$ rustup target add x86_64-unknown-linux-musl
$ cargo build --release --target x86_64-unknown-linux-musl

Applying the --features flag from above works as expected. If you want to build a static executable with MUSL and with PCRE2, then you will need to have musl-gcc installed, which might be in a separate package from the actual MUSL library, depending on your Linux distribution.

Running tests

ripgrep is relatively well-tested, including both unit tests and integration tests. To run the full test suite, use:

$ cargo test --all

from the repository root.

Related tools

delta is a syntax highlighting pager that supports the rg --json output format. So all you need to do to make it work is rg --json pattern | delta. See delta's manual section on grep for more details.

Vulnerability reporting

For reporting a security vulnerability, please contact Andrew Gallant. The contact page has my email address and PGP public key if you wish to send an encrypted message.

Translations

The following is a list of known translations of ripgrep's documentation. These are unofficially maintained and may not be up to date.

ripgrep's People

Contributors

Stargazers

Watchers

Forkers

creativcoder llogiq orinocoz shelltips jimhester sirver andyleejordan eugene-bulkin linhua55 harishks rowedonalde chrisdoc kontomondo homeworkprod-forks dloss catchmrbharath severeoverfl0w markwoodhall artisdom cetra3 techscientist emlyn ddrcoder cybernetics moshen dot-sean devopsbox gsquire nickstenning amsharma91 otaran lilydjwg utkarshkukreti wflk akien-mga theamazingfedex leostera kurtdegreeff wavded bitshifter samuelcolvin munyari shaneutt durka alexlafroscia little-dude eluvatar lambda 8573 sanga lyuha pthariensflame tjdgus3537 emk noscripter cesarb jinyeow jacwah svend mernen happy-ferret forgottenswitch masasam jfransham llchen223 alluringjay igor-krawczuk rinetd robmurtha kelleyk neuroradiology ypcrumble morganbauer pkgw wdv4758h robi-wan felerius shepmaster stuhood blueyed thefighter isker ralfjung ahmedelgabri valentactive aswinpj hhatto einnjo jenix21 deepy aarthij ignatenkobrain scaevola tiehuis fireforge antiufo jdhorwitz roblourens xampprocky rlugojr

ripgrep's Issues

switch default thread count to `cpus - 1` instead of `cpus`

It makes sense to not use one thread more than the number of CPUs you have. Namely, -j launches N workers for searching, while the main thread does directory traversal.

Ideally, cpus would be the number of physical cores (i.e., not counting hyper threading).

Can't build

(This might be my stupidity at rust).

I tried downloading from github on Mac OS X, and then running cargo install:

~/p/r/ripgrep ❯❯❯ cargo install --verbose                                            ⏎
       Fresh rustc-serialize v0.3.19
       Fresh regex-syntax v0.3.5
       Fresh winapi-build v0.1.1
       Fresh lazy_static v0.2.1
       Fresh log v0.3.6
   Compiling strsim v0.5.1
     Running `rustc /Users/caj/.cargo/registry/src/github.com-1ecc6299db9ec823/strsim-0.5.1/src/lib.rs --crate-name strsim --crate-type lib -C opt-level=3 -C panic=abort -g -C metadata=b42a694875d9a3b0 -C extra-filename=-b42a694875d9a3b0 --out-dir /Users/caj/progs/rust/ripgrep/target/release/deps --emit=dep-info,link -L dependency=/Users/caj/progs/rust/ripgrep/target/release/deps -L dependency=/Users/caj/progs/rust/ripgrep/target/release/deps --cap-lints allow`
   Compiling kernel32-sys v0.2.2
     Running `rustc /Users/caj/.cargo/registry/src/github.com-1ecc6299db9ec823/kernel32-sys-0.2.2/build.rs --crate-name build_script_build --crate-type bin -g --out-dir /Users/caj/progs/rust/ripgrep/target/release/build/kernel32-sys-d6afa5bd3d7cfaef --emit=dep-info,link -L dependency=/Users/caj/progs/rust/ripgrep/target/release/deps -L dependency=/Users/caj/progs/rust/ripgrep/target/release/deps --extern build=/Users/caj/progs/rust/ripgrep/target/release/deps/libbuild-493a7b0628804707.rlib --cap-lints allow`
   Compiling utf8-ranges v0.1.3
     Running `rustc /Users/caj/.cargo/registry/src/github.com-1ecc6299db9ec823/utf8-ranges-0.1.3/src/lib.rs --crate-name utf8_ranges --crate-type lib -C opt-level=3 -C panic=abort -g -C metadata=5c6a6dacba3be7ce -C extra-filename=-5c6a6dacba3be7ce --out-dir /Users/caj/progs/rust/ripgrep/target/release/deps --emit=dep-info,link -L dependency=/Users/caj/progs/rust/ripgrep/target/release/deps -L dependency=/Users/caj/progs/rust/ripgrep/target/release/deps --cap-lints allow`
       Fresh winapi v0.2.8
       Fresh fnv v1.0.5
   Compiling libc v0.2.16
     Running `rustc /Users/caj/.cargo/registry/src/github.com-1ecc6299db9ec823/libc-0.2.16/src/lib.rs --crate-name libc --crate-type lib -C opt-level=3 -C panic=abort -g --cfg feature=\"default\" --cfg feature=\"use_std\" -C metadata=1417726cb94dbc83 -C extra-filename=-1417726cb94dbc83 --out-dir /Users/caj/progs/rust/ripgrep/target/release/deps --emit=dep-info,link -L dependency=/Users/caj/progs/rust/ripgrep/target/release/deps -L dependency=/Users/caj/progs/rust/ripgrep/target/release/deps --cap-lints allow`
error: the crate `build` is compiled with the panic strategy `abort` which is incompatible with this crate's strategy of `unwind`
error: aborting due to previous error
Build failed, waiting for other jobs to finish...
error: failed to compile `ripgrep v0.1.16 (file:///Users/caj/progs/rust/ripgrep)`, intermediate artifacts can be found at `/Users/caj/progs/rust/ripgrep/target`

Caused by:
  Could not compile `kernel32-sys`.

Caused by:
  Process didn't exit successfully: `rustc /Users/caj/.cargo/registry/src/github.com-1ecc6299db9ec823/kernel32-sys-0.2.2/build.rs --crate-name build_script_build --crate-type bin -g --out-dir /Users/caj/progs/rust/ripgrep/target/release/build/kernel32-sys-d6afa5bd3d7cfaef --emit=dep-info,link -L dependency=/Users/caj/progs/rust/ripgrep/target/release/deps -L dependency=/Users/caj/progs/rust/ripgrep/target/release/deps --extern build=/Users/caj/progs/rust/ripgrep/target/release/deps/libbuild-493a7b0628804707.rlib --cap-lints allow` (exit code: 101)

File name search

I have noticed that the rg -g option is like ag's -G option. But rg seems to be missing ag's -g option.

I use ag -g heavily as find command as I believe it is faster than find and the .agignore applies there too.. so no need to provide a complex ! $ -regex -o ..$ command to find.

If I want to get a list of all *.sv files, I would do ag -g '\.sv'.

Can that option please be added to rg?

It would be even more awesome if the meanings of -g and -G synced up between the two.

don't use memory maps ever on mac

See this thread for justification: https://news.ycombinator.com/item?id=12567326

We already do custom memory map logic for Windows, so it shoul dbe no problem to do it for Mac too.

add context handling to memory map searching

The memory map searcher supports every option except for printing contexts. Specifically, if any of -A, -B or -C are provided, then memory map searching can't be used.

Incidentally, handling contexts in the memory map searcher should be much easier than in the streaming searcher.

deprecate .rgingore, switch to .ignore

As discussed between myself and @ggreer: https://news.ycombinator.com/item?id=12568822 Yay cooperation!

support global gitignore config

e.g., $HOME/.config/git/ignore or $XDG_CONFIG_HOME/git/ignore or whatever file is specified by the configuration variable core.excludesFile. Another source might be $GIT_DIR/info/exclude.

.gitignore whitespace bug

If an ignored path in a .gitignore has whitespace afterwards

node_modules/   <--- whitespace

ripgrep with still search the node_modules/ folder, removing the whitespace fixes this problem.

--vimgrep reports incorrect column number

The column reported is off-by-one. You can see this here:

https://github.com/BurntSushi/ripgrep/blob/master/tests/tests.rs#L577

The W in "Watsons" is column 16, vim starts column counting with 1.

add -l / --files-with-matches option?

I use the -l option to grep / ack / ag / etc quite a lot to list files containing matches for further processing in a shell pipeline. Does it seem reasonable as a thing to add to ripgrep?

add integration tests

It should be easy to write tests that:

Take a CLI invocation.
Take one or more files to search.
Asserts some property about the output.

git: explicitly added files are ignored

I'm not sure if this is in scope of ripgrep, but explicitly added files in git repositories are ignored.

The script

#!/bin/bash
mkdir testfolder
pushd testfolder
    git init
    echo "some text" > testfile
    echo "*" > .gitignore
    git add -f testfile .gitignore
    git commit -m "Initial commit"
    echo "ripgrep:"
    rg some
    echo "git grep:"
    git --no-pager grep some
popd

yields following output:

Initialized empty Git repository in /full/path/testfolder/.git/
[master (root-commit) 350428e] Initial commit
 2 files changed, 2 insertions(+)
 create mode 100644 .gitignore
 create mode 100644 testfile
ripgrep:
No files were searched, which means ripgrep probably applied a filter you didn't expect. Try running again with --debug.
git grep:
testfile:some text

performance bug: ripgrep does poorly when given lots directory arguments

ripgrep is doing some non-trivial work for every file searched in positional parameters, and there's really no reason it should.

`--file` output includes './'

Hey, I'm playing around with rg as a replacement for ag in my fuzzy file finder and I noticed the output of rg --files includes a './' in the output, ex:

./src/file.c
./src/file2.c

In comparison to ag -g '' or ack -g '' which looks like this:

src/file.c
src/file2.c

Would you consider adding an option for this or changing it to match ack/ag? I'm aware I can use sed, but it would be nice if it was built-in.

Thanks

grep's great --file option not implemented

 -f file, --file=file
    Read one or more newline separated patterns from file.  Empty pattern
    lines match every input line.  Newlines are not considered part of a
    pattern.  If file is empty, nothing is matched.

🌞 note that neither ag or ack have this, so it would be another differentiator!

Option to consider only parent .rgignore files?

I have a ~/.gitignore and that is considered as an ignore file for everything in my $HOME. In my use case the contents of ~/.gitignore do not apply as ignore criteria for searches! I have ~/.gitignore set up just to prevent committing sensitive text files to git (and then push) by mistake. So that file contains something like

a*
!a_ok_to_commit

That would prevent search of all a* files everywhere in my $HOME.

So it would be great to allow parent .rgignore (only) file consideration.

In addition a .gitignore specific -u would also be needed.

In summary, there is very little overlap of the contents of .gitignore and search ignore files like .agignore and .rgignore.

For .gitignore:

Never use that for search ignores
Do not use parent .gitignore

For .rgignore:

Use that for search ignores
Use parent .rgignore files

From what I understand, right now -u and --no-ignore-parent applies to both of these files and you cannot set those options separately for those 2 files.

`rg -g foobar` ?

I'm used to this from ag as a more convenient find. Could that be supported? Thanks

Surprising application of .gitignore in rust-lang/rust repo

This may be working as designed but it was quite surprising. If I run rg LLVM_BINDINGS from the top level of the rust repo I get results. If I do the same from src/ I get no results. The reason is that rust's .gitignore file contains /llvm/.

Perhaps this is how .gitignore is defined to work, but it's not what I expected. I might expect that .gitignore would by applied relative to the directory in which it is defined.

^O characters in coloured output

Viewing with --color always piped to less -R, colours appear correctly but there are ^O characters (\x0F) inserted regularly, which appear in less as inverted ^O. When removed with sed 's/\x0f//g' output appears like screenshot.

Tested release binary version 0.1.16 on Debian Linux, with TERM=screen.linux.

trailing recursive globs shouldn't ignore the directory itself, only its contents

I have a .gitignore file that contains something like this:

vendor/**
!vendor/manifest

I'd expect vendor/manifest to be searched, but nothing else in the vendor directory, but instead it seems like vendor/manifest is ignored too. If I remove the vendor/** line, the search does what I expect it to.

Needs to have command line option opposite of --with-filename

There needs to be a way to disable the file name prefix even when multiple files are being searched.

bad "literal not allowed" error message

Example:

$ rg '\n'
Literal '
' not allowed.

It should probably look like this instead:

$ rg '\n'
Literal '\n' not allowed.

add support for other text encodings

Right now, ripgrep only supports reading UTF-8 encoded text (even if some of it is invalid). In my estimation, this is probably good enough for the vast majority of use cases.

However, it may be useful to search other encodings. I don't think I'd be willing to, say, modify the regex engine itself to support other encodings, but if it were easy to do transcoding on the fly, then I think it wouldn't add too much complexity. The encoding_rs project in particular appears to support this type of text decoding.

Some thoughts/concerns:

Transcoding would require using the incremental searcher as opposed to the memory map searcher. (Which is fine.)
Transcoding requires picking a source encoding, and doing this seems non-trivial. You might imagine the CLI growing a new flag that specifies a text encoding, but what happens if you want to search directories containing files with many different types of text encodings? Do we need a more elaborate system? I'm inclined towards not, since I think the juice probably isn't worth the squeeze.

add option for structured output

Consider adding a flag to output structured data such as json or csv for use by other tools.

rg --color None --no-heading is nearly good enough, but the extent of the match is missing.

Hangs when searching recursively under Cygwin

This is rather mysterious but the first thing I did after installing the Windows binary (ripgrep-0.1.17-x86_64-pc-windows-msvc.zip) was to run rg fun from Cygwin shell inside a relatively big repository and it just hung without outputting anything. Trying to search inside a single file works fine, searching in a single leaf subdirectory does too, but searching under a subdirectory containing other subdirectories outputs a number of matches and then hangs on exit. To emphasize, this only happens in a Cygwin shell, the tool doesn't hang when run from the standard Windows DOS window.

Looking at it in process explorer I see that the thread is completely stuck (0% CPU use) and the stack is

ntoskrnl.exe!KiSwapContext+0x7a
ntoskrnl.exe!KiCommitThreadWait+0x1d2
ntoskrnl.exe!KeWaitForSingleObject+0x19f
ntoskrnl.exe!KiSuspendThread+0x54
ntoskrnl.exe!KiDeliverApc+0x21d
ntoskrnl.exe!KiCommitThreadWait+0x3dd
ntoskrnl.exe!KeWaitForSingleObject+0x19f
ntoskrnl.exe!NtReadFile+0x8ae
ntoskrnl.exe!KiSystemServiceCopyEnd+0x13
ntdll.dll!NtReadFile+0xa
KERNELBASE.dll!ReadFile+0x76
kernel32.dll!ReadFileImplementation+0x55
rg.exe+0x124f1f

which doesn't make much sense to me, but maybe it can be useful to you if you have the map file and see what does 0x124f1f offset correspond to.

I could try debugging this further later, but unfortunately I really don't have time for this now, I just wanted to quickly check a promising new tool...

Ignoring of subdirectories is inconsistent with git

I have an entry target/ in .gitignore to ignore all target directories in the project. ag handles it fine and ignores all matching directories. rg only ignores the matching top level directory. If I change the pattern to **/target/ then rg works as expected, but ag stops ignoring all target directories. In both cases git ignores both the top level directory and the subdirectories.

Here's a shell session to illustrate the issue:

/p/tmp> rg --version
0.1.16
/p/tmp> git init abc
Initialized empty Git repository in /private/tmp/abc/.git/
/p/tmp> cd abc/
/p/t/abc> mkdir ghi
/p/t/abc> mkdir -p def/ghi
/p/t/abc> echo ghi/ > .gitignore
/p/t/abc> echo xyz > ghi/toplevel.txt
/p/t/abc> echo xyz > def/ghi/subdir.txt
/p/t/abc> ag xyz
/p/t/abc> rg xyz
def/ghi/subdir.txt
1:xyz
/p/t/abc> git status
On branch master

Initial commit

Untracked files:
  (use "git add <file>..." to include in what will be committed)

        .gitignore

nothing added to commit but untracked files present (use "git add" to track)
/p/t/abc> echo '**/ghi/' > .gitignore
/p/t/abc> ag xyz
def/ghi/subdir.txt
1:xyz

ghi/toplevel.txt
1:xyz
/p/t/abc> rg xyz
No files were searched, which means ripgrep probably applied a filter you didn't expect. Try running again with --debug.
/p/t/abc> git status
On branch master

Initial commit

Untracked files:
  (use "git add <file>..." to include in what will be committed)

        .gitignore

nothing added to commit but untracked files present (use "git add" to track)

DEBUG:rg::ignore: ./ghi ignored by Pattern { from: "./.gitignore", original: "ghi/", pat: "ghi", whitelist: false, only_dir: true }

.gitignore entry ignored depending on path parameter

Depending on the path parameter supplied to ripgrep on the commandline .gitignore patterns are ignored.

Setup:

$ mkdir -p test/foo/bar
$ touch test/foo/bar/baz
$ echo 'foo/bar' > test/.gitignore

When run with test as path parameter the bar directory is not ignored.

$ rg --debug --files "" test
DEBUG:grep::search: regex ast:
Empty
DEBUG:rg::ignore: test/.gitignore ignored because it is hidden
test/foo/bar/baz

When run inside test it is correctly ignored:

$ cd test
$ rg --debug --files ""
DEBUG:grep::search: regex ast:
Empty
DEBUG:rg::ignore: ./.gitignore ignored because it is hidden
DEBUG:rg::ignore: ./foo/bar ignored by Pattern { from: "./.gitignore", original: "foo/bar", pat: "foo/bar", whitelist: false, only_dir: false }

.rgignore glob?

I am trying to figure out how to ignore a particular dir name (that dir name could occur multiple times in the search path) using .rgignore.

Let's say I have the .rgignore as /proj/SOMEPRJ/.rgignore, and I have these directories

/proj/SOMEPRJ/abc/def/XYZ/
/proj/SOMEPRJ/ghi//XYZ/

What do I put in /proj/SOMEPRJ/.rgignore to ignore both the XYZ/ dirs?

For ag, I can simply put XYZ in the .agignore and be done with it. But the same does not work here.

*XYZ also does not work.

For now I need to put in the full paths to XYZ/ (both) in the .rgignore.

What is the right way?

rename the `-r` option

I just got very confused for quite a while, as my fingers tend to just naturally type -r which wanting a recursive search sometimes. The current -r does something (to me) really confusing (I'm not really sure what it would be useful for, but I'm sure it was added for a reason). However, it is (in my opinion) worth considering leaving -r unused, as it's such a common option people give to grep (then at least they get an error message).

Using --help with aliased rg

I have aliased rg to rg --follow.

After that rg --help gives this error:

Invalid arguments.

Usage: rg [options] -e PATTERN ... [<path> ...]
       rg [options] <pattern> [<path> ...]
       rg [options] --files [<path> ...]
       rg [options] --type-list
       rg --help
       rg --version

So I need to remember to use \rg --help when I need to see help.

The --help arg probably needs to ignore all other args provided to rg.

benchmark time on very small corpora

An end user reports that rg isn't as fast on ag on very small repositories. While it seems trivial, if this is because of startup time, then it's worth investigating and fixing.

Pipe causes panic

Running rg --help | echo causes a panic. Not that this is a particularly useful thing to do anyway. ;) This only happens for --help, works fine when searching.

$ rg --version
0.1.16
$ RUST_BACKTRACE=1 rg --help | echo

thread 'main' panicked at 'failed printing to stdout: Broken pipe (os error 32)', ../src/libstd/io/stdio.rs:617
stack backtrace:
   1:        0x1034311eb - std::sys::backtrace::tracing::imp::write::h46e546df6e4e4fe6
   2:        0x1034333ba - std::panicking::default_hook::_$u7b$$u7b$closure$u7d$$u7d$::h077deeda8b799591
   3:        0x103432fea - std::panicking::default_hook::heb8b6fd640571a4f
   4:        0x103425a68 - std::panicking::rust_panic_with_hook::hd7b83626099d3416
   5:        0x103433996 - std::panicking::begin_panic::h941ea76fc945d925
   6:        0x1034268d8 - std::panicking::begin_panic_fmt::h30280d4dd3f149f5
   7:        0x10342bdb2 - std::io::stdio::_print::h91aef6f665f00d62
   8:        0x103392ff2 - docopt::dopt::Error::exit::had75b1255cfb9a0a
   9:        0x10334003f - rg::main::h6a22bacbbfd7cdf6
  10:        0x103432bad - std::panicking::try::call::hca715a47aa047c49
  11:        0x1034339eb - __rust_maybe_catch_panic
  12:        0x1034329d1 - std::rt::lang_start::h162055cb2e4b9fe7
[1]    69118 abort      RUST_BACKTRACE=1 rg --help

add to package repositories

It'd be nice to at least get it into Ubuntu and homebrew. Sadly, I think either one of those will be quite difficult since Rust isn't packaged in either one of them.

The lowest hanging fruit is the Archlinux User Repository (AUR).

Fedora is getting Rust packaged soon, so that may be plausible.

Sadly, we may need to live with binaries distributed here for the time being.

add support for reading patterns from a file

It would be cool to support grep's -f/--file option, and it should be relatively easy for us to do so. The implementation strategy I have in mind is to just join all of the patterns/literals using a | and hand it off to the regex engine.

It might be faster to hand build an Aho-Corasick automaton (using the aho-corasick crate) if you know we have a bunch of literals. In fact, this is almost assuredly more memory efficient if the number of literals being searched is very large. Unfortunately, this is a harder thing to add, since it would require plumbing an Aho-Corasick automaton through all of the searching code so that either a Regex or an AcAutomaton could be used. Doable, but not straight-forward.

Can't add a new type, keep getting ''Invalid arguments".

Trying to add a new type results in the "Invalid arguments." error followed by the usage message.

I have tried the example from README (rg --type-add 'foo:*.foo,*.foobar') and some other patterns.

My environment:

Fedora 24 (x86_64).
Rust (tried both):
- rustc 1.13.0-nightly (4f9812a59 2016-09-21)
- rustc 1.11.0 (9b21dcd6a 2016-08-15)
ripgrep: 0.1.16

--vimgrep is not an option

Option to ignore git submodules

git grep by default does not search for files within a git submodule, even though they appear like normal directories.
rg on the other hand will by default recurse into these directories.

Since .gitignore is supported, it would be convenient to support how git searches when there are submodules. (Either by default and turn off with one of the -u flags, or provide some other flag besides explicitly black listing the submodule folders.)

--glob doesn't work with directories

For whatever reason, it looks like I made Overrides completely ignore directories. This comment suggests a transcription error:

        // File types don't apply to directories.
        if is_dir {
            return Match::None;
        }

Color customization

We should definitely try to support customizing colors in the output similar to how ag does it. Currently they support 3 color customizations, as per this issue:

--colors-match
--colors-path
--colors-line-number

Implementing the part where we set colors based on an argument is not particularly hard (just a matter of translating user input to term::Attr enums). Parsing input in the bash terminal format like ag does was not difficult to implement. However, this limits Windows usage because we don't have an easy translation between those colors.

How should we go about supporting these color customizations for both Linux, Mac and Windows?

Add support for -N as an alias for -AN -BN?

It might be handy to have -3 mean -A3 -B3. Could you add support for this?

stream results when feasible

Currently, all search results are written to an intermediate buffer in memory before being actually emitted to stdout. This is done to permit more efficient parallelism when searching. That is, only one thread can be writing to stdout at any point in time, but multiple threads can write to their own thread local memory buffer.

This does have undesirable end user consequences:

It can result in high memory usage when the number of search results is high.
When searching a single file, no output is seen until the search is complete.

We should be able to do quite a bit to fix these issues:

If a single file is given to search, then don't try any parallelism and make searching write to stdout directly.
If --threads 1 is given, then do (1) regardless of the number of inputs.

add support for mercurial

Mercurial is widely used enough that we should probably support it. Mercurial will actually be harder to do correctly than git, because an .hgignore file can support both regexes and globs. An .hgignore file can also specify subincludes, which include ignore patterns in a sub-directory (as opposed to git, which will read .gitignore files in sub-directories automatically).

Thankfully, ripgrep translates all globs to regexes, so we should be able to support Mercurial without too much trouble.

See: hg help hgignore and hg help patterns.

Inconsistent behavior on windows when path not specified

This is again possibly by design since ag does the same thing, but I don't understand why.

On linux I can write rg foo and rg will search for "foo". On windows if I write rg foo rg hands indefinitely, but I can write rg foo . to search.

rg exits with 1 when ran from within neovim

I'm trying to use rg with ag.vim. I have the following in my vim config:

let g:ag_prg="rg --no-heading --vimgrep"

However, it never returns results.

It turns out that when trying to run it from the vim command line it exits with 1 instead of giving results:

The same thing works fine with ag:

Running rg --no-heading --vimgrep from the terminal works fine.

Here's trying to run rg in vim with --debug:

Any ideas?

Add an option similar to -o, --only-matching

With grep you can print only the matched parts of the files. The option is described like this in the grep manpage:

-o, --only-matching
       Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.

A more powerful option, would something similar to --replace, but that doesn't print the non matched part of the text.

Cannot build project: could not parse input as TOML

On a fresh checkout, on ubuntu xenial:

➜  ripgrep git:(master) ./compile              
failed to parse manifest at `/usr/local/src/ripgrep/Cargo.toml`

Caused by:
  could not parse input as TOML
Cargo.toml:42:9 expected a key but found an empty string
Cargo.toml:42:9-42:10 expected `.`, but found `'`

On a related note, since I don't know anything about rust, is there support for doing out-of-tree builds?

when parallelism is disabled, don't use an intermediate buffer

When --threads 1 is given to ripgrep, it should not write results to an intermediate buffer. In particular, if every search is serialized (as it is when --threads 1 is given), then no synchronization is needed to print to stdout, and therefore, no intermediate buffer is needed.

This should be somewhat straight-forward, since the searcher is generic over a Printer<W: Terminal>.

In #4, I ended up doing this, but only when there was a single file (or stdin) to be searched.

Fails grepping /proc/cpuinfo due to mmap

Not sure if this is expected behavior or not. Using 0.1.17.

rg -i mhz /proc/cpuinfo --debug
DEBUG:rg::args: will try to use memory maps
DEBUG:grep::search: regex ast:
Literal {
    chars: [
        'm',
        'h',
        'z'
    ],
    casei: true
}
DEBUG:grep::literals: literal prefixes detected: Literals { lits: [Complete(MHZ), Complete(mHZ), Complete(MhZ), Complete(mhZ), Complete(MHz), Complete(mHz), Complete(Mhz), Complete(mhz)], limit_size: 250, limit_class: 10 }
FAIL: 1

rg -i mhz /proc/cpuinfo --no-mmap
8:cpu MHz       : 1284.832
35:cpu MHz      : 1216.351
62:cpu MHz      : 1200.036
89:cpu MHz      : 1199.835
116:cpu MHz     : 1200.238
143:cpu MHz     : 1199.835
170:cpu MHz     : 1205.474
197:cpu MHz     : 1228.033
224:cpu MHz     : 1267.712
251:cpu MHz     : 1199.835
278:cpu MHz     : 1199.835
305:cpu MHz     : 1209.301

grep -i mhz /proc/cpuinfo
cpu MHz     : 1199.835
cpu MHz     : 1199.835
cpu MHz     : 1199.835
cpu MHz     : 1202.252
cpu MHz     : 1199.633
cpu MHz     : 1199.835
cpu MHz     : 1219.573
cpu MHz     : 1200.036
cpu MHz     : 1199.633
cpu MHz     : 1213.128
cpu MHz     : 1199.835
cpu MHz     : 1217.358

rg interprets too much as text

I ran rg foo on a directory with a variety of different files, and it printed out a bunch of binary junk, including a bell, but luckily not anything that screwed up my terminal, on some binary files, including a bzipped tar file.

Here's the beginning of the file, via xxd; notice the lack of nulls:

0000000: 425a 6839 3141 5926 5359 8271 61b0 02b7  BZh91AY&SY.qa...
0000010: baff ffff ffff ffff ffff ffff ffff ffff  ................
0000020: ffff ffff ffff ffff ffff ffff ffff ffff  ................
0000030: ffff ffe8 c57f 36d3 addf 5dc7 b9f5 eebe  ......6...].....
0000040: ef8e 58de f6fa faef bbde f5f7 d96d edf5  ..X..........m..
0000050: 7bcf 41be df55 c46e cefa b9bd dbca 0ae6  {.A..U.n........
0000060: dbed 6f7d e5db d77d becb 6bcf 61a5 e7bd  ..o}...}..k.a...
0000070: df7d bbee fad7 2f7a eedf 79d7 aa37 9efb  .}..../z..y..7..
0000080: d79d 377b b7b0 643e 839d 776e cbbb efbc  ..7{..d>..wn....
0000090: fabe b7db 43dd f6bd e3d5 ade5 b6e4 fb61  ....C..........a
00000a0: bef7 bef8 a6ba df6d df7d eedb ef7c fbef  .......m.}...|..
00000b0: 9b6b dbd7 5bbb 6e7b 9b69 57b6 ebaf 7d8d  .k..[.n{.iW...}.
00000c0: e3a3 bedf 5cf3 7af9 33db deac 0747 5eef  ....\.z.3....G^.
00000d0: 7de7 c8f4 5f5a fb2f 7be3 df13 4fb7 dcfb  }..._Z./{...O...
00000e0: ed35 f6e2 df73 5f7d deef 3ddd cbaf bede  .5...s_}..=.....
00000f0: fbe6 f7dd bbaa f2fa f7de 5bef 6edf 7dee  ..........[.n.}.
0000100: f55b deef 5f47 6ef6 dbcf b7dd a17d 877b  .[.._Gn......}.{
0000110: 7d75 baee f573 3ded ef9f 77d6 f37d 1f7d  }u...s=...w..}.}
0000120: ee7d eb3b 59e8 ecdb eee3 b61e ecfb dbd0  .}.;Y...........
0000130: 5eda 3ebb b3ae 5f7b 74de c6be 5e59 ed5a  ^.>..._{t...^Y.Z
0000140: f97d 7bd9 b19f 5afa ebbe fbb4 bc67 addb  .}{...Z......g..
0000150: 537b b45e fafa fbed 3adb 5df5 df7d bb6a  S{.^....:.]..}.j
0000160: 77bd d6f7 3b77 df6e efba defb 9cfb 6525  w...;w.n......e%
0000170: 7db3 6f77 4697 ade3 3ddb edef 6abe e77b  }.owF...=...j..{
0000180: 7df7 df3a db3c f4d7 63ea ebee fbdd 5d76  }..:.<..c.....]v
0000190: becf 37b3 b7d9 edad 7b8a a3be ddab b77d  ..7.....{......}
00001a0: 9dc0 6f7b deaf 5d8c ef77 af74 dbbe fbef  ..o{..]..w.t....
00001b0: 7d95 d75d 5ade e674 ddb9 34f7 77cf 7416  }..]Z..t..4.w.t.
00001c0: 9f2e 3d37 d2df 2d3b be97 1d72 faf9 d9eb  ..=7..-;...r....
00001d0: 9edd 4ef7 7d7d 1bef b8cf 6bde fbc7 5df5  ..N.}}....k...].
00001e0: eefb eefb d7af b7dd ea6e b9d7 6d35 7acf  .........n..m5z.

GNU grep just reports:

Binary file es-raid-tools.tbz matches

add flag to specify one or more additional ignore files

Hello,

This might be a feature borrowed from ag.

It is very convenient to have a global ~/.rgignore that applies everywhere. It would contain stuff like:

*~
*.lib
*.cdb
*.dm
*.tag
*.oa
*.png
*.db
*.state
*.SVM
*.dat
*.sdc
*.il
*.tr
*.Cat
*.cfg
*.info
*.stateScripts
*#*#
TAGS
GTAGS
GRTAGS
GPATH

Then, even if I am working in /proj/SOMEPRJ/ and if I have /proj/SOMEPRJ/.rgignore, then the SOMEPRJ specific .rgignore + ~/.rgignore, both will be respected.

That way I do not copy the common stuff in the .rgignore of all the projects. Note that ~ (or /home/$USER) and /proj/SOMEPRJ/ do not share the same parent dir.