Code Monkey home page Code Monkey logo

findutils's People

Contributors

antiagainst avatar arcterus avatar bippityboppity avatar cakebaker avatar cnd avatar dahc avatar dependabot-preview[bot] avatar dependabot-support avatar dependabot[bot] avatar helloshiv avatar ilius avatar int3 avatar jackpot51 avatar jayvdb avatar jellehelsen avatar mcharsley avatar refi64 avatar rofrol avatar sylvestre avatar tavianator avatar tertsdiepraam avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

findutils's Issues

warning: only a 'panic!' in 'if'-then statement

10 occurrences found with
cargo +nightly clippy -- -W clippy::pedantic

warning: only a `panic!` in `if`-then statement
  --> src/find/matchers/lname.rs:76:13
   |
76 | /             if e.kind() != ErrorKind::AlreadyExists {
77 | |                 panic!("Failed to create sym link: {:?}", e);
78 | |             }
   | |_____________^ help: try instead: `assert!(!(e.kind() != ErrorKind::AlreadyExists), "Failed to create sym link: {:?}", e);`
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#manual_assert
   = note: `-W clippy::manual-assert` implied by `-W clippy::pedantic`

Implement `-follow`

Even if this is deprecated, for compat, we should implement it:


       -follow
              Deprecated; use the  -L  option  instead.   Dereference  symbolic  links.   Implies
              -noleaf.   The -follow option affects only those tests which appear after it on the
              command line.  Unless the -H or -L option has been specified, the position  of  the
              -follow  option  changes the behaviour of the -newer predicate; any files listed as
              the argument of -newer will be dereferenced if they are symbolic links.   The  same
              consideration applies to -newerXY, -anewer and -cnewer.  Similarly, the -type pred-
              icate will always match against the type of the file that a symbolic link points to
              rather  than  the  link itself.  Using -follow causes the -lname and -ilname predi-
              cates always to return false.

-name / should match ///

Repeated slashes for the root directory should match a single slash:

$ ./target/debug/find /// -maxdepth 0 -name /
$ find /// -maxdepth 0 -name /
find: warning: Unix filenames usually don't contain slashes (though pathnames do).  That means that '-name ‘/’' will probably evaluate to false all the time on this system.  You might find the '-wholename' test more useful, or perhaps '-samefile'.  Alternatively, if you are using GNU grep, you could use 'find ... -print0 | grep -FzZ ‘/’'.
///

Fix clippy warning `empty String is being created manually`

warning: empty String is being created manually
   --> src/find/matchers/printf.rs:470:36
    |
470 |                 .unwrap_or_else(|| "".to_owned())
    |                                    ^^^^^^^^^^^^^ help: consider using: `String::new()`
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#manual_string_new
    = note: `-W clippy::manual-string-new` implied by `-W clippy::pedantic`

Implement `-I / --replace`

  -I R                         same as --replace=R
  -i, --replace[=R]            replace R in INITIAL-ARGS with names read
                                 from standard input, split at newlines;
                                 if R is unspecified, assume {}

Can't handle paths longer than PATH_MAX

POSIX says

The find utility shall be able to descend to arbitrary depths in a file hierarchy and shall not fail due to path length limitations (unless a path operand specified by the application exceeds {PATH_MAX} requirements).

but

$ name="0123456789ABCDEF"
$ name="${name}${name}${name}${name}"
$ name="${name}${name}${name}${name}"
$ name="${name:0:255}"
$ (mkdir deep && cd deep && for i in {1..17}; do mkdir $name && cd $name; done)
$ ./target/debug/find deep -mindepth 17
Error: deep: other os error
$ find deep -mindepth 17
deep/0123456789ABCDEF0123456789ABCDEF...

Also, despite printing an error message, uutils' find exits with status 0.

Fix clippy warning `variables can be used directly in the 'format!' string`

cargo +nightly clippy --allow-dirty --fix -- -W clippy::pedantic
finds 34 occurrences

For example:


warning: variables can be used directly in the `format!` string
  --> src/find/matchers/delete.rs:49:17
   |
49 |                 writeln!(&mut stderr(), "Failed to delete {}: {}", path_str, e).unwrap();
   |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_args
   = note: `-W clippy::uninlined-format-args` implied by `-W clippy::pedantic`
help: change this to
   |
49 -                 writeln!(&mut stderr(), "Failed to delete {}: {}", path_str, e).unwrap();
49 +                 writeln!(&mut stderr(), "Failed to delete {path_str}: {e}").unwrap();

CodeHealth: abstract away stderr calls

At the moment when something goes wrong in a non-fatal way, we just write directly to stderr. We should extend find::Dependencies to abstract out access to stderr much like we did with stdout. This will allow

  • the code to be better used as a library (if we ever want to do that)
  • us to test that errors are reported

CodeHealth: tidy up Errors

We currently just return errors created from arbitrary strings. At some point we should switch to a proper find::Error enum

Panic when piped into head

$ find | head
.
./.git
./.git/branches
./.git/hooks
./.git/hooks/applypatch-msg.sample
./.git/hooks/commit-msg.sample
./.git/hooks/post-update.sample
./.git/hooks/pre-applypatch.sample
./.git/hooks/pre-commit.sample
./.git/hooks/prepare-commit-msg.sample
$ RUST_BACKTRACE=1 ./target/debug/find | head
.
./.git
./.git/branches
./.git/hooks
./.git/hooks/applypatch-msg.sample
./.git/hooks/commit-msg.sample
./.git/hooks/post-update.sample
./.git/hooks/pre-applypatch.sample
./.git/hooks/pre-commit.sample
./.git/hooks/prepare-commit-msg.sample
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error { repr: Os { code: 32, message: "Broken pipe" } }', src/libcore/result.rs:859
stack backtrace:
   0: std::sys::imp::backtrace::tracing::imp::unwind_backtrace
   1: std::sys_common::backtrace::_print
   2: std::panicking::default_hook::{{closure}}
   3: std::panicking::default_hook
   4: std::panicking::rust_panic_with_hook
   5: std::panicking::begin_panic
   6: std::panicking::begin_panic_fmt
   7: rust_begin_unwind
   8: core::panicking::panic_fmt
   9: core::result::unwrap_failed
             at /build/rust/src/rustc-1.17.0-src/src/libcore/macros.rs:29
  10: <core::result::Result<T, E>>::unwrap
             at /build/rust/src/rustc-1.17.0-src/src/libcore/result.rs:737
  11: <findutils::find::matchers::printer::Printer as findutils::find::matchers::Matcher>::matches
             at ./src/find/matchers/printer.rs:26
  12: <findutils::find::matchers::logical_matchers::AndMatcher as findutils::find::matchers::Matcher>::matches::{{closure}}
             at ./src/find/matchers/logical_matchers.rs:39
  13: <core::slice::Iter<'a, T> as core::iter::iterator::Iterator>::all::{{closure}}
             at /build/rust/src/rustc-1.17.0-src/src/libcore/slice.rs:1027
  14: <core::slice::Iter<'a, T>>::search_while
             at /build/rust/src/rustc-1.17.0-src/src/libcore/slice.rs:1118
  15: <core::slice::Iter<'a, T> as core::iter::iterator::Iterator>::all
             at /build/rust/src/rustc-1.17.0-src/src/libcore/slice.rs:1026
  16: <findutils::find::matchers::logical_matchers::AndMatcher as findutils::find::matchers::Matcher>::matches
             at ./src/find/matchers/logical_matchers.rs:39
  17: findutils::find::process_dir
             at ./src/find/mod.rs:122
  18: findutils::find::do_find
             at ./src/find/mod.rs:139
  19: findutils::find::find_main
             at ./src/find/mod.rs:198
  20: find::main
             at ./src/find/main.rs:14
  21: std::panicking::try::do_call
  22: __rust_maybe_catch_panic
  23: std::rt::lang_start
  24: main
  25: __libc_start_main
  26: _start

** should be a valid glob

$ find -name '**.rs'
./src/find/matchers/perm.rs
...
$ ./target/debug/find -name '**.rs'
Error: Pattern syntax error near position 2: recursive wildcards must form a single path component

But ** is not a recursive glob, it is just two wildcards next to each other. Like the regex .*.*\.rs

Implement `-newerXY`

The upstream manpage states:

-newerXY reference
          Compares the timestamp of the current file with reference.   The
          reference  argument  is  normally the name of a file (and one of
          its timestamps is used for the comparison) but it may also be  a
          string  describing  an  absolute time.  X and Y are placeholders
          for other letters, and these letters select which time belonging
          to how reference is used for the comparison.

          a   The access time of the file reference
          B   The birth time of the file reference
          c   The inode status change time of reference
          m   The modification time of the file reference
          t   reference is interpreted directly as a time
$ find . ! -newermt 'jan 01, 2000' -exec touch -d@1706517278 {} +   
Error: Unrecognized flag: '-newermt'

-exec ... {} + matched too aggressively

$ find -exec echo '{}' foo + \;
. foo +
...
$ ./target/debug/find -exec echo '{}' foo + \;
Error: -exec [args...] + isn't supported yet. Only -exec [args...] ;

In fact, POSIX says

The end of the primary expression shall be punctuated by a <semicolon> or by a <plus-sign>. Only a <plus-sign> that immediately follows an argument containing only the two characters "{}" shall punctuate the end of the primary expression. Other uses of the <plus-sign> shall not be treated as special.

Double negation is the same as single negation

$ find \! \! -name find
./src/find
./target/debug/find
$ ./target/debug/find \! \! -name find
.
./.git
./.git/branches
./.git/hooks
./.git/hooks/applypatch-msg.sample
...

Here when detecting a -not/! it does invert_next_matcher = true, but maybe it should be something more like invert_next_matcher = !invert_next_matcher?

consider making onig optional

There are a number of pure rust globs and regex crates, might be better to not have to deal with a C dependency if we could avoid it.

From what I'm seeing regex may lack a mean to select a specific flavour of regex, not sure if somebody already had a mean to restrict the engine to not support extensions compared to posix/emacs.

CodeHealth: fix windows support

This should be fairly trivial with the code at time of writing: we just need to remove the hard-coded expectations about slashes being the path separator (mainly in the tests).

Some of the currently unimplemented features (user, group and symlinks) might have to involve some if cfg!(windows) magic.

On the symlink front, I'd advise take inspiration from https://github.com/BurntSushi/walkdir/blob/master/src/tests.rs on how to test interaction with symlinks on both unix and windows machines

use uutils-args or clap for argument management

findutils is doing way too many argument management by hand.
It should be delegate to clap
for example

while i < args.len()
&& (args[i] == "-" || !args[i].starts_with('-'))
&& args[i] != "!"
&& args[i] != "("

findutils/src/find/mod.rs

Lines 167 to 175 in ac576f5

fn print_help() {
println!(
r"Usage: find [path...] [expression]
If no path is supplied then the current working directory is used by default.
Early alpha implementation. Currently the only expressions supported are
-print
-name case-sensitive_filename_pattern

findutils/src/find/mod.rs

Lines 205 to 207 in ac576f5

fn print_version() {
println!("find (Rust) {}", env!("CARGO_PKG_VERSION"));
}

let mut invert_next_matcher = false;
while i < args.len() {
let possible_submatcher = match args[i] {
"-print" => Some(printer::Printer::new_box()),
"-true" => Some(logical_matchers::TrueMatcher::new_box()),
"-false" => Some(logical_matchers::FalseMatcher::new_box()),
"-name" => {
if i >= args.len() - 1 {

FeatureComplete: implement exec[dir] +

We (will soon) have support for -exec and -execdir clauses that end with a ';' (i.e. run this command for every file/directory). We need to also add support for clauses that end with a '+' (i.e. batch up the files/dirs and then run the command for as many as possible at once).

Unfortunately this isn't easy in an os-independent way because the standard library doesn't expose any way of telling when a command-line is going to be too long. I raised rust-lang/rust#40384 but it's not getting much traction.

So we need to go for a lowest common denominator approach. Choose a hard-coded limit (I'd suggest a bit less than 8kB, to allow for any inaccuracies in the next bit), come up with an efficient way of estimating the command-line length (doing this accurately is going to involve reimplementing too much of std::process::Command) and trigger the command when the estimated total goes over the limit.

To do this we need to

  • add a MultiExecMatcher class to find::matchers::exec (implementing the finished_dir and finished methods to make the exec calls for any remaining files that haven't been executed yet)
  • tweak find::process_dir to call the finished and finished_dir methods as appropriate
  • tweak find::matchers::build_matcher_tree to stop returning an error when -exec[dir] finds a + and create a MultiExecMatcher instead.

Fix clippy warning `unnecessary `!=` operation`

warning: unnecessary `!=` operation
  --> src/testing/commandline/main.rs:81:26
   |
81 |           destination_dir: if args[1] != "-" {
   |  __________________________^
82 | |             Some(args[1].clone())
83 | |         } else {
84 | |             None
85 | |         },
   | |_________^
   |
   = help: change to `==` and swap the blocks of the `if`/`else`

xargs doesn't limit command line lengths properly

$ find ~ -print0 | ./target/debug/xargs -0 echo >/dev/null
Error: Command could not be run: Argument list too long (os error 7)

From a quick look, there's at least an issue here:

findutils/src/xargs/mod.rs

Lines 141 to 146 in 36e3229

#[cfg(unix)]
fn count_osstr_chars_for_exec(s: &OsStr) -> usize {
use std::os::unix::ffi::OsStrExt;
// Include +1 for the null terminator.
s.as_bytes().len() + 1
}

This needs to count not only the length of the string, but also the size of the pointer that ends up in argv/envp.

It might be worth using the https://crates.io/crates/argmax crate to do this accounting for us. It could also help for #6.

-help found too aggressively

$ ./target/debug/find -exec echo '{}' -help \;
Usage: find [path...] [expression]
...

But the -help should be an argument to -exec.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.