Code Monkey home page Code Monkey logo

diskus's People

Contributors

amilajack avatar arunsathiya avatar crestwave avatar fawick avatar fierthraix avatar fuerbringer avatar heimskr avatar polyzen avatar sergeykasmy avatar sharkdp avatar wngr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

diskus's Issues

Provide a way to show size of sub-elements of a given folder

One of the most common use-cases for du -hs by far for me (and also what dust caters to), is to do du -hs * to find the largest directories in a given directory (usually the current one). It doesn't seem like diskus can currently operate in that mode? It'd be awesome if that kind of "tell me what's large" quick-n-dirty mode could be supported somehow!

Undocumented stdout behaviour

Printed output on the screen does not match the output sent to stdout. For example:

$ diskus
23.24 MB (23,240,704 bytes)
$ diskus | cat
23240704

Now, I try to understand the purpose of this: when piping to an external program it is often useful to have the raw number, without any formatting. That's all well and good, but it is unexpected as it does not comply with conventions used by other CLI tools. Also, there is no mention of it in the documentation. Additionally, there should be a simple way to send the entire, formatted output to stdout. Typically, unless the output is dynamic (like text progress bars), what one sees on the screen should be the stdout, and if there is a need for changing the format of the output, that should be handled by the command options.

license needs clarification

Please clarify licensing. I have not found anything indicating how you mean this to be licensed. Is it MIT+Apache, or is it a standard "recipient chooses" dual licensing scheme where the recipent gets to decide which license to accept?

diskus slower than du

Maybe I were doing it wrong.
The computed directory is my clippy build.

% /usr/bin/du -sch
4.5G    .
4.5G    total
% diskus
4.73 GB (4,727,521,280 bytes)
% hyperfine diskus '/usr/bin/du -sch'
Benchmark #1: diskus
  Time (mean ± σ):     115.8 ms ±  28.6 ms    [User: 2.601 s, System: 0.592 s]
  Range (min … max):    69.1 ms … 156.9 ms    19 runs

Benchmark #2: /usr/bin/du -sch
  Time (mean ± σ):      22.8 ms ±   2.8 ms    [User: 5.5 ms, System: 17.4 ms]
  Range (min … max):    14.2 ms …  26.9 ms    163 runs

Summary
  '/usr/bin/du -sch' ran
    5.07 ± 1.40 times faster than 'diskus'

Meta

  • diskus: b2e4cf9 but with cargo update

Some improvement ideas

Can you please allow to pass a path to dup?
The motivation behind this request is that it's pretty inconvenient to change my working directory only to know how much space it uses on disk. Even better, allow to pass an arbitrary number of paths and process them simultaneously. Also, please consider JSON output as mentioned in a comment from this lobste.rs thread.

Difference in size reported by `du -sh` and diskus

In a directory with VM images,

jagan@giant ~/s/diskus> cd ~/.vm
jagan@giant ~/.vm> du -sh
32G     .
jagan@giant ~/.vm> dup 
68.72 GB (68719480832 bytes)

Is this due to sparse files? The huge difference in the numbers can be misleading.

PS: Still using old binary. I was not able to compile it again after the name change. Getting this from rustc,

error[E0658]: use of unstable library feature 'libc': use `libc` from crates.io (see issue #27783)                                                                        
  --> /home/jagan/.cargo/registry/src/github.com-1ecc6299db9ec823/term_size-0.3.1/src/lib.rs:16:1                                                                         
   |                                                                                                                                                                      
16 | extern crate libc;                                                                                                                                                   
   | ^^^^^^^^^^^^^^^^^^                                                                                                                                                   

Project name?

I think your project is very cool, but am concerned about the name.

Aside from dup being a built-in in many versions of Forth like languages, I would have that it was mnemonically associated with something that involves two of something.

Feel free to disregard (I'm just some rando from the internet), but I think this would hinder people from using/contributing to your project.

Add an option to follow symlinks

du supports follow symlinks with the -L option. There is no such option in diskus. I want to use diskus to calculate the size of packages in NixOS which makes heavy use of symlinks.

Include in Homebrew

This is a great tool that I'd love to see included in Homebrew. However, it's not notable enough yet and too new.

When it is 30 days old and has one of:

  1. ≥ 30 watchers,
  2. ≥ 30 forks, or
  3. ≥ 75 stars, (most likely to qualify)

it can be submitted for inclusion at homebrew/homebrew-core.

Add plain output option

It might make sense to add something like a --plain option (other name suggestions?) which would provide an easily parseable output (probably just the number of bytes) for other programs that want to run diskus.

Wrong total

du -sh /

du: cannot access '/run/user/1000/gvfs': Permission denied
du: cannot access '/proc/6026/task/6026/fd/4': No such file or directory
du: cannot access '/proc/6026/task/6026/fdinfo/4': No such file or directory
du: cannot access '/proc/6026/fd/3': No such file or directory
du: cannot access '/proc/6026/fdinfo/3': No such file or directory
66G	/

cd / && dup

I/O error: ./run/user/1000/gvfs: Permission denied (os error 13)
Could not get metadata: './proc/6009/fd/6'
Could not get metadata: './proc/6009/fd/9'
Could not get metadata: './proc/6009/fd/7'
Could not get metadata: './proc/6009/fd/10'
Could not get metadata: './proc/6009/fdinfo/3'
Could not get metadata: './proc/6009/fd/13'
Could not get metadata: './proc/6009/fdinfo/4'
Could not get metadata: './proc/6009/fdinfo/6'
Could not get metadata: './proc/6009/fdinfo/5'
Could not get metadata: './proc/6009/fdinfo/7'
Could not get metadata: './proc/6009/fd/8'
Could not get metadata: './proc/6009/fdinfo/8'
Could not get metadata: './proc/6009/fd/12'
Could not get metadata: './proc/6009/fdinfo/9'
Could not get metadata: './proc/6009/fdinfo/10'
Could not get metadata: './proc/6009/fdinfo/13'
Could not get metadata: './proc/6009/fdinfo/12'
Could not get metadata: './proc/6009/task/6009/fdinfo/3'
Could not get metadata: './proc/6009/task/6010/fd/6'
Could not get metadata: './proc/6009/task/6010/fd/4'
Could not get metadata: './proc/6009/task/6009/fdinfo/9'
Could not get metadata: './proc/6009/task/6010/fd/5'
Could not get metadata: './proc/6009/task/6009/fdinfo/5'
Could not get metadata: './proc/6009/task/6009/fd/5'
Could not get metadata: './proc/6009/task/6009/fd/6'
Could not get metadata: './proc/6009/task/6009/fd/8'
Could not get metadata: './proc/6009/task/6010/fd/8'
Could not get metadata: './proc/6009/task/6010/fd/3'
Could not get metadata: './proc/6009/task/6009/fdinfo/8'
Could not get metadata: './proc/6009/task/6010/fd/9'
Could not get metadata: './proc/6009/task/6010/fd/11'
Could not get metadata: './proc/6009/task/6009/fd/3'
Could not get metadata: './proc/6009/task/6009/fdinfo/11'
Could not get metadata: './proc/6009/task/6009/fdinfo/4'
Could not get metadata: './proc/6009/task/6009/fd/4'
Could not get metadata: './proc/6009/task/6010/fdinfo/3'
Could not get metadata: './proc/6009/task/6009/fd/9'
Could not get metadata: './proc/6009/task/6009/fdinfo/6'
Could not get metadata: './proc/6009/task/6011/fd/4'
Could not get metadata: './proc/6009/task/6011/fd/9'
Could not get metadata: './proc/6009/task/6010/fdinfo/5'
Could not get metadata: './proc/6009/task/6010/fdinfo/6'
Could not get metadata: './proc/6009/task/6011/fdinfo/8'
Could not get metadata: './proc/6009/task/6011/fdinfo/9'
Could not get metadata: './proc/6009/task/6011/fdinfo/11'
Could not get metadata: './proc/6009/task/6011/fd/6'
Could not get metadata: './proc/6009/task/6012/fd/3'
Could not get metadata: './proc/6009/task/6012/fd/4'
Could not get metadata: './proc/6009/task/6012/fdinfo/3'
Could not get metadata: './proc/6009/task/6011/fd/8'
Could not get metadata: './proc/6009/task/6011/fd/11'
Could not get metadata: './proc/6009/task/6009/fd/11'
Could not get metadata: './proc/6009/task/6012/fd/8'
Could not get metadata: './proc/6009/task/6012/fdinfo/6'
Could not get metadata: './proc/6009/task/6010/fdinfo/8'
Could not get metadata: './proc/6009/task/6010/fdinfo/9'
Could not get metadata: './proc/6009/task/6012/fd/11'
Could not get metadata: './proc/6009/task/6012/fd/5'
Could not get metadata: './proc/6009/task/6012/fdinfo/11'
Could not get metadata: './proc/6009/task/6012/fdinfo/4'
Could not get metadata: './proc/6009/task/6014/fd/3'
Could not get metadata: './proc/6009/task/6012/fdinfo/5'
Could not get metadata: './proc/6009/task/6014/fd/6'
Could not get metadata: './proc/6009/task/6011/fdinfo/5'
Could not get metadata: './proc/6009/task/6012/fd/9'
Could not get metadata: './proc/6009/task/6012/fdinfo/8'
Could not get metadata: './proc/6009/task/6014/fd/11'
Could not get metadata: './proc/6009/task/6010/fdinfo/4'
Could not get metadata: './proc/6009/task/6013/fd/5'
Could not get metadata: './proc/6009/task/6014/fd/4'
Could not get metadata: './proc/6009/task/6012/fd/6'
Could not get metadata: './proc/6009/task/6011/fdinfo/4'
Could not get metadata: './proc/6009/task/6014/fd/8'
Could not get metadata: './proc/6009/task/6014/fd/9'
Could not get metadata: './proc/6009/task/6010/fdinfo/11'
Could not get metadata: './proc/6009/task/6013/fd/3'
Could not get metadata: './proc/6009/task/6012/fdinfo/9'
Could not get metadata: './proc/6009/task/6013/fd/4'
Could not get metadata: './proc/6009/task/6014/fd/5'
Could not get metadata: './proc/6009/task/6014/fdinfo/3'
Could not get metadata: './proc/6009/task/6013/fd/6'
Could not get metadata: './proc/6009/task/6013/fdinfo/4'
Could not get metadata: './proc/6009/task/6013/fdinfo/5'
Could not get metadata: './proc/6009/task/6014/fdinfo/5'
Could not get metadata: './proc/6009/task/6013/fd/9'
Could not get metadata: './proc/6009/task/6014/fdinfo/6'
Could not get metadata: './proc/6009/task/6014/fdinfo/8'
Could not get metadata: './proc/6009/task/6013/fd/11'
Could not get metadata: './proc/6009/task/6014/fdinfo/9'
Could not get metadata: './proc/6009/task/6014/fdinfo/11'
Could not get metadata: './proc/6009/task/6015/fd/9'
Could not get metadata: './proc/6009/task/6015/fd/11'
Could not get metadata: './proc/6009/task/6013/fdinfo/3'
Could not get metadata: './proc/6009/task/6014/fdinfo/4'
Could not get metadata: './proc/6009/task/6015/fdinfo/3'
Could not get metadata: './proc/6009/task/6013/fdinfo/6'
Could not get metadata: './proc/6009/task/6013/fdinfo/8'
Could not get metadata: './proc/6009/task/6013/fdinfo/9'
Could not get metadata: './proc/6009/task/6013/fdinfo/11'
Could not get metadata: './proc/6009/task/6016/fd/3'
Could not get metadata: './proc/6009/task/6016/fdinfo/3'
Could not get metadata: './proc/6009/task/6015/fd/5'
Could not get metadata: './proc/6009/task/6015/fd/6'
Could not get metadata: './proc/6009/task/6015/fdinfo/4'
Could not get metadata: './proc/6009/task/6015/fdinfo/5'
Could not get metadata: './proc/6009/task/6016/fd/6'
Could not get metadata: './proc/6009/task/6015/fdinfo/6'
Could not get metadata: './proc/6009/task/6015/fdinfo/8'
Could not get metadata: './proc/6009/task/6015/fdinfo/9'
Could not get metadata: './proc/6009/task/6016/fd/9'
Could not get metadata: './proc/6009/task/6016/fdinfo/4'
Could not get metadata: './proc/6009/task/6016/fd/11'
Could not get metadata: './proc/6009/task/6015/fd/4'
Could not get metadata: './proc/6009/task/6016/fd/4'
Could not get metadata: './proc/6009/task/6016/fd/5'
Could not get metadata: './proc/6009/task/6017/fd/3'
Could not get metadata: './proc/6009/task/6016/fdinfo/6'
Could not get metadata: './proc/6009/task/6016/fd/8'
Could not get metadata: './proc/6009/task/6017/fd/9'
Could not get metadata: './proc/6009/task/6017/fd/11'
Could not get metadata: './proc/6009/task/6018/fd/3'
Could not get metadata: './proc/6009/task/6018/fd/4'
Could not get metadata: './proc/6009/task/6017/fd/5'
Could not get metadata: './proc/6009/task/6016/fdinfo/11'
Could not get metadata: './proc/6009/task/6013/fd/8'
Could not get metadata: './proc/6009/task/6018/fd/9'
Could not get metadata: './proc/6009/task/6017/fd/6'
Could not get metadata: './proc/6009/task/6017/fd/8'
Could not get metadata: './proc/6009/task/6016/fdinfo/8'
Could not get metadata: './proc/6009/task/6015/fdinfo/11'
Could not get metadata: './proc/6009/task/6016/fdinfo/9'
Could not get metadata: './proc/6009/task/6018/fd/5'
Could not get metadata: './proc/6009/task/6018/fd/6'
Could not get metadata: './proc/6009/task/6017/fd/4'
Could not get metadata: './proc/6009/task/6018/fd/8'
Could not get metadata: './proc/6009/task/6016/fdinfo/5'
Could not get metadata: './proc/6009/task/6018/fdinfo/3'
Could not get metadata: './proc/6009/task/6018/fdinfo/4'
Could not get metadata: './proc/6009/task/6018/fdinfo/6'
Could not get metadata: './proc/6009/task/6018/fdinfo/8'
Could not get metadata: './proc/6009/task/6018/fdinfo/9'
Could not get metadata: './proc/6009/task/6018/fdinfo/11'
Could not get metadata: './proc/6009/task/6019/fd/3'
Could not get metadata: './proc/6009/task/6019/fd/5'
Could not get metadata: './proc/6009/task/6017/fdinfo/6'
Could not get metadata: './proc/6009/task/6018/fd/11'
Could not get metadata: './proc/6009/task/6019/fdinfo/4'
Could not get metadata: './proc/6009/task/6017/fdinfo/11'
Could not get metadata: './proc/6009/task/6018/fdinfo/5'
Could not get metadata: './proc/6009/task/6019/fd/4'
Could not get metadata: './proc/6009/task/6017/fdinfo/4'
Could not get metadata: './proc/6009/task/6017/fdinfo/5'
Could not get metadata: './proc/6009/task/6019/fdinfo/3'
Could not get metadata: './proc/6009/task/6017/fdinfo/3'
Could not get metadata: './proc/6009/task/6020/fd/3'
Could not get metadata: './proc/6009/task/6019/fd/6'
Could not get metadata: './proc/6009/task/6017/fdinfo/9'
Could not get metadata: './proc/6009/task/6019/fd/9'
Could not get metadata: './proc/6009/task/6019/fdinfo/5'
Could not get metadata: './proc/6009/task/6019/fdinfo/6'
Could not get metadata: './proc/6009/task/6019/fd/11'
Could not get metadata: './proc/6009/task/6019/fdinfo/9'
Could not get metadata: './proc/6009/task/6019/fdinfo/11'
Could not get metadata: './proc/6009/task/6020/fd/4'
Could not get metadata: './proc/6009/task/6020/fd/5'
Could not get metadata: './proc/6009/task/6017/fdinfo/8'
Could not get metadata: './proc/6009/task/6020/fd/6'
Could not get metadata: './proc/6009/task/6020/fd/8'
Could not get metadata: './proc/6009/task/6020/fd/9'
Could not get metadata: './proc/6009/task/6021/fd/3'
Could not get metadata: './proc/6009/task/6020/fdinfo/3'
Could not get metadata: './proc/6009/task/6020/fdinfo/4'
Could not get metadata: './proc/6009/task/6020/fdinfo/5'
Could not get metadata: './proc/6009/task/6020/fdinfo/6'
Could not get metadata: './proc/6009/task/6020/fdinfo/8'
Could not get metadata: './proc/6009/task/6020/fdinfo/9'
Could not get metadata: './proc/6009/task/6020/fdinfo/11'
Could not get metadata: './proc/6009/task/6021/fd/4'
Could not get metadata: './proc/6009/task/6021/fd/5'
Could not get metadata: './proc/6009/task/6020/fd/11'
Could not get metadata: './proc/6009/task/6021/fd/6'
Could not get metadata: './proc/6009/task/6021/fdinfo/3'
Could not get metadata: './proc/6009/task/6021/fd/8'
Could not get metadata: './proc/6009/task/6022/fd/3'
Could not get metadata: './proc/6009/task/6021/fd/9'
Could not get metadata: './proc/6009/task/6021/fdinfo/5'
Could not get metadata: './proc/6009/task/6022/fd/9'
Could not get metadata: './proc/6009/task/6021/fdinfo/6'
Could not get metadata: './proc/6009/task/6022/fdinfo/3'
Could not get metadata: './proc/6009/task/6021/fdinfo/9'
Could not get metadata: './proc/6009/task/6021/fdinfo/11'
Could not get metadata: './proc/6009/task/6022/fd/5'
Could not get metadata: './proc/6009/task/6022/fd/4'
Could not get metadata: './proc/6009/task/6022/fd/6'
Could not get metadata: './proc/6009/task/6022/fd/8'
Could not get metadata: './proc/6009/task/6021/fdinfo/4'
Could not get metadata: './proc/6009/task/6021/fd/11'
Could not get metadata: './proc/6009/task/6022/fd/11'
Could not get metadata: './proc/6009/task/6021/fdinfo/8'
Could not get metadata: './proc/6009/task/6022/fdinfo/4'
Could not get metadata: './proc/6009/task/6022/fdinfo/5'
Could not get metadata: './proc/6009/task/6022/fdinfo/6'
Could not get metadata: './proc/6009/task/6022/fdinfo/8'
Could not get metadata: './proc/6009/task/6022/fdinfo/9'
Could not get metadata: './proc/6009/task/6022/fdinfo/11'
140.81 TB (140805022081340 bytes)

The result given by du -sh is accurate.

Provide path in output

Great tool!

Would love to see the path in the output a la:

# diskus /tmp
57.87 MB (57,872,384 bytes) /tmp

Putting diskus in place greatly reduced the amount of time our previous du -sh script took, but I need the path name in the output. So I'm doing this for now, seems though that we lose human readable sizes when manipulating the stdout:

for i in ` find /share/users -mindepth 1 -maxdepth 1 | grep -v "^\.*$"`; do diskus -j 3 $i | awk -v dir=$i '{print $NF "\t\t\t" dir}' & done >> /tmp/diskus-$DATE.log

Output:

20966879414272    /share/users/mholmes
16338228916224    /share/users/sholmes
12737846097408    /share/users/jwatson

DANGER for anyone that finds this at random: The above launches a diskus process for everything found 1 layer in, if you have over 100 dirs and say only 96 cores, this will oversubscribe and you'll see a loadavg over 300.

Stack overflow

With a sufficiently large directory, the threads can overflow their stacks because walk() is recursive. I encountered a similar issue when rewriting the du implementation in uutils/coreutils, so I decided to test diskus and found the same problem.

Calculate size for directory names passed as arguments

Hey, it would be great if I could pass directories as args and have dup calculate the size only for those directories. It's definitely easier than doing them separately and adding the results myself. What do you think?

Consider adding support for Windows directory size "philosophy"

Hey, first time trying diskus, and while I can confirm that it's really fast here as well, I get a different result in total bytes on a local directory. 😕

I've noticed the Windows caveat section, but I don't think that this applies in my case here, because there is noting unusual in this path, no junctions, or hardlinks whatsoever.

To be sure, I've tested the same path with some other tools, like the Python based duu1, another one implemented in Rust found here on GitHub (dua2), as well as the Sysinternals Disk Usage (du3) tool for Windows for reference.

diskus
PS E:\> diskus Down
101.90 GB (101,896,640,783 bytes)
PS E:\> cd Down
PS E:\Down> diskus .
101.90 GB (101,896,640,783 bytes)
PS E:\Down>

Here's the comparison:

duu
PS E:\Down> duu --quiet

summary
=======
files         : 3'919
directories   : 125
bytes         : 101'895'657'743
kilobytes     : 99'507'478.26
megabytes     : 97'175.27
gigabytes     : 94.90
PS E:\Down>
dua
PS E:\Down> dua --format bytes '.'
101895657743 b . entries
PS E:\Down> '{0:N0}' -f 101895657743
101'895'657'743
PS E:\Down>
du
PS E:\Down> du E:\Down
Files:        3919
Directories:  125
Size:         101'895'657'743 bytes
Size on disk: 101'904'261'120 bytes

PS E:\Down>

And last but not least, the total value in bytes as displayed in Windows Explorer: 94.8 GB (101'895'657'743 bytes)

OS information:

PS E:\Down> [Environment]::OSVersion.VersionString
Microsoft Windows NT 10.0.19043.0
PS E:\Down> Get-WindowsVersion | select Version, OS* | fl

Version  : 2009
OS Build : 19043.1526

PS E:\Down>

Footnotes

  1. https://github.com/jftuga/duu

  2. https://github.com/Byron/dua-cli

  3. https://docs.microsoft.com/en-us/sysinternals/downloads/du

Usage with xargs

I'm trying to do this:

xargs -0 -P $parallel -I {} sh -c "{ date '+START {} %Y-%m-%d_%H:%M:%S' >> $logf; \
	bytes=$(diskus -v {}); \
        echo "{},$bytes" >> $csvf; \
	date '+FINISH {} %Y-%m-%d_%H:%M:%S' >> $logf; }"

But:

diskus: could not retrieve metadata for path '{}'

I'm 96% sure my xargs is correct since I'm just replacing du with diskus in the same construct. The {} comes out as expected elsewhere. Any ideas?

Error on installation

Hello! Thanks for writing this, looks super neat! :)

I had a problem installing this just now, perhaps I am missing something obvious, but just wanted to file this in case it is helpful

Details below, let me know if you would like more information :)

$ cargo install du-dup
    Updating registry `https://github.com/rust-lang/crates.io-index`
  Installing du-dup v0.2.0
   Compiling crossbeam-utils v0.5.0
   Compiling semver-parser v0.7.0
   Compiling strsim v0.7.0
   Compiling nodrop v0.1.12
   Compiling rand_core v0.3.0
   Compiling cfg-if v0.1.6
   Compiling scopeguard v0.3.3
   Compiling memoffset v0.2.1
error: the struct `#[repr(align(u16))]` attribute is experimental (see issue #33626)
  --> /Users/ntie0001/.cargo/registry/src/github.com-1ecc6299db9ec823/crossbeam-utils-0.5.0/src/cache_padded.rs:19:1
   |
19 | #[repr(align(64))]
   | ^^^^^^^^^^^^^^^^^^

error: non-string literals in attributes, or string literals in top-level positions, are experimental (see issue #34981)
  --> /Users/ntie0001/.cargo/registry/src/github.com-1ecc6299db9ec823/crossbeam-utils-0.5.0/src/cache_padded.rs:19:1
   |
19 | #[repr(align(64))]
   | ^^^^^^^^^^^^^^^^^^

error: aborting due to 2 previous errors

error: Could not compile `crossbeam-utils`.
warning: build failed, waiting for other jobs to finish...
error: failed to compile `du-dup v0.2.0`, intermediate artifacts can be found at `/var/folders/mw/gj7418356js6s29x7wn8crfmljy4wh/T/cargo-install.rwrMN5HWFtwH`

Caused by:
  build failed

Discard warning messages

Hi,

First let me illustrate the problem:

Screenshot from 2019-05-22 14-53-35

Can we just discard those could not read contents of directory warnings?
They seem to be useless, cause running diskus as root produces the same output.

Support resident size as well as apparent size or clarify description

Currently, diskus's description is misleading as you explained here. It is actually not an alternative to du -sh, but to du -sb. I suggest that support be added for resident size (i.e., disk usage) in addition to apparent size. If you don't want to add support for this, please at least change the description and the README file to clarify that only apparent size is computed and not resident size.

Missing manpage

There's currently no man page for the diskus binary, we should add one!

Support Windows as target OS

I was trying to build diskus with on Windows 7 (Rust 1.37.0, stable-x86_64-pc-windows-msvc toolchain). This failed, as the trait st::os::unix::fs::MetadataExt is used in src/walk.rs.

Here is the full error message of the failing build from my machine.
$ cargo build
   Compiling diskus v0.5.0 (C:\Users\fabian\rust\diskus)
error[E0433]: failed to resolve: could not find `unix` in `os`
 --> src\walk.rs:3:14
  |
3 | use std::os::unix::fs::MetadataExt;
  |              ^^^^ could not find `unix` in `os`

warning: unused import: `std::os::unix::fs::MetadataExt`
 --> src\walk.rs:3:5
  |
3 | use std::os::unix::fs::MetadataExt;
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: #[warn(unused_imports)] on by default

error[E0599]: no method named `nlink` found for type `std::fs::Metadata` in the current scope
  --> src\walk.rs:27:63
   |
27 |             let unique_id = if metadata.is_file() && metadata.nlink() > 1 {
   |                                                               ^^^^^

error[E0599]: no method named `dev` found for type `std::fs::Metadata` in the current scope
  --> src\walk.rs:28:40
   |
28 |                 Some(UniqueID(metadata.dev(), metadata.ino()))
   |                                        ^^^

error[E0599]: no method named `ino` found for type `std::fs::Metadata` in the current scope
  --> src\walk.rs:28:56
   |
28 |                 Some(UniqueID(metadata.dev(), metadata.ino()))
   |                                                        ^^^

error: aborting due to 4 previous errors

Some errors have detailed explanations: E0433, E0599.
For more information about an error, try `rustc --explain E0433`.
error: Could not compile `diskus`.

To learn more, run the command again with --verbose.  

I dug into the source and came up with a patch that allowed me a compile and run the tool. Basically, I refactored the generation of UniqueID into a function and made a explicit distinction between unix and windows when implementing that function.

Here is the diff of the patch.
$ git diff  src
diff --git a/src/walk.rs b/src/walk.rs
index 0125713..580a6e0 100644
--- a/src/walk.rs
+++ b/src/walk.rs
@@ -1,9 +1,11 @@
 use std::collections::HashSet;
 use std::fs;
-use std::os::unix::fs::MetadataExt;
 use std::path::PathBuf;
 use std::thread;

+#[cfg(target_os = "unix")]
+use std::os::unix::fs::MetadataExt;
+
 use crossbeam_channel as channel;

 use rayon;
@@ -18,17 +20,27 @@ enum Message {
     CouldNotReadDir(PathBuf),
 }

+#[cfg(target_os = "unix")]
+fn generate_unique_id(metadata: &std::fs::Metadata) -> Option<UniqueID> {
+    // If the entry has more than one hard link, generate
+    // a unique ID consisting of device and inode in order
+    // not to count this entry twice.
+    if metadata.is_file() && metadata.nlink() > 1 {
+        Some(UniqueID(metadata.dev(), metadata.ino()))
+    } else {
+        None
+    }
+}
+
+#[cfg(target_os = "windows")]
+fn generate_unique_id(_metadata: &std::fs::Metadata) -> Option<UniqueID> {
+    None
+}
+
 fn walk(tx: channel::Sender<Message>, entries: &[PathBuf]) {
     entries.into_par_iter().for_each_with(tx, |tx_ref, entry| {
         if let Ok(metadata) = entry.symlink_metadata() {
-            // If the entry has more than one hard link, generate
-            // a unique ID consisting of device and inode in order
-            // not to count this entry twice.
-            let unique_id = if metadata.is_file() && metadata.nlink() > 1 {
-                Some(UniqueID(metadata.dev(), metadata.ino()))
-            } else {
-                None
-            };
+            let unique_id = generate_unique_id(&metadata);

             let size = metadata.len();

While it serves my immediate usecase, the Windows implementation is incorrect/incomplete, as there is no handling of NTFS Hard Links and Junctions. A quick search with cargo revealed https://crates.io/crates/junction which would probably be a suitable candidate for coming up with a proper implementation.

The issue I'd like to raise is whether diskus should have Windows support. If so, is the approach via [cfg(target_os = "...")] acceptable for diskus. I found a discussion in the Rust forum that contains links to similar branching in the libc crate source.

Release on crates.io

Installing dup currently requires cloning the repo and then building. Could you maybe publish it on crates.io so we can just do cargo install dup? dup is taken, but you could go with du-dup (like dust uses du-dust)?

Fails to,compile in ubuntu 20.04 focal

b7c.diskus.8cbe22893c05f668-cgu.0.rcgu.o: error adding symbols: file in wrong format
collect2: error: ld returned 1 exit status

error: could not compile diskus (bin "diskus") due to previous error

[Feature] Support hidden, ignore-files

It's not uncommon that I want to run diskus on some directory, but want it to ignore certain types of files:

--hidden

It would be nice to be able to ignore hidden files when computing the total. The flag --[no-]hidden should be available to control this. The default behavior of diskus should not change.

  • Hidden directories
    • .git directories
  • Hidden files
    • No specific use-case

--ignore-file (and related things from RipGrep)

Being able to specify an --ignore-file or --[no-]ignore-vcs flag in order to not descend into e.g. a build/ directory.

Ripgrep has several related options (for .gitignore, .rgignore, and general --ignore-file) that might be useful to coopt.

The default behavior of diskus should not change.

Just Make a Pull Request

My employer does not permit FOSS contributions, otherwise I would take a stab at doing this myself.

Please upgrade the version of libc in Cargo.lock

I'm trying to make diskus package for Arch RISC-V, but the old version (currently 0.2.62) of libc in Cargo.lock failed to compile on riscv64 architecture. This can be fixed by upgrading libc in Cargo.lock, so we can make PKGBUILD work without patching libc version before building.

There is no --exclude=/path

This is useful if you don't want to have certain directories getting counted. Du has a --exclude=command which excludes it from the total.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.