marcelmay / hfsa Goto Github PK
View Code? Open in Web Editor NEWHadoop FSImage Analyzer (HFSA)
License: Apache License 2.0
Hadoop FSImage Analyzer (HFSA)
License: Apache License 2.0
Testing only:
Tool incorrectly reports symlinks.
Occurs when very large files lead to auto extending bucket size (see #5 ).
Support Hadoop 2.x and 3.x fsimage versions.
Hadoop 3 changed the fsimage layout when introducing new HDFS feature erasure encoding.
Eg you might only see NPE but not that there was an OoME when loading image data.
This regression was introduced when fixing #40 .
Current guava version 27.0-jre results in shade warning, which is fixed in 27.0.1-jre :
[WARNING] failureaccess-1.0.jar, guava-27.0-jre.jar define 2 overlapping classes:
[WARNING] - com.google.common.util.concurrent.internal.InternalFutureFailureAccess
[WARNING] - com.google.common.util.concurrent.internal.InternalFutures
See guava issue 3302 for details.
Check if a directory contains any children.
Using a fsimage file, show all files and directories owned by a specified list of users
Auto-resizing is resizing exactly one bucket less than required.
Refactor inodes handling (storing, parsing, extracting) into a new class in order to enable future alternative implementations (inodes as single huge byte array or (direct memory) ByteBuffer etc ...)
Files directly under root receive the full path including file name in FsVisitor.onFile(...).
Micro tuning file creation from about ~9s to ~7s per 100 dirs / 26k files.
Update info.picocli:picocli from 3.7.0 to 3.9.6
Report the top small files
Simplifies test assertions.
Increase from default 8KiB to 64KiB, for FSImage files on non-SSD HDs.
Support CLI option -v for INFO or -vv for DEBUG output.
i do not find the change time and access time of the file, is there that stat?
if not , can upgrade the project to ?
thank you.
Check if value is null Instead of first testing if key exists:
if(map.contains(key)) {
... map.get(key);
}```
Update overall dependencies:
Test scope:
Print inode content details, for eg verification or debugging.
Support printing out the hfsa version via option -V
For consistency with java.io.File and unix, support POSIX-like double slashes in paths when querying FSImages.
Path examples:
Dependency | Version |
---|---|
org.apache.hadoop:hadoop-common | 2.7.3 -> 2.7.5 |
org.apache.hadoop:hadoop-hdfs | 2.7.3 -> 2.7.5 |
commons-io:commons-io | 2.5 -> 2.6 |
For reuse, eg in https://github.com/marcelmay/hadoop-hdfs-fsimage-exporter
Print out a warning message when max JVM heap is less than 2x fsimage size and recommendation for JVM heap size.
The root INode has always the same INode ID and can be easily cached to prevent repeated reconstruction from INode bytes cache.
Memory cost for 1 million inodes are about 8 MiB (( ~ number inodes * sizof(long) ).
Add helper for handling block storage policy.
When calling computeBucketUpperBorders on a newly created instance:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
at de.m3y.hadoop.hdfs.hfsa.util.SizeBucket$Bucket2nModel.computeBucketUpperBorders(SizeBucket.java:78)
at de.m3y.hadoop.hdfs.hfsa.util.SizeBucket.computeBucketUpperBorders(SizeBucket.java:124)
Update com.google.guava:guava:21.0 to latest 24.0-jre
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at de.m3y.hadoop.hdfs.hfsa.tool.FormatUtil.boxAndPadWithZeros(FormatUtil.java:77)
at de.m3y.hadoop.hdfs.hfsa.tool.HdfsFSImageTool.doSummary(HdfsFSImageTool.java:134)
at de.m3y.hadoop.hdfs.hfsa.tool.HdfsFSImageTool.doPerform(HdfsFSImageTool.java:97)
at de.m3y.hadoop.hdfs.hfsa.tool.HdfsFSImageTool.main(HdfsFSImageTool.java:314)
Currently, hfsa lib fails processing compressed fsimages (see marcelmay/hadoop-hdfs-fsimage-exporter/issues/34 ).
FSImageUtil.wrapInputStreamForCompression
Provide a CLI toggle for the fsimage generator for enabling/disabling compression.
Completely skip sections of no interest
Fine tune for smaller footprint, as Hadoop Maven dependencies (transitively) pull in a lot.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.