Code Monkey home page Code Monkey logo

Comments (7)

steph-nb avatar steph-nb commented on July 29, 2024

Hi Steffen,

I just realized, it is mainly empty files causing this.
So maybe you can either prevent sending them to siegfried totally, or catch the "[ERROR] empty source" of siegfried differently?

Thanks and BR

from filedriller.

steffenfritz avatar steffenfritz commented on July 29, 2024

Hi,

sorry for not replying so long, had some busy weeks.

I guess you cannot, but I ask anyway: Can you provide some sample files?

from filedriller.

steph-nb avatar steph-nb commented on July 29, 2024

Hi,

as stated in my last comment, it is mainy empty files causing this.
So just create an empty file - e.g. named empty.
And run friller over the file directly or the folder containing it.
You will get an error of siegfried : empty source

Here my corresponding dev log:

Starting: C:\Users\User\go\bin\dlv.exe dap --listen=127.0.0.1:61357 from d:\Go_projects\siegfried-main\cmd\sf
DAP server listening at: 127.0.0.1:61357
Type 'dlv help' for list of commands.
[FILE] D:\Disks\00_Extractions\test\empty
[ERROR] empty source

siegfried : 1.9.6
scandate : 2023-02-02T16:03:40+01:00
signature : default.sig
created : 2022-11-06T17:44:52+01:00
identifiers :

  • name : 'pronom'
    details : 'DROID_SignatureFile_V109.xml; container-signature-20221102.xml'
  • name : 'tika'
    details : 'tika-mimetypes.xml'
  • name : 'freedesktop.org'
    details : 'freedesktop.org.xml'

filename : 'D:\Disks\00_Extractions\test\empty'
filesize : 0
modified : 2023-02-02T15:59:17+01:00
errors : 'empty source'
matches :

  • ns : 'pronom'
    id : 'UNKNOWN'
    format :
    version :
    mime :
    basis :
    warning : 'no match'
  • ns : 'tika'
    id : 'UNKNOWN'
    format :
    mime : 'UNKNOWN'
    basis :
    warning : 'no match'
  • ns : 'freedesktop.org'
    id : 'UNKNOWN'
    format :
    mime : 'UNKNOWN'
    basis :
    warning : 'no match'

filename : 'D:\Disks\00_Extractions\test\not_empty.txt'
filesize : 9
modified : 2023-02-02T15:59:36+01:00
errors :
matches :

  • ns : 'pronom'
    id : 'x-fmt/111'
    format : 'Plain Text File'
    version :
    mime : 'text/plain'
    basis : 'extension match txt; text match ASCII'
    warning :
  • ns : 'tika'
    id : 'text/plain'
    format :
    mime : 'text/plain'
    basis : 'extension match txt; text match ASCII'
    warning : 'match on filename and text only; byte/xml signatures for this format did not match'
  • ns : 'freedesktop.org'
    id : 'text/plain'
    format : 'plain text document'
    mime : 'text/plain'
    basis : 'extension match txt; text match ASCII'
    warning : 'match on filename and text only; byte/xml signatures for this format did not match'
    Process 15660 has exited with status 0
    Detaching

I think you should catch that siegfried error, and not log it as error to the error.log
(ERROR: 2023/02/02 16:10:34 process.go:83: "D:\Disks\00_Extractions\test\empty",,,,,,,,)

What do you think?

from filedriller.

steph-nb avatar steph-nb commented on July 29, 2024

in the csv-output the file is still treated quite well:

<style> </style>
Filename SizeInByte Registry RegistryIdentifier Name Version MIME ByteMatch IdentificationNote SHA1 UUID AccessTime ModTime ChangeTime BirthTime inNSRL Entropy
D:\Disks\00_Extractions\test\empty                 da39a3ee5e6b4b0d3255bfef95601890afd80709 7ccec6c3-d8b0-42c5-a256-ab75336bbc97 2023-02-02 15:59:17.9574559 +0100 CET 2023-02-02 15:59:17.9574559 +0100 CET 2023-02-02 16:03:32.8874471 +0100 CET 2023-02-02 15:59:17.9574559 +0100 CET TRUE  

But just as a suggestion: what about setting SizeInByte to 0?
This would make sure that no file (in contrast to directories) has aSizeInByte which is NULL

from filedriller.

steffenfritz avatar steffenfritz commented on July 29, 2024

Let's not mix things up, I still try to understand the first issue :)

  1. In the screenshot, highlighted yellow, these files are not empty, but some metadata is missing.
  2. friller handles empty files correctly in the output csv
  3. friller writes for each empty file an entry into the error log

To 1: This looks like a problem with container formats
To 2: Can you confirm?
To 3: This is annoying, but not a bug. However, this could (and should) be changed.

Writing size 0 into SizeInByte is also something that should be implemented.

from filedriller.

steph-nb avatar steph-nb commented on July 29, 2024

OK, agree :-)

  1. the highlighted yellow issue with container data: It is an issue with opening the csv in excel. Maybe caused by the fact that in Switzerland the standard-csv-sepaarator is ";" not ",". So when doubleclicking on the friller-output.csv Excel opens it as a txt-file with all data in one column.
    What happened now, is that I used the Excel-functionality to split up these data into several columns (in the german Excel: Daten -> 'Text in Spalten') - and here the mistake happens. Obviously Excel cannot handle the data correctly (there is in fact even a warning - which I just clicked away - that data already existing there are being overwritten...). So it is not a friller issue. Sorry for that!!
    (And as a workaround with Excel I can delete all existing data and import the data again via Daten -> 'Aus Text'. Strangely when fetching the csv-data like this, they get imported correctly... )

  2. Yes, I confirm, the output.csv is correct.

  3. But it handles all empty files as error cases in the log files - which in my eyes is misleading

Implementing SizeInByte=0 would be nice

Many thanks and sorry for not having analyzed thoroughly enough before...

from filedriller.

steffenfritz avatar steffenfritz commented on July 29, 2024
  • No empty files are reported to the error log
  • "0" is written into the size field

from filedriller.

Related Issues (18)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.