Code Monkey home page Code Monkey logo

Comments (3)

jmurty avatar jmurty commented on August 28, 2024

Hi, I have done some testing and the problem is triggered by the presence of quote characters (") in filenames, how these quotes are output by Git commands and then fed into follow-on commands within Transcrypt.

At multiple places in the Transcrypt script we use a command like this to find encrypted files (tracked files to which the crypt filter is applied):

git -c core.quotePath=false ls-files | git -c core.quotePath=false check-attr --stdin filter | awk 'BEGIN { FS = ":" }; /crypt$/{ print $1 }'

These commands try to avoid problems with special/unusual characters in filenames by disabling Git's core.quotePath config setting, but unfortunately per the documentation:

Double-quotes, backslash and control characters are always escaped regardless of the setting of this variable.

Running the above command on a repo containing files with quote characters produces output like this:

"ismdou - Know the identifiers of the table \"FOO\".py.secret"
sensitive_file

It is because the ismdou… file gets quoted and quote-escaped in this way that follow-up commands fail.

Specifically for your reported issue, the pre-commit hook is failing because it's passing the quoted file path/name to a command git show :"${secret_file}" to determine whether the file is properly encrypted, but this gets interpreted as the following which doesn't work: git show :'"ismdou - Know the identifiers of the table \"FOO\".py.secret"'

Although the pre-commit hook is the culprit it this case, the same problem manifests for the --list command (shows over-quoted output) and the --show-raw command (will not work if you name the file, and also fails if you use the wildcard --show-raw=* option):

./transcrypt --show-raw=*
==> "ismdou - Know the identifiers of the table \"FOO\".py.secret" <==
fatal: path '"ismdou - Know the identifiers of the table \"FOO\".py.secret"' does not exist in 'HEAD'

So I can tell what the problem is, but how to fix it is less clear. We need to do one of:

  • come up with an alternative to the piped commands I listed at the top, so we can identify and act on encrypted files from Git's file listing. I think this is the fix we should pursue, but it won't be easy (see below)
  • somehow remove the outer quotes and un-escape inner quotes for over-quoted filenames before we feed them to git show commands. I doubt there is any sensible way to do this in bash.

I have experimented with using the -z option instead of -c core.quotePath=false to identify encrypted files via ls-files and check-attr Git commands. The -z option use NUL to delimit filenames, instead of newlines, but importantly avoids quoting file names at all (emphasis mine):

Without the -z option, pathnames with "unusual" characters are quoted as explained for the configuration variable core.quotePath (see git-config[1]). Using -z the filename is output verbatim and the line is terminated by a NUL byte.

The trick will be figuring out how to string together the commands to work with NUL-delimited outputs, especially given that the command sequence relies on adding – then removing – the suffix : filter: crypt to the same line as the original filename of encrypted files.

So far I've gotten as far as the following, which avoids the unwanted filename quoting using -z with the first ls-files command, but then unfortunately adds it back in with the following check-attr command:

git ls-files -z | awk 'BEGIN { RS = "\0" }; { print $0 }' | git check-attr --stdin filter

I suspect a proper fix will require replacing the single-line piped commands with a function instead that will iterate over unquoted filenames (thanks to the -z option), run the check-attr command on each one individually to identify encrypted files, then add the unquoted filename of just the encrypted files to an output string.

from transcrypt.

jmurty avatar jmurty commented on August 28, 2024

I've started working towards a fix for this in PR #174

In my manual testing the PR works for most situations, but tests are failing for a few use-cases that have been broken by my changes.

from transcrypt.

jmurty avatar jmurty commented on August 28, 2024

The potential fix on branch 173-handle-quotes-in-filenames is now passing all unit tests and works for my manual testing of file names containing double-quotes. Can you try it and see if it works for you @andreineculau?

from transcrypt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.