Code Monkey home page Code Monkey logo

credential-digger's Introduction

REUSE status GitHub release (latest by date) PyPI PyPI - Python Version Docker Visual Studio Plugin

Logo

Credential Digger

Credential Digger is a GitHub scanning tool that identifies hardcoded credentials (Passwords, API Keys, Secret Keys, Tokens, personal information, etc), filtering the false positive data through machine learning models.

TLDR; watch the video ⬇️

Watch the video

Why

In data protection, one of the most critical threats is represented by hardcoded (or plaintext) credentials in open-source projects. Several tools are already available to detect leaks in open-source platforms, but the diversity of credentials (depending on multiple factors such as the programming language, code development conventions, or developers' personal habits) is a bottleneck for the effectiveness of these tools. Their lack of precision leads to a very high number of pieces of code incorrectly detected as leaked secrets. Data wrongly detected as a leak is called false positive data, and compose the huge majority of the data detected by currently available tools.

The goal of Credential Digger is to reduce the amount of false positive data on the output of the scanning phase by leveraging machine learning models.

Architecture

The tool supports several scan flavors: public and private repositories on github and gitlab, pull requests, wiki pages, github organizations, local git repositories, local files and folders. Please refer to the Wiki for the complete documentation.

For the complete description of the approach of Credential Digger (versions <4.4), you can read this publication.

@InProceedings {lrnto-icissp21,
    author = {S. Lounici and M. Rosa and C. M. Negri and S. Trabelsi and M. Önen},
    booktitle = {Proc. of the 8th The International Conference on Information Systems Security and Privacy  (ICISSP)},
    title = {Optimizing Leak Detection in Open-Source Platforms with Machine Learning Techniques},
    month = {February},
    day = {11-13},
    year = {2021}
}

Requirements

Credential Digger supports Python >= 3.8 and < 3.13, and works only with Linux and MacOS systems. In case you don't meet these requirements, you may consider running a Docker container (that also includes a user interface).

Download and Installation

First, you need to install some dependencies (namely, build-essential and python3-dev). No need to explicitely install hyperscan anymore.

sudo apt install -y build-essential python3-dev

Then, you can install Credential Digger module using pip.

pip install credentialdigger

For ARM machines (e.g., new MacBooks), installation is possible following this guide

How to run

Add rules

One of the core components of Credential Digger is the regular expression scanner. You can choose the regular expressions rules you want (just follow the template here). We provide a list of patterns in the rules.yml file, that are included in the UI. The scanner supports rules of 4 different categories: password, token, crypto_key, and other.

Before the very first scan, you need to add the rules that will be used by the scanner. This step is only needed once.

credentialdigger add_rules --sqlite /path/to/data.db /path/to/rules.yaml

Scan a repository

After adding the rules, you can scan a repository:

credentialdigger scan https://github.com/user/repo --sqlite /path/to/data.db

Machine learning models are not mandatory, but highly recommended in order to reduce the manual effort of reviewing the result of a scan:

credentialdigger scan https://github.com/user/repo --sqlite /path/to/data.db --models PathModel PasswordModel

As for the models, also the similarity feature is not mandatory, but highly recommended in order to reduce the manual effort while assessing the discoveries after a scan:

credentialdigger scan https://github.com/user/repo --sqlite /path/to/data.db --similarity --models PathModel PasswordModel

Docker container

To have a ready-to-use instance of Credential Digger, with a user interface, you can use a docker container. This option requires the installation of Docker and Docker Compose.

Credential Digger is published on dockerhub. You can pull the latest release

sudo docker pull saposs/credentialdigger

Or build and run containers with docker compose

git clone https://github.com/SAP/credential-digger.git
cd credential-digger
cp .env.sample .env
docker compose up --build

The UI is available at http://localhost:5000/

It is preferrable to have at least 8 GB of RAM free when using docker containers

Advanced Installation

Credential Digger is modular, and offers a wide choice of components and adaptations.

Build from source

After installing the dependencies listed above, you can install Credential Digger as follows.

Configure a virtual environment for Python 3 (optional) and clone the main branch of the project:

virtualenv -p python3 ./venv
source ./venv/bin/activate

git clone https://github.com/SAP/credential-digger.git
cd credential-digger

Install the tool from source:

pip install .

Then, you can add the rules and scan a repository as described above.

External postgres database

Another ready-to-use instance of Credential Digger with the UI, but using a dockerized postgres database instead of a local sqlite one:

git clone https://github.com/SAP/credential-digger.git
cd credential-digger
cp .env.sample .env
vim .env  # set credentials for postgres
docker compose -f docker-compose.postgres.yml up --build

WARNING: Differently from the sqlite version, here we need to configure the .env file with the credentials for postgres (by modifying POSTGRES_USER, POSTGRES_PASSWORD and POSTGRES_DB).

Most advanced users may also wish to use an external postgres database instead of the dockerized one we provide in our docker-compose.postgres.yml.

How to update the project

If you are already running Credential Digger and you want to update it to a newer version, you can refer to the wiki for the needed steps.

Python library usage

When installing credentialdigger from pip (or from source), you can instantiate the client and scan a repository.

Instantiate the client proper for the chosen database:

# Using a Sqlite database
from credentialdigger import SqliteClient
c = SqliteClient(path='/path/to/data.db')

# Using a postgres database
from credentialdigger import PgClient
c = PgClient(dbname='my_db_name',
             dbuser='my_user',
             dbpassword='my_password',
             dbhost='localhost_or_ip',
             dbport=5432)

Add rules

Add rules before launching your first scan.

c.add_rules_from_file('/path/to/rules.yml')

Scan a repository

new_discoveries = c.scan(repo_url='https://github.com/user/repo',
                         models=['PathModel', 'PasswordModel'],
                         debug=True)

WARNING: Make sure you add the rules before your first scan.

Please refer to the Wiki for further information on the arguments.

CLI - Command Line Interface

Credential Digger also offers a simple CLI to scan a repository. The CLI supports both sqlite and postgres databases. In case of postgres, you need either to export the credentials needed to connect to the database as environment variables or to setup a .env file. In case of sqlite, the path of the db must be passed as argument.

Refer to the Wiki for all the supported commands and their usage.

Micosoft Visual Studio Plugin

VS Code extension for project "Credential Digger" is a free IDE extension that let you detect secrets and credentials in your code before they get leaked! Like a spell checker, the extension scans your files using the Credential Digger and highlights the secrets as you write code, so you can fix them before the code is even committed.

The VS Code extension can be donwloaded from the Microsoft VS Code Marketplace

VSCODE

pre-commit hook

Credential Digger can be used with the pre-commit framework to scan staged files before each commit.

Please, refer to the Wiki page of the pre-commit hook for further information on its installation and execution.

CI/CD Pipeline Intergation on Piper

Piper

Credential Digger is intergrated with the continuous delivery CI/CD pipeline Piper in order to automate secrets scans for your Github projects and repositories. In order to activate the Credential Diggger Step please refer to this Credential Digger step documentation for Piper

How Piper works with Jenkins

  • Once the step for credentialdigger is reached, its docker image is downloaded from the internal SAP registry. (A public instance will be avaialble soon)
  • Jenkins runs this container and runs a scan using credentialdigger, based on the step configuration. Indeed, the step supports full scan of a repo, scan of a snapshot and scan of a pull request. It is also supporting orchestrators.
  • The result of the scan (an excel file) is stored in Jenkins workspace as an output artifact
  • Jenkins destroys the container after the scan

There is no need to deploy or install a Credential Digger instance !!

Wiki

For further information, please refer to the Wiki

Contributing

We invite your participation to the project through issues and pull requests. Please refer to the Contributing guidelines for how to contribute.

How to obtain support

As a first step, we suggest to read the wiki. In case you don't find the answers you need, you can open an issue or contact the maintainers.

News

credential-digger's People

Contributors

alaabenfatma avatar dependabot[bot] avatar fabiosangregorio avatar ichbinfrog avatar lorisonori avatar marcorosa avatar sarthakg1234 avatar sebastianwolf-sap avatar sec4567 avatar sgarouachi avatar slimtrabelsi avatar sofiane-lounici avatar sofianelounici avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

credential-digger's Issues

export leaks

Add a button to export discoveries of a repo (only the leaks) in a csv/excel file

Windows support

The module does not support the Windows OS, it would be great to add such a feature to make the project OS independent.

State: Work in progress... ⌛

ignore forks scanning a user

When scanning all the repositories of a user, add a parameter to the method (e.g., forks=False|True) to decide whether to scan or not the forks of that user (i.e., the repositories that the user forked and that appear in her repo list).

Rationale: usually, the forks are outdated and, if there is any commit by the user, these commits are generally few and never contain credentials

error: "model path_model is missing meta.json file"

Built a little play example with the following dockerfile, based on the one in /ui:

FROM python:3.7

RUN pip install Flask python-dotenv
RUN apt-get update && apt-get install -y libhyperscan5 libpq-dev

# Don't verify ssl for github enterprise
RUN git config --global http.sslverify false

# Install Credential Digger
RUN pip install credentialdigger

COPY rules.yml /creddig/rules.yml

WORKDIR /creddig

RUN python -m credentialdigger add_rules --sqlite creddig.db rules.yml

ENV path_model=https://github.com/SAP/credential-digger/releases/download/PM-v1.0.1/path_model-1.0.1.tar.gz
ENV snippet_model=https://github.com/SAP/credential-digger/releases/download/SM-v1.0.0/snippet_model-1.0.0.tar.gz

RUN python -m credentialdigger download path_model && \
    python -m credentialdigger download snippet_model

then:

$ docker build -t creddigger .
$ docker run --rm -it --entrypoint /bin/bash creddigger
# python -m credentialdigger scan https://github.com/some/repo --sqlite ./creddig.db --models PathModel SnippetModel
INFO:dotenv.main:Python-dotenv could not find configuration file .env.
INFO:credentialdigger.cli.cli:Database in use: Sqlite
INFO:credentialdigger.client:Detected 240 discoveries.
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.7/site-packages/credentialdigger/__main__.py", line 4, in <module>
    cli.main()
  File "/usr/local/lib/python3.7/site-packages/credentialdigger/cli/cli.py", line 114, in main
    args.func(client, args)
  File "/usr/local/lib/python3.7/site-packages/credentialdigger/cli/scan.py", line 108, in run
    git_token=args.git_token)
  File "/usr/local/lib/python3.7/site-packages/credentialdigger/client.py", line 676, in scan
    mm = ModelManager(model)
  File "/usr/local/lib/python3.7/site-packages/credentialdigger/models/model_manager.py", line 33, in __init__
    self.model = this_model(**kwargs)
  File "/usr/local/lib/python3.7/site-packages/credentialdigger/models/path_model/path_model.py", line 21, in __init__
    super().__init__(super().find_model_file(model, binary))
  File "/usr/local/lib/python3.7/site-packages/credentialdigger/models/base_model.py", line 95, in find_model_file
    model_meta = self.get_model_meta(model_path)
  File "/usr/local/lib/python3.7/site-packages/credentialdigger/models/base_model.py", line 52, in get_model_meta
    'Contact maintainers' % model_path.name)
FileNotFoundError: It seems that model path_model is  missing meta.json file.Contact maintainers

Am I doing something wrong here?

Tool upgrade documentation

Hello,

Can we please get a detailed documentation on how to upgrade pre-installed Credential Digger with the new released versions.

Facing Error while trying to install

ERROR: Command errored out with exit status 1:
 command: /usr/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-uybuy2vv/fasttext_5ea4b956925846ce8a043b201688f9b6/setup.py'"'"'; __file__='"'"'/tmp/pip-install-uybuy2vv/fasttext_5ea4b956925846ce8a043b201688f9b6/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-7uzhe2bu/install-record.txt --single-version-externally-managed --compile --install-headers /usr/include/python3.6m/fasttext
     cwd: /tmp/pip-install-uybuy2vv/fasttext_5ea4b956925846ce8a043b201688f9b6/
Complete output (173 lines):
running install
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.6
creating build/lib.linux-x86_64-3.6/fasttext
copying python/fasttext_module/fasttext/FastText.py -> build/lib.linux-x86_64-3.6/fasttext
copying python/fasttext_module/fasttext/__init__.py -> build/lib.linux-x86_64-3.6/fasttext
creating build/lib.linux-x86_64-3.6/fasttext/util
copying python/fasttext_module/fasttext/util/__init__.py -> build/lib.linux-x86_64-3.6/fasttext/util
copying python/fasttext_module/fasttext/util/util.py -> build/lib.linux-x86_64-3.6/fasttext/util
creating build/lib.linux-x86_64-3.6/fasttext/tests
copying python/fasttext_module/fasttext/tests/test_script.py -> build/lib.linux-x86_64-3.6/fasttext/tests
copying python/fasttext_module/fasttext/tests/test_configurations.py -> build/lib.linux-x86_64-3.6/fasttext/tests
copying python/fasttext_module/fasttext/tests/__init__.py -> build/lib.linux-x86_64-3.6/fasttext/tests
running build_ext
creating tmp
gcc -pthread -Wno-unused-result -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python3.6m -c /tmp/tmpzzi8n_9e.cpp -o tmp/tmpzzi8n_9e.o -std=c++14
gcc: error: unrecognized command line option ‘-std=c++14’
gcc -pthread -Wno-unused-result -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python3.6m -c /tmp/tmp04af03se.cpp -o tmp/tmp04af03se.o -std=c++11
gcc -pthread -Wno-unused-result -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python3.6m -c /tmp/tmpc0tax4t_.cpp -o tmp/tmpc0tax4t_.o -fvisibility=hidden
building 'fasttext_pybind' extension
creating build/temp.linux-x86_64-3.6
creating build/temp.linux-x86_64-3.6/python
creating build/temp.linux-x86_64-3.6/python/fasttext_module
creating build/temp.linux-x86_64-3.6/python/fasttext_module/fasttext
creating build/temp.linux-x86_64-3.6/python/fasttext_module/fasttext/pybind
creating build/temp.linux-x86_64-3.6/src
gcc -pthread -Wno-unused-result -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/lib/python3.6/site-packages/pybind11/include -I/usr/lib/python3.6/site-packages/pybind11/include -Isrc -I/usr/include/python3.6m -c python/fasttext_module/fasttext/pybind/fasttext_pybind.cc -o build/temp.linux-x86_64-3.6/python/fasttext_module/fasttext/pybind/fasttext_pybind.o -DVERSION_INFO="0.9.2" -std=c++11 -fvisibility=hidden
In file included from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/pytypes.h:12:0,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/cast.h:13,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/attr.h:13,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/pybind11.h:45,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/numpy.h:12,
                 from python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:13:
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h: In instantiation of ‘struct pybind11::overload_cast<int>’:
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:203:65:   required from here
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:805:5: error: static assertion failed: pybind11::overload_cast<...> requires compiling in C++14 mode
     static_assert(detail::deferred_t<std::false_type, Args...>::value,
     ^
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc: In function ‘void pybind11_init_fasttext_pybind(pybind11::module_&)’:
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:203:65: error: no matching function for call to ‘pybind11::overload_cast<int>::overload_cast(<unresolved overloaded function type>, const std::integral_constant<bool, true>&)’
               &fasttext::Meter::precisionRecallCurve, py::const_))
                                                                 ^
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:203:65: note: candidates are:
In file included from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/pytypes.h:12:0,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/cast.h:13,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/attr.h:13,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/pybind11.h:45,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/numpy.h:12,
                 from python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:13:
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note: constexpr pybind11::overload_cast<int>::overload_cast()
 template <typename... Args> struct overload_cast {
                                    ^
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note:   candidate expects 0 arguments, 2 provided
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note: constexpr pybind11::overload_cast<int>::overload_cast(const pybind11::overload_cast<int>&)
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note:   candidate expects 1 argument, 2 provided
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note: constexpr pybind11::overload_cast<int>::overload_cast(pybind11::overload_cast<int>&&)
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note:   candidate expects 1 argument, 2 provided
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h: In instantiation of ‘struct pybind11::overload_cast<>’:
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:207:65:   required from here
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:805:5: error: static assertion failed: pybind11::overload_cast<...> requires compiling in C++14 mode
     static_assert(detail::deferred_t<std::false_type, Args...>::value,
     ^
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:207:65: error: no matching function for call to ‘pybind11::overload_cast<>::overload_cast(<unresolved overloaded function type>, const std::integral_constant<bool, true>&)’
               &fasttext::Meter::precisionRecallCurve, py::const_))
                                                                 ^
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:207:65: note: candidates are:
In file included from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/pytypes.h:12:0,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/cast.h:13,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/attr.h:13,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/pybind11.h:45,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/numpy.h:12,
                 from python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:13:
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note: constexpr pybind11::overload_cast<>::overload_cast()
 template <typename... Args> struct overload_cast {
                                    ^
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note:   candidate expects 0 arguments, 2 provided
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note: constexpr pybind11::overload_cast<>::overload_cast(const pybind11::overload_cast<>&)
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note:   candidate expects 1 argument, 2 provided
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note: constexpr pybind11::overload_cast<>::overload_cast(pybind11::overload_cast<>&&)
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note:   candidate expects 1 argument, 2 provided
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h: In instantiation of ‘struct pybind11::overload_cast<int, double>’:
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:211:62:   required from here
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:805:5: error: static assertion failed: pybind11::overload_cast<...> requires compiling in C++14 mode
     static_assert(detail::deferred_t<std::false_type, Args...>::value,
     ^
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:211:62: error: no matching function for call to ‘pybind11::overload_cast<int, double>::overload_cast(<unresolved overloaded function type>, const std::integral_constant<bool, true>&)’
               &fasttext::Meter::precisionAtRecall, py::const_))
                                                              ^
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:211:62: note: candidates are:
In file included from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/pytypes.h:12:0,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/cast.h:13,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/attr.h:13,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/pybind11.h:45,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/numpy.h:12,
                 from python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:13:
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note: constexpr pybind11::overload_cast<int, double>::overload_cast()
 template <typename... Args> struct overload_cast {
                                    ^
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note:   candidate expects 0 arguments, 2 provided
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note: constexpr pybind11::overload_cast<int, double>::overload_cast(const pybind11::overload_cast<int, double>&)
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note:   candidate expects 1 argument, 2 provided
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note: constexpr pybind11::overload_cast<int, double>::overload_cast(pybind11::overload_cast<int, double>&&)
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note:   candidate expects 1 argument, 2 provided
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h: In instantiation of ‘struct pybind11::overload_cast<double>’:
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:215:62:   required from here
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:805:5: error: static assertion failed: pybind11::overload_cast<...> requires compiling in C++14 mode
     static_assert(detail::deferred_t<std::false_type, Args...>::value,
     ^
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:215:62: error: no matching function for call to ‘pybind11::overload_cast<double>::overload_cast(<unresolved overloaded function type>, const std::integral_constant<bool, true>&)’
               &fasttext::Meter::precisionAtRecall, py::const_))
                                                              ^
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:215:62: note: candidates are:
In file included from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/pytypes.h:12:0,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/cast.h:13,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/attr.h:13,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/pybind11.h:45,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/numpy.h:12,
                 from python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:13:
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note: constexpr pybind11::overload_cast<double>::overload_cast()
 template <typename... Args> struct overload_cast {
                                    ^
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note:   candidate expects 0 arguments, 2 provided
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note: constexpr pybind11::overload_cast<double>::overload_cast(const pybind11::overload_cast<double>&)
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note:   candidate expects 1 argument, 2 provided
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note: constexpr pybind11::overload_cast<double>::overload_cast(pybind11::overload_cast<double>&&)
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note:   candidate expects 1 argument, 2 provided
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:219:62: error: no matching function for call to ‘pybind11::overload_cast<int, double>::overload_cast(<unresolved overloaded function type>, const std::integral_constant<bool, true>&)’
               &fasttext::Meter::recallAtPrecision, py::const_))
                                                              ^
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:219:62: note: candidates are:
In file included from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/pytypes.h:12:0,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/cast.h:13,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/attr.h:13,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/pybind11.h:45,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/numpy.h:12,
                 from python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:13:
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note: constexpr pybind11::overload_cast<int, double>::overload_cast()
 template <typename... Args> struct overload_cast {
                                    ^
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note:   candidate expects 0 arguments, 2 provided
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note: constexpr pybind11::overload_cast<int, double>::overload_cast(const pybind11::overload_cast<int, double>&)
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note:   candidate expects 1 argument, 2 provided
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note: constexpr pybind11::overload_cast<int, double>::overload_cast(pybind11::overload_cast<int, double>&&)
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note:   candidate expects 1 argument, 2 provided
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:223:62: error: no matching function for call to ‘pybind11::overload_cast<double>::overload_cast(<unresolved overloaded function type>, const std::integral_constant<bool, true>&)’
               &fasttext::Meter::recallAtPrecision, py::const_));
                                                              ^
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:223:62: note: candidates are:
In file included from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/pytypes.h:12:0,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/cast.h:13,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/attr.h:13,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/pybind11.h:45,
                 from /usr/lib/python3.6/site-packages/pybind11/include/pybind11/numpy.h:12,
                 from python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:13:
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note: constexpr pybind11::overload_cast<double>::overload_cast()
 template <typename... Args> struct overload_cast {
                                    ^
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note:   candidate expects 0 arguments, 2 provided
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note: constexpr pybind11::overload_cast<double>::overload_cast(const pybind11::overload_cast<double>&)
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note:   candidate expects 1 argument, 2 provided
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note: constexpr pybind11::overload_cast<double>::overload_cast(pybind11::overload_cast<double>&&)
/usr/lib/python3.6/site-packages/pybind11/include/pybind11/detail/common.h:804:36: note:   candidate expects 1 argument, 2 provided
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc: In lambda function:
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:345:53: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
             for (int32_t i = 0; i < vocab_freq.size(); i++) {
                                                     ^
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc: In lambda function:
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:359:54: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
             for (int32_t i = 0; i < labels_freq.size(); i++) {
                                                      ^
error: command 'gcc' failed with exit status 1
----------------------------------------

ERROR: Command errored out with exit status 1: /usr/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-uybuy2vv/fasttext_5ea4b956925846ce8a043b201688f9b6/setup.py'"'"'; file='"'"'/tmp/pip-install-uybuy2vv/fasttext_5ea4b956925846ce8a043b201688f9b6/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-7uzhe2bu/install-record.txt --single-version-externally-managed --compile --install-headers /usr/include/python3.6m/fasttext Check the logs for full command output.

Scan local files

Question. How could I use credential-digger to scan a local file that has already been downloaded. Basically looking to use it like this:

from credential-digger import scan

results = scan("/path/to/file or directory",*args)

I was looking through the code and I wasn't sure where a good entry point was. Thanks!

Wiki : arguments have mismatching names

Referring to the wiki page that discusses the approach to follow in order to scan a repo, I have noticed that names of the arguments mismatch those of the constructor.
Wiki page : How to scan a repository

  1. Instantiate the client
    from credentialdigger.cli import Client
    c = Client(host='xxx.xxx.xxx.xxx', port=NUM, dbname='mydbname', user='myusername', password='mypassword')

The constructor :

  def __init__(self, dbname, dbuser, dbpassword,
                 dbhost='localhost', dbport=5432)

Code on the wiki page must be :

  from credentialdigger.cli import Client
  c = Client(dbhost='xxx.xxx.xxx.xxx', dbport=NUM, dbname='mydbname', dbuser='myusername', dbpassword='mypassword')

button "flag as fp" not working for scan_local results

The button to flag a discovery as a FP doesn't work with this setting:

  • discoveries of the scan of a folder
    • exec bash into the container
    • execute scan_local of a folder (in case you want to scan a git repo as a folder, remove the .git first)
  • UI running to review the results
  • Manually flag one single discovery as a FP

The button works when flagging all the discoveries of a file as FPs (both in the discovery view and in the file view)

Secure UI connections

Add HTTPS support for the UI

Furthermore, the UI is now accessible by everybody who have visibility over a certain IP. We may secure this with a simple optional login page (e.g., ui_password set in .env to be entered).

Fix support for python3.9

verify dependencies and add tests when using Linux + python 3.9
(Actually tested and working for MacOS)

Models break the scan

Some change introduced in the develop branch introduced also a bug: the use of models during the scan break the scan itself

python setup.py build_ext --pg-config /path/to/pg_config build ...

Screenshot_2020-12-05-21-14-36-38

$ pip install -r requirements.txt
Requirement already satisfied: requests in /data/data/com.termux/files/usr/lib/python3.9/site-packages (from -r requirements.txt (line 13)) (2.25.0)
Collecting fasttext
Using cached fasttext-0.9.2.tar.gz (68 kB)
Requirement already satisfied: pybind11>=2.2 in /data/data/com.termux/files/usr/lib/python3.9/site-packages (from fasttext->-r requirements.txt (line 1)) (2.6.1)
Requirement already satisfied: setuptools>=0.7.0 in /data/data/com.termux/files/usr/lib/python3.9/site-packages (from fasttext->-r requirements.txt (line 1)) (49.2.1)
Collecting GitPython
Using cached GitPython-3.1.11-py3-none-any.whl (159 kB)
Collecting hyperscan
Using cached hyperscan-0.1.5.tar.gz (11 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing wheel metadata ... done
Collecting leven
Downloading leven-1.0.4.tar.gz (20 kB)
Collecting nltk
Downloading nltk-3.5.zip (1.4 MB)
|████████████████████████████████| 1.4 MB 561 kB/s
Collecting numpy
Downloading numpy-1.19.4.zip (7.3 MB)
|████████████████████████████████| 7.3 MB 88 kB/s
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing wheel metadata ... done
Collecting pandas
Downloading pandas-1.1.4.tar.gz (5.2 MB)
|████████████████████████████████| 5.2 MB 660 kB/s
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing wheel metadata ... done
Collecting plac
Downloading plac-1.2.0-py2.py3-none-any.whl (21 kB)
Collecting psycopg2-binary
Downloading psycopg2-binary-2.8.6.tar.gz (384 kB)
|████████████████████████████████| 384 kB 280 kB/s

(Having this problem below)

ERROR: Command errored out with exit status 1:
 command: /data/data/com.termux/files/usr/bin/python3 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/data/data/com.termux/files/usr/tmp/pip-install-xv8dea7t/psycopg2-binary_ccc8aff7d25d4791b0f2983a7f36c73c/setup.py'"'"'; __file__='"'"'/data/data/com.termux/files/usr/tmp/pip-install-xv8dea7t/psycopg2-binary_ccc8aff7d25d4791b0f2983a7f36c73c/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /data/data/com.termux/files/usr/tmp/pip-pip-egg-info-nwpgb9_8
     cwd: /data/data/com.termux/files/usr/tmp/pip-install-xv8dea7t/psycopg2-binary_ccc8aff7d25d4791b0f2983a7f36c73c/
Complete output (23 lines):
running egg_info
creating /data/data/com.termux/files/usr/tmp/pip-pip-egg-info-nwpgb9_8/psycopg2_binary.egg-info
writing /data/data/com.termux/files/usr/tmp/pip-pip-egg-info-nwpgb9_8/psycopg2_binary.egg-info/PKG-INFO
writing dependency_links to /data/data/com.termux/files/usr/tmp/pip-pip-egg-info-nwpgb9_8/psycopg2_binary.egg-info/dependency_links.txt
writing top-level names to /data/data/com.termux/files/usr/tmp/pip-pip-egg-info-nwpgb9_8/psycopg2_binary.egg-info/top_level.txt
writing manifest file '/data/data/com.termux/files/usr/tmp/pip-pip-egg-info-nwpgb9_8/psycopg2_binary.egg-info/SOURCES.txt'

Error: pg_config executable not found.

pg_config is required to build psycopg2 from source.  Please add the directory
containing pg_config to the $PATH or specify the full executable path with the
option:

    python setup.py build_ext --pg-config /path/to/pg_config build ...

or with the pg_config option in 'setup.cfg'.

If you prefer to avoid building psycopg2 from source, please install the PyPI
'psycopg2-binary' package instead.

For further information please check the 'doc/src/install.rst' file (also at
<https://www.psycopg.org/docs/install.html>).

----------------------------------------

ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

When I did $ python setup.py build_ext --pg-config $HOME build this pops up
usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
or: setup.py --help [cmd1 cmd2 ...]
or: setup.py --help-commands
or: setup.py cmd --help

error: option --pg-config not recognized

Then I did $ python setup.py install
Then this happened
Traceback (most recent call last):
File "/data/data/com.termux/files/usr/lib/python3.9/site-packages/setuptools/sandbox.py", line 154, in save_modules
yield saved
File "/data/data/com.termux/files/usr/lib/python3.9/site-packages/setuptools/sandbox.py", line 195, in setup_context
yield
File "/data/data/com.termux/files/usr/lib/python3.9/site-packages/setuptools/sandbox.py", line 250, in run_setup
_execfile(setup_script, ns)
File "/data/data/com.termux/files/usr/lib/python3.9/site-packages/setuptools/sandbox.py", line 45, in _execfile
exec(code, globals, locals)
File "/data/data/com.termux/files/usr/tmp/easy_install-w64k7zfv/srsly-2.3.2/setup.py", line 7, in

ModuleNotFoundError: No module named 'Cython'

During handling of the above exception, another exception occurred:
So how do I resolve this problem their is no pkg like libhyperscan-dev in termux please help me

Access to model definitions and training/validation data?

Would it be possible to get access to model definitions and training/validation data for the models used in SAP/credential-digger?

I'm interested to see how these models were trained, and to possible contribute to their future development.

Currently it seems that only trained models are available for download.

Chose between GitPython and PyGithub and support only one of them

At this moment, we are using 2 git libraries: PyGithub and GitPython.
I suggest to just chose one and update the code of the git scanner accordingly.

Why PyGithub

  • GitPython is not actively maintained anymore
  • It is possible to authenticate with a token (it is needed in many enterprise GitHub or closed gitlab servers)
  • it is possible to set the endpoint (thus, it is possible to execute the scan_user function not only on github.com)

Why GitPython

  • It is possible to call git directly, and this is pivotal here

Considering the points above, I'm inclined towards PyGithub, but only if the diff can be calculated as it is now (i.e., consider only newly added and modified files, ignore submodules, consider only "green" lines in a commit, ignore spaces)

MacOS out-of-the-box integration

The project has to be adapted to smoothly work on MacOS, too.

This means:

  • Install dependencies without modifying manually the version of the packages
  • scanners must be working with the right hyperscan APIs out-of-the-box (i.e., strings for Linux, bytes for MacOS)
  • deploy a package on pypi that can work directly on MacOS without compulsory building from source
  • Deprecate the actual wiki page (maybe leave another one where we present the reason why MacOS integration is not trivial and we have to do all this work)

scan users from non-github.com servers

Actually, the scan_user function only supports users from GitHub.com

We can leverage the use of github APIs to list the repositories of users from other git servers (e.g., enterprise GitHub, gitlab, etc.).

NB: this requires to set the api endpoint and to implement the use of a token (in case the servers require authentication for any operation)

support private repositories

Add the support for tokens such that the user can scan her private repositories

def scan(repo_url, ..., git_token=None):
    ...

Use the token to clone the repository, if passed as argument

Pre-commit Hooks (Client side)

In order to provide to the developers the possibility to scan the new portion of code before committing it, Credential Digger can provide a Pre-commit Hook that executes a fast scan on the new portion of code to be committed in a transparent manner. If the portion of code contained hardcoded credentials, A warning message will be displayed, and the commit task will be frozen until that decision of the developer to commit anyway or to modify the code and re-submit.

Github documented officially the notion of pre-commit hook Here

Some Opens Source projects already proposed solutions here and here

Similarity model

Auto-flag similar discoveries to reduce the human effort needed to review them.

This feature is initially UI-related only, then we will consider to integrate it into the update_discovery function.

Docker upgrade

Upgrade the docker image used for the UI (use python3.9 instead of python3.7, now that it is supported)

"Show on GitHub" button not working with .git in url

A repo can be scanned both with and without .git at the end of its url (e.g., both https://github.com/SAP/credential-digger and https://github.com/SAP/credential-digger.git can be passed as url). Yet, the show on GitHub button (in the discoveries view) only works if .git is not in the url.

How to fix: clean the repo url when generating the link for the show on GitHub button

Improve user interface

The ui needs several improvements.

  • Remove the "fake path" placeholder and use the real path when uploading a file with rules [#9 ]
  • Fix the button to show/hide the false positives
  • Fix the scan function (it always uses all the categories) [#35 and #39 ]
    • Even if the checkbox is not checked, all the rules are selected [#35]
  • Fix the button to re-scan a repository (choose the category instead of picking rules, like in normal scans) [#44]
  • Add flag to force repo rescan (see #36 ) [#44]

Fix empty repo scan exception

The scan of an empty repository raises an exception

Traceback (most recent call last):                                                                                                                        [0/1871]
  File "/home/marco/venv/lib/python3.6/site-packages/credentialdigger/client.py", line 752, in scan_user
    debug=debug)
  File "/home/marco/venv/lib/python3.6/site-packages/credentialdigger/client.py", line 591, in scan
    since_commit=from_commit)
  File "/home/marco/venv/lib/python3.6/site-packages/credentialdigger/scanners/git_scanner.py", line 71, in scan
    latest_commit = repo.rev_parse('HEAD').hexsha
  File "/home/marco/venv/lib/python3.6/site-packages/git/repo/fun.py", line 336, in rev_parse
    obj = name_to_object(repo, rev)
  File "/home/marco/venv/lib/python3.6/site-packages/git/repo/fun.py", line 147, in name_to_object
    raise BadName(name)
gitdb.exc.BadName: Ref 'HEAD' did not resolve to an object

Search-bar does not work

Searching for repos does not work.

Error : Uncaught TypeError: search is not a function

State: Fixing it...

Server certification verification failed error while cloning public repo.

Hi - Thanks for building this tool.

I'm running this locally by following steps mentioned here - https://github.com/SAP/credential-digger#quick-launch

Whole cloning a public repo for scan, I get the following error Server certification verification failed error while cloning public repo.

credential_digger_sqlite | INFO:werkzeug:172.18.0.1 - - [24/Jan/2021 00:52:29] "POST /scan_repo HTTP/1.1" 500 -
credential_digger_sqlite | Traceback (most recent call last):
credential_digger_sqlite | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 2464, in call
credential_digger_sqlite | return self.wsgi_app(environ, start_response)
credential_digger_sqlite | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 2450, in wsgi_app
credential_digger_sqlite | response = self.handle_exception(e)
credential_digger_sqlite | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1867, in handle_exception
credential_digger_sqlite | reraise(exc_type, exc_value, tb)
credential_digger_sqlite | File "/usr/local/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
credential_digger_sqlite | raise value
credential_digger_sqlite | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 2447, in wsgi_app
credential_digger_sqlite | response = self.full_dispatch_request()
credential_digger_sqlite | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1952, in full_dispatch_request
credential_digger_sqlite | rv = self.handle_user_exception(e)
credential_digger_sqlite | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1821, in handle_user_exception
credential_digger_sqlite | reraise(exc_type, exc_value, tb)
credential_digger_sqlite | File "/usr/local/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
credential_digger_sqlite | raise value
credential_digger_sqlite | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1950, in full_dispatch_request
credential_digger_sqlite | rv = self.dispatch_request()
credential_digger_sqlite | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1936, in dispatch_request
credential_digger_sqlite | return self.view_functionsrule.endpoint
credential_digger_sqlite | File "/credential-digger-ui/server.py", line 137, in scan_repo
credential_digger_sqlite | c.scan(repolink, models=models, category=rulesToUse, force=forceScan)
credential_digger_sqlite | File "/usr/local/lib/python3.7/site-packages/credentialdigger/client.py", line 592, in scan
credential_digger_sqlite | since_commit=from_commit)
credential_digger_sqlite | File "/usr/local/lib/python3.7/site-packages/credentialdigger/scanners/git_scanner.py", line 70, in scan
credential_digger_sqlite | project_path = self.clone_git_repo(git_url)
credential_digger_sqlite | File "/usr/local/lib/python3.7/site-packages/credentialdigger/scanners/base_scanner.py", line 19, in clone_git_repo
credential_digger_sqlite | GitRepo.clone_from(git_url, project_path)
credential_digger_sqlite | File "/usr/local/lib/python3.7/site-packages/git/repo/base.py", line 1032, in clone_from
credential_digger_sqlite | return cls._clone(git, url, to_path, GitCmdObjectDB, progress, multi_options, **kwargs)
credential_digger_sqlite | File "/usr/local/lib/python3.7/site-packages/git/repo/base.py", line 973, in _clone
credential_digger_sqlite | finalize_process(proc, stderr=stderr)
credential_digger_sqlite | File "/usr/local/lib/python3.7/site-packages/git/util.py", line 329, in finalize_process
credential_digger_sqlite | proc.wait(**kwargs)
credential_digger_sqlite | File "/usr/local/lib/python3.7/site-packages/git/cmd.py", line 408, in wait
credential_digger_sqlite | raise GitCommandError(self.args, status, errstr)
credential_digger_sqlite | git.exc.GitCommandError: Cmd('git') failed due to: exit code(128)
credential_digger_sqlite | cmdline: git clone -v https://github.com/23ranjan/redis-rack.git /tmp/tmps664dj83
credential_digger_sqlite | stderr: 'Cloning into '/tmp/tmps664dj83'...
credential_digger_sqlite | fatal: unable to access 'https://github.com/23ranjan/redis-rack.git/': server certificate verification failed. CAfile: none CRLfile: none

Change link to the paper

The link to the paper forwards to SAP Jam, which is not accessible from everybody.

Change that link to redirect interested readers where they can access the paper.

Repo rescan

A repo can be re-scanned with the new scan button.
Yet, only the new commits are taken into consideration (i.e., if there were commits after last scan, then these commits are scanned).
Since we can select the rules to be used for the new scan, then I would suggest to add also a "Rescan repo" checkbox to chose whether to force the rescan of the repository instead of scanning only the new commits.

This flag will trigger the force=True argument of the scan function

repo rescan

When we force the re-scan of a repository, we should delete the results of previous scans (i.e., delete the entries in the discoveries table and in the repo one) for that repositories. Otherwise, forcing the re-scan will produce a ton of duplicates

Client.delete_rule behaviour

The discoveries table is defined with a foreign key (to rule.id) and a on delete no action.

If the client can delete a rule, then it may raise an exception in case that rule is used.

Proposal:

  • SQL - table discoveries: use the category of the rules instead of the rule_id
  • CLIENT - git_scanner: change keep the rule category instead of id here
  • this change is not retro-compatible. We will need to update the version of credentialdigger

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.