Code Monkey home page Code Monkey logo

pfff's Issues

cleanup README

https://github.com/returntocorp/pfff/blob/develop/changes.txt#039-q4-2019-real-python-parser-a-generic-ast-a-generic-sgrepscheck split off several tools:

* 0.39 (Q4 2019) (real Python parser, a generic AST, a generic sgrep/scheck)
10 years of Pfff! Started in November 2009 (while at Facebook).

** big split! move sgrep/spatch, codemap, codegraph, the lang_xxx bytecode
related, mini, and scheck in separate repositories
either under github.com/returntocorp (pfff, sgrep, check_generic)
or under github.com/aryx

It would be good to reflect those changes in README.md to avoid confusion - currently it says:
https://github.com/returntocorp/pfff/blob/develop/README.md#pfff

pfff is also made of few tools:
 - `pfff`, which allows to test the different parsers on a single file
 - `scheck`, a bug finder
 - `stags`, an Emacs tag generator
 - `sgrep`, a syntactical grep
 - `spatch`, a syntactical patch
 - `codequery`, an interactive tool a la SQL to query information
   about the structure of a codebase using Prolog as the query engine
 - `pfff_db`, which does some global analysis on a set of source files and
   store the data in a marshalled form in a file somewhere (e.g. `/tmp/db.json`)

il_generic error when constructing Record

I've been testing the taint mode of semgrep and came across a parsing issue that seems to be caused by pfff.

Example input:

function a() {
  b("c", {d: "e"})
}

Example output:

$ pfff -il_generic demo.js
(S
   (Block
      ((),
       [(ExprStmt (
           (Call (
              (Id (("b", ()),
                 { id_resolved = ref (None); id_type = ref (None);
                   id_const_literal = ref (None) }
                 )),
              ((),
               [(Arg (L (String ("c", ()))));
                 (Arg
                    (Record
                       ((),
                        [(FieldStmt
                            (DefStmt
                               ({ name = (EId ("d", ())); attrs = [];
                                  info =
                                  { id_resolved = ref (None);
                                    id_type = ref (None);
                                    id_const_literal = ref (None) };
                                  tparams = [] },
                                (FieldDefColon
                                   { vinit = (Some (L (String ("e", ()))));
                                     vtype = None }))))
                          ],
                        ())))
                 ],
               ())
              )),
           ()))
         ],
       ())))
==>
demo.js:2:2: the ident 'b' is not resolved
Fatal error: exception Parse_info.Ast_builder_error("TODO Construct: (E
   (Record
      ((),
       [(FieldStmt
           (DefStmt
              ({ name = (EId (\"d\", ())); attrs = [];
                 info =
                 { id_resolved = ref (None); id_type = ref (None);
                   id_const_literal = ref (None) };
                 tparams = [] },
               (FieldDefColon
                  { vinit = (Some (L (String (\"e\", ())))); vtype = None }))))
         ],
       ())))", _)

fails to build on Mac/arm64

While trying to use sudo port install semgrep:

:info:build [ERROR] The compilation of pfff.0.40.4 failed at "make".
:info:build #=== ERROR while compiling pfff.0.40.4 ========================================#
:info:build # context              2.1.2 | macos/arm64 | ocaml.4.12.1 | pinned(git+file:///opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_devel_semgrep/semgrep/work/semgrep-0.14.0/pfff#HEAD#c554bc0c)
:info:build # path                 ~/.opam/default/.opam-switch/build/pfff.0.40.4
:info:build # command              /usr/bin/make
:info:build # exit-code            2
:info:build # env-file             ~/.opam/log/pfff-69080-0c5aa6.env
:info:build # output-file          ~/.opam/log/pfff-69080-0c5aa6.out
:info:build ### output ###
:info:build # [...]
:info:build # value_type
:info:build # virtual_method
:info:build # virtual_method_type
:info:build # virtual_value
:info:build # with_constraint
:info:build # with_type_binder
:info:build # ocamlc.opt -g -thread -w +a-4-6-7-29-41-44-45-48-52-67 -warn-error +a -bin-annot -absname   -I ../../commons -I ../../external/ppx_deriving -I ../../commons_core -I ../../globals -I ../../h_program-lang   -c parser_ml.ml
:info:build # File "/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_devel_semgrep/semgrep/work/.home/.opam/default/.opam-switch/build/pfff.0.40.4/lang_ml/parsing/parser_ml.ml", line 1:
:info:build # Error: I/O error: parser_ml.ml: No such file or directory
:info:build # make[2]: *** [parser_ml.cmo] Error 2
:info:build # make[1]: *** [rec] Error 1
:info:build # make: *** [all] Error 2
:info:build <><> Error report <><><><><><><><><><><><><><><><><><><><><><><><><><><><><>  � 
:info:build ┌─ The following actions failed
:info:build │ λ build pfff 0.40.4
:info:build └─ 
:info:build ┌─ The following changes have been performed
:info:build │ ∗ install base-bytes          base
:info:build │ ∗ install conf-perl           2
:info:build │ ∗ install cppo                1.6.9
:info:build │ ∗ install grain_dypgen        0.2
:info:build │ ∗ install json-wheel          1.0.6+safe-string
:info:build │ ∗ install menhir              20220210
:info:build │ ∗ install menhirLib           20220210
:info:build │ ∗ install menhirSdk           20220210
:info:build │ ∗ install ocaml-compiler-libs v0.12.4
:info:build │ ∗ install ocamlgraph          2.0.0
:info:build │ ∗ install ocamlnet            4.1.9-2
:info:build │ ∗ install ppx_derivers        1.2.1
:info:build │ ∗ install ppx_deriving        5.2.1
:info:build │ ∗ install ppxlib              0.27.0
:info:build │ ∗ install result              1.5
:info:build │ ∗ install sexplib0            v0.15.1
:info:build │ ∗ install uucp                14.0.0
:info:build │ ∗ install uutf                1.0.3
:info:build └─ 
:info:build The former state can be restored with:
:info:build     /opt/local/bin/opam switch import
:info:build "/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_devel_semgrep/semgrep/work/.home/.opam/default/.opam-switch/backup/state-20220826204252.export"
:info:build Command failed: . /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_devel_semgrep/semgrep/work/opam.env && opam install -j 10 -y ./pfff
:info:build Exit code: 31
:error:build Failed to build semgrep: command execution failed
:debug:build Error code: NONE
:debug:build Backtrace: command execution failed
:debug:build     while executing
:debug:build "$procedure $targetname"
:error:build See /opt/local/var/macports/logs/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_devel_semgrep/semgrep/main.log for details.

Add using/with to AST_generic.ml

Several languages have syntax to declare a variable and automatically dispose of it at the end of the block.

Python's with:

with opened(filename, "w") as f:
    with stdout_redirected(f):
        print "Hello world"

C# has using:

using (var reader = new StringReader(manyLines))
{
    string? item;
    do {
        item = reader.ReadLine();
        Console.WriteLine(item);
    } while(item != null);
}

using takes an expression or a variable declaration. A variable declaration can consist of multiple declarations in one:

using (Font font3 = new Font("Arial", 10.0f), font4 = new Font("Arial", 10.0f))
{
   // Use font3 and font4.
}

I would propose the following:

stmt =
| WithUsingResource of tok * stmt * stmt 

Where:

  • tok is the using or with.
  • First stmt is the resource to be disposed at the end. So this is typically an ExprStmt or a DefStmt.
  • Second stmt is a Block with the code.

Should we use something more specific than stmt for the resource acquisition part? Is the name WithUsingResource OK?

This replaces OSWS_With. Should we remove that immediately, or keep it around for backward compatibility?

Migrate off camlp4

Camlp4 release for OCaml 4.08 is the last release ever - see the official announcement. It is recommended to port the code to use ppx (extension points), or any external parser (like mehrir, sedlex, ocamllex, ansgtrom, etc).

http://opam.ocaml.org/packages/pfff/ shows there is a dependency on camlp4, thus this issue.

But pfff can't be installed with 4.09 and 4.10 OCaml releases. Please consider to get rid of that dependency.

Ask Github to delist this repository as a fork?

Given that this fork is a superset of the commits on the repository it is forked from, it might be worth asking Github to delist it as a fork.
I found it a bit confusing at first (though I figured out based on the README.md of the repository this repository is forked from that this is the correct repository).
Github has a note about this in their documentation https://docs.github.com/en/github/setting-up-and-managing-your-github-profile/why-are-my-contributions-not-showing-up-on-my-profile#commit-was-made-in-a-fork
This is just a suggestion :)
This project looks awesome by the way – I'm itching to try it out.

`make test` fails due to assumptions about filesystem layout

make test seems to assume pfff is installed in usr/local/share, so fails when pfff is run out of some other repo:

Error: all:22:typing_tests:8:test inferred variable definitions go

Sys_error("/usr/local/share/pfff//tests/GENERIC/typing/PropVarDef.go: No such file or directory")
----------------------------------------------------------------------
======================================================================
Error: all:22:typing_tests:7:test basic function call go

Sys_error("/usr/local/share/pfff//tests/GENERIC/typing/FuncParam.go: No such file or directory")
----------------------------------------------------------------------
======================================================================
Error: all:22:typing_tests:6:test basic variable definitions go

Sys_error("/usr/local/share/pfff//tests/GENERIC/typing/StaticVarDef.go: No such file or directory")
----------------------------------------------------------------------
======================================================================
Error: all:22:typing_tests:5:go_pattern_files

Common2.CmdError(_, "CMD = ls -1 /usr/local/share/pfff//tests/go/semgrep/*.sgrep, RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:22:typing_tests:4:java_pattern_files

Common2.CmdError(_, "CMD = ls -1 /usr/local/share/pfff//tests/java/semgrep/*.sgrep, RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:22:typing_tests:3:test class field types

Sys_error("/usr/local/share/pfff//tests/GENERIC/typing/ClassFields.java: No such file or directory")
----------------------------------------------------------------------
======================================================================
Error: all:22:typing_tests:2:test basic params java

Sys_error("/usr/local/share/pfff//tests/GENERIC/typing/BasicParam.java: No such file or directory")
----------------------------------------------------------------------
======================================================================
Error: all:22:typing_tests:1:test multiple variable definitions java

Sys_error("/usr/local/share/pfff//tests/GENERIC/typing/EqVarCmp.java: No such file or directory")
----------------------------------------------------------------------
======================================================================
Error: all:22:typing_tests:0:test basic variable definitions java

Sys_error("/usr/local/share/pfff//tests/GENERIC/typing/VarDef.java: No such file or directory")
----------------------------------------------------------------------
======================================================================
Error: all:21:naming generic:0:regression files

Common2.CmdError(_, "CMD = ls -1 /usr/local/share/pfff//tests/python/naming/*.py, RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:20:parsing_go:0:regression files

Common2.CmdError(_, "CMD = ls -1 /usr/local/share/pfff//tests/go/parsing/*.go, RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:19:parsing_cpp:2:C regression files

Common2.CmdError(_, "CMD = ls -1 /usr/local/share/pfff//tests/c/parsing/*.c, RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:19:parsing_cpp:1:rejecting bad code

Common2.CmdError(_, "CMD = ls -1 /usr/local/share/pfff//tests/cpp/parsing_errors/*.cpp, RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:19:parsing_cpp:0:regression files

Common2.CmdError(_, "CMD = ls -1 /usr/local/share/pfff//tests/cpp/parsing/*.h, RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:17:parsing_ruby:0:regression files

Common2.CmdError(_, "CMD = ls -1 /usr/local/share/pfff//tests/ruby/parsing/*.rb, RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:16:parsing_python:0:regression files

Common2.CmdError(_, "CMD = ls -1 /usr/local/share/pfff//tests/python/parsing/*.py, RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:15:parsing_json:0:regression files

Common2.CmdError(_, "CMD = ls -1 /usr/local/share/pfff//tests/json/parsing/*.json, RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:14:analyze_js:2:AST js building regression files

Common2.CmdError(_, "CMD = ls -1 /usr/local/share/pfff//tests/js/parsing/*.js, RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:13:parsing_js:1:regression files typescript

Common2.CmdError(_, "CMD = ls -1 /usr/local/share/pfff//tests/typescript/parsing/*.ts, RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:13:parsing_js:0:regression files

Common2.CmdError(_, "CMD = ls -1 /usr/local/share/pfff//tests/js/parsing/jsx/*.js, RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:12:analyze_java:0:regression files

Common2.CmdError(_, "CMD = ls -1 /usr/local/share/pfff//tests/java/parsing/*.java, RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:11:parsing_java:0:regression files

Common2.CmdError(_, "CMD = ls -1 /usr/local/share/pfff//tests/java/parsing/*.java, RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:10:analyze_ml:1:coverage_ml:0:basename to readable

Unix.Unix_error(Unix.ENOENT, "stat", "/usr/local/share/pfff/lang_ml")
----------------------------------------------------------------------
======================================================================
Error: all:10:analyze_ml:0:building light database

Unix.Unix_error(Unix.ENOENT, "stat", "/usr/local/share/pfff/tests/ml/db")
----------------------------------------------------------------------
======================================================================
Error: all:9:parsing_ml:0:regression files

Common2.CmdError(_, "CMD = ls -1 /usr/local/share/pfff//tests/ml/parsing/*.ml, RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:7:prolog:19:generator

Common.CmdError(_, "CMD = swipl -s /var/folders/zl/6kw19vqs1k1brhhjz3p0_2c40000gn/T/prolog_php_db-29295-6da5a5.pl -f /usr/local/share/pfff/h_program-lang/prolog_code.pl -t halt --quiet -g \"async(X), writeln(X), fail ,fail\", RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:7:prolog:18:hack

Common.CmdError(_, "CMD = swipl -s /var/folders/zl/6kw19vqs1k1brhhjz3p0_2c40000gn/T/prolog_php_db-29295-8b6d13.pl -f /usr/local/share/pfff/h_program-lang/prolog_code.pl -t halt --quiet -g \"hh(X,_), writeln(X) ,fail\", RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:7:prolog:17:xhp

Common.CmdError(_, "CMD = swipl -s /var/folders/zl/6kw19vqs1k1brhhjz3p0_2c40000gn/T/prolog_php_db-29295-dc0032.pl -f /usr/local/share/pfff/h_program-lang/prolog_code.pl -t halt --quiet -g \"field(':x:frag', (_, X)), writeln(X) ,fail\", RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:7:prolog:16:types

Common.CmdError(_, "CMD = swipl -s /var/folders/zl/6kw19vqs1k1brhhjz3p0_2c40000gn/T/prolog_php_db-29295-5d8812.pl -f /usr/local/share/pfff/h_program-lang/prolog_code.pl -t halt --quiet -g \"return('foo', X), writeln(X) ,fail\", RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:7:prolog:15:class constant use

Common.CmdError(_, "CMD = swipl -s /var/folders/zl/6kw19vqs1k1brhhjz3p0_2c40000gn/T/prolog_php_db-29295-e06a5c.pl -f /usr/local/share/pfff/h_program-lang/prolog_code.pl -t halt --quiet -g \"use('foo', X , constant, read), writeln(X) ,fail\", RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:7:prolog:14:fields use

Common.CmdError(_, "CMD = swipl -s /var/folders/zl/6kw19vqs1k1brhhjz3p0_2c40000gn/T/prolog_php_db-29295-28dff7.pl -f /usr/local/share/pfff/h_program-lang/prolog_code.pl -t halt --quiet -g \"use(X, 'bar', field, write), writeln(X) ,fail\", RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:7:prolog:13:arrays used as records

Common.CmdError(_, "CMD = swipl -s /var/folders/zl/6kw19vqs1k1brhhjz3p0_2c40000gn/T/prolog_php_db-29295-55136b.pl -f /usr/local/share/pfff/h_program-lang/prolog_code.pl -t halt --quiet -g \"use(X, 'bar', array, read), writeln(X) ,fail\", RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:7:prolog:12:exceptions

Common.CmdError(_, "CMD = swipl -s /var/folders/zl/6kw19vqs1k1brhhjz3p0_2c40000gn/T/prolog_php_db-29295-7e69b0.pl -f /usr/local/share/pfff/h_program-lang/prolog_code.pl -t halt --quiet -g \"throw('foo', X), writeln(X) ,fail\", RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:7:prolog:11:advanced callgraph analysis for methods

Common.CmdError(_, "CMD = swipl -s /var/folders/zl/6kw19vqs1k1brhhjz3p0_2c40000gn/T/prolog_php_db-29295-faf6af.pl -f /usr/local/share/pfff/h_program-lang/prolog_code.pl -t halt --quiet -g \"docall2('bar', (X,Y), method), writeln((X,Y)), fail ,fail\", RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:7:prolog:10:callgraph for higher order functions

Common.CmdError(_, "CMD = swipl -s /var/folders/zl/6kw19vqs1k1brhhjz3p0_2c40000gn/T/prolog_php_db-29295-439220.pl -f /usr/local/share/pfff/h_program-lang/prolog_code.pl -t halt --quiet -g \"docall('bar', X, special), writeln(X), fail ,fail\", RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:7:prolog:9:callgraph for static methods

Common.CmdError(_, "CMD = swipl -s /var/folders/zl/6kw19vqs1k1brhhjz3p0_2c40000gn/T/prolog_php_db-29295-3bf67c.pl -f /usr/local/share/pfff/h_program-lang/prolog_code.pl -t halt --quiet -g \"docall('bar', X, method), writeln(X), fail ,fail\", RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:7:prolog:8:handling new PHP syntax (new X)->

Common.CmdError(_, "CMD = swipl -s /var/folders/zl/6kw19vqs1k1brhhjz3p0_2c40000gn/T/prolog_php_db-29295-6b03f8.pl -f /usr/local/share/pfff/h_program-lang/prolog_code.pl -t halt --quiet -g \"docall('bar1', X, class), writeln(X), fail ,fail\", RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:7:prolog:7:basic (imprecise) callgraph for methods

Common.CmdError(_, "CMD = swipl -s /var/folders/zl/6kw19vqs1k1brhhjz3p0_2c40000gn/T/prolog_php_db-29295-858882.pl -f /usr/local/share/pfff/h_program-lang/prolog_code.pl -t halt --quiet -g \"docall('bar', X, method), writeln(X), fail ,fail\", RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:7:prolog:6:basic callgraph for functions

Common.CmdError(_, "CMD = swipl -s /var/folders/zl/6kw19vqs1k1brhhjz3p0_2c40000gn/T/prolog_php_db-29295-ad5323.pl -f /usr/local/share/pfff/h_program-lang/prolog_code.pl -t halt --quiet -g \"docall(X, 'foo', function), writeln(X), fail ,fail\", RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:7:prolog:5:overrides

Common.CmdError(_, "CMD = swipl -s /var/folders/zl/6kw19vqs1k1brhhjz3p0_2c40000gn/T/prolog_php_db-29295-06af07.pl -f /usr/local/share/pfff/h_program-lang/prolog_code.pl -t halt --quiet -g \"overrides(Class, Method), writeln(Method), fail ,fail\", RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:7:prolog:4:traits

Common.CmdError(_, "CMD = swipl -s /var/folders/zl/6kw19vqs1k1brhhjz3p0_2c40000gn/T/prolog_php_db-29295-be745e.pl -f /usr/local/share/pfff/h_program-lang/prolog_code.pl -t halt --quiet -g \"method('A', (_Class, X)), writeln(X) ,fail\", RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:7:prolog:3:inheritance and traits

Common.CmdError(_, "CMD = swipl -s /var/folders/zl/6kw19vqs1k1brhhjz3p0_2c40000gn/T/prolog_php_db-29295-bd99f0.pl -f /usr/local/share/pfff/h_program-lang/prolog_code.pl -t halt --quiet -g \"children(X, 'I'), writeln(X) ,fail\", RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:7:prolog:2:inheritance

Common.CmdError(_, "CMD = swipl -s /var/folders/zl/6kw19vqs1k1brhhjz3p0_2c40000gn/T/prolog_php_db-29295-3fd6d4.pl -f /usr/local/share/pfff/h_program-lang/prolog_code.pl -t halt --quiet -g \"children(X, 'A'), writeln(X) ,fail\", RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:7:prolog:1:types

Common.CmdError(_, "CMD = swipl -s /var/folders/zl/6kw19vqs1k1brhhjz3p0_2c40000gn/T/prolog_php_db-29295-b8b165.pl -f /usr/local/share/pfff/h_program-lang/prolog_code.pl -t halt --quiet -g \"type(('A','x'), X), writeln(X) ,fail\", RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:7:prolog:0:kinds

Common.CmdError(_, "CMD = swipl -s /var/folders/zl/6kw19vqs1k1brhhjz3p0_2c40000gn/T/prolog_php_db-29295-702222.pl -f /usr/local/share/pfff/h_program-lang/prolog_code.pl -t halt --quiet -g \"kind('foo', X), writeln(X) ,fail\", RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:4:foundation_php:0:ast_simple regression files

Common2.CmdError(_, "CMD = ls -1 /usr/local/share/pfff//tests/php/semantic/*.php, RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:3:pretty print php:3:regression files

Common2.CmdError(_, "CMD = ls -1 /usr/local/share/pfff//tests/php/pretty/*.php, RESULT = ")
----------------------------------------------------------------------
======================================================================
Error: all:2:parsing_php:9:regression files

Common2.CmdError(_, "CMD = ls -1 /usr/local/share/pfff//tests/php/parsing/*.php, RESULT = ")
----------------------------------------------------------------------
======================================================================
Failure: all:8:coverage_php:2:coverage and json input output

fullpath: file (or directory) /usr/local/share/pfff/tests/php/coverage/good_trace.json does not exist
----------------------------------------------------------------------
======================================================================
Failure: all:1:graph_code:0:graph:3:class analysis

cant find filename_without_project_path: /tmp  /var/folders/zl/6kw19vqs1k1brhhjz3p0_2c40000gn/T/test-29295-c06de4.php
----------------------------------------------------------------------```

Slow builds introduced by menhir 20211230 or 20220210

We were running into out-of-memory errors when we tried building pfff with menhir 20220210. This is a report of the fast build times like we're used to and the new, long build times.

Note that the atd package on which we depend for semgrep doesn't build with menhir 20211230 (some type error), so I didn't check the performance for that version but I suspect this is where it started based on the menhir changelog:

The code back-end has been rewritten from the ground up by Émile Trotignon and François Pottier, and now produces efficient and well-typed OCaml code. The infamous Obj.magic is not used any more.

The times shown are the build times for the whole pfff project after deleting the mentioned build folders (under _build/). The command I used is

for x in lang_*/parsing; do echo $x; rm -rf _build/default/$x ; /bin/time -f "$x: %U s" -o "$(dirname $x).time" make; done

Results

The longest build time is now for lang_js/parsing: it went from 23.67 s to 154.29 s. I didn't run into OOM errors during this benchmarking but I did previously when rebuilding the whole project both in CI and locally (starting my build with about 11 GB available on my machine).

Before (menhir 20211128):

lang_cpp/parsing: 25.15 s
lang_csharp/parsing: 2.67 s
lang_css/parsing: 1.74 s
lang_c/parsing: 3.43 s
lang_erlang/parsing: 2.47 s
lang_FUZZY/parsing: 1.99 s
lang_GENERIC/parsing: 1.64 s
lang_go/parsing: 9.37 s
lang_haskell/parsing: 2.31 s
lang_html/parsing: 3.51 s
lang_java/parsing: 13.27 s
lang_json/parsing: 2.18 s
lang_js/parsing: 23.67 s
lang_lisp/parsing: 2.42 s
lang_ml/parsing: 14.92 s
lang_nw/parsing: 2.28 s
lang_php/parsing: 26.39 s
lang_python/parsing: 10.37 s
lang_regexp/parsing: 3.12 s
lang_ruby/parsing: 14.77 s
lang_rust/parsing: 2.54 s
lang_scala/parsing: 5.94 s
lang_skip/parsing: 2.85 s
lang_sql/parsing: 1.73 s
lang_web/parsing: 1.70 s

After (menhir 20220210):

lang_cpp/parsing: 92.68 s
lang_csharp/parsing: 2.74 s
lang_css/parsing: 1.81 s
lang_c/parsing: 36.69 s
lang_erlang/parsing: 2.54 s
lang_FUZZY/parsing: 2.20 s
lang_GENERIC/parsing: 1.84 s
lang_go/parsing: 20.57 s
lang_haskell/parsing: 2.70 s
lang_html/parsing: 3.52 s
lang_java/parsing: 53.64 s
lang_json/parsing: 2.41 s
lang_js/parsing: 154.29 s
lang_lisp/parsing: 2.55 s
lang_ml/parsing: 32.30 s
lang_nw/parsing: 2.46 s
lang_php/parsing: 67.24 s
lang_python/parsing: 20.36 s
lang_regexp/parsing: 3.46 s
lang_ruby/parsing: 15.20 s
lang_rust/parsing: 2.74 s
lang_scala/parsing: 6.05 s
lang_skip/parsing: 3.16 s
lang_sql/parsing: 1.97 s
lang_web/parsing: 1.87 s

Having a dockerfile or clearer install instructions

Would it be possible to have a dockerfile or clearer install instructions ?
I tried to install it in a docker but it fails mysteriously, I did try to follow the vague install instructions.

Here is my dockerfile

FROM ubuntu:18.04
RUN apt update && apt install -y ocaml opam menhir swi-prolog default-jre libncurses5-dev libncursesw5-dev binutils-gold
COPY . /pfff
RUN opam init -a && eval $(opam env)
RUN cd /pfff && ./configure && make depend && make && make opt
ENTRYPOINT [ "/usr/local/bin/codegraph" ]

I get this error at make depend:

File "parser_ml.mly", line 100, characters 9-10:
Error: unexpected character(s).
Makefile:58: recipe for target 'parser_ml.ml' failed
make[1]: Leaving directory '/pfff/lang_ml/parsing'
make[1]: *** [parser_ml.ml] Error 1
make: *** [depend] Error 2

Error: Unbound module Graph

Im on Ubuntu 18.04, I ran ./configure and make depend without any issues but when I run make I get this error:

Error: Unbound module Graph ../../Makefile.common:160: recipe for target 'graphe.cmo' failed make[2]: *** [graphe.cmo] Error 2 make[2]: Leaving directory '/home/gabriele/Documents/workspace/git/pfff/commons_wrappers/graph' Makefile:232: recipe for target 'rec' failed make[1]: *** [rec] Error 1 make[1]: Leaving directory '/home/gabriele/Documents/workspace/git/pfff' Makefile:223: recipe for target 'all' failed make: *** [all] Error 2

Fix scripts/setup-debian

This is on ubuntu 20.04 but presumably this also happens in the docker build (ocaml/opam2:debian-stable). Maybe we should add the docker build to CI if it's not there already.

Update: the docker build is fine. This may be an issue with ubuntu 20.04.

~/pfff $ ./scripts/setup-debian 
Found opam 2.0.5.
Hit:1 http://dl.google.com/linux/chrome/deb stable InRelease
Hit:2 https://download.docker.com/linux/ubuntu focal InRelease                 
Hit:3 http://us.archive.ubuntu.com/ubuntu focal InRelease                      
Hit:4 http://us.archive.ubuntu.com/ubuntu focal-updates InRelease              
Ign:5 http://ppa.launchpad.net/avsm/ppa/ubuntu focal InRelease                 
Hit:6 http://us.archive.ubuntu.com/ubuntu focal-backports InRelease         
Hit:7 http://security.ubuntu.com/ubuntu focal-security InRelease         
Hit:8 http://ppa.launchpad.net/deadsnakes/ppa/ubuntu focal InRelease     
Hit:9 http://ppa.launchpad.net/git-core/ppa/ubuntu focal InRelease
Err:10 http://ppa.launchpad.net/avsm/ppa/ubuntu focal Release
  404  Not Found [IP: 91.189.95.83 80]
Reading package lists... Done
E: The repository 'http://ppa.launchpad.net/avsm/ppa/ubuntu focal Release' does not have a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.