Code Monkey home page Code Monkey logo

jalangi2's People

Contributors

christofferqa avatar dependabot[bot] avatar esbena avatar franktip avatar jacksongl avatar ksen007 avatar madhunimmo avatar marijaselakovic avatar michaelpradel avatar milahu avatar msridhar avatar ric-light avatar rohanpadhye avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

jalangi2's Issues

eval of the empty string does not produce `undefined`

Example evals, Jalangi behaves badly on the second one:

$ cat test.js                            
console.log(eval());
console.log(eval(""));
console.log(eval(undefined));
$ node test.js                           
undefined
undefined
undefined
$ node src/js/commands/jalangi.js test.js 
undefined

undefined

Missing assignment for function declarations in 34427b7

As of 34427b7,

function f() {}

Is instrumented to:

function f() {
                jalangiLabel0:
...
}
J$.N(41, 'f', J$.T(33, f, 12, false), false, false, false);

Rather than:

function f() {
                jalangiLabel0:
...
}
f = J$.N(41, 'f', J$.T(33, f, 12, false), true, false, false);

The missing assignment seems to be a bug, as it it no longer possible to redefine f with the return value of J$.N(...).

[BranchOnly]: J$cntr is not defined

If I try to instrument a program with the following:

sandbox.Config.requiresInstrumentation = function (id, funId, sid, funName) {
    if (funName === 'J$.T') {
        return true;
    }

    if (funName === 'J$.S' || funName === 'J$.Fe' || funName === 'J$.Fr' ||
        funName === 'J$.M' || funName === 'J$.F' ||
        funName === 'J$.C1' || funName === 'J$.C2' || funName === 'J$.C') {

        return Instrument.indexOf(funId) !== - 1;
    }

    return false;
};

where Instrument is a stable array of identifiers, I get

ReferenceError: J$cntr is not defined

update documentation to indicate that we break code that does stack inspection

Look at this fun code from here:

exports.getFileName = function getFileName (calling_file) {
  var origPST = Error.prepareStackTrace
    , origSTL = Error.stackTraceLimit
    , dummy = {}
    , fileName

  Error.stackTraceLimit = 10

  Error.prepareStackTrace = function (e, st) {
    for (var i=0, l=st.length; i<l; i++) {
      fileName = st[i].getFileName()
      if (fileName !== __filename) {
        if (calling_file) {
            if (fileName !== calling_file) {
              return
            }
        } else {
          return
        }
      }
    }
  }

  // run the 'prepareStackTrace' function above
  Error.captureStackTrace(dummy)
  dummy.stack

  // cleanup
  Error.prepareStackTrace = origPST
  Error.stackTraceLimit = origSTL

  return fileName
}

It's inspecting the stack trace from an exception to figure out the filename for the caller module. Since Jalangi2 doesn't preserve call stacks, this code breaks.

This is probably a fundamental issue, but posting in case there happens to be some kind of solution.

simplified handling of source locations for multiple scripts

I think the handling of source locations with multiple scripts can be simplified for analyses with some changes. Right now, if I understand correctly, the current script ID is stored in J$.sid. If the analysis wants to detect when the current script ID has changed, it needs to implement its own logic in the scriptEnter, scriptExit, functionEnter, and functionExit callbacks. Furthermore, the analysis is responsible for maintaining a mapping from script ID to the corresponding file name (the originalFileName parameter to scriptEnter).

Instead, I propose that this logic all be implemented once-and-for-all in analysis.js and exposed to the analysis via a new updateCurrentScript(name) callback. updateCurrentScript would be invoked any time the current script is changing, e.g., at script enter, or at function exit if the caller was in a different script. analysis.js would maintain the mapping between script IDs and names, so the analysis would never need to see script IDs. We'd of course need some special handling for eval'd scripts. I think baking this logic into analysis.js is the right way to go to simplify writing analyses.

Content-type checking in proxy.py is too restrictive

The following code in: https://github.com/Samsung/jalangi2/blob/master/scripts/proxy.py#L43
misses several files for instrumentation:

  1. Content-Type is case insensitive (i.e can also be content-type or CoNtEnT-tYpE):
    http://stackoverflow.com/questions/5258977/are-http-headers-case-sensitive
def response(context, flow):
    flow.response.decode()
    if 'Content-Type' in flow.response.headers:
        if flow.response.headers['Content-Type'][0].find('javascript') != -1:
            flow.response.content = processFile(flow.response.content, "js")
        if flow.response.headers['Content-Type'][0].find('html') != -1:
            flow.response.content = processFile(flow.response.content, "html")

You could try this:

def content_type(headers):
    for key in headers.keys():
        if key.lower() == "content-type":
            return headers[key].lower()
    return None

def response(context, flow):
    flow.response.decode()
    if 'javascript' in content_type(flow.response.headers):
        flow.response.content = processFile(flow.response.content, "js")
    elif 'html' in content_type(flow.response.headers):
        flow.response.content = processFile(flow.response.content, "html")
  1. You might want to looking at the extension of the path (i.e. filename: flow.request.path.split('/')[-1] or ext flow.request.path.split('.')[-1] -- with appropriate string sanitization)

isComputed & binary delete

analysis.getField and analysis.putField have a parameter named isComputed, it would be nice if analysis.binary also had that, for use in delete o[p] expressions.

[Bug]: INSTR_LITERAL and INSTR_INIT are not orthogonal

Disabling instrumentation of certain initializers may prevent instrumentation of certain literals.

For example;

var Box2D = {};
(function () {
function f () {
return 2+2;
}
})();

If INSTR_LITERAL and INSTR_INIT are enabled both function literal will be instrumented with J$.T...

However, if INSTR_INIT is disabled then the inner function literal is not instrumented.

Infinite Loop with INSTR_WRITE Disabled

The newest version of Jalangi enters an infinite loop on 3d-raytrace.js with instrumentation disabled for writes:

sandbox.Config.INSTR_WRITE = function (name, ast) {
return false;
};

JIT-friendly mode

For experiments regarding performance overhead, it would be good to have a mode in which esnstrument.js does not create any code constructs known to stop V8 from applying its JIT. As of now, the relevant constructs are:

  1. try/finally blocks enclosing method bodies
  2. referencing and escaping the arguments array
    The JIT-friendly mode need not be suitable for writing analyses; it would just be used to be able to measure the overhead of the constructs across an array of runtimes.

endExpression & sequence-expressions

J$.X1 is missing inside sequence-expressions, as the non-last
expressions are "end of expression" by themselves.

I suggest pushing J$.X1 onto all members of a sequence-expression.

Consider the two examples:

('a', 'b')

Is instrumented to:

J$.X1((J$.T(9, 'a', 21, false), J$.T(17, 'b', 21, false)));

But it should be:

(J$.X1(J$.T(9, 'a', 21, false), J$.X1(J$.T(17, 'b', 21, false)));
for('a', 'b';;);

Is instrumented to:

for (J$.X1((J$.T(25, 'a', 21, false), J$.T(33, 'b', 21, false)));;);

But it should be:

for ((J$.X1(J$.T(25, 'a', 21, false), J$.X1(J$.T(33, 'b', 21, false))));;);

(NB: AST-wise, the node.type of the sequence expression is: "SequenceExpression")

better source locations for formal parameters

Consider the following JS script:

function foo(f,g) {}

When instrumented (with --inlineIID option), we get:

J$.iids = {"9":[1,1,1,21],"17":[1,1,1,21],"25":[1,1,1,21],"33":[1,1,1,21],"41":[1,1,2,1],"49":[1,1,1,21],"57":[1,1,2,1],"65":[1,1,1,21],"73":[1,1,1,21],"81":[1,1,2,1],"89":[1,1,2,1],"nBranches":2,"originalCodeFileName":"/tmp/params.js","instrumentedCodeFileName":"/tmp/params_jalangi_.js"};
jalangiLabel1:
    while (true) {
        try {
            J$.Se(41, '/tmp/params_jalangi_.js', '/tmp/params.js');
            function foo(f, g) {
                jalangiLabel0:
                    while (true) {
                        try {
                            J$.Fe(9, arguments.callee, this, arguments);
                            arguments = J$.N(17, 'arguments', arguments, true, false, false);
                            f = J$.N(25, 'f', f, true, false, false);
                            g = J$.N(33, 'g', g, true, false, false);
...

Note that the source locations for the J$.N callbacks for parameters f and g are [1,1,1,21], i.e., the entirety of function foo. Instead, it would be nice if these callbacks got source locations that only included the corresponding formal parameter (e.g., [1,12,1,13] for parameter f, IID 25).

Comparing function name

Hello, I am testing for the code to check for the params that are passed in post functions.
To do that, I wrote an analysis to check if the function call is post, but it is returning 'ReferenceError: post is not defined'. Below is the snippet of analysis that I wrote and the error is caused by 'var POST_FUNCTION= post'. Referring to the function name like this was found in Jalangi online demo. I understand that it is not used supported anymore, however is there any other way to check function's name in either invokeFunPre or invokeFun by comparing parameter f? If not, using any functions provided by jalangi?

ANALYSIS:

 var POST_FUNCTION = post;

this.invokeFunPre = function (iid, f, base, args, isConstructor) {
      console.log('function call intercepted before invoking');    
      if (f === POST_FUNCTION && args) {
      console.log('function is POST'); 
       // pass in config, always thrid parameter in http.post
        checkParams(args[2]);
      }
    };

I am also including the code that I am testing below.
CODE:

function post() {
} 

var sampleRecords = function(params) {
  var config = {
    params: params,
    httpErrorHandlers: {
      '4xx': function(error) {
        deferred.reject(error);
      }
    }
  }

post('/machinelearning/service/datasource/sample-records', null, config);
}

var params = {
  dbPassword : "password_test",
  dbUsername : "username_test"
};

sampleRecords(request);

Semantic preservation bugs when using instrumented lodash

When the lodash library is run instrumented, semantics seems to change a little:
The internal test suite of the library fails when running with jalangi.js:

Setup:

$ git clone [email protected]:lodash/lodash.git
$ cd lodash
$ npm i
...
$ cd ..

Run lodash test suite uninstrumented:

$ node lodash/test/test.js         
test.js invoked with arguments: ["node","/home/esbena/tmp/lodash/test/test.js"]
----------------------------------------
    PASS: 4574  FAIL: 0  TOTAL: 4574
    Finished in 8994 milliseconds.
----------------------------------------

Run lodash test suite instrumented:

$ node jalangi2-official/src/js/commands/jalangi.js lodash/test/test.js
test.js invoked with arguments: ["node","/home/esbena/tmp/lodash/test/test.js"]
----------------------------------------
keys methods
----------------------------------------
 FAIL - `_.keys` skips non-enumerable properties (test in IE < 9)
    FAIL | EQ | skips non-enumerable properties on `RegExp.prototype` | Expected: true, Actual: false
    FAIL | EQ | skips non-enumerable properties inherited from `RegExp.prototype` | Expected: true, Actual: false
 FAIL - `_.keysIn` skips non-enumerable properties (test in IE < 9)
    FAIL | EQ | skips non-enumerable properties on `RegExp.prototype` | Expected: true, Actual: false
    FAIL | EQ | skips non-enumerable properties inherited from `RegExp.prototype` | Expected: true, Actual: false
    FAIL | EQ | skips non-enumerable properties on `RegExp.prototype` | Expected: true, Actual: false
    FAIL | EQ | skips non-enumerable properties inherited from `RegExp.prototype` | Expected: true, Actual: false
----------------------------------------
lodash.partialRight
----------------------------------------
 FAIL - should work as a deep `_.defaults`
    FAIL | OK | Died on test #1     at invokeFun (/home/esbena/tmp/jalangi2-official/src/js/runtime/analysis.js:211:22): deep is not defined
----------------------------------------
    PASS: 4569  FAIL: 5  TOTAL: 4574
    Finished in 69820 milliseconds.
----------------------------------------

forIn variable update not wrapped in X1

The source:

for(var p in {}){}

Is instrumented to:

            J$.N(41, 'p', p, false, false, false);
            for (J$._tm_p in J$.H(17, J$.T(9, {}, 11, false))) {
                var p = J$.W(25, 'p', J$._tm_p, p, false, true, true);
                {
                    {
                    }
                }
            }

I think that:

var p = J$.W(25, 'p', J$._tm_p, p, false, true, true);

Should be wrapped in X1:

var p = X1(xxx, J$.W(25, 'p', J$._tm_p, p, false, true, true));

In the same way as the source:

var x = 42;

Is instrumented to:

            var x = J$.X1(25, J$.W(17, 'x', J$.T(9, 42, 22, false), x, false, true, true));

Semantic bug: different exception thrown

A browserified browserify behaves differently when run with Jalangi:

Original behaviour:

TypeError: Object #<Object> has no method 'readFileSync'

Jalangi behaviour:

TypeError: Function.prototype.apply was called on undefined, which is a undefined and not a function

To reproduce:
In jalangi2 directory:

$ npm install -g browserify
$ mkdir bug
$ cd bug
$ mkdir node_modules
$ npm install browserify
$ echo 'require("browserify");' >> main.js
$ node main.js 
$ browserify main.js -o bundle.js
$ node bundle.js 

/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:845
var defaultPrelude = fs.readFileSync(defaultPreludePath, 'utf8');
                        ^
TypeError: Object #<Object> has no method 'readFileSync'
    at Object.<anonymous> (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:845:25)
    at Object.JSONStream (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:955:4)
    at s (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:1:316)
    at /Users/e.andreasen/workspace/jalangi2/bug/bundle.js:1:367
    at Object.<anonymous> (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:8:13)
    at Object../lib/builtins.js (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:791:4)
    at s (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:1:316)
    at /Users/e.andreasen/workspace/jalangi2/bug/bundle.js:1:367
    at Object.browserify (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:2:1)
    at s (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:1:316)
$ cd ..
$ node src/js/commands/jalangi.js --inlineIID --inlineSource --analysis src/js/sample_analyses/ChainedAnalyses.js --analysis src/js/sample_analyses/dlint/Utils.js --analysis src/js/sample_analyses/dlint/CheckNaN.js bug/bundle.js 

/Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:440
            throw tmp;
                  ^
TypeError: Function.prototype.apply was called on undefined, which is a undefined and not a function
    at callFun (/Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:145:51)
    at invokeFun (/Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:166:22)
    at /Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:189:41
    at Object.<anonymous> (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:2278:195)
    at callFun (/Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:145:51)
    at invokeFun (/Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:166:22)
    at /Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:189:41
    at Object.J$.X1.J$.F.J$.T.J$.T.J$.T.JSONStream (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:2516:73)
    at callFun (/Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:145:51)
    at invokeFun (/Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:166:22)

Wrong event order for getField on Firefox and Chrome

For the expression a.b(c), node.js engine will evaluate the sub-expressions in the following order:

  • a
  • c
  • a.b

However, in Chrome and Firefox, those sub-expressions are evaluated in a different order:

  • a
  • a.b
  • c

Problem: for expression a.b(c), the instrumented code always emits getField a.b event after read c event.

Consider the following example code:

var a = {
    get b() {
        console.log('getting b');
        return function (){};
    }
};

var d = {
    get e() {
        console.log('getting e');
        return function (){};
    }
};

console.log('--------------');
a.b(d.e);
console.log('--------------');
(1, a.b)(d.e)

Running the following example program in Chrome and Firefox produces the following result:

--------------
getting b
getting e
--------------
getting b
getting e

Running the instrumented example code on Chrome and Firefox produces a different result:

--------------
getting e
getting b
--------------
getting b
getting e

Therefore, the instrumentation process changes the semantics of the example code on Firefox and Chrome.

jalangi.analyze does not always perform an analysis

Description

It seems that require('jalangi2').analyze(...) can fail to perform an analysis by silently not loading the application of interest.

When the failure occurs, it occurs on every instrument-and-analyze attempt for an application except for the first. The situation only occurs if the instrumented files are named the same as in the previous attempt.

Abstract example

$ node instrument-and-analyze.js test1.js instumentationDirectory
<<instumentationDirectory/test1.js outputs>>

$ node instrument-and-analyze.js test1.js instumentationDirectory
<<instumentationDirectory/test1.js outputs>>

$ node instrument-and-analyze.js test2.js instumentationDirectory
<<instumentationDirectory/test2.js outputs>>

$ node instrument-and-analyze.js test2.js instumentationDirectory
<<instumentationDirectory/test2.js NOT LOADED>>

Concrete example

See https://github.com/esbena/jalangi-api-caching-bug

Additional information

  • reproducible with node 0.10 and 0.12 on ubuntu.
  • src/js/commands/{instrument,jalangi,direct}.js does not encounter the problem

A bug in function level sampling

A problem related to nested function scope. Suppose here is the original code:

function f1() {
    function f2() {
        console.log('orig_fun');
    }
    f2();
}

f1();

After enabling function level sampling, it will be instrumented into (under function level sampling):
(the sampling of f2() is omitted for simplicity):

function f1() {
    if(true) { // suppose J$.S() always returns true here
        function f2() {
            console.log('instru_fun');
        }
        f2();
    } else {
        function f2() { // this f2 takes over the scope
            console.log('orig_fun');
        }
        f2();
    }
}

f1();

The entire instrumented code will always print: orig_fun
As a result, instrumented version of f2 is never called no matter what J$.S() returns.

Crash: Circular JSON on 3d-raytrace.js

The newest version of Jalangi crashes on 3d-raytrace.js with

sandbox.Config.ENABLE_SAMPLING = true;

and exception:

/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/esnstrument.js:1177
return JSON.parse(JSON.stringify(src), JSONParseHandler);
^
TypeError: Converting circular structure to JSON
at Object.stringify (native)
at clone (/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/esnstrument.js:1177:32)
at Object.visitorCloneBodyPre.FunctionDeclaration (/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/esnstrument.js:1251:29)
at transformAst (/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/astUtil.js:148:36)
at transformAst (/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/astUtil.js:141:39)
at Object.transformAst (/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/astUtil.js:141:39)
at transformString (/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/esnstrument.js:1759:30)
at Object.instrumentCode (/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/esnstrument.js:1819:26)
at Object.Module._extensions..js (/Users/m.madsen/Arov/lib/jalangi2/src/js/commands/jalangi.js:77:30)
at Module.load (module.js:355:32)

[BranchOnly]: Support Object Literals

In BranchOnly analysis.literal is restricted to only function objects.
However, to track rudimentary dataflow, I think we also want to be able to track object literals.
So if we could also support instrumentation of these.

Record replay in jalangi 2 ?

I am not clear on whether record replay is not possible with jalangi2 because of some architectural changes / choices or is it that record replay engine has not been ported to jalangi2 (and hence it is possible if someone actually does the work )?

The reason I ask is , i am interested in the record replay part of jalangi - but I want to know which of the two forks to go forward with ? If its still possible with jalangi 2 I would be happy to put in the effort to port the record replay engine to jalangi 2 ?

overhead regression for the annex app

I tried measuring the overhead of Jalangi 2 instrumentation on annex and comparing the overhead with analysis2.js from Jalangi 1. Here are the running times for the fourth computer move on Chrome 40, where we are running with no client analysis.

analysis2.js from Jalangi 1: ~1.8s
Jalangi 2: ~2.1s

This is a fairly significant slowdown. If I have time I'll try to dig further as to the root cause.

Reusing iids in src/js/runtime/analysis.js:A

The implementation of A in src/js/runtime/analysis.js is a bit odd.
It uses the same iids for the calls to G, B and P:

    function A(iid, base, offset, op, isComputed) {
        var oprnd1 = G(iid, base, offset, isComputed, true, false);
        return function (oprnd2) {
            var val = B(iid, op, oprnd1, oprnd2, true, false);
            return P(iid, base, offset, val, isComputed, true);
        };
    }

It should use different iids in the style of M:

    function M(iid, base, offset, isConstructor, isComputed) {
        return function () {
            var f = G(iid + 2, base, offset, isComputed, false, true);
            return (lastComputedValue = invokeFun(iid, base, f, arguments, isConstructor, true));
        };
    }

instrument.js hangs sometimes in node v5

Possibly related: #58

instrument.js hangs in node v5

To reproduce:

$ node --version
v5.0.0
$ mkdir test
$ cd test
$ echo '{}' > package.json
$ npm i jalangi2
$ npm i minimist
$ node node_modules/jalangi2/src/js/commands/instrument.js --outputDir out node_modules/minimist
^C # it hangs..
$ rm -rf node_modules/minimist/test
$ node node_modules/jalangi2/src/js/commands/instrument.js --outputDir out node_modules/minimist
instrumenting out/minimist2/index_orig_.js
instrumenting out/minimist2/example/parse_orig_.js
done!

It seems that the call to ncp on line 454 of instrument.js never calls the callback.
That can be shown with the following change:

ncp(inputDir, copyDir, {transform: transform}, function(){console.log('ncp done')});

Bug: Boxed primitives are unboxed as eval-results.

Source:

var x = new Boolean(true)
var y = eval(new Boolean(true))
console.log(x);
console.log(y);

Uninstrumented/instrumented difference:

$ node test.js
[Boolean: true]
[Boolean: true]
$ node src/js/commands/jalangi.js test.js 
[Boolean: true]
true

Exception check is wrong for `throw undefined`

The two checks in /src/js/runtime/analysis.js that check if an exception has been thrown are wrong.

The checks are implemented as

if (exceptionVal !== undefined) {...}

But this is wrong in the presence of throw undefined.

Example code to showcase wrongness:

Consider the empty analysis (../empty-analysis.js):

(function (sandbox) {
    function MyAnalysis () {
    }
    sandbox.analysis = new MyAnalysis();
})(J$);

Consider the program:
../throw-undefined.js

throw undefined;

Consider the shell session, which shows no errors:

$ node src/js/commands/esnstrument_cli.js ../throw-undefined.js && node src/js/commands/direct.js --analysis ../empty-analysis.js ../throw-undefined_jalangi_.js
$

But it should, because running ../throw-undefined.js without jalangi, produces an error:

$ node ../throw-undefined.js

../throw-undefined.js:1
(function (exports, require, module, __filename, __dirname) { throw undefined;
                                                                    ^
undefined

Add parameter to endExecution() to indicate a sudden exit

If a node program calls process.exit(), the underlying process exits immediately, without cleaning up the call stack, etc. In direct.js, we add an exit listener that at least ensures that endExecution() is invoked even if process.exit is called. But, this does not take care of all issues for the analysis client, as, e.g., it may be written expecting all functionEnter callbacks to have a corresponding functionExit callback. I can't think of anything we can really do here. We could monkey-patch process.exit to just throw some exception, but that can be caught by the application code. Just logging an issue in case there's an idea for how to hide the process.exit ugliness from analyses.

Confusing handling of global variables

The isGlobal and isScriptLocal parameters to the read and write callbacks have confusing and, in my opinion, inconsistent values when handling global variables.

Consider a simple analysis that just prints out the name of the variable and both flags on read and write (code below). Run it on the following piece of code:

var x;
x=1;
x;
z = 1;
z;
console=console;
console;

What I would expect: The isGlobal and isScriptLocal variables should be the same in reads and writes to the same variable. Moreover, z would have either have isGlobal set and isScriptLocal unset, or vice-versa, and console would have isGlobal set and isScriptLocal unset.

What I get:

Writing to x (script-local)
Reading from x (script-local)
Writing to z (local)
Reading from z (global and script-local?!)
Reading from console (global and script-local?!)
Writing to console (local)
Reading from console (global and script-local?!)

PS: Here is the analysis used:

// JALANGI DO NOT INSTRUMENT
(function (sandbox) {
    function MyAnalysis () {
    function writeFlags(isGlobal, isScriptLocal) {
        if (isGlobal && isScriptLocal) { return "global and script-local?!" }
        else if (isGlobal) { return "global" }
        else if (isScriptLocal) { return "script-local" }
        else { return "local" }
    }
        this.read = function(iid, name, val, isGlobal, isScriptLocal){
        console.log("Reading from " + name + " (" + writeFlags(isGlobal, isScriptLocal) + ")")
    };

        this.write = function(iid, name, val, lhs, isGlobal, isScriptLocal) {
        console.log("Writing to " + name + " (" + writeFlags(isGlobal, isScriptLocal) + ")")
    };
    }
    sandbox.analysis = new MyAnalysis();
})(J$);

Disabling INSTR_TRY_CATCH_ARGUMENTS throws IllegalStateException

Steps to reproduce:

  1. Clean Jalangi2 install.
  2. Edit Config.js to enable INSTR_TRY_CATCH_ARGUMENTS and always return false.
  3. Run node jalangi2/src/js/commands/instrument.js --outputDir foo jalangi2/tests/octane/box2d.js

/Users/m.madsen/tmp/jalangi2/src/js/commands/instrument.js:322
throw e;
^
Error: IllegalStateException
at getFnIdFromAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/esnstrument.js:531:19)
at wrapLiteral (/Users/m.madsen/tmp/jalangi2/src/js/instrument/esnstrument.js:554:37)
at Object.visitorRRPost.FunctionExpression (/Users/m.madsen/tmp/jalangi2/src/js/instrument/esnstrument.js:1386:24)
at transformAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/astUtil.js:148:36)
at transformAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/astUtil.js:141:39)
at transformAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/astUtil.js:141:39)
at transformAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/astUtil.js:141:39)
at transformAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/astUtil.js:141:39)
at transformAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/astUtil.js:141:39)
at transformAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/astUtil.js:141:39)

Bug: wrong base-object for with-statement function calls

Function calls done inside with-statements can receive the wrong base object.

It seems that function calls always are treated as function calls while they should be treated as method calls when the callee is found on the "with-object".

Source:

function f() {console.log(this + '');}
var x = {m: function() {console.log(this + '');}}

with (x) {
  f();
  m();
}

Output uninstrumented/instrumented:

$ node test.js
[object global]
[object Object]
$ node src/js/commands/jalangi.js test.js 
[object global]
[object global]

Acorn parsing of node shell scripts

The acorn.parse usage in esnstrument.js does not support node shell scripts with #!-magic:

#!/usr/bin/env node
...

Trying to parse the file gives an error:

SyntaxError: Unexpected character '#' (1:0)
    at raise (.../jalangi2/node_modules/acorn/acorn.js:319:15)

Either allowHashBang: true should be set as an option to acorn.parse, or the first line needs to be changed into a comment manually with something like code = '//' + code;.

NB: Doing this blindly all the time is dangerous as it might suppress real syntax errors.

Wrong floating point semantics for postfix operations.

The numeric postfix operation instrumentations does not preserve floating point semantics.

Example

Source:

var a = 0.15;
console.log(a);
console.log(a++);
console.log(a);

Uninstrumented & Instrumented runs:

$ node test.js
0.15
0.15
1.15
$ node src/js/commands/jalangi.js test.js
0.15
0.1499999999999999
1.15

Explanation

Ideally, the value of a postfix expression is the initial value, but the adjustIncDec function in esnstrument.js subtracts/adds 1 to the modified value instead. But this has an unfortunate effect on floating point numbers, as seen above.

Instrumented:

...
J$.X1(65, J$.B(26, '-', a = J$.W(49, 'a', J$.B(18, '+', J$.U(10, '+', J$.R(41, 'a', a, 0)), J$.T(33, 1, 22, false), 0), a, 0), J$.T(57, 1, 22, false), 0));
...

rewrite instrument.js to use cpr

We should rewrite instrument.js to use cpr instead of ncp. We are already stuck on some old ncp version due to API changes, and it also is preventing instrument.js from working on io.js (due to AvianFlu/ncp#79). Additionally, I'd like to add an option so that rather than writing the _orig_.js files, we simply point back to the original files, or put the original files in some parallel directory structure (but this will take some thought).

[Feature Request]: Disable Instrumentation On-Demand

In Jalangi it is currently possible to turn instrumentation on/off with a ENABLE_SAMPLING and runInstrumentedFunctionBody.

What I would like to ask for is a mechanism, for example as part of preFunInvoke, where you can opt to enter the original un-instrumented function and thus escape entirely to a world where nothing is instrumented. You would of course never be able to go back.

A potential issue is function objects stored in the heap, and how they would be replaced. Here it might still be sufficient to rely on the ENABLE_SAMPLING trick.

If this is beyond the scope of Jalangi, some pointers on how to achieve this would be welcome.

Missing J$.T for primitive literals in typeof expressions?

It seems that typeof "foo" and similar primitive literal arguments to typeof does not get instrumented.
Is this intentional?

Source:

typeof "foo";
typeof true;
typeof 1;

+"bar";
typeof +"baz";
typeof {};

Annotated instrumentation:

...
            J$.X1(9, J$.U(10, 'typeof', 'foo')); // << no J$.T
            J$.X1(17, J$.U(18, 'typeof', true)); // << no J$.T
            J$.X1(25, J$.U(26, 'typeof', 1)); // << no J$.T

            J$.X1(41, J$.U(34, '+', J$.T(33, 'bar', 21, false))); // << J$.T for non-typeof unary
            J$.X1(57, J$.U(50, 'typeof', J$.U(42, '+', J$.T(49, 'baz', 21, false)))); // << J$.something for expression argument to typeof
            J$.X1(73, J$.U(58, 'typeof', J$.T(65, {}, 11, false))); // << J$.T for object argument to typeof
...

Missing ReferenceError on undeclared variable references?

Jalangi squelches the ReferenceError that would occur on a references to an undeclared variable through its use of J$.I.

This breaks feature-detection with try-catch blocks as seen below:

try {
    document; // <-- supposed to throw in non-browser environments
} catch (e) {
    standalone = true;
} 

Suggestion: the jalangi-instrumentation-pattern with J$.I(...) for undeclared variables should be disabled by default.
Thoughts?


Small example that showcases the discrepancy between a jalangi-runtime and a node-runtime:

test.js:

DOES_NOT_EXIST;

test_jalangi_.js

...
J$.X1(17, J$.I(typeof DOES_NOT_EXIST === 'undefined' ? DOES_NOT_EXIST = J$.R(9, 'DOES_NOT_EXIST', undefined, true, true) : DOES_NOT_EXIST = J$.R(9, 'DOES_NOT_EXIST', DOES_NOT_EXIST, true, true)));
...
$ node src/js/commands/jalangi.js test.js
$ node test.js
...
(function (exports, require, module, __filename, __dirname) { DOES_NOT_EXIST;
                                                              ^
ReferenceError: DOES_NOT_EXIST is not defined
...

Missing declaration for function expression names

Jalangi does not have a declare-call for the name of a function expression.

I think this source:

(function g(){})();

Should be instrumented to something like:

J$.X1(113, J$.F(105, J$.T(97, function g() {
                jalangiLabel1:
                    while (true) {
                        try {

// NEW
                            g = J$.X1(2324, J$.N(137, 'g', J$.T(129, g, 12, false), true, false, false);
// END OF NEW

                            J$.Fe(81, arguments.callee, this, arguments);
                            arguments = J$.N(89, 'arguments', arguments, true, false, false);
                        } catch (J$e) {
                            J$.Ex(161, J$e);
                        } finally {
                            if (J$.Fr(169))
                                continue jalangiLabel1;
                            else
                                return J$.Ra();
                        }
                    }
            }, 12, false), false)());

Use Javascript based task runners?

Given that jalangi is a javascript tool. It might be a good idea to migrate all the python based tasks ( scripts folder ) to javascript based task runners ( grunt / gulp! ). This will reduce the dependency on additional installation's required.

Documentation: isGlobal and eval interact in unexpected ways

The behavior of the isGlobal flag in the read and write callbacks interacts with evals in an unexpected way. Consider the following example:

var x = "magic";
console.log(x);
eval("console.log(x)");

Then reading x the first time will invoke the read callback with isGlobal=false and isScriptLocal=true (which is entirely expected, from the documentation), while the read in the eval will invoke the callback with isGlobal=true and isScriptLocal=true.

According to the documentation, isGlobal is "True if the variable is not declared using var", but in fact, isGlobal is "True if the variable is not declared using var in the current script".

Minimal analysis to reproduce:

(function (sandbox) {
    function MyAnalysis (global) {
        this.read = function(iid, name, val, isGlobal, isScriptLocal){
        console.log("Reading " + name + ", isGlobal=" + isGlobal)
    };

        this.scriptEnter = function(iid, instrumentedFileName, originalFileName){
        console.log("Entering script")
    };

        this.scriptExit = function(iid, wrappedExceptionVal){
        console.log("Exiting script")
    };
    }
    sandbox.analysis = new MyAnalysis(this);
})(J$);

This yields:

Entering script
Reading console, isGlobal=true
Reading x, isGlobal=false
magic
Entering script
Reading console, isGlobal=true
Reading x, isGlobal=true
magic
Exiting script
Reading console, isGlobal=true
Reading x, isGlobal=false
magic
Exiting script

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.