samsung / jalangi2 Goto Github PK

View Code? Open in Web Editor NEW

427.0 427.0 115.0 25.31 MB

Dynamic analysis framework for JavaScript

License: Apache License 2.0

JavaScript 86.04% Python 0.35% Java 0.01% HTML 7.66% CSS 5.78% Shell 0.04% Less 0.13%

jalangi2's People

Contributors

Stargazers

Watchers

Forkers

esbena jackofmosttrades hothost87 alpha360x silverge franktip cwz920716 arunkt1 marijaselakovic superlyb shabesoglu harroldfinch marinabilles scylla neostoic longjohncoder mwbutcher osmanager rohanpadhye christofferqa sunnyjiang linearregression drill89 michaelpradel yorci arranf smt-js zunygun vieyahn tesla3327 cpx0rpc ric-light msridhar nagyistge thornmaker joseroubert08 jacksongl 0xleir zcl0203 ecu-pase-lab saya327 ndcroos floledermann pombredanne cnlittledimple njuhxc zhangbowei manjiribirajdar changxiaoning realdennis haiyang-sun jawline professorcoal gabrielnicolasavellaneda ufwt marcio-diaz abay123 senad87 andrewhead cnxtech eddings mem2019 livelifelively rmlatdibris robertdurst wsdou zfq005 angomablaise mikeslover88 gmbale nodechaser tao2years zer0yu bitcalc jkla972 madhunimmo kaist-plrg groupbwt-admin mrcodechef ucr-riple litschiw hilbigan itisbean metabob-axel4-c2nd crackercat nashid ljh9248 stjordanis pjsrcool soufianos01 captainbarber99 nitrogenousfish gabssnake msettro frankie-b ej11 huddy1985 xmhu123 jessicam9797 amandascm

jalangi2's Issues

Feature Request: distinguish between ++ and +=1 in binary callback

In binaryPre and binary, there seems to be no way to distinguish between the binary plus in a.b++ and the one in a.b+=1. It may be useful to add another argument (e.g., isAuto) to binaryPre and binary callback.

eval of the empty string does not produce `undefined`

Example evals, Jalangi behaves badly on the second one:

$ cat test.js                            
console.log(eval());
console.log(eval(""));
console.log(eval(undefined));
$ node test.js                           
undefined
undefined
undefined
$ node src/js/commands/jalangi.js test.js 
undefined

undefined

Missing assignment for function declarations in 34427b7

As of 34427b7,

function f() {}

Is instrumented to:

function f() {
                jalangiLabel0:
...
}
J$.N(41, 'f', J$.T(33, f, 12, false), false, false, false);

Rather than:

function f() {
                jalangiLabel0:
...
}
f = J$.N(41, 'f', J$.T(33, f, 12, false), true, false, false);

The missing assignment seems to be a bug, as it it no longer possible to redefine f with the return value of J$.N(...).

[BranchOnly]: J$cntr is not defined

If I try to instrument a program with the following:

sandbox.Config.requiresInstrumentation = function (id, funId, sid, funName) {
    if (funName === 'J$.T') {
        return true;
    }

    if (funName === 'J$.S' || funName === 'J$.Fe' || funName === 'J$.Fr' ||
        funName === 'J$.M' || funName === 'J$.F' ||
        funName === 'J$.C1' || funName === 'J$.C2' || funName === 'J$.C') {

        return Instrument.indexOf(funId) !== - 1;
    }

    return false;
};

where Instrument is a stable array of identifiers, I get

ReferenceError: J$cntr is not defined

update documentation to indicate that we break code that does stack inspection

Look at this fun code from here:

exports.getFileName = function getFileName (calling_file) {
  var origPST = Error.prepareStackTrace
    , origSTL = Error.stackTraceLimit
    , dummy = {}
    , fileName

  Error.stackTraceLimit = 10

  Error.prepareStackTrace = function (e, st) {
    for (var i=0, l=st.length; i<l; i++) {
      fileName = st[i].getFileName()
      if (fileName !== __filename) {
        if (calling_file) {
            if (fileName !== calling_file) {
              return
            }
        } else {
          return
        }
      }
    }
  }

  // run the 'prepareStackTrace' function above
  Error.captureStackTrace(dummy)
  dummy.stack

  // cleanup
  Error.prepareStackTrace = origPST
  Error.stackTraceLimit = origSTL

  return fileName
}

It's inspecting the stack trace from an exception to figure out the filename for the caller module. Since Jalangi2 doesn't preserve call stacks, this code breaks.

This is probably a fundamental issue, but posting in case there happens to be some kind of solution.

simplified handling of source locations for multiple scripts

I think the handling of source locations with multiple scripts can be simplified for analyses with some changes. Right now, if I understand correctly, the current script ID is stored in J$.sid. If the analysis wants to detect when the current script ID has changed, it needs to implement its own logic in the scriptEnter, scriptExit, functionEnter, and functionExit callbacks. Furthermore, the analysis is responsible for maintaining a mapping from script ID to the corresponding file name (the originalFileName parameter to scriptEnter).

Instead, I propose that this logic all be implemented once-and-for-all in analysis.js and exposed to the analysis via a new updateCurrentScript(name) callback. updateCurrentScript would be invoked any time the current script is changing, e.g., at script enter, or at function exit if the caller was in a different script. analysis.js would maintain the mapping between script IDs and names, so the analysis would never need to see script IDs. We'd of course need some special handling for eval'd scripts. I think baking this logic into analysis.js is the right way to go to simplify writing analyses.

[BranchOnly]: Support InvokeFunPre and InvokeFun

Add support for InvokeFunPre and InvokeFun and add these to analysisCallbackTemplate.

Content-type checking in proxy.py is too restrictive

The following code in: https://github.com/Samsung/jalangi2/blob/master/scripts/proxy.py#L43
misses several files for instrumentation:

Content-Type is case insensitive (i.e can also be content-type or CoNtEnT-tYpE):
http://stackoverflow.com/questions/5258977/are-http-headers-case-sensitive

def response(context, flow):
    flow.response.decode()
    if 'Content-Type' in flow.response.headers:
        if flow.response.headers['Content-Type'][0].find('javascript') != -1:
            flow.response.content = processFile(flow.response.content, "js")
        if flow.response.headers['Content-Type'][0].find('html') != -1:
            flow.response.content = processFile(flow.response.content, "html")

You could try this:

def content_type(headers):
    for key in headers.keys():
        if key.lower() == "content-type":
            return headers[key].lower()
    return None

def response(context, flow):
    flow.response.decode()
    if 'javascript' in content_type(flow.response.headers):
        flow.response.content = processFile(flow.response.content, "js")
    elif 'html' in content_type(flow.response.headers):
        flow.response.content = processFile(flow.response.content, "html")

You might want to looking at the extension of the path (i.e. filename: flow.request.path.split('/')[-1] or ext flow.request.path.split('.')[-1] -- with appropriate string sanitization)

isComputed & binary delete

analysis.getField and analysis.putField have a parameter named isComputed, it would be nice if analysis.binary also had that, for use in delete o[p] expressions.

[Bug]: INSTR_LITERAL and INSTR_INIT are not orthogonal

Disabling instrumentation of certain initializers may prevent instrumentation of certain literals.

For example;

var Box2D = {};
(function () {
function f () {
return 2+2;
}
})();

If INSTR_LITERAL and INSTR_INIT are enabled both function literal will be instrumented with J$.T...

However, if INSTR_INIT is disabled then the inner function literal is not instrumented.

Infinite Loop with INSTR_WRITE Disabled

The newest version of Jalangi enters an infinite loop on 3d-raytrace.js with instrumentation disabled for writes:

sandbox.Config.INSTR_WRITE = function (name, ast) {
return false;
};

JIT-friendly mode

For experiments regarding performance overhead, it would be good to have a mode in which esnstrument.js does not create any code constructs known to stop V8 from applying its JIT. As of now, the relevant constructs are:

try/finally blocks enclosing method bodies
referencing and escaping the arguments array
The JIT-friendly mode need not be suitable for writing analyses; it would just be used to be able to measure the overhead of the constructs across an array of runtimes.

endExpression & sequence-expressions

J$.X1 is missing inside sequence-expressions, as the non-last
expressions are "end of expression" by themselves.

I suggest pushing J$.X1 onto all members of a sequence-expression.

Consider the two examples:

('a', 'b')

Is instrumented to:

J$.X1((J$.T(9, 'a', 21, false), J$.T(17, 'b', 21, false)));

But it should be:

(J$.X1(J$.T(9, 'a', 21, false), J$.X1(J$.T(17, 'b', 21, false)));

for('a', 'b';;);

Is instrumented to:

for (J$.X1((J$.T(25, 'a', 21, false), J$.T(33, 'b', 21, false)));;);

But it should be:

for ((J$.X1(J$.T(25, 'a', 21, false), J$.X1(J$.T(33, 'b', 21, false))));;);

(NB: AST-wise, the node.type of the sequence expression is: "SequenceExpression")

better source locations for formal parameters

Consider the following JS script:

function foo(f,g) {}

When instrumented (with --inlineIID option), we get:

J$.iids = {"9":[1,1,1,21],"17":[1,1,1,21],"25":[1,1,1,21],"33":[1,1,1,21],"41":[1,1,2,1],"49":[1,1,1,21],"57":[1,1,2,1],"65":[1,1,1,21],"73":[1,1,1,21],"81":[1,1,2,1],"89":[1,1,2,1],"nBranches":2,"originalCodeFileName":"/tmp/params.js","instrumentedCodeFileName":"/tmp/params_jalangi_.js"};
jalangiLabel1:
    while (true) {
        try {
            J$.Se(41, '/tmp/params_jalangi_.js', '/tmp/params.js');
            function foo(f, g) {
                jalangiLabel0:
                    while (true) {
                        try {
                            J$.Fe(9, arguments.callee, this, arguments);
                            arguments = J$.N(17, 'arguments', arguments, true, false, false);
                            f = J$.N(25, 'f', f, true, false, false);
                            g = J$.N(33, 'g', g, true, false, false);
...

Note that the source locations for the J$.N callbacks for parameters f and g are [1,1,1,21], i.e., the entirety of function foo. Instead, it would be nice if these callbacks got source locations that only included the corresponding formal parameter (e.g., [1,12,1,13] for parameter f, IID 25).

https://github.com/Samsung/jalangi2/pull/28 (passing internal iid of a function to literal)

I believe that getters and setters are not getting the correct internalIids.

Comparing function name

Hello, I am testing for the code to check for the params that are passed in post functions.
To do that, I wrote an analysis to check if the function call is post, but it is returning 'ReferenceError: post is not defined'. Below is the snippet of analysis that I wrote and the error is caused by 'var POST_FUNCTION= post'. Referring to the function name like this was found in Jalangi online demo. I understand that it is not used supported anymore, however is there any other way to check function's name in either invokeFunPre or invokeFun by comparing parameter f? If not, using any functions provided by jalangi?

ANALYSIS:

 var POST_FUNCTION = post;

this.invokeFunPre = function (iid, f, base, args, isConstructor) {
      console.log('function call intercepted before invoking');    
      if (f === POST_FUNCTION && args) {
      console.log('function is POST'); 
       // pass in config, always thrid parameter in http.post
        checkParams(args[2]);
      }
    };

I am also including the code that I am testing below.
CODE:

function post() {
} 

var sampleRecords = function(params) {
  var config = {
    params: params,
    httpErrorHandlers: {
      '4xx': function(error) {
        deferred.reject(error);
      }
    }
  }

post('/machinelearning/service/datasource/sample-records', null, config);
}

var params = {
  dbPassword : "password_test",
  dbUsername : "username_test"
};

sampleRecords(request);

Semantic preservation bugs when using instrumented lodash

When the lodash library is run instrumented, semantics seems to change a little:
The internal test suite of the library fails when running with jalangi.js:

Setup:

$ git clone [email protected]:lodash/lodash.git
$ cd lodash
$ npm i
...
$ cd ..

Run lodash test suite uninstrumented:

$ node lodash/test/test.js         
test.js invoked with arguments: ["node","/home/esbena/tmp/lodash/test/test.js"]
----------------------------------------
    PASS: 4574  FAIL: 0  TOTAL: 4574
    Finished in 8994 milliseconds.
----------------------------------------

Run lodash test suite instrumented:

$ node jalangi2-official/src/js/commands/jalangi.js lodash/test/test.js
test.js invoked with arguments: ["node","/home/esbena/tmp/lodash/test/test.js"]
----------------------------------------
keys methods
----------------------------------------
 FAIL - `_.keys` skips non-enumerable properties (test in IE < 9)
    FAIL | EQ | skips non-enumerable properties on `RegExp.prototype` | Expected: true, Actual: false
    FAIL | EQ | skips non-enumerable properties inherited from `RegExp.prototype` | Expected: true, Actual: false
 FAIL - `_.keysIn` skips non-enumerable properties (test in IE < 9)
    FAIL | EQ | skips non-enumerable properties on `RegExp.prototype` | Expected: true, Actual: false
    FAIL | EQ | skips non-enumerable properties inherited from `RegExp.prototype` | Expected: true, Actual: false
    FAIL | EQ | skips non-enumerable properties on `RegExp.prototype` | Expected: true, Actual: false
    FAIL | EQ | skips non-enumerable properties inherited from `RegExp.prototype` | Expected: true, Actual: false
----------------------------------------
lodash.partialRight
----------------------------------------
 FAIL - should work as a deep `_.defaults`
    FAIL | OK | Died on test #1     at invokeFun (/home/esbena/tmp/jalangi2-official/src/js/runtime/analysis.js:211:22): deep is not defined
----------------------------------------
    PASS: 4569  FAIL: 5  TOTAL: 4574
    Finished in 69820 milliseconds.
----------------------------------------

[BranchOnly] Pass Ast fragment to Config.requiresInstrumentation

See above

forIn variable update not wrapped in X1

The source:

for(var p in {}){}

Is instrumented to:

            J$.N(41, 'p', p, false, false, false);
            for (J$._tm_p in J$.H(17, J$.T(9, {}, 11, false))) {
                var p = J$.W(25, 'p', J$._tm_p, p, false, true, true);
                {
                    {
                    }
                }
            }

I think that:

var p = J$.W(25, 'p', J$._tm_p, p, false, true, true);

Should be wrapped in X1:

var p = X1(xxx, J$.W(25, 'p', J$._tm_p, p, false, true, true));

In the same way as the source:

var x = 42;

Is instrumented to:

            var x = J$.X1(25, J$.W(17, 'x', J$.T(9, 42, 22, false), x, false, true, true));

Semantic bug: different exception thrown

A browserified browserify behaves differently when run with Jalangi:

Original behaviour:

TypeError: Object #<Object> has no method 'readFileSync'

Jalangi behaviour:

TypeError: Function.prototype.apply was called on undefined, which is a undefined and not a function

To reproduce:
In jalangi2 directory:

$ npm install -g browserify
$ mkdir bug
$ cd bug
$ mkdir node_modules
$ npm install browserify
$ echo 'require("browserify");' >> main.js
$ node main.js 
$ browserify main.js -o bundle.js
$ node bundle.js 

/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:845
var defaultPrelude = fs.readFileSync(defaultPreludePath, 'utf8');
                        ^
TypeError: Object #<Object> has no method 'readFileSync'
    at Object.<anonymous> (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:845:25)
    at Object.JSONStream (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:955:4)
    at s (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:1:316)
    at /Users/e.andreasen/workspace/jalangi2/bug/bundle.js:1:367
    at Object.<anonymous> (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:8:13)
    at Object../lib/builtins.js (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:791:4)
    at s (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:1:316)
    at /Users/e.andreasen/workspace/jalangi2/bug/bundle.js:1:367
    at Object.browserify (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:2:1)
    at s (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:1:316)
$ cd ..
$ node src/js/commands/jalangi.js --inlineIID --inlineSource --analysis src/js/sample_analyses/ChainedAnalyses.js --analysis src/js/sample_analyses/dlint/Utils.js --analysis src/js/sample_analyses/dlint/CheckNaN.js bug/bundle.js 

/Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:440
            throw tmp;
                  ^
TypeError: Function.prototype.apply was called on undefined, which is a undefined and not a function
    at callFun (/Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:145:51)
    at invokeFun (/Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:166:22)
    at /Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:189:41
    at Object.<anonymous> (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:2278:195)
    at callFun (/Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:145:51)
    at invokeFun (/Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:166:22)
    at /Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:189:41
    at Object.J$.X1.J$.F.J$.T.J$.T.J$.T.JSONStream (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:2516:73)
    at callFun (/Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:145:51)
    at invokeFun (/Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:166:22)

Support Conditional Instrumentation of Init (J$.N)

Add support for a flag in Config to disable instrumentation of initializers, e.g. to avoid generation of instrumented code such as:

J$.N(161, 'PI2nx', PI2nx, false, false, false);

Wrong event order for getField on Firefox and Chrome

For the expression a.b(c), node.js engine will evaluate the sub-expressions in the following order:

a
c
a.b

However, in Chrome and Firefox, those sub-expressions are evaluated in a different order:

a
a.b
c

Problem: for expression a.b(c), the instrumented code always emits getField a.b event after read c event.

Consider the following example code:

var a = {
    get b() {
        console.log('getting b');
        return function (){};
    }
};

var d = {
    get e() {
        console.log('getting e');
        return function (){};
    }
};

console.log('--------------');
a.b(d.e);
console.log('--------------');
(1, a.b)(d.e)

Running the following example program in Chrome and Firefox produces the following result:

--------------
getting b
getting e
--------------
getting b
getting e

Running the instrumented example code on Chrome and Firefox produces a different result:

--------------
getting e
getting b
--------------
getting b
getting e

Therefore, the instrumentation process changes the semantics of the example code on Firefox and Chrome.

jalangi.analyze does not always perform an analysis

Description

It seems that require('jalangi2').analyze(...) can fail to perform an analysis by silently not loading the application of interest.

When the failure occurs, it occurs on every instrument-and-analyze attempt for an application except for the first. The situation only occurs if the instrumented files are named the same as in the previous attempt.

Abstract example

$ node instrument-and-analyze.js test1.js instumentationDirectory
<<instumentationDirectory/test1.js outputs>>

$ node instrument-and-analyze.js test1.js instumentationDirectory
<<instumentationDirectory/test1.js outputs>>

$ node instrument-and-analyze.js test2.js instumentationDirectory
<<instumentationDirectory/test2.js outputs>>

$ node instrument-and-analyze.js test2.js instumentationDirectory
<<instumentationDirectory/test2.js NOT LOADED>>

Concrete example

See https://github.com/esbena/jalangi-api-caching-bug

Additional information

reproducible with node 0.10 and 0.12 on ubuntu.
src/js/commands/{instrument,jalangi,direct}.js does not encounter the problem

A bug in function level sampling

A problem related to nested function scope. Suppose here is the original code:

function f1() {
    function f2() {
        console.log('orig_fun');
    }
    f2();
}

f1();

After enabling function level sampling, it will be instrumented into (under function level sampling):
(the sampling of f2() is omitted for simplicity):

function f1() {
    if(true) { // suppose J$.S() always returns true here
        function f2() {
            console.log('instru_fun');
        }
        f2();
    } else {
        function f2() { // this f2 takes over the scope
            console.log('orig_fun');
        }
        f2();
    }
}

f1();

The entire instrumented code will always print: orig_fun
As a result, instrumented version of f2 is never called no matter what J$.S() returns.

Crash: Circular JSON on 3d-raytrace.js

The newest version of Jalangi crashes on 3d-raytrace.js with

sandbox.Config.ENABLE_SAMPLING = true;

and exception:

/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/esnstrument.js:1177
return JSON.parse(JSON.stringify(src), JSONParseHandler);
^
TypeError: Converting circular structure to JSON
at Object.stringify (native)
at clone (/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/esnstrument.js:1177:32)
at Object.visitorCloneBodyPre.FunctionDeclaration (/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/esnstrument.js:1251:29)
at transformAst (/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/astUtil.js:148:36)
at transformAst (/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/astUtil.js:141:39)
at Object.transformAst (/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/astUtil.js:141:39)
at transformString (/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/esnstrument.js:1759:30)
at Object.instrumentCode (/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/esnstrument.js:1819:26)
at Object.Module._extensions..js (/Users/m.madsen/Arov/lib/jalangi2/src/js/commands/jalangi.js:77:30)
at Module.load (module.js:355:32)

[BranchOnly]: Support Object Literals

In BranchOnly analysis.literal is restricted to only function objects.
However, to track rudimentary dataflow, I think we also want to be able to track object literals.
So if we could also support instrumentation of these.

cannot use argparse 1.0.2

npm install fails if I try to upgrade argparse from 0.1.6 to 1.0.2 in package.json.

Record replay in jalangi 2 ?

I am not clear on whether record replay is not possible with jalangi2 because of some architectural changes / choices or is it that record replay engine has not been ported to jalangi2 (and hence it is possible if someone actually does the work )?

The reason I ask is , i am interested in the record replay part of jalangi - but I want to know which of the two forks to go forward with ? If its still possible with jalangi 2 I would be happy to put in the effort to port the record replay engine to jalangi 2 ?

overhead regression for the annex app

I tried measuring the overhead of Jalangi 2 instrumentation on annex and comparing the overhead with analysis2.js from Jalangi 1. Here are the running times for the fourth computer move on Chrome 40, where we are running with no client analysis.

analysis2.js from Jalangi 1: ~1.8s
Jalangi 2: ~2.1s

This is a fairly significant slowdown. If I have time I'll try to dig further as to the root cause.

Reusing iids in src/js/runtime/analysis.js:A

The implementation of A in src/js/runtime/analysis.js is a bit odd.
It uses the same iids for the calls to G, B and P:

    function A(iid, base, offset, op, isComputed) {
        var oprnd1 = G(iid, base, offset, isComputed, true, false);
        return function (oprnd2) {
            var val = B(iid, op, oprnd1, oprnd2, true, false);
            return P(iid, base, offset, val, isComputed, true);
        };
    }

It should use different iids in the style of M:

    function M(iid, base, offset, isConstructor, isComputed) {
        return function () {
            var f = G(iid + 2, base, offset, isComputed, false, true);
            return (lastComputedValue = invokeFun(iid, base, f, arguments, isConstructor, true));
        };
    }

instrument.js hangs sometimes in node v5

Possibly related: #58

instrument.js hangs in node v5

To reproduce:

$ node --version
v5.0.0
$ mkdir test
$ cd test
$ echo '{}' > package.json
$ npm i jalangi2
$ npm i minimist
$ node node_modules/jalangi2/src/js/commands/instrument.js --outputDir out node_modules/minimist
^C # it hangs..
$ rm -rf node_modules/minimist/test
$ node node_modules/jalangi2/src/js/commands/instrument.js --outputDir out node_modules/minimist
instrumenting out/minimist2/index_orig_.js
instrumenting out/minimist2/example/parse_orig_.js
done!

It seems that the call to ncp on line 454 of instrument.js never calls the callback.
That can be shown with the following change:

ncp(inputDir, copyDir, {transform: transform}, function(){console.log('ncp done')});

Bug: Boxed primitives are unboxed as eval-results.

Source:

var x = new Boolean(true)
var y = eval(new Boolean(true))
console.log(x);
console.log(y);

Uninstrumented/instrumented difference:

$ node test.js
[Boolean: true]
[Boolean: true]
$ node src/js/commands/jalangi.js test.js 
[Boolean: true]
true

[Feature Request]: runInstrumentedFunctionBody should be provided with the iid of the function

The runInstrumentedFunctionBody is invoked by J$.S. It would be nice if J$.S would also have access to the function iid or the function object, or possibly both.

endExpression() should pass the IID of the expression

Exception check is wrong for `throw undefined`

The two checks in /src/js/runtime/analysis.js that check if an exception has been thrown are wrong.

The checks are implemented as

if (exceptionVal !== undefined) {...}

But this is wrong in the presence of throw undefined.

Example code to showcase wrongness:

Consider the empty analysis (../empty-analysis.js):

(function (sandbox) {
    function MyAnalysis () {
    }
    sandbox.analysis = new MyAnalysis();
})(J$);

Consider the program:
../throw-undefined.js

throw undefined;

Consider the shell session, which shows no errors:

$ node src/js/commands/esnstrument_cli.js ../throw-undefined.js && node src/js/commands/direct.js --analysis ../empty-analysis.js ../throw-undefined_jalangi_.js
$

But it should, because running ../throw-undefined.js without jalangi, produces an error:

$ node ../throw-undefined.js

../throw-undefined.js:1
(function (exports, require, module, __filename, __dirname) { throw undefined;
                                                                    ^
undefined

Add parameter to endExecution() to indicate a sudden exit

If a node program calls process.exit(), the underlying process exits immediately, without cleaning up the call stack, etc. In direct.js, we add an exit listener that at least ensures that endExecution() is invoked even if process.exit is called. But, this does not take care of all issues for the analysis client, as, e.g., it may be written expecting all functionEnter callbacks to have a corresponding functionExit callback. I can't think of anything we can really do here. We could monkey-patch process.exit to just throw some exception, but that can be caught by the application code. Just logging an issue in case there's an idea for how to hide the process.exit ugliness from analyses.

Confusing handling of global variables

The isGlobal and isScriptLocal parameters to the read and write callbacks have confusing and, in my opinion, inconsistent values when handling global variables.

Consider a simple analysis that just prints out the name of the variable and both flags on read and write (code below). Run it on the following piece of code:

var x;
x=1;
x;
z = 1;
z;
console=console;
console;

What I would expect: The isGlobal and isScriptLocal variables should be the same in reads and writes to the same variable. Moreover, z would have either have isGlobal set and isScriptLocal unset, or vice-versa, and console would have isGlobal set and isScriptLocal unset.

What I get:

Writing to x (script-local)
Reading from x (script-local)
Writing to z (local)
Reading from z (global and script-local?!)
Reading from console (global and script-local?!)
Writing to console (local)
Reading from console (global and script-local?!)

PS: Here is the analysis used:

// JALANGI DO NOT INSTRUMENT
(function (sandbox) {
    function MyAnalysis () {
    function writeFlags(isGlobal, isScriptLocal) {
        if (isGlobal && isScriptLocal) { return "global and script-local?!" }
        else if (isGlobal) { return "global" }
        else if (isScriptLocal) { return "script-local" }
        else { return "local" }
    }
        this.read = function(iid, name, val, isGlobal, isScriptLocal){
        console.log("Reading from " + name + " (" + writeFlags(isGlobal, isScriptLocal) + ")")
    };

        this.write = function(iid, name, val, lhs, isGlobal, isScriptLocal) {
        console.log("Writing to " + name + " (" + writeFlags(isGlobal, isScriptLocal) + ")")
    };
    }
    sandbox.analysis = new MyAnalysis();
})(J$);

Disabling INSTR_TRY_CATCH_ARGUMENTS throws IllegalStateException

Steps to reproduce:

Clean Jalangi2 install.
Edit Config.js to enable INSTR_TRY_CATCH_ARGUMENTS and always return false.
Run node jalangi2/src/js/commands/instrument.js --outputDir foo jalangi2/tests/octane/box2d.js

/Users/m.madsen/tmp/jalangi2/src/js/commands/instrument.js:322
throw e;
^
Error: IllegalStateException
at getFnIdFromAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/esnstrument.js:531:19)
at wrapLiteral (/Users/m.madsen/tmp/jalangi2/src/js/instrument/esnstrument.js:554:37)
at Object.visitorRRPost.FunctionExpression (/Users/m.madsen/tmp/jalangi2/src/js/instrument/esnstrument.js:1386:24)
at transformAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/astUtil.js:148:36)
at transformAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/astUtil.js:141:39)
at transformAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/astUtil.js:141:39)
at transformAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/astUtil.js:141:39)
at transformAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/astUtil.js:141:39)
at transformAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/astUtil.js:141:39)
at transformAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/astUtil.js:141:39)

Bug: wrong base-object for with-statement function calls

Function calls done inside with-statements can receive the wrong base object.

It seems that function calls always are treated as function calls while they should be treated as method calls when the callee is found on the "with-object".

Source:

function f() {console.log(this + '');}
var x = {m: function() {console.log(this + '');}}

with (x) {
  f();
  m();
}

Output uninstrumented/instrumented:

$ node test.js
[object global]
[object Object]
$ node src/js/commands/jalangi.js test.js 
[object global]
[object global]

Acorn parsing of node shell scripts

The acorn.parse usage in esnstrument.js does not support node shell scripts with #!-magic:

#!/usr/bin/env node
...

Trying to parse the file gives an error:

SyntaxError: Unexpected character '#' (1:0)
    at raise (.../jalangi2/node_modules/acorn/acorn.js:319:15)

Either allowHashBang: true should be set as an option to acorn.parse, or the first line needs to be changed into a comment manually with something like code = '//' + code;.

NB: Doing this blindly all the time is dangerous as it might suppress real syntax errors.

Wrong floating point semantics for postfix operations.

The numeric postfix operation instrumentations does not preserve floating point semantics.

Example

Source:

var a = 0.15;
console.log(a);
console.log(a++);
console.log(a);

Uninstrumented & Instrumented runs:

$ node test.js
0.15
0.15
1.15
$ node src/js/commands/jalangi.js test.js
0.15
0.1499999999999999
1.15

Explanation

Ideally, the value of a postfix expression is the initial value, but the adjustIncDec function in esnstrument.js subtracts/adds 1 to the modified value instead. But this has an unfortunate effect on floating point numbers, as seen above.

Instrumented:

...
J$.X1(65, J$.B(26, '-', a = J$.W(49, 'a', J$.B(18, '+', J$.U(10, '+', J$.R(41, 'a', a, 0)), J$.T(33, 1, 22, false), 0), a, 0), J$.T(57, 1, 22, false), 0));
...

rewrite instrument.js to use cpr

We should rewrite instrument.js to use cpr instead of ncp. We are already stuck on some old ncp version due to API changes, and it also is preventing instrument.js from working on io.js (due to AvianFlu/ncp#79). Additionally, I'd like to add an option so that rather than writing the _orig_.js files, we simply point back to the original files, or put the original files in some parallel directory structure (but this will take some thought).

[Feature Request]: Disable Instrumentation On-Demand

In Jalangi it is currently possible to turn instrumentation on/off with a ENABLE_SAMPLING and runInstrumentedFunctionBody.

What I would like to ask for is a mechanism, for example as part of preFunInvoke, where you can opt to enter the original un-instrumented function and thus escape entirely to a world where nothing is instrumented. You would of course never be able to go back.

A potential issue is function objects stored in the heap, and how they would be replaced. Here it might still be sufficient to rely on the ENABLE_SAMPLING trick.

If this is beyond the scope of Jalangi, some pointers on how to achieve this would be welcome.

instrumentation of html files containing onclick="return false;" throws SyntaxError exception

src/js/commands/instrument.js on an html file containing the following tag
< a href="//slashdot.org/my/login" onclick="show_login_box(); return false;">
throws SyntaxError exception.

Missing J$.T for primitive literals in typeof expressions?

It seems that typeof "foo" and similar primitive literal arguments to typeof does not get instrumented.
Is this intentional?

Source:

typeof "foo";
typeof true;
typeof 1;

+"bar";
typeof +"baz";
typeof {};

Annotated instrumentation:

...
            J$.X1(9, J$.U(10, 'typeof', 'foo')); // << no J$.T
            J$.X1(17, J$.U(18, 'typeof', true)); // << no J$.T
            J$.X1(25, J$.U(26, 'typeof', 1)); // << no J$.T

            J$.X1(41, J$.U(34, '+', J$.T(33, 'bar', 21, false))); // << J$.T for non-typeof unary
            J$.X1(57, J$.U(50, 'typeof', J$.U(42, '+', J$.T(49, 'baz', 21, false)))); // << J$.something for expression argument to typeof
            J$.X1(73, J$.U(58, 'typeof', J$.T(65, {}, 11, false))); // << J$.T for object argument to typeof
...

store more meaningful names for scripts instrumented via proxy server

Right now, the only name we get for a script instrumented via the proxy server is the hash, e.g., 008d09db116979dcfd1bead4060785af.js. We should support something better, e.g., storing the original URL in the source map information.

Missing ReferenceError on undeclared variable references?

Jalangi squelches the ReferenceError that would occur on a references to an undeclared variable through its use of J$.I.

This breaks feature-detection with try-catch blocks as seen below:

try {
    document; // <-- supposed to throw in non-browser environments
} catch (e) {
    standalone = true;
}

Suggestion: the jalangi-instrumentation-pattern with J$.I(...) for undeclared variables should be disabled by default.
Thoughts?

Small example that showcases the discrepancy between a jalangi-runtime and a node-runtime:

test.js:

DOES_NOT_EXIST;

test_jalangi_.js

...
J$.X1(17, J$.I(typeof DOES_NOT_EXIST === 'undefined' ? DOES_NOT_EXIST = J$.R(9, 'DOES_NOT_EXIST', undefined, true, true) : DOES_NOT_EXIST = J$.R(9, 'DOES_NOT_EXIST', DOES_NOT_EXIST, true, true)));
...

$ node src/js/commands/jalangi.js test.js
$ node test.js
...
(function (exports, require, module, __filename, __dirname) { DOES_NOT_EXIST;
                                                              ^
ReferenceError: DOES_NOT_EXIST is not defined
...

Missing declaration for function expression names

Jalangi does not have a declare-call for the name of a function expression.

I think this source:

(function g(){})();

Should be instrumented to something like:

J$.X1(113, J$.F(105, J$.T(97, function g() {
                jalangiLabel1:
                    while (true) {
                        try {

// NEW
                            g = J$.X1(2324, J$.N(137, 'g', J$.T(129, g, 12, false), true, false, false);
// END OF NEW

                            J$.Fe(81, arguments.callee, this, arguments);
                            arguments = J$.N(89, 'arguments', arguments, true, false, false);
                        } catch (J$e) {
                            J$.Ex(161, J$e);
                        } finally {
                            if (J$.Fr(169))
                                continue jalangiLabel1;
                            else
                                return J$.Ra();
                        }
                    }
            }, 12, false), false)());

Use Javascript based task runners?

Given that jalangi is a javascript tool. It might be a good idea to migrate all the python based tasks ( scripts folder ) to javascript based task runners ( grunt / gulp! ). This will reduce the dependency on additional installation's required.

Documentation: isGlobal and eval interact in unexpected ways

The behavior of the isGlobal flag in the read and write callbacks interacts with evals in an unexpected way. Consider the following example:

var x = "magic";
console.log(x);
eval("console.log(x)");

Then reading x the first time will invoke the read callback with isGlobal=false and isScriptLocal=true (which is entirely expected, from the documentation), while the read in the eval will invoke the callback with isGlobal=true and isScriptLocal=true.

According to the documentation, isGlobal is "True if the variable is not declared using var", but in fact, isGlobal is "True if the variable is not declared using var in the current script".

Minimal analysis to reproduce:

(function (sandbox) {
    function MyAnalysis (global) {
        this.read = function(iid, name, val, isGlobal, isScriptLocal){
        console.log("Reading " + name + ", isGlobal=" + isGlobal)
    };

        this.scriptEnter = function(iid, instrumentedFileName, originalFileName){
        console.log("Entering script")
    };

        this.scriptExit = function(iid, wrappedExceptionVal){
        console.log("Exiting script")
    };
    }
    sandbox.analysis = new MyAnalysis(this);
})(J$);

This yields:

Entering script
Reading console, isGlobal=true
Reading x, isGlobal=false
magic
Entering script
Reading console, isGlobal=true
Reading x, isGlobal=true
magic
Exiting script
Reading console, isGlobal=true
Reading x, isGlobal=false
magic
Exiting script

samsung / jalangi2 Goto Github PK

jalangi2's People

Contributors

Stargazers

Watchers

Forkers

jalangi2's Issues

Description

Abstract example

Concrete example

Additional information

Example

Explanation

Small example that showcases the discrepancy between a jalangi-runtime and a node-runtime:

Recommend Projects

Recommend Topics

Recommend Org