samsung / jalangi2 Goto Github PK
View Code? Open in Web Editor NEWDynamic analysis framework for JavaScript
License: Apache License 2.0
Dynamic analysis framework for JavaScript
License: Apache License 2.0
In binaryPre
and binary
, there seems to be no way to distinguish between the binary plus in a.b++
and the one in a.b+=1
. It may be useful to add another argument (e.g., isAuto
) to binaryPre
and binary
callback.
Example evals, Jalangi behaves badly on the second one:
$ cat test.js
console.log(eval());
console.log(eval(""));
console.log(eval(undefined));
$ node test.js
undefined
undefined
undefined
$ node src/js/commands/jalangi.js test.js
undefined
undefined
As of 34427b7,
function f() {}
Is instrumented to:
function f() {
jalangiLabel0:
...
}
J$.N(41, 'f', J$.T(33, f, 12, false), false, false, false);
Rather than:
function f() {
jalangiLabel0:
...
}
f = J$.N(41, 'f', J$.T(33, f, 12, false), true, false, false);
The missing assignment seems to be a bug, as it it no longer possible to redefine f
with the return value of J$.N(...)
.
If I try to instrument a program with the following:
sandbox.Config.requiresInstrumentation = function (id, funId, sid, funName) {
if (funName === 'J$.T') {
return true;
}
if (funName === 'J$.S' || funName === 'J$.Fe' || funName === 'J$.Fr' ||
funName === 'J$.M' || funName === 'J$.F' ||
funName === 'J$.C1' || funName === 'J$.C2' || funName === 'J$.C') {
return Instrument.indexOf(funId) !== - 1;
}
return false;
};
where Instrument is a stable array of identifiers, I get
ReferenceError: J$cntr is not defined
Look at this fun code from here:
exports.getFileName = function getFileName (calling_file) {
var origPST = Error.prepareStackTrace
, origSTL = Error.stackTraceLimit
, dummy = {}
, fileName
Error.stackTraceLimit = 10
Error.prepareStackTrace = function (e, st) {
for (var i=0, l=st.length; i<l; i++) {
fileName = st[i].getFileName()
if (fileName !== __filename) {
if (calling_file) {
if (fileName !== calling_file) {
return
}
} else {
return
}
}
}
}
// run the 'prepareStackTrace' function above
Error.captureStackTrace(dummy)
dummy.stack
// cleanup
Error.prepareStackTrace = origPST
Error.stackTraceLimit = origSTL
return fileName
}
It's inspecting the stack trace from an exception to figure out the filename for the caller module. Since Jalangi2 doesn't preserve call stacks, this code breaks.
This is probably a fundamental issue, but posting in case there happens to be some kind of solution.
I think the handling of source locations with multiple scripts can be simplified for analyses with some changes. Right now, if I understand correctly, the current script ID is stored in J$.sid
. If the analysis wants to detect when the current script ID has changed, it needs to implement its own logic in the scriptEnter
, scriptExit
, functionEnter
, and functionExit
callbacks. Furthermore, the analysis is responsible for maintaining a mapping from script ID to the corresponding file name (the originalFileName
parameter to scriptEnter
).
Instead, I propose that this logic all be implemented once-and-for-all in analysis.js
and exposed to the analysis via a new updateCurrentScript(name)
callback. updateCurrentScript
would be invoked any time the current script is changing, e.g., at script enter, or at function exit if the caller was in a different script. analysis.js
would maintain the mapping between script IDs and names, so the analysis would never need to see script IDs. We'd of course need some special handling for eval
'd scripts. I think baking this logic into analysis.js
is the right way to go to simplify writing analyses.
Add support for InvokeFunPre and InvokeFun and add these to analysisCallbackTemplate.
The following code in: https://github.com/Samsung/jalangi2/blob/master/scripts/proxy.py#L43
misses several files for instrumentation:
Content-Type
is case insensitive (i.e can also be content-type
or CoNtEnT-tYpE
):def response(context, flow):
flow.response.decode()
if 'Content-Type' in flow.response.headers:
if flow.response.headers['Content-Type'][0].find('javascript') != -1:
flow.response.content = processFile(flow.response.content, "js")
if flow.response.headers['Content-Type'][0].find('html') != -1:
flow.response.content = processFile(flow.response.content, "html")
You could try this:
def content_type(headers):
for key in headers.keys():
if key.lower() == "content-type":
return headers[key].lower()
return None
def response(context, flow):
flow.response.decode()
if 'javascript' in content_type(flow.response.headers):
flow.response.content = processFile(flow.response.content, "js")
elif 'html' in content_type(flow.response.headers):
flow.response.content = processFile(flow.response.content, "html")
flow.request.path.split('/')[-1]
or ext flow.request.path.split('.')[-1]
-- with appropriate string sanitization)analysis.getField
and analysis.putField
have a parameter named isComputed
, it would be nice if analysis.binary
also had that, for use in delete o[p]
expressions.
Disabling instrumentation of certain initializers may prevent instrumentation of certain literals.
For example;
var Box2D = {};
(function () {
function f () {
return 2+2;
}
})();
If INSTR_LITERAL and INSTR_INIT are enabled both function literal will be instrumented with J$.T...
However, if INSTR_INIT is disabled then the inner function literal is not instrumented.
The newest version of Jalangi enters an infinite loop on 3d-raytrace.js with instrumentation disabled for writes:
sandbox.Config.INSTR_WRITE = function (name, ast) {
return false;
};
For experiments regarding performance overhead, it would be good to have a mode in which esnstrument.js
does not create any code constructs known to stop V8 from applying its JIT. As of now, the relevant constructs are:
arguments
arrayJ$.X1
is missing inside sequence-expressions, as the non-last
expressions are "end of expression" by themselves.
I suggest pushing J$.X1 onto all members of a sequence-expression.
Consider the two examples:
('a', 'b')
Is instrumented to:
J$.X1((J$.T(9, 'a', 21, false), J$.T(17, 'b', 21, false)));
But it should be:
(J$.X1(J$.T(9, 'a', 21, false), J$.X1(J$.T(17, 'b', 21, false)));
for('a', 'b';;);
Is instrumented to:
for (J$.X1((J$.T(25, 'a', 21, false), J$.T(33, 'b', 21, false)));;);
But it should be:
for ((J$.X1(J$.T(25, 'a', 21, false), J$.X1(J$.T(33, 'b', 21, false))));;);
(NB: AST-wise, the node.type
of the sequence expression is: "SequenceExpression")
Consider the following JS script:
function foo(f,g) {}
When instrumented (with --inlineIID option), we get:
J$.iids = {"9":[1,1,1,21],"17":[1,1,1,21],"25":[1,1,1,21],"33":[1,1,1,21],"41":[1,1,2,1],"49":[1,1,1,21],"57":[1,1,2,1],"65":[1,1,1,21],"73":[1,1,1,21],"81":[1,1,2,1],"89":[1,1,2,1],"nBranches":2,"originalCodeFileName":"/tmp/params.js","instrumentedCodeFileName":"/tmp/params_jalangi_.js"};
jalangiLabel1:
while (true) {
try {
J$.Se(41, '/tmp/params_jalangi_.js', '/tmp/params.js');
function foo(f, g) {
jalangiLabel0:
while (true) {
try {
J$.Fe(9, arguments.callee, this, arguments);
arguments = J$.N(17, 'arguments', arguments, true, false, false);
f = J$.N(25, 'f', f, true, false, false);
g = J$.N(33, 'g', g, true, false, false);
...
Note that the source locations for the J$.N
callbacks for parameters f
and g
are [1,1,1,21]
, i.e., the entirety of function foo
. Instead, it would be nice if these callbacks got source locations that only included the corresponding formal parameter (e.g., [1,12,1,13]
for parameter f
, IID 25).
I believe that getters and setters are not getting the correct internalIids.
Hello, I am testing for the code to check for the params that are passed in post functions.
To do that, I wrote an analysis to check if the function call is post, but it is returning 'ReferenceError: post is not defined'. Below is the snippet of analysis that I wrote and the error is caused by 'var POST_FUNCTION= post'. Referring to the function name like this was found in Jalangi online demo. I understand that it is not used supported anymore, however is there any other way to check function's name in either invokeFunPre or invokeFun by comparing parameter f? If not, using any functions provided by jalangi?
ANALYSIS:
var POST_FUNCTION = post;
this.invokeFunPre = function (iid, f, base, args, isConstructor) {
console.log('function call intercepted before invoking');
if (f === POST_FUNCTION && args) {
console.log('function is POST');
// pass in config, always thrid parameter in http.post
checkParams(args[2]);
}
};
I am also including the code that I am testing below.
CODE:
function post() {
}
var sampleRecords = function(params) {
var config = {
params: params,
httpErrorHandlers: {
'4xx': function(error) {
deferred.reject(error);
}
}
}
post('/machinelearning/service/datasource/sample-records', null, config);
}
var params = {
dbPassword : "password_test",
dbUsername : "username_test"
};
sampleRecords(request);
When the lodash library is run instrumented, semantics seems to change a little:
The internal test suite of the library fails when running with jalangi.js:
Setup:
$ git clone [email protected]:lodash/lodash.git
$ cd lodash
$ npm i
...
$ cd ..
Run lodash test suite uninstrumented:
$ node lodash/test/test.js
test.js invoked with arguments: ["node","/home/esbena/tmp/lodash/test/test.js"]
----------------------------------------
PASS: 4574 FAIL: 0 TOTAL: 4574
Finished in 8994 milliseconds.
----------------------------------------
Run lodash test suite instrumented:
$ node jalangi2-official/src/js/commands/jalangi.js lodash/test/test.js
test.js invoked with arguments: ["node","/home/esbena/tmp/lodash/test/test.js"]
----------------------------------------
keys methods
----------------------------------------
FAIL - `_.keys` skips non-enumerable properties (test in IE < 9)
FAIL | EQ | skips non-enumerable properties on `RegExp.prototype` | Expected: true, Actual: false
FAIL | EQ | skips non-enumerable properties inherited from `RegExp.prototype` | Expected: true, Actual: false
FAIL - `_.keysIn` skips non-enumerable properties (test in IE < 9)
FAIL | EQ | skips non-enumerable properties on `RegExp.prototype` | Expected: true, Actual: false
FAIL | EQ | skips non-enumerable properties inherited from `RegExp.prototype` | Expected: true, Actual: false
FAIL | EQ | skips non-enumerable properties on `RegExp.prototype` | Expected: true, Actual: false
FAIL | EQ | skips non-enumerable properties inherited from `RegExp.prototype` | Expected: true, Actual: false
----------------------------------------
lodash.partialRight
----------------------------------------
FAIL - should work as a deep `_.defaults`
FAIL | OK | Died on test #1 at invokeFun (/home/esbena/tmp/jalangi2-official/src/js/runtime/analysis.js:211:22): deep is not defined
----------------------------------------
PASS: 4569 FAIL: 5 TOTAL: 4574
Finished in 69820 milliseconds.
----------------------------------------
See above
The source:
for(var p in {}){}
Is instrumented to:
J$.N(41, 'p', p, false, false, false);
for (J$._tm_p in J$.H(17, J$.T(9, {}, 11, false))) {
var p = J$.W(25, 'p', J$._tm_p, p, false, true, true);
{
{
}
}
}
I think that:
var p = J$.W(25, 'p', J$._tm_p, p, false, true, true);
Should be wrapped in X1:
var p = X1(xxx, J$.W(25, 'p', J$._tm_p, p, false, true, true));
In the same way as the source:
var x = 42;
Is instrumented to:
var x = J$.X1(25, J$.W(17, 'x', J$.T(9, 42, 22, false), x, false, true, true));
A browserified browserify behaves differently when run with Jalangi:
Original behaviour:
TypeError: Object #<Object> has no method 'readFileSync'
Jalangi behaviour:
TypeError: Function.prototype.apply was called on undefined, which is a undefined and not a function
To reproduce:
In jalangi2 directory:
$ npm install -g browserify
$ mkdir bug
$ cd bug
$ mkdir node_modules
$ npm install browserify
$ echo 'require("browserify");' >> main.js
$ node main.js
$ browserify main.js -o bundle.js
$ node bundle.js
/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:845
var defaultPrelude = fs.readFileSync(defaultPreludePath, 'utf8');
^
TypeError: Object #<Object> has no method 'readFileSync'
at Object.<anonymous> (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:845:25)
at Object.JSONStream (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:955:4)
at s (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:1:316)
at /Users/e.andreasen/workspace/jalangi2/bug/bundle.js:1:367
at Object.<anonymous> (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:8:13)
at Object../lib/builtins.js (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:791:4)
at s (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:1:316)
at /Users/e.andreasen/workspace/jalangi2/bug/bundle.js:1:367
at Object.browserify (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:2:1)
at s (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:1:316)
$ cd ..
$ node src/js/commands/jalangi.js --inlineIID --inlineSource --analysis src/js/sample_analyses/ChainedAnalyses.js --analysis src/js/sample_analyses/dlint/Utils.js --analysis src/js/sample_analyses/dlint/CheckNaN.js bug/bundle.js
/Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:440
throw tmp;
^
TypeError: Function.prototype.apply was called on undefined, which is a undefined and not a function
at callFun (/Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:145:51)
at invokeFun (/Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:166:22)
at /Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:189:41
at Object.<anonymous> (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:2278:195)
at callFun (/Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:145:51)
at invokeFun (/Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:166:22)
at /Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:189:41
at Object.J$.X1.J$.F.J$.T.J$.T.J$.T.JSONStream (/Users/e.andreasen/workspace/jalangi2/bug/bundle.js:2516:73)
at callFun (/Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:145:51)
at invokeFun (/Users/e.andreasen/workspace/jalangi2/src/js/runtime/analysis.js:166:22)
Add support for a flag in Config to disable instrumentation of initializers, e.g. to avoid generation of instrumented code such as:
J$.N(161, 'PI2nx', PI2nx, false, false, false);
For the expression a.b(c)
, node.js engine will evaluate the sub-expressions in the following order:
a
c
a.b
However, in Chrome and Firefox, those sub-expressions are evaluated in a different order:
a
a.b
c
Problem: for expression a.b(c)
, the instrumented code always emits getField a.b
event after read c
event.
Consider the following example code:
var a = {
get b() {
console.log('getting b');
return function (){};
}
};
var d = {
get e() {
console.log('getting e');
return function (){};
}
};
console.log('--------------');
a.b(d.e);
console.log('--------------');
(1, a.b)(d.e)
Running the following example program in Chrome and Firefox produces the following result:
--------------
getting b
getting e
--------------
getting b
getting e
Running the instrumented example code on Chrome and Firefox produces a different result:
--------------
getting e
getting b
--------------
getting b
getting e
Therefore, the instrumentation process changes the semantics of the example code on Firefox and Chrome.
It seems that require('jalangi2').analyze(...)
can fail to perform an analysis by silently not loading the application of interest.
When the failure occurs, it occurs on every instrument-and-analyze attempt for an application except for the first. The situation only occurs if the instrumented files are named the same as in the previous attempt.
$ node instrument-and-analyze.js test1.js instumentationDirectory
<<instumentationDirectory/test1.js outputs>>
$ node instrument-and-analyze.js test1.js instumentationDirectory
<<instumentationDirectory/test1.js outputs>>
$ node instrument-and-analyze.js test2.js instumentationDirectory
<<instumentationDirectory/test2.js outputs>>
$ node instrument-and-analyze.js test2.js instumentationDirectory
<<instumentationDirectory/test2.js NOT LOADED>>
See https://github.com/esbena/jalangi-api-caching-bug
A problem related to nested function scope. Suppose here is the original code:
function f1() {
function f2() {
console.log('orig_fun');
}
f2();
}
f1();
After enabling function level sampling, it will be instrumented into (under function level sampling):
(the sampling of f2()
is omitted for simplicity):
function f1() {
if(true) { // suppose J$.S() always returns true here
function f2() {
console.log('instru_fun');
}
f2();
} else {
function f2() { // this f2 takes over the scope
console.log('orig_fun');
}
f2();
}
}
f1();
The entire instrumented code will always print: orig_fun
As a result, instrumented version of f2
is never called no matter what J$.S()
returns.
The newest version of Jalangi crashes on 3d-raytrace.js with
sandbox.Config.ENABLE_SAMPLING = true;
and exception:
/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/esnstrument.js:1177
return JSON.parse(JSON.stringify(src), JSONParseHandler);
^
TypeError: Converting circular structure to JSON
at Object.stringify (native)
at clone (/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/esnstrument.js:1177:32)
at Object.visitorCloneBodyPre.FunctionDeclaration (/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/esnstrument.js:1251:29)
at transformAst (/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/astUtil.js:148:36)
at transformAst (/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/astUtil.js:141:39)
at Object.transformAst (/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/astUtil.js:141:39)
at transformString (/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/esnstrument.js:1759:30)
at Object.instrumentCode (/Users/m.madsen/Arov/lib/jalangi2/src/js/instrument/esnstrument.js:1819:26)
at Object.Module._extensions..js (/Users/m.madsen/Arov/lib/jalangi2/src/js/commands/jalangi.js:77:30)
at Module.load (module.js:355:32)
In BranchOnly analysis.literal is restricted to only function objects.
However, to track rudimentary dataflow, I think we also want to be able to track object literals.
So if we could also support instrumentation of these.
npm install fails if I try to upgrade argparse from 0.1.6 to 1.0.2 in package.json.
I am not clear on whether record replay is not possible with jalangi2 because of some architectural changes / choices or is it that record replay engine has not been ported to jalangi2 (and hence it is possible if someone actually does the work )?
The reason I ask is , i am interested in the record replay part of jalangi - but I want to know which of the two forks to go forward with ? If its still possible with jalangi 2 I would be happy to put in the effort to port the record replay engine to jalangi 2 ?
I tried measuring the overhead of Jalangi 2 instrumentation on annex and comparing the overhead with analysis2.js
from Jalangi 1. Here are the running times for the fourth computer move on Chrome 40, where we are running with no client analysis.
analysis2.js
from Jalangi 1: ~1.8s
Jalangi 2: ~2.1s
This is a fairly significant slowdown. If I have time I'll try to dig further as to the root cause.
The implementation of A
in src/js/runtime/analysis.js is a bit odd.
It uses the same iids for the calls to G
, B
and P
:
function A(iid, base, offset, op, isComputed) {
var oprnd1 = G(iid, base, offset, isComputed, true, false);
return function (oprnd2) {
var val = B(iid, op, oprnd1, oprnd2, true, false);
return P(iid, base, offset, val, isComputed, true);
};
}
It should use different iids in the style of M
:
function M(iid, base, offset, isConstructor, isComputed) {
return function () {
var f = G(iid + 2, base, offset, isComputed, false, true);
return (lastComputedValue = invokeFun(iid, base, f, arguments, isConstructor, true));
};
}
Possibly related: #58
instrument.js hangs in node v5
To reproduce:
$ node --version
v5.0.0
$ mkdir test
$ cd test
$ echo '{}' > package.json
$ npm i jalangi2
$ npm i minimist
$ node node_modules/jalangi2/src/js/commands/instrument.js --outputDir out node_modules/minimist
^C # it hangs..
$ rm -rf node_modules/minimist/test
$ node node_modules/jalangi2/src/js/commands/instrument.js --outputDir out node_modules/minimist
instrumenting out/minimist2/index_orig_.js
instrumenting out/minimist2/example/parse_orig_.js
done!
It seems that the call to ncp on line 454 of instrument.js never calls the callback.
That can be shown with the following change:
ncp(inputDir, copyDir, {transform: transform}, function(){console.log('ncp done')});
Source:
var x = new Boolean(true)
var y = eval(new Boolean(true))
console.log(x);
console.log(y);
Uninstrumented/instrumented difference:
$ node test.js
[Boolean: true]
[Boolean: true]
$ node src/js/commands/jalangi.js test.js
[Boolean: true]
true
The runInstrumentedFunctionBody is invoked by J$.S. It would be nice if J$.S would also have access to the function iid or the function object, or possibly both.
The two checks in /src/js/runtime/analysis.js that check if an exception has been thrown are wrong.
The checks are implemented as
if (exceptionVal !== undefined) {...}
But this is wrong in the presence of throw undefined
.
Example code to showcase wrongness:
Consider the empty analysis (../empty-analysis.js):
(function (sandbox) {
function MyAnalysis () {
}
sandbox.analysis = new MyAnalysis();
})(J$);
Consider the program:
../throw-undefined.js
throw undefined;
Consider the shell session, which shows no errors:
$ node src/js/commands/esnstrument_cli.js ../throw-undefined.js && node src/js/commands/direct.js --analysis ../empty-analysis.js ../throw-undefined_jalangi_.js
$
But it should, because running ../throw-undefined.js
without jalangi, produces an error:
$ node ../throw-undefined.js
../throw-undefined.js:1
(function (exports, require, module, __filename, __dirname) { throw undefined;
^
undefined
If a node program calls process.exit()
, the underlying process exits immediately, without cleaning up the call stack, etc. In direct.js
, we add an exit
listener that at least ensures that endExecution()
is invoked even if process.exit
is called. But, this does not take care of all issues for the analysis client, as, e.g., it may be written expecting all functionEnter
callbacks to have a corresponding functionExit
callback. I can't think of anything we can really do here. We could monkey-patch process.exit
to just throw some exception, but that can be caught by the application code. Just logging an issue in case there's an idea for how to hide the process.exit
ugliness from analyses.
The isGlobal and isScriptLocal parameters to the read and write callbacks have confusing and, in my opinion, inconsistent values when handling global variables.
Consider a simple analysis that just prints out the name of the variable and both flags on read and write (code below). Run it on the following piece of code:
var x;
x=1;
x;
z = 1;
z;
console=console;
console;
What I would expect: The isGlobal and isScriptLocal variables should be the same in reads and writes to the same variable. Moreover, z would have either have isGlobal set and isScriptLocal unset, or vice-versa, and console would have isGlobal set and isScriptLocal unset.
What I get:
Writing to x (script-local)
Reading from x (script-local)
Writing to z (local)
Reading from z (global and script-local?!)
Reading from console (global and script-local?!)
Writing to console (local)
Reading from console (global and script-local?!)
PS: Here is the analysis used:
// JALANGI DO NOT INSTRUMENT
(function (sandbox) {
function MyAnalysis () {
function writeFlags(isGlobal, isScriptLocal) {
if (isGlobal && isScriptLocal) { return "global and script-local?!" }
else if (isGlobal) { return "global" }
else if (isScriptLocal) { return "script-local" }
else { return "local" }
}
this.read = function(iid, name, val, isGlobal, isScriptLocal){
console.log("Reading from " + name + " (" + writeFlags(isGlobal, isScriptLocal) + ")")
};
this.write = function(iid, name, val, lhs, isGlobal, isScriptLocal) {
console.log("Writing to " + name + " (" + writeFlags(isGlobal, isScriptLocal) + ")")
};
}
sandbox.analysis = new MyAnalysis();
})(J$);
Steps to reproduce:
/Users/m.madsen/tmp/jalangi2/src/js/commands/instrument.js:322
throw e;
^
Error: IllegalStateException
at getFnIdFromAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/esnstrument.js:531:19)
at wrapLiteral (/Users/m.madsen/tmp/jalangi2/src/js/instrument/esnstrument.js:554:37)
at Object.visitorRRPost.FunctionExpression (/Users/m.madsen/tmp/jalangi2/src/js/instrument/esnstrument.js:1386:24)
at transformAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/astUtil.js:148:36)
at transformAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/astUtil.js:141:39)
at transformAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/astUtil.js:141:39)
at transformAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/astUtil.js:141:39)
at transformAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/astUtil.js:141:39)
at transformAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/astUtil.js:141:39)
at transformAst (/Users/m.madsen/tmp/jalangi2/src/js/instrument/astUtil.js:141:39)
Function calls done inside with-statements can receive the wrong base object.
It seems that function calls always are treated as function calls while they should be treated as method calls when the callee is found on the "with-object".
Source:
function f() {console.log(this + '');}
var x = {m: function() {console.log(this + '');}}
with (x) {
f();
m();
}
Output uninstrumented/instrumented:
$ node test.js
[object global]
[object Object]
$ node src/js/commands/jalangi.js test.js
[object global]
[object global]
The acorn.parse usage in esnstrument.js does not support node shell scripts with #!-magic:
#!/usr/bin/env node
...
Trying to parse the file gives an error:
SyntaxError: Unexpected character '#' (1:0)
at raise (.../jalangi2/node_modules/acorn/acorn.js:319:15)
Either allowHashBang: true
should be set as an option to acorn.parse, or the first line needs to be changed into a comment manually with something like code = '//' + code;
.
NB: Doing this blindly all the time is dangerous as it might suppress real syntax errors.
The numeric postfix operation instrumentations does not preserve floating point semantics.
Source:
var a = 0.15;
console.log(a);
console.log(a++);
console.log(a);
Uninstrumented & Instrumented runs:
$ node test.js
0.15
0.15
1.15
$ node src/js/commands/jalangi.js test.js
0.15
0.1499999999999999
1.15
Ideally, the value of a postfix expression is the initial value, but the adjustIncDec
function in esnstrument.js subtracts/adds 1
to the modified value instead. But this has an unfortunate effect on floating point numbers, as seen above.
Instrumented:
...
J$.X1(65, J$.B(26, '-', a = J$.W(49, 'a', J$.B(18, '+', J$.U(10, '+', J$.R(41, 'a', a, 0)), J$.T(33, 1, 22, false), 0), a, 0), J$.T(57, 1, 22, false), 0));
...
We should rewrite instrument.js
to use cpr instead of ncp. We are already stuck on some old ncp version due to API changes, and it also is preventing instrument.js
from working on io.js (due to AvianFlu/ncp#79). Additionally, I'd like to add an option so that rather than writing the _orig_.js
files, we simply point back to the original files, or put the original files in some parallel directory structure (but this will take some thought).
In Jalangi it is currently possible to turn instrumentation on/off with a ENABLE_SAMPLING and runInstrumentedFunctionBody.
What I would like to ask for is a mechanism, for example as part of preFunInvoke, where you can opt to enter the original un-instrumented function and thus escape entirely to a world where nothing is instrumented. You would of course never be able to go back.
A potential issue is function objects stored in the heap, and how they would be replaced. Here it might still be sufficient to rely on the ENABLE_SAMPLING trick.
If this is beyond the scope of Jalangi, some pointers on how to achieve this would be welcome.
src/js/commands/instrument.js on an html file containing the following tag
< a href="//slashdot.org/my/login" onclick="show_login_box(); return false;">
throws SyntaxError exception.
It seems that typeof "foo"
and similar primitive literal arguments to typeof does not get instrumented.
Is this intentional?
Source:
typeof "foo";
typeof true;
typeof 1;
+"bar";
typeof +"baz";
typeof {};
Annotated instrumentation:
...
J$.X1(9, J$.U(10, 'typeof', 'foo')); // << no J$.T
J$.X1(17, J$.U(18, 'typeof', true)); // << no J$.T
J$.X1(25, J$.U(26, 'typeof', 1)); // << no J$.T
J$.X1(41, J$.U(34, '+', J$.T(33, 'bar', 21, false))); // << J$.T for non-typeof unary
J$.X1(57, J$.U(50, 'typeof', J$.U(42, '+', J$.T(49, 'baz', 21, false)))); // << J$.something for expression argument to typeof
J$.X1(73, J$.U(58, 'typeof', J$.T(65, {}, 11, false))); // << J$.T for object argument to typeof
...
Right now, the only name we get for a script instrumented via the proxy server is the hash, e.g., 008d09db116979dcfd1bead4060785af.js
. We should support something better, e.g., storing the original URL in the source map information.
Jalangi squelches the ReferenceError that would occur on a references to an undeclared variable through its use of J$.I
.
This breaks feature-detection with try-catch blocks as seen below:
try {
document; // <-- supposed to throw in non-browser environments
} catch (e) {
standalone = true;
}
Suggestion: the jalangi-instrumentation-pattern with J$.I(...)
for undeclared variables should be disabled by default.
Thoughts?
test.js:
DOES_NOT_EXIST;
test_jalangi_.js
...
J$.X1(17, J$.I(typeof DOES_NOT_EXIST === 'undefined' ? DOES_NOT_EXIST = J$.R(9, 'DOES_NOT_EXIST', undefined, true, true) : DOES_NOT_EXIST = J$.R(9, 'DOES_NOT_EXIST', DOES_NOT_EXIST, true, true)));
...
$ node src/js/commands/jalangi.js test.js
$ node test.js
...
(function (exports, require, module, __filename, __dirname) { DOES_NOT_EXIST;
^
ReferenceError: DOES_NOT_EXIST is not defined
...
Jalangi does not have a declare-call for the name of a function expression.
I think this source:
(function g(){})();
Should be instrumented to something like:
J$.X1(113, J$.F(105, J$.T(97, function g() {
jalangiLabel1:
while (true) {
try {
// NEW
g = J$.X1(2324, J$.N(137, 'g', J$.T(129, g, 12, false), true, false, false);
// END OF NEW
J$.Fe(81, arguments.callee, this, arguments);
arguments = J$.N(89, 'arguments', arguments, true, false, false);
} catch (J$e) {
J$.Ex(161, J$e);
} finally {
if (J$.Fr(169))
continue jalangiLabel1;
else
return J$.Ra();
}
}
}, 12, false), false)());
Given that jalangi is a javascript tool. It might be a good idea to migrate all the python based tasks ( scripts folder ) to javascript based task runners ( grunt / gulp! ). This will reduce the dependency on additional installation's required.
The behavior of the isGlobal flag in the read and write callbacks interacts with evals in an unexpected way. Consider the following example:
var x = "magic";
console.log(x);
eval("console.log(x)");
Then reading x
the first time will invoke the read
callback with isGlobal=false and isScriptLocal=true (which is entirely expected, from the documentation), while the read in the eval will invoke the callback with isGlobal=true and isScriptLocal=true.
According to the documentation, isGlobal is "True if the variable is not declared using var", but in fact, isGlobal is "True if the variable is not declared using var in the current script".
Minimal analysis to reproduce:
(function (sandbox) {
function MyAnalysis (global) {
this.read = function(iid, name, val, isGlobal, isScriptLocal){
console.log("Reading " + name + ", isGlobal=" + isGlobal)
};
this.scriptEnter = function(iid, instrumentedFileName, originalFileName){
console.log("Entering script")
};
this.scriptExit = function(iid, wrappedExceptionVal){
console.log("Exiting script")
};
}
sandbox.analysis = new MyAnalysis(this);
})(J$);
This yields:
Entering script
Reading console, isGlobal=true
Reading x, isGlobal=false
magic
Entering script
Reading console, isGlobal=true
Reading x, isGlobal=true
magic
Exiting script
Reading console, isGlobal=true
Reading x, isGlobal=false
magic
Exiting script
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.