ts-archive / teraslice_hdfs_reader Goto Github PK
View Code? Open in Web Editor NEWTeraslice reader to process JSON data stored in text files in HDFS. - Deprecated use hdfs-assets
License: MIT License
Teraslice reader to process JSON data stored in text files in HDFS. - Deprecated use hdfs-assets
License: MIT License
Supporting the 'connection' parameter should be a standard feature of all modules that read or store data.
If the source data isn't formatted correctly you receive an error similar to the following. The error is caught but it's not very clear what's causing it. I'm sure we can handle this more gracefully.
failed to process { worker_id: '10.0.1.8__8',
slice:
{ slice_id: '7acc3f0f-5872-4e69-af11-6e022ce0cfe1',
request:
{ path: '/test/kstaken-2017.04/kteramac.local.4',
offset: 12000000,
length: 500000,
total: 13843536 },
slicer_id: 0,
slicer_order: 25 },
error: 'SyntaxError: Unexpected token {\n at Object.parse (native)\n at /Users/kstaken/projects/opensource/teraslice_hdfs_reader/index.js:245:36\n at Array.map (native)\n at json_lines (/Users/kstaken/projects/opensource/teraslice_hdfs_reader/index.js:245:17)\n at /Users/kstaken/projects/opensource/teraslice_hdfs_reader/index.js:111:24\n at tryCatcher (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/util.js:16:23)\n at Promise._settlePromiseFromHandler (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:510:31)\n at Promise._settlePromise (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:567:18)\n at Promise._settlePromise0 (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:612:10)\n at Promise._settlePromises (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:691:18)\n at Promise._fulfill (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:636:18)\n at Promise._settlePromise (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:580:21)\n at Promise._settlePromise0 (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:612:10)\n at Promise._settlePromises (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:691:18)\n at Promise._fulfill (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:636:18)\n at Promise._settlePromise (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:580:21)\n at Promise._settlePromise0 (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:612:10)\n at Promise._settlePromises (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:691:18)\n at Promise._fulfill (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:636:18)\n at Promise._resolveCallback (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:431:57)\n at Promise._settlePromiseFromHandler (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:522:17)\n at Promise._settlePromise (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:567:18)\n at Promise._settlePromise0 (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:612:10)\n at Promise._settlePromises (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:691:18)\n at Promise._fulfill (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:636:18)\n at Promise._resolveCallback (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:431:57)\n at Promise._settlePromiseFromHandler (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:522:17)\n at Promise._settlePromise (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:567:18)' } and has slice state is marked as error
Configuration
json_lines
but other formats may make sense later.Slicer
size
boundary then the file should be split based on size
.path
to a file, starting offset
and ending offset
.Reader
starting offset
is 0 then the first line is kept.
starting offset
is NOT 0 then the first line is dropped in all cases.teraslice-hdfs-reader
Appears something isn't quite joining correctly. I believe this may be happening at the end of the file but still digging.
failed to process { worker_id: '10.0.1.8__8',
slice:
{ slice_id: '7acc3f0f-5872-4e69-af11-6e022ce0cfe1',
request:
{ path: '/test/kstaken-2017.04/kteramac.local.4',
offset: 12000000,
length: 500000,
total: 13843536 },
slicer_id: 0,
slicer_order: 25 },
error: 'SyntaxError: Unexpected token {\n at Object.parse (native)\n at /Users/kstaken/projects/opensource/teraslice_hdfs_reader/index.js:245:36\n at Array.map (native)\n at json_lines (/Users/kstaken/projects/opensource/teraslice_hdfs_reader/index.js:245:17)\n at /Users/kstaken/projects/opensource/teraslice_hdfs_reader/index.js:111:24\n at tryCatcher (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/util.js:16:23)\n at Promise._settlePromiseFromHandler (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:510:31)\n at Promise._settlePromise (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:567:18)\n at Promise._settlePromise0 (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:612:10)\n at Promise._settlePromises (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:691:18)\n at Promise._fulfill (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:636:18)\n at Promise._settlePromise (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:580:21)\n at Promise._settlePromise0 (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:612:10)\n at Promise._settlePromises (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:691:18)\n at Promise._fulfill (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:636:18)\n at Promise._settlePromise (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:580:21)\n at Promise._settlePromise0 (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:612:10)\n at Promise._settlePromises (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:691:18)\n at Promise._fulfill (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:636:18)\n at Promise._resolveCallback (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:431:57)\n at Promise._settlePromiseFromHandler (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:522:17)\n at Promise._settlePromise (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:567:18)\n at Promise._settlePromise0 (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:612:10)\n at Promise._settlePromises (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:691:18)\n at Promise._fulfill (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:636:18)\n at Promise._resolveCallback (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:431:57)\n at Promise._settlePromiseFromHandler (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:522:17)\n at Promise._settlePromise (/Users/kstaken/projects/opensource/terafoundation/node_modules/bluebird/js/release/promise.js:567:18)' } and has slice state is marked as error
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.