Code Monkey home page Code Monkey logo

canvas-data-cli's Introduction

Canvas Data CLI

A small CLI tool for syncing data from the Canvas Data API.

NOTE: this is currently in beta, please report any bugs or issues you find!

Installing

Prerequisites

This tool should work on Linux, OSX, and Windows. The tool uses node.js runtime, which you will need to install before being able to use it.

  1. Install Node.js - Any version newer than 0.12.0 should work, best bet is to follow the instructions here

Install via npm

npm install -g canvas-data-cli

OR Install from github

git clone https://github.com/instructure/canvas-data-cli.git && cd canvas-data-cli && make installLocal

Configuring

The Canvas Data CLI requires a configuration file with a fields set. Canvas Data CLI uses a small javascript file as configuration file. To generate a stub of this configuration run canvasDataCli sampleConfig which will create a config.js.sample file. Rename this to a file, like config.js.

Edit the file to point to where you want to save the files as well as the file used to track the state of which data exports you have already downloaded. By default the sample config file tries to pull your API key and secret from environment variables, CD_API_KEY and CD_API_SECRET, which is more secure, however, you can also hard code the credentials in the config file.

Configuring an HTTP Proxy

canvas-data-cli has support for HTTP Proxies, both with and without basic authentication. To do this there are three extra options you can add to your config file. httpsProxy, proxyUsername, and proxyPassword.

Config Option Value
httpsProxy the host:port of the https proxy. Ideally it'd look like: https_proxy_stuff.com:433
proxyUsername the basic auth username for the https proxy.
proxyPassword the basic auth password for the https proxy.

Usage

Syncing

If you want to simply download all the data from Canva Data, the sync command can be used to keep an up-to-date copy locally.

canvasDataCli sync -c path/to/config.js

This will start the sync process. The sync process uses the sync api endpoint to get a list of all the files. If the file does

not exist, it will download it. Otherwise, it will skip the file. After downloading all files, it will delete any unexpected files

in the directory to remove old data.

On subsequent executions, it will only download the files it doesn't have.

This process is also resumeable, if for whatever reason you have issues, it should restart and download only the files

that previously failed. One of the ways to make this more safe is that it downloads the file to a temporary name and

renames it once the process is finished. This may leave around gz.tmp files, but they should get deleted automatically once

you have a successful run.

If you run this daily, you should keep all of your data from Canvas Data up to date.

Fetch

Fetches most up to date data for a single table from the API. This ignores any previously downloaded files and will redownload all the files associated with that table.

canvasDataCli fetch -c path/to/config.js -t user_dim

This will start the fetch process and download what is needed to get the most recent data for that table (in this case, the user_dim).

On subsequent executions, this will redownload all the data for that table, ignoring any previous days data.

Unpack

NOTE: This only works after properly running a sync command

This command will unpack the gzipped files, concat any partitioned files, and add a header to the output file

canvasDataCli unpack -c path/to/config.js -f user_dim,account_dim

This command will unpack the user_dim and account_dim tables to a directory. Currently, you explictly have to give the files you want to unpack as this has the potential for creating very large files.

API

This subcommand is designed to allow users to make API calls directly. The main use case for which is debugging and development.

canvasDataCli api -c config.js -r /account/self/dump

Historical Requests

Periodically requests data is regrouped into collections that span more than just a single day. In this case, the date that the files were generated differs from the time that the included requests were made. To make it easier to identify which files contain the requests made during a particular time range, we have the historical-requests subcommand.

canvasDataCli historical-requests -c config.js

Its output takes the form:

{
  "dumpId": "...",
  "ranges": {
    "20180315_20180330": [
      {
        "url": "...",
        "filename": "..."
      },
      {
        "url": "...",
        "filename": "..."
      }
    ],
    "20180331_20180414": [
      {
        "url": "...",
        "filename": "..."
      }
    ]
  }
}

Developing

Process:

  1. Write some code
  2. Write tests
  3. Open a pull request

Running tests

In Docker

If you use docker, you can run tests inside a docker container

./build.sh

Native

npm install .
npm test

canvas-data-cli's People

Contributors

dlecocq avatar howderek avatar kblibr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

canvas-data-cli's Issues

Out of memory unpacking Requests

I'm running this on my laptop (16GB RAM), and keep being told that, when I unpack Requests, that Javascript is running out of memory. I note that my requests.txt file went from 825K to ~45MB as a result of this unpacking, so I'm not sure that it didn't get fully unpacked. But…

$ canvasDataCLI unpack -c config.js -f requests
outputting requests to /Users/sethbattis/canvas-data-cli/unpackedFiles/requests.txt

<--- Last few GCs --->

  440187 ms: Mark-sweep 1390.3 (1434.8) -> 1390.2 (1434.8) MB, 5311.6 / 0 ms [allocation failure] [GC in old space requested].
  444771 ms: Mark-sweep 1390.2 (1434.8) -> 1390.2 (1434.8) MB, 4582.4 / 0 ms (+ 1.2 ms in 1 steps since start of marking, biggest step 1.2 ms) [allocation failure] [GC in old space requested].
  449126 ms: Mark-sweep 1390.2 (1434.8) -> 1390.2 (1434.8) MB, 4354.9 / 0 ms [last resort gc].
  453310 ms: Mark-sweep 1390.2 (1434.8) -> 1390.2 (1434.8) MB, 4184.3 / 0 ms [last resort gc].


<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 0x14d6bb8c9e31 <JS Object>
    2: onwrite(aka onwrite) [_stream_writable.js:~329] [pc=0x1aab841dec64] (this=0x14d6bb804189 <undefined>,stream=0x33b49e8f05b9 <a WriteStream with map 0x3949478aa079>,er=0x14d6bb804189 <undefined>)
    3: /* anonymous */(aka /* anonymous */) [_stream_writable.js:~88] [pc=0x1aab84113e97] (this=0x14d6bb804189 <undefined>,er=0x14d6bb804189 <undefined>)
    4: arguments adaptor frame: 0->1
    5...

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
 1: node::Abort() [/usr/local/bin/node]
 2: node::FatalException(v8::Isolate*, v8::Local<v8::Value>, v8::Local<v8::Message>) [/usr/local/bin/node]
 3: v8::Utils::ReportApiFailure(char const*, char const*) [/usr/local/bin/node]
 4: v8::internal::V8::FatalProcessOutOfMemory(char const*, bool) [/usr/local/bin/node]
 5: v8::internal::Factory::NewFixedArray(int, v8::internal::PretenureFlag) [/usr/local/bin/node]
 6: v8::internal::TypeFeedbackVector::New(v8::internal::Isolate*, v8::internal::Handle<v8::internal::TypeFeedbackMetadata>) [/usr/local/bin/node]
 7: v8::internal::CompilationInfo::EnsureFeedbackVector() [/usr/local/bin/node]
 8: v8::internal::FullCodeGenerator::MakeCode(v8::internal::CompilationInfo*) [/usr/local/bin/node]
 9: v8::internal::GenerateBaselineCode(v8::internal::CompilationInfo*) [/usr/local/bin/node]
10: v8::internal::CompileBaselineCode(v8::internal::CompilationInfo*) [/usr/local/bin/node]
11: v8::internal::GetUnoptimizedCodeCommon(v8::internal::CompilationInfo*) [/usr/local/bin/node]
12: v8::internal::Compiler::GetLazyCode(v8::internal::Handle<v8::internal::JSFunction>) [/usr/local/bin/node]
13: v8::internal::Runtime_CompileLazy(int, v8::internal::Object**, v8::internal::Isolate*) [/usr/local/bin/node]
14: 0x1aab83d0961b
15: 0x1aab83d31ed9
Abort trap: 6

Sync option delete requests files

Hi I am trying to get a handle with the this tool and ran into what I think is a bug. When I run the CanvasDataCLi, it would download the data, but then delete it as the end during the cleanup process.

Has anyone experience this issue?

Thanks

Two data canvas data exports in the same night

The sync command doesn't know how to handle two separate data dumps on the same night performed by Instructure, if the second dump is just for the requests table and doesn't include the other tables. The sync will clear out all the rest of the datafiles locally and just sync the new requests table data only. You then have to use the fetch command to grab the other updated tables manually.

premature close on unpack

Description

Able to fetch compressed table data without issue. When attempting to unpack the file is partially unpacked and then closes prematurely. The resulting file is valid but missing the last 10 percent of the data.

canvasDataCli unpack -c $canvas_path/config.js -f learning_outcome_dim -l debug

Additional Information

  • canvas cli tool version 0.6.2
  • Node Version 8.16.1
  • Platform: Ubuntu 19.04
  • Logs:
will unpack learning_outcome_dim
outputting learning_outcome_dim to /media/acura/DEV/canvas/unpackedfiles/learning_outcome_dim.txt
an error occured
Error: premature close
    at MultiStream.onclose (/usr/lib/node_modules/canvas-data-cli/node_modules/end-of-stream/index.js:47:89)
    at MultiStream.emit (events.js:203:15)
    at MultiStream.destroy (/usr/lib/node_modules/canvas-data-cli/node_modules/multistream/index.js:65:8)
    at MultiStream._gotNextStream (/usr/lib/node_modules/canvas-data-cli/node_modules/multistream/index.js:94:10)
    at MultiStream._next (/usr/lib/node_modules/canvas-data-cli/node_modules/multistream/index.js:85:10)
    at Readable.onEnd (/usr/lib/node_modules/canvas-data-cli/node_modules/multistream/index.js:120:10)
    at Object.onceWrapper (events.js:286:20)
    at Readable.emit (events.js:198:13)
    at endReadableNT (/usr/lib/node_modules/canvas-data-cli/node_modules/readable-stream/lib/_stream_readable.js:1010:12)
    at process._tickCallback (internal/process/next_tick.js:63:19)
Error: premature close
    at MultiStream.onclose (/usr/lib/node_modules/canvas-data-cli/node_modules/end-of-stream/index.js:47:89)
    at MultiStream.emit (events.js:203:15)
    at MultiStream.destroy (/usr/lib/node_modules/canvas-data-cli/node_modules/multistream/index.js:65:8)
    at MultiStream._gotNextStream (/usr/lib/node_modules/canvas-data-cli/node_modules/multistream/index.js:94:10)
    at MultiStream._next (/usr/lib/node_modules/canvas-data-cli/node_modules/multistream/index.js:85:10)
    at Readable.onEnd (/usr/lib/node_modules/canvas-data-cli/node_modules/multistream/index.js:120:10)
    at Object.onceWrapper (events.js:286:20)
    at Readable.emit (events.js:198:13)
    at endReadableNT (/usr/lib/node_modules/canvas-data-cli/node_modules/readable-stream/lib/_stream_readable.js:1010:12)
    at process._tickCallback (internal/process/next_tick.js:63:19)

Not Working in Ubuntu

When I try to install the Canvas-Data-Cli it tells me this:

npm WARN [email protected] requires a peer of chai@>= 1.6.1 < 2 but none is installed. You must install peer dependencies yourself.

Then if I try to run it from the command line I get this:

canvasDataCli
module.js:538
throw err;
^

Error: Cannot find module './lib/Api'
at Function.Module._resolveFilename (module.js:536:15)
at Function.Module._load (module.js:466:25)
at Module.require (module.js:579:17)
at require (internal/module.js:11:18)
at Object. (/usr/local/lib/node_modules/canvas-data-cli/index.js:2:8)
at Module._compile (module.js:635:30)
at Object.Module._extensions..js (module.js:646:10)
at Module.load (module.js:554:32)
at tryModuleLoad (module.js:497:12)
at Function.Module._load (module.js:489:3)

Can anyone tell me what's going on? I have it working fine on Windows 10.

fileUrl is not defined

During download, i'm getting an error:

ReferenceError: fileUrl is not defined
at FileDownloader._downloadRetry (C:\Users\JasonMiles\AppData\Roaming\npm\node_modules\canvas-data-cli\lib\FileDownloader.js:32:94)
at null._onTimeout (C:\Users\JasonMiles\AppData\Roaming\npm\node_modules\canvas-data-cli\lib\FileDownloader.js:46:26)
at Timer.listOnTimeout (timers.js:92:15)

Missing Foreign Key Data

Description

I have set up the CLI data extraction successfully. This includes setting up the CLI extract (see Github site) which pulls down our data (as dimension and fact tables) into a set of flat text files. I have successfully set up a SQL Server database along with all foreign keys and relationships between those dimension and fact tables. Finally, I set up an SSIS process which reads those downloaded flat files and loads the tables in the SQL Server database.

This works wonderfully - EXCEPT for a small detail (which really isn't small) - there are some fact tables that are getting pulled down that contain foreign key data that does not exist in our dimension tables. This is not good. At this point, we are pretty new, and do a "truncation and load" of all of our tables. I know this will need to change down the road as we get larger. But for now, I simply get all of the data for each load. I have the sequence of how to load the dimension tables and they all load fine. However, when the fact tables (details) are being loaded, some of the loads fail because it is trying to put data into the fact tables that doesn't correspond to data in the dimension tables.

One example is the table "Wiki_page_fact" is bringing down data that the usr_id (user id) does not exist in the usr_dim table. Null values for this are OK, but when it is bringing down data that doesn't exist, that is a problem.

To get this data to load, I've disabled my FK relationship enforcement. Overall, I have 20 tables that I had to disable the FK enforcement to get loaded. Most (but not all) of them are for user ids that are not in the usr_dim table.

Is anyone else seeing this? If so, any solutions other than disabling the FK relationships??

Any help/direction is appreciated!!

FYI...the tables I've disabled FK enforcement for are as follows (just showing the script):

-- Disabled Fact Table FKs

ALTER TABLE [cvs].[assign_overr_usr_roll_fact]
NOCHECK CONSTRAINT [FK_usr_assign_overr_usr_roll_fact];
GO

ALTER TABLE [cvs].[conference_partic_fact]
NOCHECK CONSTRAINT FK_usr_conference_partic_fact;
GO

ALTER TABLE cvs.course_ui_nav_item_fact
NOCHECK CONSTRAINT FK_course_ui_canvas_nav_course_ui_nav_item_fact
GO

ALTER TABLE cvs.discussion_entry_fact
NOCHECK CONSTRAINT FK_topic_editor_discussion_entry_fact
go

ALTER TABLE cvs.discussion_topic_fact
NOCHECK CONSTRAINT FK_usr_discussion_topic_fact
GO

ALTER TABLE cvs.discussion_topic_fact
NOCHECK CONSTRAINT FK_editor_discussion_topic_fact
go
ALTER TABLE cvs.enrollment_fact
NOCHECK CONSTRAINT FK_usr_enrollment_fact
GO

ALTER TABLE cvs.file_fact
NOCHECK CONSTRAINT FK_uploader_file_fact
GO

ALTER TABLE cvs.file_fact
NOCHECK CONSTRAINT FK_folder_file_fact
GO
ALTER TABLE cvs.module_compl_req_fact
NOCHECK CONSTRAINT FK_discussion_topic_editor_module_compl_req_fact
go

ALTER TABLE cvs.module_compl_req_fact
NOCHECK CONSTRAINT FK_usr_module_compl_req_fact
go
ALTER TABLE cvs.module_item_fact
NOCHECK CONSTRAINT FK_discussion_topic_editor_module_item_fact
GO

ALTER TABLE cvs.module_item_fact
NOCHECK CONSTRAINT FK_usr_module_item_fact
GO
ALTER TABLE cvs.pseudonym_fact
NOCHECK CONSTRAINT FK_usr_pseudonym_fact
GO

ALTER TABLE cvs.quiz_quest_answer_fact
NOCHECK CONSTRAINT FK_assessment_quest_quiz_quest_answer_fact
GO

ALTER TABLE cvs.quiz_quest_answer_fact
NOCHECK CONSTRAINT FK_quiz_quest_group_quiz_quest_answer_fact
GO
ALTER TABLE cvs.quiz_quest_fact
NOCHECK CONSTRAINT FK_assessment_quest_quiz_quest_fact
GO

ALTER TABLE cvs.quiz_quest_fact
NOCHECK CONSTRAINT FK_quiz_quest_group_quiz_quest_fact
GO
ALTER TABLE cvs.submis_file_fact
NOCHECK CONSTRAINT FK_submis_file_submis_file_fact
GO
ALTER TABLE cvs.wiki_page_fact
NOCHECK CONSTRAINT FK_usr_wiki_page_fact
GO

Additional Information

  • Node Version: 10.15.3
  • Platform: Windows
  • Logs: <files are not generating. Any help on this is appreciated. Using config.js for options

401 Unauthorized Errors on All actions

Description

We've started receiving 401 Unauthorized Errors on all api-bound requests from the CLI. We have been using the CLI for a couple of years to pull down data sets that get built out for internal use. Yesterday I noticed that the build was throwing errors. I started the process manually to find out what was going on and discovered the 401 errors. I think this is a recent change (maybe the last week or so?). Other (non-CanvasDataCli) access works as expected. I reset my KEY/SECRET pair which didn't help. I upgraded to the latest version of the CanvasDataCli, which also didn't help.

Additional Information

  • Node Version : v8.10.0
  • Client Version: 0.6.6
  • Platform: Ubuntu Linux LTS
  • Logs:
$ canvasDataCli -l debug -c config.js list
an error occured
{ [Error] errorCode: 401, resp: 'Unauthorized' }```

Unhandled Exceptions Connecting to Server

Description

Here is the error when running this from the Windows command line:

R:\canvasData>canvasDataCli sync -c ./config.js                                                 
fetching current list of files from API...                                                      
an error occured                                                                                
{ Error: read ECONNRESET                                                                            
    at TLSWrap.onStreamRead (internal/stream_base_commons.js:111:27) errno: 'ECONNRESET', code: 
'ECONNRESET', syscall: 'read' }                                                                 
Error: read ECONNRESET                                                                              
      at TLSWrap.onStreamRead (internal/stream_base_commons.js:111:27)      

Here is the error when running this from the Linux subsystem for Windows (version info is below):

/mnt/r/canvasData# canvasDataCli sync --level debug -c ./config.js 
fs.js:1657                                                                                   
binding.lstat(baseLong);                                                                       
            ^                                                                                                                                                               Error: ENOTCONN: socket is not connected, lstat '/mnt/r'                                   
    at Object.realpathSync (fs.js:1657:15)                                                 
    at toRealPath (module.js:164:13)                                                       
    at Function.Module._findPath (module.js:213:22)                                        
    at Function.Module._resolveFilename (module.js:545:25)                                 
    at Function.Module._load (module.js:474:25)                                            
    at Module.require (module.js:596:17)                                                   
    at require (internal/module.js:11:18)                                                  
    at Object.run (/usr/lib/node_modules/canvas-data-cli/lib/cli.js:108:16)                
    at Object.<anonymous> (/usr/lib/node_modules/canvas-data-cli/bin/canvasDataCli:4:5)

    at Module._compile (module.js:652:30)   

I had Node.js v 10 installed in the Linux subsystem and received a different error:

/mnt/r/canvasData# canvasDataCli sync -c ./config.js               
fetching current list of files from API...
an error occured
{ Error: Client network socket disconnected before secure TLS connection was established
     at TLSSocket.onConnectEnd (_tls_wrap.js:1086:19)
     at Object.onceWrapper (events.js:273:13)
     at TLSSocket.emit (events.js:187:15)
     at endReadableNT (_stream_readable.js:1092:12)
     at process._tickCallback (internal/process/next_tick.js:63:19)
  code: 'ECONNRESET',
  path: null,
  host: 'api.inshosteddata.com',
  port: 443,
  localAddress: undefined }
Error: Client network socket disconnected before secure TLS connection was established     
     at TLSSocket.onConnectEnd (_tls_wrap.js:1086:19)
     at Object.onceWrapper (events.js:273:13)
     at TLSSocket.emit (events.js:187:15)
     at endReadableNT (_stream_readable.js:1092:12)
     at process._tickCallback (internal/process/next_tick.js:63:19) 

As an end user I need the source to handle exceptions and provide informative error messages to help diagnose what the cause of the error is and to help identify potential solutions. The desired behavior I would like is for the software that is developed by Instructure to work as intended to facilitate accessing our data programmatically. It would also be great if some of the devs from Instructure would respond to the issues being submitted to the source repository as well.

Additional Information

  • Node Version 10.10.0 & 8.11.4
  • Platform: Windows 10 (Education license) & Linux Subsystem for Windows (Ubuntu 16.04.5)
  • Logs: Adding the debug flag does not appear to have any effect on the logging of the errors

Will this CLI still be functional after the move to Canvas Data 2?

Canvas is closing Canvas Data 1 on December 31 2023, and will be moving to Canvas Data 2:

https://community.canvaslms.com/t5/Admin-Guide/What-is-Canvas-Data-2/ta-p/560956?mkt_tok=NDQ5LUJWSi01NDMAAAGP_pbE5fpndX95_0HKtJ_QkFaorc1dXLy4rprHVZvvwQT7Cb-5R2DtNWJAf44xaSTzRH2xWRCwBocKX6CchMzg4-4z6WNnu-2IDUTtY8cAhOBIQwU

We are dependent on this CLI as an essential component of our workflow, and do not have a drop-in replacement should this tool be deprecated. While the patches to this repo indicate no change or update for a long time, I see no indication of incompatibility or warning.

Will we expect this CLI to break with the deprecation of Canvas Data v1?

httpsProxy for filedownloader.js

Description

Looking at the .js libs in use, it appears that while Api.js implements the httpsProxy setting from config.js, FileDownloader.js does not.

Specifically, it appears this request builder:
var r = request({ method: 'GET', url: downloadLink.url });

Might need a proxy opt:
var r = request({ method: 'GET', url: downloadLink.url, proxy: config.httpsProxy });

Additional Information

  • Node Version v12.18.3
  • Platform: Windows
  • Logs: N/A

Unpack filters don't work as documented

Description

I run the below as per the readme:
canvasDataCli unpack -c path/to/config.js -f user_dim,account_dim

I get:
no files matched filter, nothing will be unpacked
unpack command completed successfully

If I run the following for each table/file individually it works as expexted:
canvasDataCli unpack -c config.js -f user_dim
canvasDataCli unpack -c config.js -f account_dim

There seems to be an issue parsing the -f parameter.

Additional Information

  • Node Version: v10.19.0
  • Platform: Ubuntu 20.0.4.1
  • Logs: (If you can please run the CLI with: -l debug and provide us the debug logs.)

Issue with unpack, unexpected end of file.

Description

Ran canvasDataCli unpack with most tables, (skipped requests). After finishing, although I'm not sure if it did finish, it comes back with events.js: 174 throw er; // unhyandled 'error' event
error: unexpected end of file
at Zlib.zlibonerror [as onneror] (zlib.js:162:17)
emitted 'error' event at:
at zlib.zlibonerror [as on error] (zlib.js:165:8)

Additional Information

  • Node Version 10.16.0
  • Platform: Windows 10 build 1903

NPM is giving access issue

--- Delete everything above this line ---

Description

Explain what you did, what you expected to happen, and what actually happens.
I was using CanvasDataCli in my local machine without any issues.
Now I am moving to a Linux server where I wanted to download the files and using an ETL tool load it to an Oracle Database.
My Linux admin is not giving me access to install node.js and run npm.
hence I asked them to follow instructions to install node and run canvas-data-cli npm ref: https://community.canvaslms.com/docs/DOC-6600-how-to-use-the-canvas-data-cli-tool .
they're getting following error

`[USER@SERVER]$ npm install -g canvas-data-cli
npm WARN deprecated [email protected]: Legacy versions of mkdirp are no longer supported. Please update to mkdirp 1.x. (Note that the API surface has changed to use Promises in 1.x.)
npm WARN deprecated [email protected]: request has been deprecated, see request/request#3142
npm WARN checkPermissions Missing write access to /usr/lib/node_modules
npm ERR! code EACCES
npm ERR! syscall access
npm ERR! path /usr/lib/node_modules
npm ERR! errno -13
npm ERR! Error: EACCES: permission denied, access '/usr/lib/node_modules'
npm ERR! [Error: EACCES: permission denied, access '/usr/lib/node_modules'] {
npm ERR! stack: "Error: EACCES: permission denied, access '/usr/lib/node_modules'",
npm ERR! errno: -13,
npm ERR! code: 'EACCES',
npm ERR! syscall: 'access',
npm ERR! path: '/usr/lib/node_modules'
npm ERR! }
npm ERR!
npm ERR! The operation was rejected by your operating system.
npm ERR! It is likely you do not have the permissions to access this file as the current user
npm ERR!
npm ERR! If you believe this might be a permissions issue, please double-check the
npm ERR! permissions of the file and its containing directories, or try running
npm ERR! the command again as root/Administrator.

npm ERR! A complete log of this run can be found in:
npm ERR! /home/d_smalisetty/.npm/_logs/2020-03-24T20_03_48_113Z-debug.log
[USER@SERVER]$ npm -v
6.13.4
[USER@SERVER]$ node -v
v12.16.1
[USER@SERVER]`

Additional Information

  • Node Version 12.6.1
  • Platform:
    [USER@SERVER]$ uname Linux [USER@SERVER]$ uname -r 3.10.0-1062.12.1.el7.x86_64
  • Logs: (If you can please run the CLI with: -l debug and provide us the debug logs.)

Difficulty getting the sample config

Please help me understand why I am getting the following when I try to get the sample configuration. Thanks.

Isabella-Houchards-iMac:~ isabellahouchard$ npm install -g canvas-data-cli
/usr/local/bin/canvasDataCli -> /usr/local/lib/node_modules/canvas-data-cli/bin/canvasDataCli
/usr/local/lib
└── [email protected]

Isabella-Houchards-iMac:~ isabellahouchard$ canvasDataCli sampleConfig
module.js:471
throw err;
^

Error: Cannot find module './lib/Api'
at Function.Module._resolveFilename (module.js:469:15)
at Function.Module._load (module.js:417:25)
at Module.require (module.js:497:17)
at require (internal/module.js:20:19)
at Object. (/usr/local/lib/node_modules/canvas-data-cli/index.js:2:8)
at Module._compile (module.js:570:32)
at Object.Module._extensions..js (module.js:579:10)
at Module.load (module.js:487:32)
at tryModuleLoad (module.js:446:12)
at Function.Module._load (module.js:438:3)
Isabella-Houchards-iMac:~ isabellahouchard$

Connection reset breaks sync

Description

What I did

Closed laptop when leaving the house this morning while running a first time sync.

What I expected to happen

  1. Once computer was reopened it would reestablish the connection and continue downloading files.
  2. If number 1 above didn't work, I could terminate the process in the terminal and re-run the synchronize command

What actually happens

Now I keep repeatedly getting a TLS Error when trying to run the sync command (copied below). Additionally, if I login to the front end I get an error message/icon when I navigate to the Canvas Data Portal page saying that the connection was reset and providing no other information.

Additional Information

$ node --version
v10.8.0
$ canvasDataCli --version
0.6.1
$ canvasDataCli sync -c ./config.js -l debug
fetching current list of files from API...
an error occured
{ Error: read ECONNRESET
    at TLSWrap.onread (net.js:660:25) errno: 'ECONNRESET', code: 'ECONNRESET', syscall: 'read' }
Error: read ECONNRESET
    at TLSWrap.onread (net.js:660:25)

Installing Canvas Data CLI

Hi Addison,

My name is Tom Lamy and I work for the University of New Hampshire. I'm getting my feet wet with the Canvas Data and found your tool for synching Canvas Data. I'd really like to try this, but got an error when I ran canvasDataCli sampleConfig. I installed:

  • nodejs.x86_64 0:0.10.40-1nodesource.el7.centos
  • libstdc++-devel-4.8.3-9.el7.x86_64
  • gcc-c++-4.8.3-9.el7.x86_64
  • canvas-data-cli

Generating the stub (canvasDataCli sampleConfig) I get this error:

357 # canvasDataCli sampleConfig

/usr/lib/node_modules/canvas-data-cli/lib/logger.js:80
throw _iteratorError;
^
ReferenceError: Symbol is not defined
at Object. (/usr/lib/node_modules/canvas-data-cli/lib/logger.js:65:31)
at Module._compile (module.js:456:26)
at Object.Module._extensions..js (module.js:474:10)
at Module.load (module.js:356:32)
at Function.Module._load (module.js:312:12)
at Module.require (module.js:364:17)
at require (module.js:380:17)
at Object. (/usr/lib/node_modules/canvas-data-cli/lib/cli.js:6:14)
at Module._compile (module.js:456:26)
at Object.Module._extensions..js (module.js:474:10)

What symbol is not or am I not defining? Anything you can tell me would be appreciated.

Thanks -
[email protected]

incorrect header check - wiki_page_fact

I completed the sync without errors, then tried to unpack all files. There seems to be a problem with the headers of the wiki_page_fact table. All files had the header files, but no data. I reran the unpack command for just a few tables and it worked fine for them.

outputting wiki_page_fact to path\unpackedFiles\wiki_page_fact.txt
events.js:141
throw er; // Unhandled 'error' event
^

Error: incorrect header check
at Zlib._handle.onerror (zlib.js:363:17)

historical_requests url format

Description

Downloaded latest version and ran the historical_requests command

canvasDataCli historical-requests -c config.js

Date ranges were not part of the output.

In https://github.com/instructure/canvas-data-cli/blob/master/src/HistoricalRequests.js

getRangeForFile parses the url with url.split('/')[7]. However, when I look at our URLs, they are not in that format. They are:

https://bucket.amazonaws.com/ACCOUNT/requests/RANGE/etc/etc

Therefore the code should be url.split('/')[5]

I made this change in my local copy and it resolved the issue.

Additional Information

  • Node Version v18.2.0
  • Platform: Alpline 3.16.0
  • Logs: (If you can please run the CLI with: -l debug and provide us the debug logs.)

Default config path

I would love a default config path, so I wouldn't need to use the -c switch for every invocation.

unpack error

Hi Addison,

I was able to successfully sync using the canvas-data-cli but I am receiving an error when using the unpack. It creates the text file with headers but the data is not included. I have received two different errors but the results are still the same.

canvasDataCli -c config.js unpack -f account_dim
outputting account_dim to filepath/account_dim.txt
events.js:141
throw er; // Unhandled 'error' event
^

Error: unknown compression method
at Zlib._handle.onerror (zlib.js:363:17)

canvasDataCli -c config.js unpack -f enrollment_dim
outputting enrollment_dim to filepath/enrollment_dim.txt
events.js:141
throw er; // Unhandled 'error' event
^

Error: incorrect header check
at Zlib._handle.onerror (zlib.js:363:17)

Uses deprecated `new Buffer()` which gives warning

Description

When running anything that makes requests to the API a warning is emitted on newer versions of nodejs:

(node:41042) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or Buffer.from() methods instead.
(Use `node --trace-deprecation ...` to show where the warning was created)

This is because the code uses the new Buffer(string) method which is deprecated and since version 10 has emitted this warning: https://nodejs.org/api/buffer.html#new-bufferstring-encoding

Missing files in new schema

The grading_period_fact, grading_period_dim, grading_period_group_dim tables are missing when run, but the new score tables are included. These were tables in the new schema version 1.16.0 but don't seem to be downloading. I'm not sure if this is a problem with the script or the API. Only 79 tables are listed when downloading. Do I need to re-generate my api key?

[Suggestion] Logging / Events for a better integration

To develop great things related to canvas data updating the data needs to be automated, of course I appreciate your script, however, it lacks events that can allow to control states within another application created in different languages.
There are always alternatives, but it would be interesting if the current script could keep a record and different update states in a database (Postgresql - Mysql) and then simply take care of keeping the script running at all times with something like "supervisor".

DB table:
ID - Table name - start_at - finished_at - status (finished - failed - in_progress)
1 // account_dim // 30-03-2021 15:00 // NULL // In progress

It could also be complemented with a general update table, which is updated only when all the tables were updated.

DB table:
ID - start_at - finished_at - status (finished - in_progress - failed)

Finally, complement with a parameter the maximum number of download retries per table in the command.

canvasDataCli sync -c path/to/config.js --max_attempts=3

The header content contains invalid characters

When I tried to sync using the command line utility and received a "The header content contains invalid characters" error.

The machine is a Windows 2012 R2 server (virtual machine)
Do you need any additional information?

.. command line error output

starting from sequence 0
_http_outgoing.js:348
throw new TypeError('The header content contains invalid characters');
^

TypeError: The header content contains invalid characters
at ClientRequest.OutgoingMessage.setHeader (_http_outgoing.js:348:11)
at new ClientRequest (_http_client.js:85:14)
at Object.exports.request (http.js:31:10)
at Object.exports.request (https.js:197:15)
at Request.start (C:\Users\lis10\AppData\Roaming\npm\node_modules\canvas-data-cli\node_modules\request\request.js:747:30)
at Request.end (C:\Users\lis10\AppData\Roaming\npm\node_modules\canvas-data-cli\node_modules\request\request.js:1381:10)
at end (C:\Users\lis10\AppData\Roaming\npm\node_modules\canvas-data-cli\node_modules\request\request.js:575:14)
at Immediate._onImmediate (C:\Users\lis10\AppData\Roaming\npm\node_modules\canvas-data-cli\node_modules\request\request.js:589:7)
at processImmediate as _immediateCallback

How to execute a NodeJs script in azure-scm-console, having canvas-data-cli command

Description

I can run the canvasDataCli sync command directly from console successfully as:

canvasDataCli sync -c ./config.js

Here is the code file, test.js

var Promise = require('promise');

require('canvas-data-cli');

var s='./config.js';
var command=canvasDataCli sync -c ${s};

function exec_command_promise(cmd) {
return new Promise(function (resolve, reject) {

var exec = require('child_process').exec;
var child_process = exec(cmd);

// event : stdout
child_process.stdout.on('data', data => {
    console.log(data);
});

// event : stderr
child_process.stderr.on('data', data => {
    console.log(data);
    reject(data);
});

// event : close
child_process.on('close', data => {
    console.log(data);
    resolve(data);
});

});
};

Additional Information

Node version: 10.14.1
Platform: Windows

Initial Sync fails with request timeout

Got everything installed under Windows 10. I set ENV variable but they were not read:

config at D:\Users\xxxx\Documents\Canvas\Data\config.js is invalid
missing key, secret fields in config

config settings [as distributed]
key: process.env.CD_API_KEY, // don't hardcode creds, keep them in environment variables ideally!
secret: process.env.CD_API_SECRET

so I hard-coded the keys and launched with:
D:\temp>canvasDataCli sync -c "D:\Users\xxxx\Documents\Canvas\Data\config1.js"

Started off okay...
starting from sequence 0
will process 100 dumps
downloading 113 artifacts

but after a long while (see attached log: syncLog.txt
) I got this error:
_an error occured
[Error: max number of retries reached for requests-00000-ca7e907b.gz, aborting]
Error: max number of retries reached for requests-00000-ca7e907b.gz, aborting
at C:\Users\xxxx\AppData\Roaming\npm\node_modules\canvas-data-cli\lib\FileDownloader.js:52:18
at null.onTimeout (C:\Users\xxxx\AppData\Roaming\npm\node_modules\canvas-data-cli\node_modules\re\lib\re.js:90:43)
at Timer.listOnTimeout (timers.js:92:15)

Max Number of Retries Reached

Description

We use canvas-data-cli to download our files daily at 3:30 pm. This has been our regular process for over a year now.
Starting yesterday, our process is failing with following error:

Error: max number of retries reached for submission_dim-00027-3052c8c0.gz, aborting
    at /usr/lib/node_modules/canvas-data-cli/lib/FileDownloader.js:52:18
    at Timeout._onTimeout (/usr/lib/node_modules/canvas-data-cli/node_modules/re/lib/re.js:90:43)
    at listOnTimeout (internal/timers.js:554:17)
    at processTimers (internal/timers.js:497:7)

Our process is a combination of the list, grab, fetch, and unpack with some directory creation and removal steps.
Instead of running the process as a whole, I tried to run only grab command to download .gz files of the latest dumpid. The error is consistent.

I do not know if something needs to be updated in /usr/lib/node_modules directories or some where else, Please help how I can resolve this.

Additional Information

node -v
v14.11.0
npm -v
6.14.8
canvasDataCli -v
0.6.6
RHELinux
uname -r
3.10.0-1127.19.1.el7.x86_64
  • Logs: attaching log file.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.