Code Monkey home page Code Monkey logo

grapler's People

Contributors

jaikrishnats avatar kcratie avatar saumitraaditya avatar smahesul avatar vahid-dan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

grapler's Issues

Trying to untar a folder

GrapleGetExperimentResults tries to list all files inside Results directory and tries to untar and delete them.

Probably it is meant to take only the files, but sometimes it tries to untar the Sims directory and fails with it. Log below.

@smahesul @kcratie This was the issue I had been mentioning two weeks before.

GrapleCheckExperimentCompletion(graplerURL, expId1)
$curr_status
[1] "completed"

setwd(expRootDir)
GrapleGetExperimentResults(graplerURL, expId1)
--2016-03-29 17:13:54-- http://10.244.37.116:5000/KF085HXATWIE8A2LGJE4ZHWVZBU1YH7X4FHLYUU9/Results/output.tar.gz
Connecting to 10.244.37.116:5000... connected.
HTTP request sent, awaiting response... 200 OK
Length: 573120 (560K) [application/x-tar]
Saving to: ‘/home/jaikrishna/work/SimRoot1/Exp1/results.tar.gz’

 0K .......... .......... .......... .......... ..........  8% 10.7M 0s
50K .......... .......... .......... .......... .......... 17% 7.56M 0s

100K .......... .......... .......... .......... .......... 26% 10.1M 0s
150K .......... .......... .......... .......... .......... 35% 10.7M 0s
200K .......... .......... .......... .......... .......... 44% 10.4M 0s
250K .......... .......... .......... .......... .......... 53% 10.7M 0s
300K .......... .......... .......... .......... .......... 62% 10.4M 0s
350K .......... .......... .......... .......... .......... 71% 10.4M 0s
400K .......... .......... .......... .......... .......... 80% 10.8M 0s
450K .......... .......... .......... .......... .......... 89% 10.4M 0s
500K .......... .......... .......... .......... .......... 98% 10.3M 0s
550K ......... 100% 11.7M=0.05s

2016-03-29 17:13:54 (10.2 MB/s) - ‘/home/jaikrishna/work/SimRoot1/Exp1/results.tar.gz’ saved [573120/573120]

/bin/tar: Sims: Cannot read: Is a directory
/bin/tar: At beginning of tape, quitting now
/bin/tar: Error is not recoverable: exiting now
[1] "/home/jaikrishna/work/SimRoot1/Exp1/results.tar.gz"
Warning messages:
1: In untar(x) : ‘/bin/tar -xf 'Sims'’ returned error code 2
2: In file.remove(x) :
cannot remove file 'Sims', reason 'Directory not empty'

Incorrect Email Notification

The last 3 jobs are held and the progress is 96% but I got an email saying the experiment has completed processing.

EMS Compression Error

While the jobs are done, EMS happened to be unable to finalize the results and the experiment completion got stuck in 98%. A sudo systemctl restart ems.service recovered the results; But, the output compressed files seams corrupted.

On the Submit Node:

● ems.service - Experiment management service to support GWS
   Loaded: loaded (/etc/systemd/system/ems.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2017-07-04 12:09:39 EDT; 6 days ago
 Main PID: 281976 (python)
    Tasks: 8
   Memory: 33.2G
      CPU: 6d 8h 32min 30.252s
   CGroup: /system.slice/ems.service
           ├─281976 /usr/bin/python ems.py
           ├─488301 perl /usr/bin/parallel -P 2 tar jxf ::: Results62.tar.bz2 Results178.tar.bz2 Results181.tar.bz2 Results205.tar.bz2 Results7.tar.bz2 Results20.tar.bz2 Results172.tar.bz2 Results50.tar.b
           ├─488603 tar jxf Results77.tar.bz2
           ├─488604 bzip2 -d
           ├─488607 tar jxf Results98.tar.bz2
           └─488608 bzip2 -d

Jul 10 12:53:50 graple-Submit python[281976]:         perhaps it is corrupted?  *Possible* reason follows.
Jul 10 12:53:50 graple-Submit python[281976]: bzip2: Inappropriate ioctl for device
Jul 10 12:53:50 graple-Submit python[281976]:         Input file = (stdin), output file = (stdout)
Jul 10 12:53:50 graple-Submit python[281976]: It is possible that the compressed file(s) have become corrupted.
Jul 10 12:53:50 graple-Submit python[281976]: You can use the -tvv option to test integrity of such files.
Jul 10 12:53:50 graple-Submit python[281976]: You can use the `bzip2recover' program to attempt to recover
Jul 10 12:53:50 graple-Submit python[281976]: data from undamaged sections of corrupted files.
Jul 10 12:53:50 graple-Submit python[281976]: tar: Unexpected EOF in archive
Jul 10 12:53:50 graple-Submit python[281976]: tar: Unexpected EOF in archive
Jul 10 12:53:50 graple-Submit python[281976]: tar: Error is not recoverable: exiting now

On RStudio Console after Attempting to Download the Results:

gzip: results.tar.gz: unexpected end of file
gzip: results.tar.gz: uncompress failed
Sims/Sim78_1/Results/output.nc: Truncated tar archive
tar: Error exit delayed from previous errors.
Warning message:
In untar("results.tar.gz") :
  ‘/usr/bin/gzip -dc 'results.tar.gz' | /usr/bin/tar -xf '-'’ returned error code 1

Server Monitoring

Some server monitoring notifiers are needed, such as service status notifier and disk usage notifier.

Byte Order Mark in job_desc.json Files

There may be BOM characters in job_dsc.json file, especially if the user is creating it with some Windows editors. That may cause Invalid job_desc file error in RStudio.

EDDIE: Error on creating Graple object

I am running through the project EDDIE examples and getting an error reading:

## Error: lexical error: invalid char in json text.
##                                        <html>  <head><title>302 Found<
##                      (right here) ------^

after I run:

MyExp <- GrapleCheckService(MyExp)

I have a reproducible example up at: https://github.com/jsta/EDDIE

Unable to Start Condor Automatically after Boot

● condor.service - LSB: Manage condor daemons
   Loaded: loaded (/etc/init.d/condor; bad; vendor preset: enabled)
   Active: failed (Result: exit-code) since Tue 2017-07-11 10:27:45 PDT; 10min ago
     Docs: man:systemd-sysv-generator(8)
  Process: 1192 ExecStart=/etc/init.d/condor start (code=exited, status=1/FAILURE)

Jul 11 10:27:43 comet-W2 systemd[1]: Starting LSB: Manage condor daemons...
Jul 11 10:27:45 comet-W2 condor[1192]: mkdir: cannot create directory ‘FATAL: Unable to locate LOG in /etc/condor/condor_config’: No such file or directory
Jul 11 10:27:45 comet-W2 condor[1192]: chown: cannot access 'FATAL: Unable to locate LOG in /etc/condor/condor_config': No such file or directory
Jul 11 10:27:45 comet-W2 condor[1192]: FATAL: Required directory FATAL: Unable to locate LOG in /etc/condor/condor_config does not exist, or is not a directory.
Jul 11 10:27:45 comet-W2 systemd[1]: condor.service: Control process exited, code=exited status=1
Jul 11 10:27:45 comet-W2 systemd[1]: Failed to start LSB: Manage condor daemons.
Jul 11 10:27:45 comet-W2 systemd[1]: condor.service: Unit entered failed state.
Jul 11 10:27:45 comet-W2 systemd[1]: condor.service: Failed with result 'exit-code'.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.