Code Monkey home page Code Monkey logo

Comments (9)

sherrillmix avatar sherrillmix commented on August 31, 2024

I notice that the final accessionTaxa.sqlite is still being written to the local directory. Does that directory have ~70Gb of space for the final database?

from taxonomizr.

Aron-github avatar Aron-github commented on August 31, 2024

There should more than 1 Tb avail on the disk I'm writing to...

from taxonomizr.

sherrillmix avatar sherrillmix commented on August 31, 2024

Hmm that should be enough space but you do appear to be running out of space somewhere. Maybe getwd to make sure your working directory is in the right place and system('df -h .') to make sure you have enough space there?

Also the link to issue #5 displays the correct url (github.com/sherrillmix/taxonomizr/issues/5) but actually links to an incorrect page in case anyone else comes across this.

from taxonomizr.

Aron-github avatar Aron-github commented on August 31, 2024

Unfortunately getwd and system('df -h') return exactly what I knew already, i.e. that I'm working in the desired folder which has 1.4T of free space. Any other suggestion?

Also, sorry for mistakenly referring to the issue #5 via url, it is now corrected in the first post.

from taxonomizr.

sherrillmix avatar sherrillmix commented on August 31, 2024

Hmm that's pretty odd. I guess if it's not the final directory then it's something related to the temp directory. That can be tricky with SQLite. You said you started R with something like TMPDIR=/path/to/tmp R (note the variable assignment is in the same input as R) correct? Maybe double check that that stuck with system('echo $TMPDIR') and system('df -h "$TMPDIR"')?

from taxonomizr.

Aron-github avatar Aron-github commented on August 31, 2024

In both cases I am getting back what I wrote you above BUT I'm running the code within RStudio, that's why, as written in the first post, I set the TMPDIR both in my .Renviron and manually in the script via unixtools::set.tempdir...

from taxonomizr.

sherrillmix avatar sherrillmix commented on August 31, 2024

unixtools::set.tempdir doesn't appear to set the TMPDIR variable so that's most likely not going to help SQLite use the correct directory. Setting it in .Renviron should set TMPDIR correctly but looking back at the previous issue #5, it actually ended with a report from the user that that in fact did not cover all their uses with no followup. So it seems setting TMPDIR may not always solve the issue. I guess either there are some systems where SQlite doesn't listen to TMPDIR or RStudio is doing something funny. I don't have RStudio to debug so I guess my advice would still be:

The package is using sqlite for the database and sqlite3 can be a bit finicky with /tmp problems so I would guess that's the cause. I think it does not listen to R's internal environment variables but would listen to a TMPDIR set prior to starting R. If you start R with TMPDIR=/path/to/tmp R and rerun things is the database created correctly? If you wanted to keep everything internal to RStudio then I suppose you could do a system call within RStudio with something like.:
system('TMPDIR=/path/to/tmp R -e "taxonomizr::prepareDatabase('accessionTaxa.sql')"')

from taxonomizr.

sherrillmix avatar sherrillmix commented on August 31, 2024

This stackoverflow answer suggests that the variable needs to be set in Renviron.site and not .Renviron. I haven't really mucked around in startup but apparently Renviron.site is more global than .Renviron as per this stackoverflow and ?Startup so perhaps reasonable.

Another option might be prepareDatabase(extraSqlCommand="PRAGMA temp_store_directory = '/MY/TMP/DIR'") although I haven't had a chance to test that recently.

from taxonomizr.

sherrillmix avatar sherrillmix commented on August 31, 2024

prepareDatabase(extraSqlCommand="PRAGMA temp_store_directory = '/MY/TMP/DIR'") seems to work for me. SQLite deletes the temporary file as soon as it creates it so the temp file isn't visible by ls but I can see the space being used in the correct partition while it prepares the indexing. Maybe give that a shot and let me know if it fixes things. If that works, I'm debating just setting the function to call that PRAGMA automatically at the location defined in the tmpDir argument to prepareDatabase so people can stop worrying about this.

from taxonomizr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.