Code Monkey home page Code Monkey logo

Comments (5)

alexvpickering avatar alexvpickering commented on August 27, 2024 2

EDIT: now compatible with getGEOSuppFiles (fixing same error)

I think the issue is with getDirListing. When processing HTML content, there can be links other than the series_matrix.txt.gz that seem to be related to determining the file size. In ronammar's case, the implicated link is '/geo/series/GSE94nnn/GSE94802/'. As a result, a download of https://ftp.ncbi.nlm.nih.gov/geo/series/GSE94nnn/GSE94802/matrix//geo/series/GSE94nnn/GSE94802/ is attempted and fails because it doesn't exist. To fix, just make sure the only links returned don't end with a forward slash (this approach also fixes the same download errors for getGEOSuppFiles). The following fixed it for me (added line ending with # !!!):

getDirListing <- function(url) {
    message(url)
    # Takes a URL and returns a character vector of filenames
    a <- RCurl::getURL(url)
    if( grepl("<HTML", a, ignore.case=T) ){ # process HTML content
        doc <- XML::htmlParse(a)
        links <- XML::xpathSApply(doc, "//a/@href")
        links <- links[!grepl('/$', links)]   # !!!
        XML::free(doc)
        b <- as.matrix(links)
        message('OK')
    } else { # standard processing of txt content
        tmpcon <- textConnection(a, "r")
        b <- read.table(tmpcon)
        close(tmpcon)
    }
    b <- as.character(b[,ncol(b)])
    return(b)
}

from geoquery.

seandavi avatar seandavi commented on August 27, 2024

from geoquery.

ronammar avatar ronammar commented on August 27, 2024

Thanks @seandavi. I'm not well versed enough in tryCatch() in R but am finding it challenging because even when a call to getGEO is successful, I still get warnings that not all columns named in 'colClasses' exist. So essentially, I can never get getGEO to run completely error or warning free. Also, the warnings are not consistent in their result: not all columns named in 'colClasses' exist results in a usable object (which has been an issue for a while https://support.bioconductor.org/p/82800/), whereas the warning status was '404 Not Found' results in nothing. Perhaps a 404 should be an error and the colClasses warning needs to be removed somehow?

from geoquery.

kalugny avatar kalugny commented on August 27, 2024

@alexvpickering
I also had this issue.
It seems that while accessing the matrix directory, you can get a different server (due to load balancing) and some of them have a Parent Directory link which getDirListing also tries to download.
e.g:
image

from geoquery.

kalugny avatar kalugny commented on August 27, 2024

@seandavi can you integrate @alexvpickering 's fix please?
Thank you both very much!

from geoquery.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.