Code Monkey home page Code Monkey logo

Comments (24)

LaKing avatar LaKing commented on July 4, 2024 2

In my case, file command is missing in the container - wich is propably the same in the dockerized instance.

    # CouchDB has a tendancy to output Windows carridge returns in it's output -
    # This messes up us trying to sed things at the end of lines!
    if [ "`file $file_name | grep -c CRLF`" = "1" ]; then
        echo "... INFO: File contains Windows carridge returns- converting..."
        filesize=$(du -P -k ${file_name} | awk '{print$1}')
        checkdiskspace "${file_name}" $filesize
        tr -d '\r' < ${file_name} > ${file_name}.tmp
        if [ $? = 0 ]; then
            mv ${file_name}.tmp ${file_name}
            if [ $? = 0 ]; then
                echo "... INFO: Completed successfully."
            else
                echo "... ERROR: Failed to overwrite ${file_name} with ${file_name}.tmp"
                exit 1
            fi
        else
            echo ".. ERROR: Failed to convert file."
            exit 1
        fi
    fi

from couchdb-dump.

LaKing avatar LaKing commented on July 4, 2024 1

I can confirm this bug.

An extra } is placed at end of each doc.

from couchdb-dump.

thecalle avatar thecalle commented on July 4, 2024

Backup file dimension is 1/10 to database dumped. Database was compacted before backup.

from couchdb-dump.

dalgibbard avatar dalgibbard commented on July 4, 2024

Are you able to share your backup file? Definitely shouldn't happen though- marked as a Bug.

The backup file being substantially smaller than the DB normally is, is perfectly normal- as the backup doesn't include the b-tree layout, deleted documents, nor historical revisions, and also is able to strip about 20% of the document header from the raw JSON which isn't needed for import.

It may be useful if you could provide debug script output too by running it with:
bash -x ./couchdb-backup.sh $args

Be sure to truncate/remove/mask any sensitive information. I can provide an email/skype/other account for sharing direct to myself if you prefer.

from couchdb-dump.

thecalle avatar thecalle commented on July 4, 2024

Problem is due to DB contents. In my test case there are documents with attachments (images, javascript, html, css) because this DB is a web application shown by a redirect from nginx (because link is too long). Backup fails before restore. This is the reason I think dimension of backup file is wrong.

from couchdb-dump.

dalgibbard avatar dalgibbard commented on July 4, 2024

So, should we close this? If your backup failed, you'll need to re-do your backup. If your backup failed for a specific reason relating to the script (rather than an nginx timeout etc) then let us know so we can look into it.

from couchdb-dump.

thecalle avatar thecalle commented on July 4, 2024

Backup fails due to DB content. Contact me thecalle [petit escargot] gmail.c0m

from couchdb-dump.

dalgibbard avatar dalgibbard commented on July 4, 2024

@thecalle - this has been open for a while, and I can't remember where we got up to... do you remember?

from couchdb-dump.

KingScooty avatar KingScooty commented on July 4, 2024

I'm getting the same issue actually when trying to restore a backup. Putting the backed up json through a linter shows extra closing braces } after each entry.

Deleting each and every extra } seems to create a valid json object that can be restored. Is the script doing anything weird when saving the dump?

from couchdb-dump.

dalgibbard avatar dalgibbard commented on July 4, 2024

I'd need to be able to recreate to debug. Do you have a step by step? Do you have nginx in front or anything?

from couchdb-dump.

KingScooty avatar KingScooty commented on July 4, 2024

So i'm using a dockerised image of couchdb (https://hub.docker.com/r/tutum/couchdb/).
I have a database filled with raw tweet json from the twitter API---essentially a load of tweets matching a hashtag.

Backing up this database saves fine, but when restoring it back into my local couchdb instance is where i noticed the error---the same error as above.

So i decided to put the JSON through a linter to check the quality of the JSON that was saved, and it reports a series of errors. An extra } seems to be added for every single tweet entry.

I'm not using nginx, or any attachments, it's just raw twitter API json.

from couchdb-dump.

dalgibbard avatar dalgibbard commented on July 4, 2024

Would it be possible to provide me with a small subset of the data? Ideally just the .couch file if possible? (Should be able to import it...)

from couchdb-dump.

KingScooty avatar KingScooty commented on July 4, 2024

Yeah, sure. Here's a whole lot of it!
http://pastebin.com/jVhFz7jS (link expires in 24hours)

That's how it looks when i use the recover tool. If you lint that, you'll see what i'm talking about about the extra }'s.

from couchdb-dump.

dalgibbard avatar dalgibbard commented on July 4, 2024

Do you have a means to create an actual DB file? I'd ideally need to test against a database which contains the relevant data (though only a tiny subset).

AFAIK, this issue should already be matched and edited by:
sed ${sed_edit_in_place} 's/}},$/},/g' ${file_name}

With an existing backup, it's hard to tell if it's happened already (extra, extra curly braces), or not happened at all.

As an alternative, edit the script to comment out the seds, and see what the raw output looks like?

from couchdb-dump.

KingScooty avatar KingScooty commented on July 4, 2024

That's a very good point. I'll see what i can do to get hold of the .couch files. They don't seem to be in the expected location of /var/lib on my node, so i'm going to have to do some rooting around.

from couchdb-dump.

KingScooty avatar KingScooty commented on July 4, 2024

Here's a .couch file: http://wildfla.me/2B163v0z2O3f

from couchdb-dump.

dalgibbard avatar dalgibbard commented on July 4, 2024

Nice one, thanks. I'll try and look into this over the weekend or next week depending on availability :)

from couchdb-dump.

JigSawFr avatar JigSawFr commented on July 4, 2024

Same problem as @LaKing and some comma are missing at end.

from couchdb-dump.

unpete avatar unpete commented on July 4, 2024

Сouchdb-backup corrupts json
Archive contains a response from the сouchdb and after couchdb-backup treatment
json.zip

from couchdb-dump.

dalgibbard avatar dalgibbard commented on July 4, 2024

Initially assuming that this problem relates to the CRLF trimming not happening on machines which don't have file present. Looking to fix that in 1.1.5

from couchdb-dump.

dalgibbard avatar dalgibbard commented on July 4, 2024

OK, have merged the file missing change. Does that change things for anyone?

from couchdb-dump.

dalgibbard avatar dalgibbard commented on July 4, 2024

Or better yet; is it still broken for anyone? 😆

from couchdb-dump.

dalgibbard avatar dalgibbard commented on July 4, 2024

Based on the current radio silence and implemented fix for docker, shall we close this @danielebailo ?

from couchdb-dump.

danielebailo avatar danielebailo commented on July 4, 2024

from couchdb-dump.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.