Code Monkey home page Code Monkey logo

Comments (13)

notzeist avatar notzeist commented on August 27, 2024 1

I actually ran into two errors.

One was that the check for tr fails since BSD tr does not have a --help option so it returns an error when calling it with that option. That was fairly easily bypassed by that I changed the check to "which tr".

The thing I got stuck on though is that it fails with a line 756: syntax error: unexpected end of file. I thought it was missing a fi right before the last fi but when I added that it spits out some sed error so I probably missed something.

from couchdb-dump.

notzeist avatar notzeist commented on August 27, 2024 1

I actually ended up finding another issue but I also solved it.

Basically it was complaining about sed:

sed: 1: "1i{"new_edits":false,"d ...": command i expects \ followed by text

Then I saw that for FreeBSD it explicitly uses GNU sed because of some difference in Sed itself. From what I could find when googling there does seem to be some differences in how -i works primarily.

So I basically changed the check from FreeBSD to FreeBSD or Darwin and then it works as expected, of course assuming you have access to GNU sed which you can get from homebrew with brew install gnu-sed

Pretty small diff:

diff --git a/couchdb-backup.sh b/couchdb-backup.sh
index a424700..4fc07ca 100755
--- a/couchdb-backup.sh
+++ b/couchdb-backup.sh
@@ -184,7 +184,7 @@ file_name_orig=$file_name
 os_type=`uname -s`
 
 # Pick sed or gsed
-if [ "$os_type" = "FreeBSD" ]; then
+if [ "$os_type" = "FreeBSD" ] || [ "$os_type" = "Darwin" ]; then
     sed_cmd="gsed";
 else
     sed_cmd="sed";

from couchdb-dump.

notzeist avatar notzeist commented on August 27, 2024

I have this issue as well. I tried installing gnu split through brew install coreutils and replacing split in the script with the gnu gsplit but that seems to cause it to do a weird split which caused invalid json.

So the only way I was able to do it was to set $lines to some ridiculously huge number so it did not do the splitting prior to importing.

from couchdb-dump.

dalgibbard avatar dalgibbard commented on August 27, 2024

Strange, i thought this was fixed a while back...
What options does your version of split support then?

from couchdb-dump.

notzeist avatar notzeist commented on August 27, 2024

On El Capitan and Sierra this is what I get:

~  split -d
split: illegal option -- d
usage: split [-a sufflen] [-b byte_count] [-l line_count] [-p pattern]
[file [prefix]]

Looking at man split on Sierra it seems to match this https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man1/split.1.html

from couchdb-dump.

dalgibbard avatar dalgibbard commented on August 27, 2024

Yeah had a look through this in a bit more detail this morning, and it's a pain.
For the current code we split with:

  • -d -- use numeric suffixes (usually does [a-z])
    As this allows us to very simply iterate through the numbers 0+ in order.

We could likely do the same with non-numeric splits I guess; but would just need to work it out. There's a possibility of switching to these a-z splits, and just lazily iterating through that for a fixed count; else we'd have to work out how to map numeric values from the loop into the equivilent letter from a split, which probably isn't very efficient either... for the lazy loops, this does also have quite a high amount of wasted computation to work out the permutations- though we may be able to do this once, assign to a VAR at the beginning, and save us recomputing them later.

In terms of file count limits; currently; 10digits across 6 is 151,200 permutations (maximum split files)- which with the default split of 5000 lines is 756million DB lines. Switching to letters, we can either have 26alpha across 3 digits = 15600 or 26alpha across 4 digits = 358800. These numbers represent the maximum number of files a particular database can be split into. With the default split of 5000 lines; this equates to ~78million entries for 3units, or 1.794bil for 4 - quite a swing!

For now; I'm going to suggest the reduced computation of 3 positions, changed to alpha chars; and if people hit the maximum (we can probably check for that in code), we can suggest that they simply increase the line split value instead.

from couchdb-dump.

dalgibbard avatar dalgibbard commented on August 27, 2024

@denniszelada / @notzeist - I've added a potential fix for this in my fork here: https://github.com/dalgibbard/couchdb-dump

Could I trouble you to test it and report back? Then if you say it's good, we can merge it in 😄

from couchdb-dump.

dalgibbard avatar dalgibbard commented on August 27, 2024

Okey dokey, will sort it out :)

from couchdb-dump.

dalgibbard avatar dalgibbard commented on August 27, 2024

Many thanks for testing and spotting those two bits; they should be resolved now in the same fork as above - can I trouble you to check again?

from couchdb-dump.

dalgibbard avatar dalgibbard commented on August 27, 2024

OK makes sense; I've changed that also; and dropped the dependancy on which for sed testing also. Updated the text to explicitly state 'gnu-sed' (with the same format as the brew package name to hopefully lead people in the right direction when needing to install it).

One last test before commit? 😄

from couchdb-dump.

notzeist avatar notzeist commented on August 27, 2024

Works for me now.

from couchdb-dump.

dalgibbard avatar dalgibbard commented on August 27, 2024

Nice, thanks @notzeist !

from couchdb-dump.

denniszelada avatar denniszelada commented on August 27, 2024

Thanks for fixing this @dalgibbard

from couchdb-dump.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.