Code Monkey home page Code Monkey logo

Comments (11)

emilburzo avatar emilburzo commented on August 29, 2024

Yep, rev 2511be9 works.

from osmium-tool.

joto avatar joto commented on August 29, 2024

The code where this fails was only introduced recently. This looks like a bug in libosmium. Problem is that the place where this bug occurs is really dependent on the input data and will likely not happen with small test files.

What are you using as input? Can you give me that file somehow?

from osmium-tool.

joto avatar joto commented on August 29, 2024

@emilburzo Could you try something for me: Find the function should_gc() in libosmium include/osmium/storage/item_stash.hpp line 171 and change it to always return false. The recompile osmium-tool and try your command again. This will need more memory now, so it might be you don't have enough. If it runs through after this change, we'll know for sure that the error is in a the area I think it is.

from osmium-tool.

emilburzo avatar emilburzo commented on August 29, 2024

@joto I changed the should_gc() function:

$ git diff
diff --git a/include/osmium/storage/item_stash.hpp b/include/osmium/storage/item_stash.hpp
index 81d9802..495b7b3 100644
--- a/include/osmium/storage/item_stash.hpp
+++ b/include/osmium/storage/item_stash.hpp
@@ -168,16 +168,7 @@ namespace osmium {
         // buffer grow (*3). The checks (*1) and (*2) make sure there is
         // minimum and maximum for the number of removed objects.
         bool should_gc() const noexcept {
-            if (m_count_removed < 10 * 1000) { // *1
-                return false;
-            }
-            if (m_count_removed >  5 * 1000 * 1000) { // *2
-                return true;
-            }
-            if (m_count_removed * 5 < m_count_items) { // *3
-                return false;
-            }
-            return m_buffer.capacity() - m_buffer.committed() < 10 * 1024; // *4
+               return false;
         }
 
     public:

(is that correct?)

and recompiled osmium-tool, but it still crashes unfortunately:

[ 0:00] Started osmium export
[ 0:00]   osmium version 1.6.1 (v1.6.1-10-g4d44ac9)
[ 0:00]   libosmium version 2.12.2
[ 0:00] Command line options and default settings:
[ 0:00]   input options:
[ 0:00]     file name: /home/ubuntu/filtered-tags.osm
[ 0:00]     file format: 
[ 0:00]   output options:
[ 0:00]     file name: /home/ubuntu/filtered-tags.geojsonseq
[ 0:00]     file format: geojsonseq (without RS)
[ 0:00]     overwrite: no
[ 0:00]     fsync: no
[ 0:00]   attributes:
[ 0:00]     type:      (omitted)
[ 0:00]     id:        (omitted)
[ 0:00]     version:   (omitted)
[ 0:00]     changeset: (omitted)
[ 0:00]     timestamp: (omitted)
[ 0:00]     uid:       (omitted)
[ 0:00]     user:      (omitted)
[ 0:00]     way_nodes: (omitted)
[ 0:00]   linear tags:
[ 0:00]   area tags:
[ 0:00]   other options:
[ 0:00]     index type: sparse_file_array
[ 0:00]     add unique IDs: type and id
[ 0:00]     keep untagged features: no
[ 0:00] First pass through input file (reading relations)...
[31:12] First pass done.
[31:12] Second pass through input file...
osm2geojson.sh: line 6: 18424 Segmentation fault      (core dumped) osmium export -vi sparse_file_array -u type_id -r $1 -o $2

Although it did progress a lot further (~16GB geojsonseq file) than before (~4GB geojsonseq file)

My workflow is:

  • download osm pbf planet dump
  • run osmium tags-filter pbf to osm (with a long list of tags)
  • run osmium export osm to geojsonseq

I don't have where to host that huge file, but I can send you the list of tags (where?) if it helps.

from osmium-tool.

joto avatar joto commented on August 29, 2024

I think I found the problem. Can you recompile with newest libosmium master and try again?

from osmium-tool.

joto avatar joto commented on August 29, 2024

(And just btw: Why are you writing the file into the osm format and not using pbf? That would make steps 2 and 3 much faster.)

from osmium-tool.

emilburzo avatar emilburzo commented on August 29, 2024

I think I found the problem. Can you recompile with newest libosmium master and try again?

It went a lot further this time, the geojsonseq output file has ~30 GB (the complete one has ~38 GB)

[ 0:00] First pass through input file (reading relations)...
[31:54] First pass done.
[31:54] Second pass through input file...
osm2geojson.sh: line 6: 27181 Segmentation fault      (core dumped) osmium export -vi sparse_file_array -u type_id -r $1 -o $2

(And just btw: Why are you writing the file into the osm format and not using pbf? That would make steps 2 and 3 much faster.)

Just an assumption I never actually tested (I assumed plaintext would need less processing/be faster).

Thanks for the tip!

from osmium-tool.

joto avatar joto commented on August 29, 2024

Okay, that doesn't look good. Can you tell me the exact input data you used, libosmium and osmium-tool software versions involved and the exact commands you used so that i can try to reproduce the problem?

Oh, and how much memory do you have?

from osmium-tool.

emilburzo avatar emilburzo commented on August 29, 2024

Exact steps for a vanilla Ubuntu 16.04 install:

sudo apt update
sudo apt install -y build-essential cmake zlib1g-dev libexpat1-dev libbz2-dev libboost-program-options-dev libboost-dev
cd ~/
wget https://ftp5.gwdg.de/pub/misc/openstreetmap/planet.openstreetmap.org/pbf/planet-latest.osm.pbf
git clone https://github.com/osmcode/libosmium.git
git clone https://github.com/osmcode/osmium-tool.git
cd ~/osmium-tool
git checkout export
make -j4
wget -O tags http://hq.emilburzo.com/public/tags
~/osmium-tool/build/src/osmium tags-filter -v ~/planet-latest.osm.pbf $(cat tags | tr '\n' ' ') -o ~/filtered-planet.osm
~/osmium-tool/build/src/osmium export -vi sparse_file_array -u type_id -r ~/filtered-planet.osm -o ~/filtered-planet.geojsonseq

libosmium and osmium-tool software versions

I'm using the latest master branch of libsodium and the export branch of osmium-tool

From osmium's output:

  • osmium version 1.6.1 (v1.6.1-10-g4d44ac9)
  • libosmium version 2.12.2

Oh, and how much memory do you have?

30.5 GB (AWS r4.xlarge)

from osmium-tool.

joto avatar joto commented on August 29, 2024

I think I have found the problem. It is in libosmium. Can you try with current master?

(And btw: Instead of $(cat tags | tr '\n' ' ') you should be able to just use -e tags.

from osmium-tool.

emilburzo avatar emilburzo commented on August 29, 2024

Spot on! it worked:

[90:35] Second pass done.
[90:35] Wrote 54921999 features.
[90:35] Encountered 1555 errors.
[90:35] Peak memory used: 17416 MBytes
[90:35] Done.

Thanks for your help (and the very useful tips!).

from osmium-tool.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.