Code Monkey home page Code Monkey logo

Comments (2)

vruusmann avatar vruusmann commented on August 12, 2024

I've converted an h2o DRF model (9 MB) but the output PMML is 200 MB

MOJO is a compressed data format, PMML is a plain text data format. To make a relevant size comparison, you should compare 1) MOJO with compressed PMML, or 2) uncompressed MOJO with PMML.

PMML easily deflates 95% when compressed using the ZIP algorithm. So, you'd be looking at 9 MB MOJO vs 10-12 MB PMML here.

.. occupying almost 1 GB on RAM.

What kind of PMML library do you use?

The JPMML-Model/JPMML-Evaluator stack uses a memory representation (eg. heavily interned elements, attributes), which is smaller than the on-disk representation.

A 200 MB PMML document should comfortably fit into 50-100 MB of RAM.

There seems to be no compact option to reduce size.

Compaction won't help you if your PMML engine uses inefficient memory representation.

H2O.ai appears to be using binary splits identical to XGBoost/LightGBM/Scikit-Learn, so you may wish to develop a compacting visitor based on them.

from jpmml-h2o.

vruusmann avatar vruusmann commented on August 12, 2024

Closing as "nothing for me to do here".

I don't have any customers depending on JPMML-H2O, so I'm more inclined to terminate this project, rather than waste more time on it.

from jpmml-h2o.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.