Comments (3)
Noted.
This fix would make a metadata-level connection between these two DataFrame
objects:
https://github.com/jpmml/jpmml-evaluator-python/blob/0.10.1/jpmml_evaluator/__init__.py#L164
https://github.com/jpmml/jpmml-evaluator-python/blob/0.10.1/jpmml_evaluator/__init__.py#L175
Most ML models perform a 1-to-1 mapping (ie. one row of input produces one row of output). But it wouldn't hurt to have a quick check about this assumption, and fall back to a default numbering of output rows if it's being violated.
from jpmml-evaluator-python.
Thanks. Shouldn't there be a meta-level connection between these two objects? i.e.
evaluateAll(df) shouldBeSameAs df.apply(lambda row: evaluate(row), axis=1) #probably with result_type = "expand"
This connection would break if df
can be split into one or more groups and the predictions for each group are based on that group's data as a whole. But this (and others such) seems like entirely different use-case that probably deserves its own function.
from jpmml-evaluator-python.
evaluateAll(df)
shouldBeSameAsdf.apply(lambda row: evaluate(row), axis=1)
The evaluateAll(df)
method call involves exactly one Python-Java-Python roundtrip, because all data is sent over in one go (as a dict
data structure). If you replace it with elementary evaluate(X)
method calls, then there will be many Python-Java-Python roundtrips (one for each data frame row).
It is my guesstimate that the computational cost of Python-Java-Python rountrip could be comparable to the cost of the actual model evaluation work. So, by limiting the number of roundtrips, I'm expecting to get more "useful" work done per unit of time.
For reference, here's the mechanics of a Python-Java-Python roundtrip:
- In Python, dump arguments in
pickle
data format to a pickle blob (byte array). - Send the pickle blob from Python to Java.
- In Java, load arguments from pickle blob.
- In Java, perform model evaluation.
- In Java, dump results in
pickle
data format to a pickle blob. - Send the pickle blob from Java to Python.
- In Python, load results from pickle blob.
from jpmml-evaluator-python.
Related Issues (20)
- Setting JAVA_HOME required although java is installed in PATH HOT 7
- py4j.protocol.Py4JNetworkError: Answer from Java side is empty HOT 2
- Question: Can I use sklearn2pmml plugin in jpmml evaluator for Python? HOT 1
- Using Python equivalent of the basic usage of jpmml-evaluator from Java HOT 2
- Getting subprocess.CalledProcessError: Command '['which', 'javac']' returned non-zero exit status 1 when calling make_evaluator with jnius HOT 7
- PyJNIus backend can't handle `None` dict values HOT 6
- AttributeError: 'Timestamp' object has no attribute '_get_object_id' HOT 5
- Atomic data exchange between Python and Java HOT 1
- Reporting of PMML HOT 2
- Is there a way to turn off `too many input fields` exception? HOT 6
- Reflect Java exception hierarchy in Python HOT 2
- How to handle NaN fields HOT 2
- Problems when inputting values for date/datetime fields HOT 27
- Function "lessOrEqual" cannot accept missing value at position 0 HOT 6
- Using PMML with SkLearn's train-test split workflow HOT 12
- Advice for debugging erroneous input and/or PMML documents HOT 4
- Choosing a default backend depending on the system architecture
- Convert PMML serialized model to Sklearn HOT 2
- Replace numpy.NaN with numpy.nan HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jpmml-evaluator-python.