Comments (3)
Comment by skrawcz
Thursday Dec 15, 2022 at 07:01 GMT
Status on getting the hello world to run on snowpark:
create or replace function hamilton_hw()
-- table does not seem to work
-- returns Table ( spend float,
-- signups float,
-- avg_3wk_spend float,
-- spend_per_signup float,
-- spend_zero_mean_unit_variance float)
returns Object
language python
runtime_version = '3.8'
handler = 'main_py'
imports = ('@~/hamilton.zip', '@~/my_functions.py')
packages = ('pandas', 'typing_inspect', 'numpy', 'snowflake-snowpark-python')
as
$$
import pandas as pd
from hamilton import driver
import my_functions
def main_py() -> dict:
initial_columns = { # load from actuals or wherever -- this is our initial data we use as input.
# Note: these values don't have to be all series, they could be a scalar.
"signups": pd.Series([1, 10, 50, 100, 200, 400]),
"spend": pd.Series([10, 10, 20, 40, 40, 50]),
}
dr = driver.Driver(initial_columns, my_functions)
output_columns = [
"spend",
"signups",
"avg_3wk_spend",
"spend_per_signup",
"spend_zero_mean_unit_variance",
]
# let's create the dataframe!
df = dr.execute(output_columns)
return df.to_dict()
$$;
select hamilton_hw();
{
"avg_3wk_spend": {
"0": NaN,
"1": NaN,
"2": 13.333333333333334,
"3": 23.333333333333332,
"4": 33.333333333333336,
"5": 43.333333333333336
},
"signups": {
"0": 1,
"1": 10,
"2": 50,
"3": 100,
"4": 200,
"5": 400
},
"spend": {
"0": 10,
"1": 10,
"2": 20,
"3": 40,
"4": 40,
"5": 50
},
"spend_per_signup": {
"0": 10,
"1": 1,
"2": 0.4,
"3": 0.4,
"4": 0.2,
"5": 0.125
},
"spend_zero_mean_unit_variance": {
"0": -1.0644053746097524,
"1": -1.0644053746097524,
"2": -0.4838206248226147,
"3": 0.6773488747516607,
"4": 0.6773488747516607,
"5": 1.2579336245387984
}
}
from hamilton.
Comment by skrawcz
Thursday Dec 15, 2022 at 07:06 GMT
Putting the driver logic in a module also works:
create or replace function hamilton_hw()
returns Object
language python
runtime_version = '3.8'
handler = 'my_script.main_py'
imports = ('@~/hamilton.zip', '@~/my_functions.py', '@~/my_script.py')
packages = ('pandas', 'typing_inspect', 'numpy', 'snowflake-snowpark-python')
;
from hamilton.
Comment by skrawcz
Thursday Dec 15, 2022 at 22:45 GMT
As a UDTF:
create or replace function hamilton_udtf_hw()
returns Table ( index float,
spend float,
signups float,
avg_3wk_spend float,
spend_per_signup float,
spend_zero_mean_unit_variance float)
-- returns Object
language python
runtime_version = '3.8'
handler = 'Runner'
imports = ('@~/hamilton.zip', '@~/my_functions.py')
packages = ('pandas', 'typing_inspect', 'numpy', 'snowflake-snowpark-python')
as
$$
import pandas as pd
from hamilton import driver
import my_functions
class Runner(object):
def __init__(self):
pass
def process(self) -> (float, float, float, float, float):
initial_columns = { # load from actuals or wherever -- this is our initial data we use as input.
# Note: these values don't have to be all series, they could be a scalar.
"signups": pd.Series([1, 10, 50, 100, 200, 400]),
"spend": pd.Series([10, 10, 20, 40, 40, 50]),
}
dr = driver.Driver(initial_columns, my_functions)
output_columns = [
"spend",
"signups",
"avg_3wk_spend",
"spend_per_signup",
"spend_zero_mean_unit_variance",
]
# let's create the dataframe!
df = dr.execute(output_columns)
# return df.to_dict
for index, row in df.iterrows():
yield (index, row.spend, row.signups, row.avg_3wk_spend, row.spend_per_signup, row.spend_zero_mean_unit_variance)
$$;
select * from table(hamilton_udtf_hw());
Results in a nice table.
from hamilton.
Related Issues (20)
- Can't have a function with both Parallelizable and Collect HOT 1
- Support daft dataframes HOT 1
- Change node hooks to take in HamiltonNode
- Easy way to scrape local functions
- Can't save DAG file in subdirectories HOT 5
- Post-function modification (opposite/equivalent to `pipe`) HOT 4
- Expose default values in `HamiltonNode` HOT 1
- Allow materializer targets to specify inputs HOT 2
- VSCode Extension can't register modules HOT 1
- Standardize API for dataflow and node versioning HOT 1
- Add funnel metrics example & hub contribution
- Parallelizable hangs in certain cases
- Parallelizable cannot aggregate or return multiple Collects HOT 3
- Add ability to mark function outputs as unserializable HOT 2
- Registering `Dataflows` and `ExecutionContext` by project
- Use @inject with @pipe decorators HOT 1
- Add @post_pipe HOT 1
- Add version field to data saver/ loader metadata HOT 2
- Add support for Polars LazyFrame and with_columns
- OSError when using jupyter magic in databricks notebook due to missing source code HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hamilton.