Code Monkey home page Code Monkey logo

Comments (3)

qidewenwhen avatar qidewenwhen commented on September 25, 2024

Hi @ lorenzwalthert, thanks for reaching out!

Not sure this behaviour applies to ExecutionVariables only or to all PipelineVariables.

For this, yes, the behavior applies to all PipelineVariables. This is because PipelineVariables are placeholders in compile time and are only parsed in pipeline execution time. Thus, we can not do the following in SDK when defining a pipeline definition.

do arbitrary transformations involving Pipeline variables (e.g. taking a substring, performing aritmetic with float or int parameters, evaluating an if condition involving a PipelineVariable etc)

Currently we only provide the Join and JsonGet functions in SDK to perform operation on the PipelineVariables in execution time. We may not plan to add more such functions in the near future.

Hence, for other operations, leveraging a LambdaStep can be one solution.

Besides LambdaStep, as you're using training and processing steps, can you try out our recently launched new feature - @step and see if it can get you out of this issue?
https://docs.aws.amazon.com/sagemaker/latest/dg/pipelines-step-decorator.html.

In your case, the code can be similar to the following.

Note: because the custom_func runs in pipeline execution time when the ExecutionVariables.PIPELINE_EXECUTION_ID or exe_var has already been parsed, we can do any python primitive string operations to it.

    from sagemaker.workflow.function_step import step

    @step(
        name="...",
        keep_alive_period_in_seconds=600,
        ...
    )
    def custom_func(exe_var):
        # Add your ML logics here, which will be run in a training job in pipeline execution time
        return exe_var.[0:2] # <<<<<<<<<<<<<<<<<<<<<<<<<<<<

    custom_func_output = custom_func(
        exe_var=ExecutionVariables.PIPELINE_EXECUTION_ID,
    )

    pipeline = Pipeline(
        name=pipeline_name,
        steps=[custom_func_output],
        sagemaker_session=sagemaker_sessione,
    )

   pipeline.create(role)

   execution = pipeline.start()

from sagemaker-python-sdk.

qidewenwhen avatar qidewenwhen commented on September 25, 2024

Closing this issue as we did not get response in the last 3 week. Feel free to reopen if you have further questions. Thanks!

from sagemaker-python-sdk.

lorenzwalthert avatar lorenzwalthert commented on September 25, 2024

Thanks @qidewenwhen for your answer (and sorry for the late reply). I see that using a pipeline with the lightweight training job decorator (instead of the more verbose cassical syntax) is an option. But I don't really want these things to show up in my training jobs. I'd prefer processing job. I don't think that's possible (yet)? Anyways, my pipeline is pretty long, so I wondered if I can combine the decorator approach with existing classical pipeline syntax... The drawback of <5 min startup time for a training job remain, however. I believe that's where the lambda step could potentially shine (although I have not investigate it yet, and I'd require it to work in local mode too to be comaptible with my development loop).

from sagemaker-python-sdk.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.