Code Monkey home page Code Monkey logo

component-library's Introduction

OpenSSF Best Practices GitHub

CLAIMED - It's time to concentrate on your code only

For more information, please visit the project's website

Weekly community call (co-located with Elyra atm)
https://ibm.webex.com/meet/romeo.kienzler1
Every Friday, 8:30 AM PST, 11:30 AM EST, 5:30 PM CET, 10 PM IST

component-library's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

component-library's Issues

Meta issue for newcomers to provide new components

New components are always welcome. Feel free to create a feature request to provide information what you are planning to deliver. If you feel confident you can also just contribute new components via PR

Wiki FAQ: Lambda Functions Update and Formatting

I wanted to push an update for the FAQ in the Wiki section. I just cleaned up and added a little bit to the section about Lambda notation. Below is the updated replacement.

# syntax error, lambda notation python 2.7 vs 3.x

<style type="text/css">
.container {
  display: inline-grid;
  align-items: center;
  justify-content: center;
}

.cap-map {
    text-align: center;
    font-size: 24px;
    text-decoration: none;
    vertical-align: middle;
}

.header-cell {
  text-align: center;
  font-weight: bold;
  vertical-align: middle;
}

.center-cell {
    text-align: center;
    vertical-align: middle;
}

.function-cell {
    text-align: left;
    font-family: monospace;
    white-space: pre;
    vertical-align: middle;
}
</style>

<div class="container">
    <table>
        <caption class="cap-map">Mapping RDDs with Single Values</caption>
        <thead>
            <tr>
                <th class="header-cell">Python Version</th>
                <th class="header-cell">Solution #</th>
                <th class="header-cell">Lambda Function</th>
            </tr>
        </thead>
        <tbody>
            <tr>
                <td class="center-cell">2.7</td>
                <td class="center-cell">1</td>
                <td class="function-cell">result_array_ts = result_rdd.map(lambda (ts, voltage): ts).collect()</td>
            </tr>
            <tr>
                <td class="center-cell">3.x</td>
                <td class="center-cell">1</td>
                <td class="function-cell">result_array_ts = result_rdd.map(lambda ts_voltage: ts_voltage[0]).collect()</td>
            </tr>
            <tr>
                <td class="center-cell">3.x</td>
                <td class="center-cell">2</td>
                <td class="function-cell">result_array_voltage = result_rdd.map(lambda ts_voltage: ts_voltage[1]).collect()</td>
            </tr>        
        </tbody>
    </table>
    <br />
    <table>
        <caption class="cap-map">Mapping RDDs with Multiple Values</caption>
        <thead>
            <tr>
                <th class="header-cell">Python Version</th>
                <th class="header-cell">Lambda Function</th>
            </tr>
        </thead>
        <tbody>
            <tr>
                <td class="center-cell">2.7</td>
                <td class="function-cell">rdd.map(lambda (x, y): pow(x * y, 2)).sum()</td>
            </tr>
            <tr>
                <td class="center-cell">3.x</td>
                <td class="function-cell">rdd.map(lambda xy: pow(xy[0] * xy[1], 2)).sum()</td>
            </tr>
        </tbody>
    </table>
</div>

The less-fancy versions (HTML and markdown) are:

Mapping RDDs with Multiple Values
Python Version Solution # Lambda Function
2.7 1 result_array_ts = result_rdd.map(lambda (ts, voltage): ts).collect()
3.x 1 result_array_ts = result_rdd.map(lambda ts_voltage: ts_voltage[0]).collect()
3.x 2 result_array_voltage = result_rdd.map(lambda ts_voltage: ts_voltage[1]).collect()

Mapping RDDs with Multiple Values
Python Version Lambda Function
2.7 rdd.map(lambda (x, y): pow(x * y, 2)).sum()
3.x rdd.map(lambda xy: pow(xy[0] * xy[1], 2)).sum()

Mapping RDDs with Single Values

Python Ver. Solution # Lambda Func.
2.7 1 result_array_ts = result_rdd.map(lambda (ts, voltage): ts).collect()
3.x 1 result_array_ts = result_rdd.map(lambda ts_voltage: ts_voltage[0]).collect()
3.x 2 result_array_voltage = result_rdd.map(lambda ts_voltage: ts_voltage[1]).collect()

Mapping RDDs with Multiple Values

Python Ver. Lambda Func.
2.7 rdd.map(lambda (x, y): pow(x * y, 2)).sum()
3.x rdd.map(lambda xy: pow(xy[0] * xy[1], 2)).sum()

Clean up main branch

Describe the issue
Remove unneccesary files and folder and move them to a branch

Typo's in transformation functions

I'm currently working through IBM's coursera notebooks, and there appear to be some errors in the .ipynb's for certain transformations. Specifically:

"claimed/component-library/transform/spark-csv-to-parquet.ipynb" : destination path and parqet filename is stored in a variable "output_data_parquet" (third code cell). In code cell 5: data_dir + data_parquet fails to run because data_parquet is not defined. I think this should be output_data_parquet as appears in the eighth code cell.

"claimed/component-library/transform/spark-sql.ipynb" : In cell 4, where the environment variables are defined, "data_dir" is defined twice. The first occurance appears to be correct based on the comment. The second occurance appears to be incorrect, as the comment suggests it should be a sql query. As a result, in cell 7, the variable "sql" is not defined. I think that the second occurance of data_dir should really be a line along the lines of: "sql = os.environ.get('sql_query, 'select * from df')"

Watson Studio offers 50 execution hours per month

@romeokienzler I am unable to select the Free environment anymore on Watson Studio, for Notebooks. However, I get the error message "Watson Studio no longer offers free environments. Select another environment." while I try to open my previous notebooks created in the free Python instance.

Java port number issue

The installation is succesful but when trying to execute the code . Getting the following error
Java gateway process exited before sending the driver its port number

Good idea but...

Idea of : Write once, re-use anywhere is good.

Think it woyld be good to remove dependency on jupyter
and only allows :

python base code 
CLI

to ensure more portability.

Have got similar idea :
pip install utilmy
A collection of One Liner to do tasks

         df = pd_read_file ( from anywhere S3, local, Hadooop, .  any format)

: A question and require your guidance on the Course : Advanced Data Science with IBM Specialization on Coursera

I have enrolled for the course Advanced Data Science with IBM specialization and completed Week 1 through 4 partially. https://www.coursera.org/specializations/advanced-data-science-ibm

Please note, I am a new to coding person with only minimal knowledge on Phyton - based on my course on www.cognitiveclass.ai.

In this, on week 2, 3, when try to complete the programming assignments on Spark and tried to set up and learn, I am stuck. Unfortunately, the guidelines video hosted here : is also not available / working.

I have also posted this query on the discussions forum there in Coursera last week and awaiting advise.
Ref : https://www.coursera.org/learn/ds/discussions/weeks/2/threads/Y40fZ4hDEemB1RILjckZRA

However I am worried that I might loose time in this process.

In fact, I require advise on how and where to start the environment and continue.

Could you please guide and advise me for successful learning of this course.

ValueError: x and y must have same first dimension, but have shapes (118,) and (134,)

Tried to create a run chart using two lists - but held back due to sample data returned in second query is not consistent -

result = spark.sql("select temperature from washing where temperature is not null")
print(result.count())
result_array = result.rdd.map(lambda row : row.temperature).sample(False,0.1).collect()
len(result_array)

**printed below as expected -
1342
134

result_ts = spark.sql("select ts from washing where temperature is not null")
print(result_ts)
result_array_ts = result_ts.rdd.map(lambda row : row.ts).sample(False,0.1).collect()
len(result_array_ts)
** Was able to verify result_ts as '1342' but sample is not giving 10% of given data, i.e. result_array_ts returned 118 records

plt.plot(result_array_ts,result_array)
plt.xlabel("time")
plt.ylabel("temperature")
plt.show()
** failed to create run chart with the error mentioned
"ValueError: x and y must have same first dimension, but have shapes (118,) and (134,)"

Preserve metadata on geospatial images when tiling

Is your feature request related to a problem? Please describe.
When using the image tiling operator on geotiff or cloud optimized geotiff, the tiled images are not conaining any geospatial metadata anymore

Describe the solution you'd like
Copy all metadata over to the tiles, adjust the relevant information (e.g. lon/lat coordinated must be re-computed)

Simplify creating multidimensional array

You can get rid of 2nd map if you use square brackets instead of curved brackets in the first map.

data = column1.zip(column2).zip(column3).zip(column4).map(lambda a_b_c_d : [a_b_c_d[0][0][0],a_b_c_d[0][0][1],a_b_c_d[0][1],a_b_c_d[1]] )

AssignmentML4.ipynb not compatible with Python 3.9 which is the only one available on Watson

@romeokienzler week 4 assignment of the Advanced machine learning and signal processing course on Coursera needs some updates. Many students and I are stuck on it and can't validate the course because of it. It seems that the provided systemml version is not compatible with Python 3.9, but works only until Python 3.8, which for some reason can't be selected in Watson studio as it always says "Notebook environments with Python 3.8 are restricted to notebooks that are already using them. Choose an environment with Python 3.9 instead."
Here are the available versions (the notebook doesn't work with the option Default Python 3.8 + Watson NLP (beta)):
image
The error occurs in cell 6 when calling MLContext(spark). This is the error: TypeError: 'JavaPackage' object is not callable

Thank you.

Improve test coverage

Describe the issue
Components are added at a higher pace than corresponding automated tests. We are in the process of only accepting new components with corresponding tests but still, there is a gap to be filled

To Reproduce
Steps to reproduce the behavior:

  1. assess test coverage after an automated build
  2. see the amount of test coverage

Expected behavior
100% test coverage on statemet coverage level

Can't register because of credit card authorization issue

First of all, you do need a credit card information to register an account
Also, the cc verification doesn't work and I can't register the account - I tried 3 different cc.
I sent a message to support but they keep sending reply to my IBM account to where I can't log in and therefore I can't read the reply.
So, please let me know how I can register without creditt card information or when you going to resolve cc verification issue.

Applied AI with DeepLearning Course -Lock questionary

Hi, the Questionaries the weeks two and three are Locked , please , i need to that someone Unlock the questionary to continue with the course.

I have not pendient task from the firts week and i have a automatic pay for a deep learning course.

Regards !

Remove deprecatred docker code

Describe the issue
The new C3 doesn't need inlin docker code

To Reproduce
Open a notebook and search for 'docker' - it should not find anything

Improve functionality of CLAIMED CLI

The CLAIMED CLI allows all CLAIMED components to be used from the command line in the form:

claimed <component_name:version> <parameters, ...>

claimed claimed-util-cos:0.3 access_key_id=xxx secret_access_key=xxx endpoint=https://s3.us-east.cloud-object-storage.appdomain.cloud bucket_name=era5-cropscape-zarr path=/ recursive=True operation=ls

This is implemented as an exemplar for the claimed-util-cos component but not yet generic.

Making this generic (the CLI can work with any CLAIMED component and make it available via CLI automatically) will allow to use and CLAIMED component from the terminal and in shell scripts

Downloading the file problem

Hi,
I have problem downloading the file into my local jupyter notebooks by using pyspark. Can you please helpme.
Thanks,

Push all components to latest spec of the component compiler

Describe the issue
The components and also the component compiler are evolving over time. This means not all components (e.g., notebooks) are sticking to the latest standards - therefore the claimed component compiler fails in the automated build for some of the components

To Reproduce
Steps to reproduce the behavior:

  1. commit to main
  2. wait for the build to be triggered
  3. observe the errors of the component compiler

Expected behavior
All components are build successfully

outdated pyspark installs

Notebook: https://github.com/IBM/claimed/blob/master/coursera_ds/assignment1.2.ipynb (and later assignments in the coursera_ds directory) uses !pip install pyspark==2.4.5 under the assumption that the user set up an environment with Python 3.6 (https://github.com/IBM/claimed/wiki/Watson-Studio-Setup). Python 3.6 is not available... the only Python environment that I'm able to create is 3.9, which is incompatible with 2.4.5. It's not clear whether I'm okay commenting out the version on pyspark and using the current version (3.3). Given I don't know pyspark (hence, why I'm taking the coursera class), I don't know if things down the road will break by switching to 3.3. At any rate, the notebook should not force the installation of an outdated pyspark version if such a version is not compatible with the Python environments that are available.

Got "IndexError: list index out of range" when following spark_setup_anaconda.ipynb

At first intended to install spark into the local computer and so ran the code provided from spark_setup_anaconda.ipynb to install Spark. However, got "IndexError: list index out of range"

`---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
in
8 os.environ["SPARK_HOME"] = "/home/dsxuser/work/spark-2.3.4-bin-hadoop2.7"
9 import findspark
---> 10 findspark.init()
11 from pyspark import SparkContext, SparkConf
12 from pyspark.sql import SQLContext, SparkSession

~\Anaconda3\lib\site-packages\findspark.py in init(spark_home, python_path, edit_rc, edit_profile)
133 # add pyspark to sys.path
134 spark_python = os.path.join(spark_home, 'python')
--> 135 py4j = glob(os.path.join(spark_python, 'lib', 'py4j-*.zip'))[0]
136 sys.path[:0] = [spark_python, py4j]
137

IndexError: list index out of range`

Create component for image tiling

Is your feature request related to a problem? Please describe.
The image tiling problem is part of nearly every computer vision pipeline. This component should read arbitrary image formats and create the desired tiles (also using sliding and tumbling windows and with configurable stride size
In case of geospatial image formats like geotiff/cog the tile's metadata should be kept / adjusted in the tiles. E.g. coordinates

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.