googlecloudplatform / training-data-analyst Goto Github PK
View Code? Open in Web Editor NEWLabs and demos for courses for GCP Training (http://cloud.google.com/training).
License: Apache License 2.0
Labs and demos for courses for GCP Training (http://cloud.google.com/training).
License: Apache License 2.0
google-cloud-storage 1.13.2 has requirement google-cloud-core<0.30dev,>=0.29.0, but you'll have google-cloud-core 0.25.0 which is incompatible.
google-gax 0.15.16 has requirement future<0.17dev,>=0.16.0, but you'll have future 0.17.1 which is incompatible.
apache-beam 2.5.0 has requirement httplib2<0.10,>=0.8, but you'll have httplib2 0.12.0 which is incompatible.
google-cloud-logging 1.9.1 has requirement google-cloud-core<0.30dev,>=0.29.0, but you'll have google-cloud-core 0.25.0 which is incompatible.
google-cloud-spanner 1.7.1 has requirement google-cloud-core<0.30dev,>=0.29.0, but you'll have google-cloud-core 0.25.0 which is incompatible.
Installing collected packages: six
Found existing installation: six 1.10.0
Uninstalling six-1.10.0:
Successfully uninstalled six-1.10.0
Successfully installed six-1.10.0
While installing below packages if I am not wrong
$ cat install_packages.sh
#!/bin/bash
apt-get install python-pip
pip install google-cloud-dataflow oauth2client==3.0.0
pip install --force six==1.10 # downgrade as 1.11 breaks apitools
pip install -U pip
For example:
update 08_image/mnistmodel/trainer/task.py to use model.train_and_evaluate() instead of using learn_runner
Hi,
When can we have contents for 6. Production ML models ?
Thanks
I got this error while running through this codelab
Hi Team,
For the feature engineering function section can you please provide an example with a complex calculation.
Example: We want to generate a new feature as division of two columns only if a third column has value = "Y" else set value for that row as -1.
In pandas it is easily possible with .apply() function but in Tensorflow pipeline how should this be done ?
I tried using tf.where and tf.cond but it doesn't seem to work fine in pipelines for me.
Dear Sir,
I am executing streaming process job using https://github.com/GoogleCloudPlatform/training-data-analyst/blob/master/courses/streaming/process/sandiego/src/main/java/com/google/cloud/training/dataanalyst/sandiego/CurrentConditions.java.
I would like to create table at Bigquery dynamically per sensorId and table name with sensorId as below. I am not taking project name from options but hard-coded.
String sensorId = info.getSensorKey();
String currConditionsTable = "brss-711:demos." + sensorId ;
The table name is defined in main function, since I want to create new table per sonsorId, I need to call the above statements from "ToBQRow" function.
Even though I made "currConditionsTable" as global variable, it is not executing and I am getting "Nullpointer exception", since variable contain "null".
Please help me to resolve the issue.
Regards,
Kiran.
I'm following codelab instructions to publish a topic in pubsub as below, but an error is returning:
response: { message: 'publisher is not defined', internalCode: undefined } }
at next (/home/google2145703_student/training-data-analyst/courses/developingapps/nodejs/pubsub-languageapi-spanner/start/node_modules/express/lib/router/index.js:275:10
Code:
// Handler for feedback POSTed from the client app
router.post('/feedback/:quiz', (req, res, next) => {
const feedback = req.body;
// TODO: Publish the message into Cloud Pub/Sub
publisher.publishFeedback(feedback).then(() => {
// TODO: Move the statement that returns a message to
// the client app here
res.json('Feedback received');
// END TODO
// TODO: Add a catch
}).catch(err => {
// TODO: There was an error, invoke the next middleware
next(err);
// END TODO
});
// END TODO
});
When I execute the script in this location:
training-data-analyst/datalab/cloudshell/create_vm.sh
I get the following error:
ERROR: (gcloud.compute.instances.create) Could not fetch resource:
Here is a fix
Stop tensorboard instances
pids_df = TensorBoard.list()
if not pids_df.empty:
for pid in pids_df['pid']:
TensorBoard().stop(pid)
print 'Stopped TensorBoard with pid {}'.format(pid)
In Cloud ML models, use the supplied --job-dir as the output dir. This avoids the need to do code like this:
output_dir = os.path.join(
output_dir,
json.loads(
os.environ.get('TF_CONFIG', '{}')
).get('task', {}).get('trial', '')
)
(it's already done for --job-dir)
While running python transform.py in SSH, getting the below error:
Traceback (most recent call last):
File "transform.py", line 11, in
import urllib.request, urllib.error, urllib.parse
ImportError: No module named request
Please help.
This line in lab and in solution
should have:
arguments['hidden_units'] = [int(v) for v in arguments['hidden_units'].split(' ')]
I'm not sure if this is related to new changes released. This code used to work.
python 2.
All cells cleared and restarted. Everything runs until this cell:
preprocess(50*100, 'DataflowRunner')
#change first arg to None to preprocess full dataset
Result is this stack trace:
Launching Dataflow job preprocess-taxifeatures-180901-165026 ... hang on
/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/coders/typecoders.py:135: UserWarning: Using fallback coder for typehint: Any.
warnings.warn('Using fallback coder for typehint: %r.' % typehint)
CalledProcessErrorTraceback (most recent call last)
<ipython-input-10-b4775e416971> in <module>()
----> 1 preprocess(50*100, 'DataflowRunner')
2 #change first arg to None to preprocess full dataset
<ipython-input-8-8419c1762ff8> in preprocess(EVERY_N, RUNNER)
53 )
54
---> 55 p.run()
/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/pipeline.pyc in run(self, test_runner_api)
174 finally:
175 shutil.rmtree(tmpdir)
--> 176 return self.runner.run(self)
177
178 def __enter__(self):
/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.pyc in run(self, pipeline)
250 # Create the job
251 result = DataflowPipelineResult(
--> 252 self.dataflow_client.create_job(self.job), self)
253
254 self._metrics = DataflowMetrics(self.dataflow_client, result, self.job)
/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/utils/retry.pyc in wrapper(*args, **kwargs)
166 while True:
167 try:
--> 168 return fun(*args, **kwargs)
169 except Exception as exn: # pylint: disable=broad-except
170 if not retry_filter(exn):
/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/runners/dataflow/internal/apiclient.pyc in create_job(self, job)
423 def create_job(self, job):
424 """Creates job description. May stage and/or submit for remote execution."""
--> 425 self.create_job_description(job)
426
427 # Stage and submit the job when necessary
/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/runners/dataflow/internal/apiclient.pyc in create_job_description(self, job)
446 """Creates a job described by the workflow proto."""
447 resources = dependency.stage_job_resources(
--> 448 job.options, file_copy=self._gcs_file_copy)
449 job.proto.environment = Environment(
450 packages=resources, options=job.options,
/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/runners/dataflow/internal/dependency.pyc in stage_job_resources(options, file_copy, build_setup_args, temp_dir, populate_requirements_cache)
377 else:
378 sdk_remote_location = setup_options.sdk_location
--> 379 _stage_beam_sdk_tarball(sdk_remote_location, staged_path, temp_dir)
380 resources.append(names.DATAFLOW_SDK_TARBALL_FILE)
381 else:
/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/runners/dataflow/internal/dependency.pyc in _stage_beam_sdk_tarball(sdk_remote_location, staged_path, temp_dir)
462 elif sdk_remote_location == 'pypi':
463 logging.info('Staging the SDK tarball from PyPI to %s', staged_path)
--> 464 _dependency_file_copy(_download_pypi_sdk_package(temp_dir), staged_path)
465 else:
466 raise RuntimeError(
/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/runners/dataflow/internal/dependency.pyc in _download_pypi_sdk_package(temp_dir)
525 '--no-binary', ':all:', '--no-deps']
526 logging.info('Executing command: %s', cmd_args)
--> 527 processes.check_call(cmd_args)
528 zip_expected = os.path.join(
529 temp_dir, '%s-%s.zip' % (package_name, version))
/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/utils/processes.pyc in check_call(*args, **kwargs)
42 if force_shell:
43 kwargs['shell'] = True
---> 44 return subprocess.check_call(*args, **kwargs)
45
46
/usr/local/envs/py2env/lib/python2.7/subprocess.pyc in check_call(*popenargs, **kwargs)
188 if cmd is None:
189 cmd = popenargs[0]
--> 190 raise CalledProcessError(retcode, cmd)
191 return 0
192
CalledProcessError: Command '['/usr/local/envs/py2env/bin/python', '-m', 'pip', 'install', '--download', '/tmp/tmp6JRn77', 'google-cloud-dataflow==2.0.0', '--no-binary', ':all:', '--no-deps']' returned non-zero exit status 2
/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/io/gcp/gcsio.py:113: DeprecationWarning: object() takes no parameters
super(GcsIO, cls).__new__(cls, storage_client))
Seems like code in some of the projects do not support Python3.
For example in devenv project server.py starts as follows:
from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer
This import will not work in Python3 as BaseHTTPRequestHandler
and HTTPServer
has been moved to http.server
module.
Also the output stream for response must be written as bytes
.
A fix for both Python2/3 compatibility for import is
try:
from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer
except ImportError:
from http.server import BaseHTTPRequestHandler, HTTPServer
and for response output stream is
self.wfile.write(b'Hello GCP dev!')
In courses/machine_learning/deepdive/02_generalization/create_datasets.ipynb there are 2 imports. One at the top
import google.datalab.bigquery as bq
and second at the last cell
import datalab.bigquery as bq
The problem with this is that the first one uses standard sql by default and second one uses legacy sql by default. If someone runs through the whole notebook and then tries to run earlier queries then it won't work and fail with errors related to enabling standard sql.
Change scripts and notebooks in this repo from Python 2 to Python 3. You can do this course-by-course & submit pull-requests.
Used training-data-analyst/courses/machine_learning/feateng/feateng.ipynb
with kaggle nyc taxi fare dataset on colab. The issue I encountered is how to predict after trained model? Since I am not using GCP directly I read test data and call predict but predictions were all empty. Can you please let me know how to perform prediction?
CSV_COLUMNS = 'key,fare_amount,pickup_longitude,pickup_latitude,dropoff_longitude,dropoff_latitude,passenger_count,dayofweek,hourofday'.split(',')
LABEL_COLUMN = 'fare_amount'
def pandas_test_input_fn(df):
return tf.estimator.inputs.pandas_input_fn(
x=df,
y=None,
batch_size=512,
num_epochs=1,
shuffle=False,
queue_capacity=1000
)
df_valid2 = pd.read_csv('mydata/valid.csv', header = None, names = CSV_COLUMNS)
predictions = estimator.predict(input_fn = pandas_test_input_fn(df_valid2))
It complains "ValueError: Feature euclidean is not in features dictionary." since that is coming from add_engineered. But confused how to process add_engineered when I feed the data thru pandas?
Seem like need to define tf.estimator.ModeKeys.EVAL but how, right?
Thanks
Hello - I'm getting an error when running the code in "Run Beam pipeline on Cloud Dataflow" section of the "feateng" notebook.
Command:
preprocess(50*100, 'DataflowRunner')
Stacktrace:
Launching Dataflow job preprocess-taxifeatures-181109-182408 ... hang on
ContextualVersionConflictTraceback (most recent call last)
<ipython-input-14-b4775e416971> in <module>()
----> 1 preprocess(50*100, 'DataflowRunner')
2 #change first arg to None to preprocess full dataset
<ipython-input-8-0ab357cc98ce> in preprocess(EVERY_N, RUNNER)
50 p | 'read_{}'.format(phase) >> beam.io.Read(beam.io.BigQuerySource(query=query))
51 | 'tocsv_{}'.format(phase) >> beam.Map(to_csv)
---> 52 | 'write_{}'.format(phase) >> beam.io.Write(beam.io.WriteToText(outfile))
53 )
54 print("Done")
/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/pipeline.pyc in __exit__(self, exc_type, exc_val, exc_tb)
421 def __exit__(self, exc_type, exc_val, exc_tb):
422 if not exc_type:
--> 423 self.run().wait_until_finish()
424
425 def visit(self, visitor):
/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/pipeline.pyc in run(self, test_runner_api)
401 if test_runner_api and self._verify_runner_api_compatible():
402 return Pipeline.from_runner_api(
--> 403 self.to_runner_api(), self.runner, self._options).run(False)
404
405 if self._options.view_as(TypeOptions).runtime_type_check:
/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/pipeline.pyc in run(self, test_runner_api)
414 finally:
415 shutil.rmtree(tmpdir)
--> 416 return self.runner.run_pipeline(self)
417
418 def __enter__(self):
/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.pyc in run_pipeline(self, pipeline)
387 # raise an exception.
388 result = DataflowPipelineResult(
--> 389 self.dataflow_client.create_job(self.job), self)
390
391 # TODO(BEAM-4274): Circular import runners-metrics. Requires refactoring.
/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/utils/retry.pyc in wrapper(*args, **kwargs)
182 while True:
183 try:
--> 184 return fun(*args, **kwargs)
185 except Exception as exn: # pylint: disable=broad-except
186 if not retry_filter(exn):
/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/runners/dataflow/internal/apiclient.pyc in create_job(self, job)
488 def create_job(self, job):
489 """Creates job description. May stage and/or submit for remote execution."""
--> 490 self.create_job_description(job)
491
492 # Stage and submit the job when necessary
/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/runners/dataflow/internal/apiclient.pyc in create_job_description(self, job)
517
518 # Stage other resources for the SDK harness
--> 519 resources = self._stage_resources(job.options)
520
521 job.proto.environment = Environment(
/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/runners/dataflow/internal/apiclient.pyc in _stage_resources(self, options)
450 options,
451 temp_dir=tempfile.mkdtemp(),
--> 452 staging_location=google_cloud_options.staging_location)
453 return resources
454
/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/runners/portability/stager.pyc in stage_job_resources(self, options, build_setup_args, temp_dir, populate_requirements_cache, staging_location)
221 resources.extend(
222 self._stage_beam_sdk(sdk_remote_location, staging_location,
--> 223 temp_dir))
224 elif setup_options.sdk_location == 'container':
225 # Use the SDK that's built into the container, rather than re-staging
/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/runners/portability/stager.pyc in _stage_beam_sdk(self, sdk_remote_location, staging_location, temp_dir)
464 """
465 if sdk_remote_location == 'pypi':
--> 466 sdk_local_file = Stager._download_pypi_sdk_package(temp_dir)
467 sdk_sources_staged_name = Stager.\
468 _desired_sdk_filename_in_staging_location(sdk_local_file)
/usr/local/envs/py2env/lib/python2.7/site-packages/apache_beam/runners/portability/stager.pyc in _download_pypi_sdk_package(temp_dir, fetch_binary, language_version_tag, language_implementation_tag, abi_tag, platform_tag)
513 package_name = Stager.get_sdk_package_name()
514 try:
--> 515 version = pkg_resources.get_distribution(package_name).version
516 except pkg_resources.DistributionNotFound:
517 raise RuntimeError('Please set --sdk_location command-line option '
/usr/local/envs/py2env/lib/python2.7/site-packages/pkg_resources/__init__.pyc in get_distribution(dist)
469 dist = Requirement.parse(dist)
470 if isinstance(dist, Requirement):
--> 471 dist = get_provider(dist)
472 if not isinstance(dist, Distribution):
473 raise TypeError("Expected string, Requirement, or Distribution", dist)
/usr/local/envs/py2env/lib/python2.7/site-packages/pkg_resources/__init__.pyc in get_provider(moduleOrReq)
345 """Return an IResourceProvider for the named module or requirement"""
346 if isinstance(moduleOrReq, Requirement):
--> 347 return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0]
348 try:
349 module = sys.modules[moduleOrReq]
/usr/local/envs/py2env/lib/python2.7/site-packages/pkg_resources/__init__.pyc in require(self, *requirements)
889 included, even if they were already activated in this working set.
890 """
--> 891 needed = self.resolve(parse_requirements(requirements))
892
893 for dist in needed:
/usr/local/envs/py2env/lib/python2.7/site-packages/pkg_resources/__init__.pyc in resolve(self, requirements, env, installer, replace_conflicting, extras)
780 # Oops, the "best" so far conflicts with a dependency
781 dependent_req = required_by[req]
--> 782 raise VersionConflict(dist, req).with_context(dependent_req)
783
784 # push the new requirements onto the stack
ContextualVersionConflict: (pytz 2016.7 (/usr/local/envs/py2env/lib/python2.7/site-packages), Requirement.parse('pytz<=2018.4,>=2018.3'), set(['apache-beam']))
Avoid the use of Discovery API and directly hit the ML end-point (since it is documented and won't change)
The resulting code is simpler and easier to understand
Hi,
Running the first block of feateng.ipynb notebook results in cffi.error.VerificationError so the rest doesn't work as well. The same notebook worked fine yesterday.
Hey @lakshmanok
the course is nice but there is very little chance to actually actively learn in most labs as usually one just clicks through pre written code. In this case this example is supposed to be be a TODO according to the lab sheet on QuickLabs...
Originally posted by @lokeshsoni in #330
In the "Prod ML Systems Lab 2 : Serving ML Predictions in batch and real-time" lab, it says:
Step 2
Back in your Cloud Shell, modify the script run_dataflow.sh to get Project Id (using --project) from command line arguments, and then run as follows:
cd ~/training-data-analyst/courses/machine_learning/deepdive/06_structured/labs/serving
./run_dataflow.sh
However, I can already see this set here: https://github.com/GoogleCloudPlatform/training-data-analyst/blob/master/courses/machine_learning/deepdive/06_structured/labs/serving/run_dataflow.sh#L11
I then get this Java error running the script:
[WARNING]
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:293)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Failed to construct instance from factory method DataflowRunner#fromOptions(interface org.apache.beam.sdk.options.PipelineOptions)
at org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:233)
at org.apache.beam.sdk.util.InstanceBuilder.build(InstanceBuilder.java:162)
at org.apache.beam.sdk.PipelineRunner.fromOptions(PipelineRunner.java:55)
at org.apache.beam.sdk.Pipeline.create(Pipeline.java:150)
at com.google.cloud.training.mlongcp.AddPrediction.main(AddPrediction.java:69)
... 6 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:222)
... 10 more
Caused by: java.lang.NoSuchMethodError: com.google.api.client.googleapis.services.json.AbstractGoogleJsonClient$Builder.setBatchPath(Ljava/lang/String;)Lcom/google/api/client/googleapis/services/AbstractG
oogleClient$Builder;
at com.google.api.services.cloudresourcemanager.CloudResourceManager$Builder.setBatchPath(CloudResourceManager.java:5929)
at com.google.api.services.cloudresourcemanager.CloudResourceManager$Builder.<init>(CloudResourceManager.java:5908)
at org.apache.beam.sdk.extensions.gcp.options.GcpOptions$GcpTempLocationFactory.newCloudResourceManagerClient(GcpOptions.java:370)
at org.apache.beam.sdk.extensions.gcp.options.GcpOptions$GcpTempLocationFactory.create(GcpOptions.java:240)
at org.apache.beam.sdk.extensions.gcp.options.GcpOptions$GcpTempLocationFactory.create(GcpOptions.java:228)
at org.apache.beam.sdk.options.ProxyInvocationHandler.returnDefaultHelper(ProxyInvocationHandler.java:592)
at org.apache.beam.sdk.options.ProxyInvocationHandler.getDefault(ProxyInvocationHandler.java:533)
at org.apache.beam.sdk.options.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:155)
at com.sun.proxy.$Proxy37.getGcpTempLocation(Unknown Source)
at org.apache.beam.runners.dataflow.DataflowRunner.fromOptions(DataflowRunner.java:240)
... 15 more
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 8.330 s
[INFO] Finished at: 2018-10-14T14:31:16+01:00
[INFO] Final Memory: 26M/62M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.4.0:java (default-cli) on project pipeline: An exception occured while executing the Java class. null: InvocationTargetException: Faile
d to construct instance from factory method DataflowRunner#fromOptions(interface org.apache.beam.sdk.options.PipelineOptions): com.google.api.client.googleapis.services.json.AbstractGoogleJsonClient$Build
er.setBatchPath(Ljava/lang/String;)Lcom/google/api/client/googleapis/services/AbstractGoogleClient$Builder; -> [Help 1]
When running b_hyperparam.ipynb from
datalab/notebooks/training-data-analyst/courses/machine_learning/deepdive/05_artandscience/labs
or
datalab/notebooks/training-data-analyst/courses/machine_learning/deepdive/05_artandscience/
I keep on getting:
Removing gs://qwiklabs-gcp-0c3e9ec8e4427080/house_trained/packages/0a26bfb7f0d02a513fe9c410c169dd06a3a4a5c1f6fdd14ca96cc66ac17fdf17/trainer-0.0.0.tar.gz#1545415738922023...
/ [1 objects]
Operation completed over 1 objects.
ERROR: (gcloud.ml-engine.jobs.submit.training) User [[email protected]] does not have permission to access project [qwiklabs-gcp-0c3e9ec8e4427080s] (or it may not exist): Permission denied on resource project qwiklabs-gcp-0c3e9ec8e4427080s.
I've been doing the
Machine Learning with TensorFlow on Google Cloud Platform
course on coursera. While working through labs in the course, I have noticed that the strategies for configuring the gcloud sdk are not very robust. Perhaps it is because they are intended to be run on GCP in datalab, but I like doing them on my computer or VMs: datalab itself has been showing non-responsive UI, which may be caused poor network latency, or my persistent use of firefox.
Anyhow, moving onward, there doesn't seem to be a place in the documentation with an advised way of automating the setup of a GCP config, and I have broke quite a few gcloud configurations by running scripts like the one here. It changes the project id, bucket and region in my currently open config. These configurations are proving quite tedious to keep an eye on.
I know terraform and other devops tools offer partial solutions, but this really feels like something that should be native. Does anyone have suggestions on how we could improve the scripts used for setting up GCP environmental variables to stop them from being set on-top of existing configs, but to use a temporary one that belongs exclusively to the script?
Perhaps it is possible to set all of these variables with the python api, and avoid changing any of the configs that are used for bash calls.
Hi,
I am currently learning GCP, and I've been following some of the examples in codelabs. To be more specific, I've been studying the San Diego traffic example. I don't quite understand what's the role of the file "LaneInfo.Java". It seems that these files define the input variables as strings, and then Currentconditions.java and AverageSpeeds.java use those variable definitions? As part of my learning, I am trying to replicate the same process using the Chicago Traffic dataset, but I keep running into issues when running the averagespeeds.java & laneinfo.java. Any type of insight(s) would be helpful. I am still very new to GCP and java/apache beam in general.
The coursera video is referencing a folder that doesn't exist
I using sklearn_crfsuite estimator
crf = sklearn_crfsuite.CRF(
algorithm='lbfgs',
c1=0.1,
c2=0.1,
max_iterations=2,
all_possible_transitions=True
)
I'm saving the model as described below:
model = 'model.joblib'
joblib.dump(crf, model)
and when I try to deploy the model it reports this error:
ERROR: (gcloud.alpha.ml-engine.versions.create) Bad model detected with error: "Failed to load model: Could not load the model: /tmp/model/0001/model.joblib. No module named sklearn_crfsuite.estimator. (Error code: 0)"
deploy model:
gcloud alpha ml-engine versions create v1 --model teste --origin $ORI --python-version 2.7 --runtime-version 1.8 --framework scikit-learn
In the jupyter notebook courses/machine_learning/deepdive/10_recommend/composer_gcf_trigger/composertriggered.ipynb
there is a typo/inconsistency in the name of the airflow-variable gcp_completion_bucket
.
In the section Complete the DAG file it is called gcs_completion_bucket
(note the s at third position).
However, in the section Setting Airflow variables it is called gcp_completion_bucket
. (I guess this name is correct since it conforms with the names of the other variables.)
The same applies to the the jupyter notebook in the labs
folder courses/machine_learning/deepdive/10_recommend/labs/composer_gcf_trigger/composertriggered.ipynb
Reminder that this might be enough to correct the lab problem.
#360
2019-02-19 11:20:15.948958: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at lookup_table_op.cc:674 : Failed precondition: Table not initialized.
2019-02-19 11:20:15.948958: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at lookup_table_op.cc:674 : Failed precondition: Table not initialized.
2019-02-19 11:20:15.958287: I tensorflow/core/kernels/lookup_util.cc:376] Table trying to initialize from file ./temp_output/vocab.txt is already initialized.
Traceback (most recent call last):
File "/home/user12/Documents/answer_evaluation_12_2_19/env3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/home/user12/Documents/answer_evaluation_12_2_19/env3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/user12/Documents/answer_evaluation_12_2_19/env3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.FailedPreconditionError: Table not initialized.
[[{{node hash_table_Lookup}} = LookupTableFindV2[Tin=DT_STRING, Tout=DT_INT64, _device="/device:CPU:0"](string_to_index/hash_table, SparseToDense, string_to_index/hash_table/Const)]]
[[{{node IteratorGetNext}} = IteratorGetNextoutput_shapes=[[?,?], [?,?], [?,1]], output_types=[DT_INT64, DT_INT64, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
During handling of the above exception, another exception occurred:
I am trying to run model native in local environment. tf.contrib.lookup.index_table_from_file throws table not initialized error.
debug_demo.ipynb downloads but does not run. Where are the Google Cloud Datalab credentials etc. for this lab? Is this just a dry lab?
flake8 testing of https://github.com/GoogleCloudPlatform/training-data-analyst on Python 3.7.0
$ flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics
./blogs/nexrad2/visualize/plot_pngs.py:41:40: E999 SyntaxError: invalid syntax
print "Plotting {} into {} upto {} km".format(args.nexrad, args.png, args.range)
^
./blogs/landsat/ndvi.py:32:33: E999 SyntaxError: invalid syntax
print 'Getting {0} to {1} '.format(self.gsfile, self.dest)
^
./blogs/landsat/setup.py:81:31: E999 SyntaxError: invalid syntax
print 'Running command: %s' % command_list
^
./blogs/landsat/dfndvi.py:33:42: E999 SyntaxError: invalid syntax
print "WARNING! format error on {", line, "}"
^
./blogs/lightning/ltgpred/create_dataset.py:330:43: E999 SyntaxError: invalid syntax
print 'Launching local job ... hang on'
^
./blogs/timeseries/simplernn/trainer/model.py:48:22: E999 SyntaxError: invalid syntax
print 'readcsv={}'.format(value_column)
^
./blogs/tf_dataflow_serving/run_pipeline.py:57:12: E999 SyntaxError: invalid syntax
print ''
^
./blogs/tf_dataflow_serving/simulate_stream.py:52:93: E999 SyntaxError: invalid syntax
print 'Topic does not exist. Please run a stream pipeline first to create the topic.'
^
./blogs/goes16/maria/create_image.py:84:39: E999 SyntaxError: invalid syntax
| 'to_jpg' >> beam.Map(lambda (dt,name,lat,lon):
^
./courses/machine_learning/feateng/taxifare/trainer/model.py:201:21: F821 undefined name 'SCALE_COLUMNS'
for name in SCALE_COLUMNS:
^
./courses/machine_learning/feateng/taxifare_tft/trainer/model.py:184:17: F821 undefined name 'tflearn'
'rmse': tflearn.MetricSpec(metric_fn=metrics.streaming_root_mean_squared_error),
^
./courses/machine_learning/feateng/taxifare_tft/trainer/model.py:184:46: F821 undefined name 'metrics'
'rmse': tflearn.MetricSpec(metric_fn=metrics.streaming_root_mean_squared_error),
^
./courses/machine_learning/feateng/taxifare_tft/trainer/model.py:185:37: F821 undefined name 'tflearn'
'training/hptuning/metric': tflearn.MetricSpec(metric_fn=metrics.streaming_root_mean_squared_error),
^
./courses/machine_learning/feateng/taxifare_tft/trainer/model.py:185:66: F821 undefined name 'metrics'
'training/hptuning/metric': tflearn.MetricSpec(metric_fn=metrics.streaming_root_mean_squared_error),
^
./courses/machine_learning/deepdive/08_image/mnistmodel/trainer/task.py:117:34: E999 SyntaxError: invalid syntax
print "Training for {} steps".format(hparams['train_steps'])
^
./courses/machine_learning/deepdive/08_image/labs/flowersmodel/model.py:103:44: E999 SyntaxError: invalid syntax
image = #TODO: decode contents into JPEG
^
./courses/machine_learning/deepdive/08_image/labs/mnistmodel/trainer/task.py:117:34: E999 SyntaxError: invalid syntax
print "Training for {} steps".format(hparams['train_steps'])
^
./courses/machine_learning/deepdive/06_structured/labs/serving/application/main.py:32:20: E999 SyntaxError: invalid syntax
credentials = # TODO
^
./courses/machine_learning/deepdive/10_recommend/labs/hybrid_recommendations/hybrid_recommendations_module/trainer/model.py:266:24: F821 undefined name 'NON_FACTOR_COLUMNS'
for colname in NON_FACTOR_COLUMNS[1:-1]
^
./courses/machine_learning/deepdive/10_recommend/hybrid_recommendations/hybrid_recommendations_module/trainer/model.py:266:24: F821 undefined name 'NON_FACTOR_COLUMNS'
for colname in NON_FACTOR_COLUMNS[1:-1]
^
./courses/machine_learning/deepdive/09_sequence/kubeflow-app/vendor/kubeflow/core/jupyterhub_spawner.py:66:1: F821 undefined name 'c'
c.JupyterHub.ip = '0.0.0.0'
^
./courses/machine_learning/deepdive/09_sequence/kubeflow-app/vendor/kubeflow/core/jupyterhub_spawner.py:67:1: F821 undefined name 'c'
c.JupyterHub.hub_ip = '0.0.0.0'
^
./courses/machine_learning/deepdive/09_sequence/kubeflow-app/vendor/kubeflow/core/jupyterhub_spawner.py:70:1: F821 undefined name 'c'
c.JupyterHub.cleanup_servers = False
^
./courses/machine_learning/deepdive/09_sequence/kubeflow-app/vendor/kubeflow/core/jupyterhub_spawner.py:76:1: F821 undefined name 'c'
c.JupyterHub.spawner_class = KubeFormSpawner
^
./courses/machine_learning/deepdive/09_sequence/kubeflow-app/vendor/kubeflow/core/jupyterhub_spawner.py:77:1: F821 undefined name 'c'
c.KubeSpawner.singleuser_image_spec = 'gcr.io/kubeflow/tensorflow-notebook'
^
./courses/machine_learning/deepdive/09_sequence/kubeflow-app/vendor/kubeflow/core/jupyterhub_spawner.py:78:1: F821 undefined name 'c'
c.KubeSpawner.cmd = 'start-singleuser.sh'
^
./courses/machine_learning/deepdive/09_sequence/kubeflow-app/vendor/kubeflow/core/jupyterhub_spawner.py:79:1: F821 undefined name 'c'
c.KubeSpawner.args = ['--allow-root']
^
./courses/machine_learning/deepdive/09_sequence/kubeflow-app/vendor/kubeflow/core/jupyterhub_spawner.py:81:1: F821 undefined name 'c'
c.KubeSpawner.start_timeout = 60 * 10
^
./courses/machine_learning/deepdive/09_sequence/kubeflow-app/vendor/kubeflow/core/jupyterhub_spawner.py:90:1: F821 undefined name 'c'
c.KubeSpawner.user_storage_pvc_ensure = True
^
./courses/machine_learning/deepdive/09_sequence/kubeflow-app/vendor/kubeflow/core/jupyterhub_spawner.py:92:1: F821 undefined name 'c'
c.KubeSpawner.user_storage_capacity = '10Gi'
^
./courses/machine_learning/deepdive/09_sequence/kubeflow-app/vendor/kubeflow/core/jupyterhub_spawner.py:93:1: F821 undefined name 'c'
c.KubeSpawner.pvc_name_template = 'claim-{username}{servername}'
^
./courses/machine_learning/deepdive/09_sequence/kubeflow-app/vendor/kubeflow/core/jupyterhub_spawner.py:94:1: F821 undefined name 'c'
c.KubeSpawner.volumes = [
^
./courses/machine_learning/deepdive/09_sequence/kubeflow-app/vendor/kubeflow/core/jupyterhub_spawner.py:102:1: F821 undefined name 'c'
c.KubeSpawner.volume_mounts = [
^
./courses/machine_learning/deepdive/09_sequence/labs/txtclsmodel/trainer/model.py:89:36: E999 SyntaxError: invalid syntax
x = # TODO (hint: use tokenizer)
^
./courses/data_analysis/deepdive/pubsub-prework-solution/python/action_publisher.py:31:11: F821 undefined name 'topic_name'
topic_name, message_future.exception()))
^
./courses/data_analysis/deepdive/composer-exercises/hello_world_solution.py:36:14: F821 undefined name 'xrange'
for i in xrange(number_of_templated_tasks):
^
./courses/developingapps/demos/gs2ds/gs2ds.py:33:18: F821 undefined name 'unicode'
'firstName': unicode(firstName),
^
./courses/developingapps/demos/gs2ds/gs2ds.py:34:17: F821 undefined name 'unicode'
'lastName': unicode(lastName),
^
./courses/developingapps/demos/gs2ds/gs2ds.py:37:14: F821 undefined name 'unicode'
'party': unicode(party),
^
./courses/developingapps/demos/gs2ds/gs2ds.py:38:18: F821 undefined name 'unicode'
'homeState': unicode(homeState),
^
./courses/developingapps/python/cloudstorage/end/quiz/__init__.py:26:24: F821 undefined name 'api'
app.register_blueprint(api.routes.api_blueprint, url_prefix='/api')
^
./courses/developingapps/python/cloudstorage/end/quiz/__init__.py:27:24: F821 undefined name 'webapp'
app.register_blueprint(webapp.routes.webapp_blueprint, url_prefix='') ^
./courses/developingapps/python/cloudstorage/end/quiz/webapp/questions.py:58:28: F821 undefined name 'unicode'
data['imageUrl'] = unicode(upload_file(image_file, True))
^
./courses/developingapps/python/cloudstorage/start/quiz/__init__.py:26:24: F821 undefined name 'api'
app.register_blueprint(api.routes.api_blueprint, url_prefix='/api')
^
./courses/developingapps/python/cloudstorage/start/quiz/__init__.py:27:24: F821 undefined name 'webapp'
app.register_blueprint(webapp.routes.webapp_blueprint, url_prefix='') ^
./courses/developingapps/python/kubernetesengine/end/frontend/quiz/__init__.py:26:24: F821 undefined name 'api'
app.register_blueprint(api.routes.api_blueprint, url_prefix='/api')
^
./courses/developingapps/python/kubernetesengine/end/frontend/quiz/__init__.py:27:24: F821 undefined name 'webapp'
app.register_blueprint(webapp.routes.webapp_blueprint, url_prefix='') ^
./courses/developingapps/python/kubernetesengine/end/frontend/quiz/webapp/questions.py:39:28: F821 undefined name 'unicode'
data['imageUrl'] = unicode(upload_file(image_file, True))
^
./courses/developingapps/python/kubernetesengine/end/backend/start/frontend/quiz/__init__.py:26:24: F821 undefined name 'api'
app.register_blueprint(api.routes.api_blueprint, url_prefix='/api')
^
./courses/developingapps/python/kubernetesengine/end/backend/start/frontend/quiz/__init__.py:27:24: F821 undefined name 'webapp'
app.register_blueprint(webapp.routes.webapp_blueprint, url_prefix='') ^
./courses/developingapps/python/kubernetesengine/end/backend/start/frontend/quiz/webapp/questions.py:39:28: F821 undefined name 'unicode'
data['imageUrl'] = unicode(upload_file(image_file, True))
^
./courses/developingapps/python/kubernetesengine/start/frontend/quiz/__init__.py:26:24: F821 undefined name 'api'
app.register_blueprint(api.routes.api_blueprint, url_prefix='/api')
^
./courses/developingapps/python/kubernetesengine/start/frontend/quiz/__init__.py:27:24: F821 undefined name 'webapp'
app.register_blueprint(webapp.routes.webapp_blueprint, url_prefix='') ^
./courses/developingapps/python/kubernetesengine/start/frontend/quiz/webapp/questions.py:39:28: F821 undefined name 'unicode'
data['imageUrl'] = unicode(upload_file(image_file, True))
^
./courses/developingapps/python/kubernetesengine/bonus/frontend/quiz/__init__.py:26:24: F821 undefined name 'api'
app.register_blueprint(api.routes.api_blueprint, url_prefix='/api')
^
./courses/developingapps/python/kubernetesengine/bonus/frontend/quiz/__init__.py:27:24: F821 undefined name 'webapp'
app.register_blueprint(webapp.routes.webapp_blueprint, url_prefix='') ^
./courses/developingapps/python/kubernetesengine/bonus/frontend/quiz/api/api.py:71:27: E999 SyntaxError: invalid syntax
print 'answer sent'
^
./courses/developingapps/python/kubernetesengine/bonus/frontend/quiz/webapp/questions.py:39:28: F821 undefined name 'unicode'
data['imageUrl'] = unicode(upload_file(image_file, True))
^
./courses/developingapps/python/pubsub-languageapi-spanner/end/quiz/__init__.py:26:24: F821 undefined name 'api'
app.register_blueprint(api.routes.api_blueprint, url_prefix='/api')
^
./courses/developingapps/python/pubsub-languageapi-spanner/end/quiz/__init__.py:27:24: F821 undefined name 'webapp'
app.register_blueprint(webapp.routes.webapp_blueprint, url_prefix='') ^
./courses/developingapps/python/pubsub-languageapi-spanner/end/quiz/webapp/questions.py:39:28: F821 undefined name 'unicode'
data['imageUrl'] = unicode(upload_file(image_file, True))
^
./courses/developingapps/python/pubsub-languageapi-spanner/start/quiz/__init__.py:26:24: F821 undefined name 'api'
app.register_blueprint(api.routes.api_blueprint, url_prefix='/api')
^
./courses/developingapps/python/pubsub-languageapi-spanner/start/quiz/__init__.py:27:24: F821 undefined name 'webapp'
app.register_blueprint(webapp.routes.webapp_blueprint, url_prefix='') ^
./courses/developingapps/python/pubsub-languageapi-spanner/start/quiz/webapp/questions.py:39:28: F821 undefined name 'unicode'
data['imageUrl'] = unicode(upload_file(image_file, True))
^
./courses/developingapps/python/pubsub-languageapi-spanner/start/quiz/gcp/pubsub.py:69:16: E999 IndentationError: expected an indented block
"""pull_feedback
^
./courses/developingapps/python/pubsub-languageapi-spanner/start/quiz/gcp/spanner.py:87:0: E999 SyntaxError: unexpected EOF while parsing
^
./courses/developingapps/python/pubsub-languageapi-spanner/start/quiz/gcp/languageapi.py:60:14: E999 SyntaxError: unexpected EOF while parsing
# END TODO ^
./courses/developingapps/python/pubsub-languageapi-spanner/bonus/quiz/__init__.py:26:24: F821 undefined name 'api'
app.register_blueprint(api.routes.api_blueprint, url_prefix='/api')
^
./courses/developingapps/python/pubsub-languageapi-spanner/bonus/quiz/__init__.py:27:24: F821 undefined name 'webapp'
app.register_blueprint(webapp.routes.webapp_blueprint, url_prefix='') ^
./courses/developingapps/python/pubsub-languageapi-spanner/bonus/quiz/webapp/questions.py:39:28: F821 undefined name 'unicode'
data['imageUrl'] = unicode(upload_file(image_file, True))
^
./courses/developingapps/python/datastore/end/quiz/__init__.py:26:24: F821 undefined name 'api'
app.register_blueprint(api.routes.api_blueprint, url_prefix='/api')
^
./courses/developingapps/python/datastore/end/quiz/__init__.py:27:24: F821 undefined name 'webapp'
app.register_blueprint(webapp.routes.webapp_blueprint, url_prefix='') ^
./courses/developingapps/python/datastore/start/quiz/__init__.py:26:24: F821 undefined name 'api'
app.register_blueprint(api.routes.api_blueprint, url_prefix='/api')
^
./courses/developingapps/python/datastore/start/quiz/__init__.py:27:24: F821 undefined name 'webapp'
app.register_blueprint(webapp.routes.webapp_blueprint, url_prefix='') ^
./courses/developingapps/python/datastore/bonus/quiz/__init__.py:26:24: F821 undefined name 'api'
app.register_blueprint(api.routes.api_blueprint, url_prefix='/api')
^
./courses/developingapps/python/datastore/bonus/quiz/__init__.py:27:24: F821 undefined name 'webapp'
app.register_blueprint(webapp.routes.webapp_blueprint, url_prefix='') ^
./courses/developingapps/python/firebase/end/quiz/__init__.py:26:24: F821 undefined name 'api'
app.register_blueprint(api.routes.api_blueprint, url_prefix='/api')
^
./courses/developingapps/python/firebase/end/quiz/__init__.py:27:24: F821 undefined name 'webapp'
app.register_blueprint(webapp.routes.webapp_blueprint, url_prefix='') ^
./courses/developingapps/python/firebase/end/quiz/webapp/questions.py:58:28: F821 undefined name 'unicode'
data['imageUrl'] = unicode(upload_file(image_file, True))
^
./courses/developingapps/python/firebase/start/quiz/__init__.py:26:24: F821 undefined name 'api'
app.register_blueprint(api.routes.api_blueprint, url_prefix='/api')
^
./courses/developingapps/python/firebase/start/quiz/__init__.py:27:24: F821 undefined name 'webapp'
app.register_blueprint(webapp.routes.webapp_blueprint, url_prefix='') ^
./courses/developingapps/python/firebase/start/quiz/webapp/questions.py:58:28: F821 undefined name 'unicode'
data['imageUrl'] = unicode(upload_file(image_file, True))
^
./courses/developingapps/python/appengine/end/frontend/quiz/__init__.py:26:24: F821 undefined name 'api'
app.register_blueprint(api.routes.api_blueprint, url_prefix='/api')
^
./courses/developingapps/python/appengine/end/frontend/quiz/__init__.py:27:24: F821 undefined name 'webapp'
app.register_blueprint(webapp.routes.webapp_blueprint, url_prefix='') ^
./courses/developingapps/python/appengine/end/frontend/quiz/webapp/questions.py:39:28: F821 undefined name 'unicode'
data['imageUrl'] = unicode(upload_file(image_file, True))
^
./courses/developingapps/python/appengine/start/frontend/quiz/__init__.py:26:24: F821 undefined name 'api'
app.register_blueprint(api.routes.api_blueprint, url_prefix='/api')
^
./courses/developingapps/python/appengine/start/frontend/quiz/__init__.py:27:24: F821 undefined name 'webapp'
app.register_blueprint(webapp.routes.webapp_blueprint, url_prefix='') ^
./courses/developingapps/python/appengine/start/frontend/quiz/webapp/questions.py:39:28: F821 undefined name 'unicode'
data['imageUrl'] = unicode(upload_file(image_file, True))
^
./bootcamps/imagereco/fashionmodel/trainer/model.py:52:12: F821 undefined name 'p2'
outlen = p2.shape[1]*p2.shape[2]*p2.shape[3] #outlen should be 980
^
./bootcamps/imagereco/fashionmodel/trainer/model.py:52:24: F821 undefined name 'p2'
outlen = p2.shape[1]*p2.shape[2]*p2.shape[3] #outlen should be 980
^
./bootcamps/imagereco/fashionmodel/trainer/model.py:52:36: F821 undefined name 'p2'
outlen = p2.shape[1]*p2.shape[2]*p2.shape[3] #outlen should be 980
^
./bootcamps/imagereco/fashionmodel/trainer/model.py:53:23: F821 undefined name 'p2'
p2flat = tf.reshape(p2, [-1, outlen]) # flattened
^
./bootcamps/imagereco/fashionmodel/trainer/model.py:85:10: F821 undefined name 'ylogits'
return ylogits, NCLASSES
^
./bootcamps/imagereco/fashionmodel/trainer/task.py:117:34: E999 SyntaxError: invalid syntax
print "Training for {} steps".format(hparams['train_steps'])
^
19 E999 SyntaxError: invalid syntax
75 F821 undefined name 'p2'
94
I have some issue with the first cell of the following notebook:
https://github.com/GoogleCloudPlatform/training-data-analyst/blob/master/courses/machine_learning/feateng/feateng.ipynb
%%bash
conda update -y -n base -c defaults conda
source activate py2env
pip uninstall -y google-cloud-dataflow
conda install -y pytz
pip install apache-beam[gcp]
It seems the issue is the installation of apache-beam:
...
Requirement already satisfied: cachetools>=2.0.0 in /usr/local/envs/py2env/lib/python2.7/site-packages (from google-auth<2.0dev,>=0.4.0->google-api-core[grpc]<2.0.0dev,>=1.4.1->google-cloud-pubsub==0.39.0; extra == "gcp"->apache-beam[gcp]) (2.1.0)
Installing collected packages: dill, pyarrow, typing, pyvcf, fastavro, httplib2, docopt, hdfs, grpc-google-iam-v1, google-api-core, google-cloud-pubsub, monotonic, fasteners, google-apitools, google-cloud-bigquery, apache-beam
Found existing installation: dill 0.2.6
Skipping google-cloud-dataflow as it is not installed.
google-cloud-monitoring 0.28.0 has requirement google-api-core<0.2.0dev,>=0.1.1, but you'll have google-api-core 1.7.0 which is incompatible.
googledatastore 7.0.1 has requirement httplib2<0.10,>=0.9.1, but you'll have httplib2 0.11.3 which is incompatible.
Cannot uninstall 'dill'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.
then if I check, I don't see the package installed:
conda list
the right env is activated:
py2env * /usr/local/envs/py2env
so when trying to in port the package it doesn't work (restarting the kernel doesn't help since the package was not installed):
ImportErrorTraceback (most recent call last)
<ipython-input-4-830e0319c5fc> in <module>()
1 import tensorflow as tf
----> 2 import apache_beam as beam
3 import shutil
4 print(tf.__version__)
ImportError: No module named apache_beam
It seems that
!conda uninstall dill=0.2.6 -y
can drop dill and then the installation of apache-beam is working. My 1:30 min session is over. I will start again and see if this was a temporary glitch and if the solution above is working.
Since re.match()
checks for a match only at the beginning of the string - regardless of mode - I wonder whether we can get rid of r'^'
in front of re.escape(term)
as it will do the same, and IMHO, more readable.
Alternatively, I am also considering simply using line.startswith(term)
, which doesn't need re
module at all, and, again IMHO, more Pythonic and faster.
In model code "simple_rnn" the state is not passed on to the next batch (if I understand it correctly):
outputs, _ = rnn.static_rnn(lstm_cell, x, dtype = tf.float32)
However if we had a very long time series (SEQ_LEN large e.g. 10000) that we would want to split up into smaller chunks how could we pass on the state to the next batch (i.e. rather than starting from a zero state each time)?
# initialize somewhere
state = tf.zeros([BATCH_SIZE, LSTM_SIZE], dtype=tf.float32)
# in model code
outputs, state = rnn.static_rnn(lstm_cell, x, initial_state=state, dtype=tf.float32)
# the outputs are passed on and used to produce the "predictions_dict"
QUESTION: where & how do we initialize the state, and how is the state passed on when creating a customer estimator?
The BigQuery link in the notebook points to the welcome page of the (legacy) BigQuery Web UI. As the project always is a fresh, new (Qwiklabs) project, the project & data list is empty.
Finding the data set is a chore and you really need to scan it to solve the TODO (get custom dimension index). So replace the link with a deeplink, aka:
https://console.cloud.google.com/bigquery?dataset=GA360_test&p=cloud-training-demos&d=GA360_test&t=ga_sessions_sample
[edit]Ah, can just make small PR. Will do so when I have time.
Hi Sir,
Now the records are created not based on timestamp but on item name as below.
If you see the timestamp column it is not in order. Everytime a new record is created it is getting appended based on item name.
Please find the below output.
device | item | type | state | timestamp |
---|---|---|---|---|
fueb_38B1DB168ABB | dimmer | Dimmer | 41 | 2018-07-24 12:27:31 UTC |
fueb_38B1DB168ABB | dimmer | Dimmer | 63 | 2018-07-24 12:24:50 UTC |
fueb_38B1DB168ABB | dimmer | Dimmer | 80 | 2018-07-24 12:27:04 UTC |
fueb_38B1DB168ABB | light | Switch | ON | 2018-07-24 12:24:43 UTC |
fueb_38B1DB168ABB | light | Switch | ON | 2018-07-24 12:26:03 UTC |
fueb_38B1DB168ABB | light | Switch | OFF | 2018-07-24 12:22:39 UTC |
fueb_38B1DB168ABB | light | Switch | OFF | 2018-07-24 12:25:47 UTC |
fueb_38B1DB168ABB | color | Color | 109100100 | 2018-07-24 12:27:56 UTC |
fueb_38B1DB168ABB | color | Color | 201100100 | 2018-07-24 12:24:57 UTC |
Please find the dataflow program which I am using to push iot data to BQ table.
PubSubReader.java.zip
How to resolve this issue and make BQ records based on timestamp.
Hi,
I was running this code on GCP and when I get to this line of code, it ended up in an error
train_and_evaluate('babyweight_trained')
I checked every single line of code in the training section and it seems like the error is from the line I mentioned.
InvalidArgumentErrorTraceback (most recent call last)
in ()
26
27 shutil.rmtree('babyweight_trained', ignore_errors=True) # start fresh each time
---> 28 train_and_evaluate('babyweight_trained')
in train_and_evaluate(output_dir)
23 steps=None,
24 exporters=exporter)
---> 25 tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
26
27 shutil.rmtree('babyweight_trained', ignore_errors=True) # start fresh each time
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/estimator/training.pyc in train_and_evaluate(estimator, train_spec, eval_spec)
437 '(with task id 0). Given task id {}'.format(config.task_id))
438
--> 439 executor.run()
440
441
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/estimator/training.pyc in run(self)
516 config.task_type != run_config_lib.TaskType.EVALUATOR):
517 logging.info('Running training and evaluation locally (non-distributed).')
--> 518 self.run_local()
519 return
520
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/estimator/training.pyc in run_local(self)
648 input_fn=self._train_spec.input_fn,
649 max_steps=self._train_spec.max_steps,
--> 650 hooks=train_hooks)
651
652 # Final export signal: For any eval result with global_step >= train
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.pyc in train(self, input_fn, hooks, steps, max_steps, saving_listeners)
361
362 saving_listeners = _check_listeners_type(saving_listeners)
--> 363 loss = self._train_model(input_fn, hooks, saving_listeners)
364 logging.info('Loss for final step: %s.', loss)
365 return self
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.pyc in _train_model(self, input_fn, hooks, saving_listeners)
841 return self._train_model_distributed(input_fn, hooks, saving_listeners)
842 else:
--> 843 return self._train_model_default(input_fn, hooks, saving_listeners)
844
845 def _train_model_default(self, input_fn, hooks, saving_listeners):
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.pyc in _train_model_default(self, input_fn, hooks, saving_listeners)
857 return self._train_with_estimator_spec(estimator_spec, worker_hooks,
858 hooks, global_step_tensor,
--> 859 saving_listeners)
860
861 def _train_model_distributed(self, input_fn, hooks, saving_listeners):
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.pyc in _train_with_estimator_spec(self, estimator_spec, worker_hooks, hooks, global_step_tensor, saving_listeners)
1057 loss = None
1058 while not mon_sess.should_stop():
-> 1059 _, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
1060 return loss
1061
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.pyc in exit(self, exception_type, exception_value, traceback)
677 if exception_type in [errors.OutOfRangeError, StopIteration]:
678 exception_type = None
--> 679 self._close_internal(exception_type)
680 # exit should return True to suppress an exception.
681 return exception_type is None
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.pyc in _close_internal(self, exception_type)
714 if self._sess is None:
715 raise RuntimeError('Session is already closed.')
--> 716 self._sess.close()
717 finally:
718 self._sess = None
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.pyc in close(self)
962 if self._sess:
963 try:
--> 964 self._sess.close()
965 except _PREEMPTION_ERRORS:
966 pass
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.pyc in close(self)
1106 self._coord.join(
1107 stop_grace_period_secs=self._stop_grace_period_secs,
-> 1108 ignore_live_threads=True)
1109 finally:
1110 try:
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/training/coordinator.pyc in join(self, threads, stop_grace_period_secs, ignore_live_threads)
387 self._registered_threads = set()
388 if self._exc_info_to_raise:
--> 389 six.reraise(*self._exc_info_to_raise)
390 elif stragglers:
391 if ignore_live_threads:
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/training/queue_runner_impl.pyc in _run(self, sess, enqueue_op, coord)
250 break
251 try:
--> 252 enqueue_callable()
253 except self._queue_closed_exception_types: # pylint: disable=catching-non-exception
254 # This exception indicates that a queue was closed.
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _single_operation_run()
1242
1243 def _single_operation_run():
-> 1244 self._call_tf_sessionrun(None, {}, [], target_list, None)
1245
1246 return _single_operation_run
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata)
1407 return tf_session.TF_SessionRun_wrapper(
1408 self._session, options, feed_dict, fetch_list, target_list,
-> 1409 run_metadata)
1410 else:
1411 with errors.raise_exception_on_not_ok_status() as status:
InvalidArgumentError: assertion failed: [string_input_producer requires a non-null input tensor]
[[Node: input_producer/Assert/Assert = Assert[T=[DT_STRING], summarize=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"](input_producer/Greater, input_producer/Assert/Assert/data_0)]]
InvalidArgumentErrorTraceback (most recent call last)
in ()
26
27 shutil.rmtree('babyweight_trained', ignore_errors=True) # start fresh each time
---> 28 train_and_evaluate('babyweight_trained')
in train_and_evaluate(output_dir)
23 steps=None,
24 exporters=exporter)
---> 25 tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
26
27 shutil.rmtree('babyweight_trained', ignore_errors=True) # start fresh each time
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/estimator/training.pyc in train_and_evaluate(estimator, train_spec, eval_spec)
437 '(with task id 0). Given task id {}'.format(config.task_id))
438
--> 439 executor.run()
440
441
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/estimator/training.pyc in run(self)
516 config.task_type != run_config_lib.TaskType.EVALUATOR):
517 logging.info('Running training and evaluation locally (non-distributed).')
--> 518 self.run_local()
519 return
520
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/estimator/training.pyc in run_local(self)
648 input_fn=self._train_spec.input_fn,
649 max_steps=self._train_spec.max_steps,
--> 650 hooks=train_hooks)
651
652 # Final export signal: For any eval result with global_step >= train
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.pyc in train(self, input_fn, hooks, steps, max_steps, saving_listeners)
361
362 saving_listeners = _check_listeners_type(saving_listeners)
--> 363 loss = self._train_model(input_fn, hooks, saving_listeners)
364 logging.info('Loss for final step: %s.', loss)
365 return self
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.pyc in _train_model(self, input_fn, hooks, saving_listeners)
841 return self._train_model_distributed(input_fn, hooks, saving_listeners)
842 else:
--> 843 return self._train_model_default(input_fn, hooks, saving_listeners)
844
845 def _train_model_default(self, input_fn, hooks, saving_listeners):
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.pyc in _train_model_default(self, input_fn, hooks, saving_listeners)
857 return self._train_with_estimator_spec(estimator_spec, worker_hooks,
858 hooks, global_step_tensor,
--> 859 saving_listeners)
860
861 def _train_model_distributed(self, input_fn, hooks, saving_listeners):
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.pyc in _train_with_estimator_spec(self, estimator_spec, worker_hooks, hooks, global_step_tensor, saving_listeners)
1057 loss = None
1058 while not mon_sess.should_stop():
-> 1059 _, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
1060 return loss
1061
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.pyc in exit(self, exception_type, exception_value, traceback)
677 if exception_type in [errors.OutOfRangeError, StopIteration]:
678 exception_type = None
--> 679 self._close_internal(exception_type)
680 # exit should return True to suppress an exception.
681 return exception_type is None
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.pyc in _close_internal(self, exception_type)
714 if self._sess is None:
715 raise RuntimeError('Session is already closed.')
--> 716 self._sess.close()
717 finally:
718 self._sess = None
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.pyc in close(self)
962 if self._sess:
963 try:
--> 964 self._sess.close()
965 except _PREEMPTION_ERRORS:
966 pass
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.pyc in close(self)
1106 self._coord.join(
1107 stop_grace_period_secs=self._stop_grace_period_secs,
-> 1108 ignore_live_threads=True)
1109 finally:
1110 try:
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/training/coordinator.pyc in join(self, threads, stop_grace_period_secs, ignore_live_threads)
387 self._registered_threads = set()
388 if self._exc_info_to_raise:
--> 389 six.reraise(*self._exc_info_to_raise)
390 elif stragglers:
391 if ignore_live_threads:
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/training/queue_runner_impl.pyc in _run(self, sess, enqueue_op, coord)
250 break
251 try:
--> 252 enqueue_callable()
253 except self._queue_closed_exception_types: # pylint: disable=catching-non-exception
254 # This exception indicates that a queue was closed.
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _single_operation_run()
1242
1243 def _single_operation_run():
-> 1244 self._call_tf_sessionrun(None, {}, [], target_list, None)
1245
1246 return _single_operation_run
/usr/local/envs/py2env/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata)
1407 return tf_session.TF_SessionRun_wrapper(
1408 self._session, options, feed_dict, fetch_list, target_list,
-> 1409 run_metadata)
1410 else:
1411 with errors.raise_exception_on_not_ok_status() as status:
InvalidArgumentError: assertion failed: [string_input_producer requires a non-null input tensor]
[[Node: input_producer/Assert/Assert = Assert[T=[DT_STRING], summarize=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"](input_producer/Greater, input_producer/Assert/Assert/data_0)]]
While i executed "d_traineval.ipynb" I could export the exported model. I used the model and used a local docker image of Tensorflow serving 1.8 CPU and i get the following result as output for REST post call
{
"error": "Serving signature name: "serving_default" not found in signature def"
}
My request JSON:
{
"instances": [
{"pickuplon" : -73.987625,
"pickuplat" : 40.750617,
"dropofflat" : 40.78518,
"dropofflon" : -73.971163,
"passengers" : 2
}
]
}
Can you please help me what is the error?
Update tutorials and course content that currently use Datalab's BigQuery module to use the official python client library. The google-cloud-* client libraries are now Google's recommended way of interacting with GCP.
RuntimeError Traceback (most recent call last)
in ()
89
90 if name=="main":
---> 91 preprocessing()
in preprocessing(argv)
85 # print(lines)
86 messages | beam.io.Write(beam.io.WriteToText("gs://anadarko/output.txt"))
---> 87 result = p.run()
88 result.wait_until_finish()
89
/root/anaconda2/lib/python2.7/site-packages/apache_beam/pipeline.pyc in run(self, test_runner_api)
401 if test_runner_api and self._verify_runner_api_compatible():
402 return Pipeline.from_runner_api(
--> 403 self.to_runner_api(), self.runner, self._options).run(False)
404
405 if self._options.view_as(TypeOptions).runtime_type_check:
/root/anaconda2/lib/python2.7/site-packages/apache_beam/pipeline.pyc in run(self, test_runner_api)
414 finally:
415 shutil.rmtree(tmpdir)
--> 416 return self.runner.run_pipeline(self)
417
418 def enter(self):
/root/anaconda2/lib/python2.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.pyc in run_pipeline(self, pipeline)
387 # raise an exception.
388 result = DataflowPipelineResult(
--> 389 self.dataflow_client.create_job(self.job), self)
390
391 # TODO(BEAM-4274): Circular import runners-metrics. Requires refactoring.
/root/anaconda2/lib/python2.7/site-packages/apache_beam/utils/retry.pyc in wrapper(*args, **kwargs)
182 while True:
183 try:
--> 184 return fun(*args, **kwargs)
185 except Exception as exn: # pylint: disable=broad-except
186 if not retry_filter(exn):
/root/anaconda2/lib/python2.7/site-packages/apache_beam/runners/dataflow/internal/apiclient.pyc in create_job(self, job)
488 def create_job(self, job):
489 """Creates job description. May stage and/or submit for remote execution."""
--> 490 self.create_job_description(job)
491
492 # Stage and submit the job when necessary
/root/anaconda2/lib/python2.7/site-packages/apache_beam/runners/dataflow/internal/apiclient.pyc in create_job_description(self, job)
517
518 # Stage other resources for the SDK harness
--> 519 resources = self._stage_resources(job.options)
520
521 job.proto.environment = Environment(
/root/anaconda2/lib/python2.7/site-packages/apache_beam/runners/dataflow/internal/apiclient.pyc in _stage_resources(self, options)
450 options,
451 temp_dir=tempfile.mkdtemp(),
--> 452 staging_location=google_cloud_options.staging_location)
453 return resources
454
/root/anaconda2/lib/python2.7/site-packages/apache_beam/runners/portability/stager.pyc in stage_job_resources(self, options, build_setup_args, temp_dir, populate_requirements_cache, staging_location)
221 resources.extend(
222 self._stage_beam_sdk(sdk_remote_location, staging_location,
--> 223 temp_dir))
224 elif setup_options.sdk_location == 'container':
225 # Use the SDK that's built into the container, rather than re-staging
/root/anaconda2/lib/python2.7/site-packages/apache_beam/runners/portability/stager.pyc in _stage_beam_sdk(self, sdk_remote_location, staging_location, temp_dir)
464 """
465 if sdk_remote_location == 'pypi':
--> 466 sdk_local_file = Stager._download_pypi_sdk_package(temp_dir)
467 sdk_sources_staged_name = Stager.
468 _desired_sdk_filename_in_staging_location(sdk_local_file)
/root/anaconda2/lib/python2.7/site-packages/apache_beam/runners/portability/stager.pyc in _download_pypi_sdk_package(temp_dir, fetch_binary, language_version_tag, language_implementation_tag, abi_tag, platform_tag)
552 processes.check_call(cmd_args)
553 except subprocess.CalledProcessError as e:
--> 554 raise RuntimeError(repr(e))
555
556 for sdk_file in expected_files:
RuntimeError: CalledProcessError()
I am running (https://github.com/GoogleCloudPlatform/training-data-analyst/blob/master/courses/machine_learning/feateng/tftransform.ipynb)
based on the updates discussed in the previous issue:
But I'm getting the following error:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 642, in do_work
work_executor.execute()
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line 130, in execute
test_shuffle_sink=self._test_shuffle_sink)
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line 104, in create_operation
is_streaming=False)
File "apache_beam/runners/worker/operations.py", line 636, in apache_beam.runners.worker.operations.create_operation
op = create_pgbk_op(name_context, spec, counter_factory, state_sampler)
File "apache_beam/runners/worker/operations.py", line 482, in apache_beam.runners.worker.operations.create_pgbk_op
return PGBKCVOperation(step_name, spec, counter_factory, state_sampler)
File "apache_beam/runners/worker/operations.py", line 538, in apache_beam.runners.worker.operations.PGBKCVOperation.init
fn, args, kwargs = pickler.loads(self.spec.combine_fn)[:3]
File "/usr/local/lib/python2.7/dist-packages/apache_beam/internal/pickler.py", line 246, in loads
return dill.loads(s)
File "/usr/local/lib/python2.7/dist-packages/dill/_dill.py", line 316, in loads
return load(file, ignore)
File "/usr/local/lib/python2.7/dist-packages/dill/_dill.py", line 304, in load
obj = pik.load()
File "/usr/lib/python2.7/pickle.py", line 864, in load
dispatchkey
File "/usr/lib/python2.7/pickle.py", line 1139, in load_reduce
value = func(*args)
TypeError: init() takes exactly 2 arguments (3 given)
apache-airflow==1.9.0
apache-beam==2.8.0
tensorflow==1.9.0
tensorflow-metadata==0.9.0
tensorflow-transform==0.9.0
I also noticed the SDK changed from 2.7 (10/22) to 2.8 (10/26)
In machine_learning/deepdive, simplify the model functions by replacing them with Keras instead of using tf.layers functionality.
@alexhanna @lakshmanok Notice the /tmp/test.json works for the locally trained model but not the cloud trained model. Once the additional features are added the cloud version can do a predict.
Hi there,
just for info. In "Big Data & ML Fundamentals Lab 4: Recommendations ML with Dataproc v1.3" when running a pyspark job with Dataproc, code is running but there are "caught exception". Maybe something to be update in the config. At the end the job run and succeed. Here the full log and the following code is run:
19/02/13 12:25:54 INFO org.spark_project.jetty.util.log: Logging initialized @3300ms
19/02/13 12:25:54 INFO org.spark_project.jetty.server.Server: jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
19/02/13 12:25:54 INFO org.spark_project.jetty.server.Server: Started @3435ms
19/02/13 12:25:54 INFO org.spark_project.jetty.server.AbstractConnector: Started ServerConnector@5a39e97c{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
19/02/13 12:25:54 WARN org.apache.spark.scheduler.FairSchedulableBuilder: Fair Scheduler configuration file not found so jobs will be scheduled in FIFO order. To use fair scheduling, configure pools in fairscheduler.xml or set spark.scheduler.allocation.file to a file that contains the configuration.
19/02/13 12:25:56 INFO org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at cluster-d34e-m/10.128.0.2:8032
19/02/13 12:25:56 INFO org.apache.hadoop.yarn.client.AHSProxy: Connecting to Application History server at cluster-d34e-m/10.128.0.2:10200
19/02/13 12:25:59 WARN org.apache.hadoop.hdfs.DataStreamer: Caught exception
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1252)
at java.lang.Thread.join(Thread.java:1326)
at org.apache.hadoop.hdfs.DataStreamer.closeResponder(DataStreamer.java:980)
at org.apache.hadoop.hdfs.DataStreamer.endBlock(DataStreamer.java:630)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:807)
19/02/13 12:26:00 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl: Submitted application application_1550060283142_0001
19/02/13 12:26:09 WARN org.apache.spark.SparkContext: Spark is not running in local mode, therefore the checkpoint directory must not be on the local filesystem. Directory 'checkpoint/' appears to be on the local filesystem.
read ...
19/02/13 12:26:23 WARN org.apache.hadoop.hdfs.DataStreamer: Caught exception
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1252)
at java.lang.Thread.join(Thread.java:1326)
at org.apache.hadoop.hdfs.DataStreamer.closeResponder(DataStreamer.java:980)
at org.apache.hadoop.hdfs.DataStreamer.endBlock(DataStreamer.java:630)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:807)
trained ...
predicted for user=0
predicted for user=1
predicted for user=2
Thanks
Cheers
Fabien
In the decode_csv() function, it has
features = dict(zip(CSV_COLUMNS, columns)
and CSV_COLUMNS includes the key
column but it does not do:
features.pop('key')
Although `key' is listed as one of the columns in the CSV, shouldn't this column be dropped prior to assigning to the feature set?
Also, this routine prints this message when doing training. Is this a problem?
INFO:tensorflow:'serving_default' : Regression input must be a single string Tensor; got {'passengers': <tf.Tensor 'Placeholder_4:0' shape=(?,) dtype=float32>, 'pickuplon': <tf.Tensor 'Placeholder:0' shape=(?,) dtype=float32>, 'dropofflon': <tf.Tensor 'Placeholder_3:0' shape=(?,) dtype=float32>, 'pickuplat': <tf.Tensor 'Placeholder_1:0' shape=(?,) dtype=float32>, 'dropofflat': <tf.Tensor 'Placeholder_2:0' shape=(?,) dtype=float32>}
INFO:tensorflow:'regression' : Regression input must be a single string Tensor; got {'passengers': <tf.Tensor 'Placeholder_4:0' shape=(?,) dtype=float32>, 'pickuplon': <tf.Tensor 'Placeholder:0' shape=(?,) dtype=float32>, 'dropofflon': <tf.Tensor 'Placeholder_3:0' shape=(?,) dtype=float32>, 'pickuplat': <tf.Tensor 'Placeholder_1:0' shape=(?,) dtype=float32>, 'dropofflat': <tf.Tensor 'Placeholder_2:0' shape=(?,) dtype=float32>}
On the Jupyter Notebook (https://github.com/GoogleCloudPlatform/training-data-analyst/blob/master/blogs/sklearn/babyweight_skl.ipynb), there's a small typo.
In the 3rd coding cell under the section "Packaging up as a Python package", and for the "install_requires" argument, it says 'cloudml-hypertune,'. The comma should go outside of the string and not within the string.
The file generated by the %writefile magic function (babyweight/setup.py) of that coding cell is actually correct, so my guess is that the Jupyter Notebook cell must have been changed after the the file was written.
Thanks Lak for the great tutorials! On my team, we definitely following them carefully and closely.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.