hydrospheredata / hydro-serving Goto Github PK
View Code? Open in Web Editor NEWMLOps Platform
Home Page: http://docs.hydrosphere.io
License: Apache License 2.0
MLOps Platform
Home Page: http://docs.hydrosphere.io
License: Apache License 2.0
For data.json
file with 7.1 mb contents, manager fails to serve model.
Stacktrace:
[ERROR] i.h.s.m.c.ModelServiceController i.h.s.c.ServingDataDirectives$$anonfun$completeExecutionResult$2.apply.25 Serving failed
akka.http.scaladsl.model.EntityStreamSizeException: EntityStreamSizeException: actual entity size (Some(38515045)) exceeded content length limit (8388608 bytes)! You can configure this by setting `akka.http.[server|client].parsing.max-content-length` or calling `HttpEntity.withSizeLimit` before materializing the dataBytes stream.
at akka.http.scaladsl.model.HttpEntity$Limitable$$anon$1.preStart(HttpEntity.scala:609) ~[akka-http-core_2.11-10.0.9.jar:?]
at akka.stream.impl.fusing.GraphInterpreter.init(GraphInterpreter.scala:290) ~[akka-stream_2.11-2.5.3.jar:?]
at akka.stream.impl.fusing.GraphInterpreterShell.init(ActorGraphInterpreter.scala:540) ~[akka-stream_2.11-2.5.3.jar:?]
at akka.stream.impl.fusing.ActorGraphInterpreter.tryInit(ActorGraphInterpreter.scala:659) ~[akka-stream_2.11-2.5.3.jar:?]
at akka.stream.impl.fusing.ActorGraphInterpreter.preStart(ActorGraphInterpreter.scala:707) ~[akka-stream_2.11-2.5.3.jar:?]
at akka.actor.Actor$class.aroundPreStart(Actor.scala:521) ~[akka-actor_2.11-2.5.3.jar:?]
at akka.stream.impl.fusing.ActorGraphInterpreter.aroundPreStart(ActorGraphInterpreter.scala:650) ~[akka-stream_2.11-2.5.3.jar:?]
at akka.actor.ActorCell.create(ActorCell.scala:591) ~[akka-actor_2.11-2.5.3.jar:?]
at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462) ~[akka-actor_2.11-2.5.3.jar:?]
at akka.actor.ActorCell.systemInvoke(ActorCell.scala:484) ~[akka-actor_2.11-2.5.3.jar:?]
at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:282) ~[akka-actor_2.11-2.5.3.jar:?]
at akka.dispatch.Mailbox.run(Mailbox.scala:223) ~[akka-actor_2.11-2.5.3.jar:?]
at akka.dispatch.Mailbox.exec(Mailbox.scala:234) ~[akka-actor_2.11-2.5.3.jar:?]
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [akka-actor_2.11-2.5.3.jar:?]
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [akka-actor_2.11-2.5.3.jar:?]
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [akka-actor_2.11-2.5.3.jar:?]
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [akka-actor_2.11-2.5.3.jar:?]
Note that log says that manager got 38515045 bytes, while data.json size is 7132595 bytes.
Update
This issue shows up when model runtime (in this case VGG 300 TF model) returns huge response.
Set akka.http.[server|client].parsing.max-content-length
parameter to ~100mb.
[INFO ] c.s.d.c.DefaultDockerClient c.s.d.c.DefaultDockerClient.createContainer.635 Creating container with ContainerConfig: ContainerConfig{hostname=null, domainname=null, user=null, attachStdin=null, attachStdout=null, attachStderr=null, portSpecs=null, exposedPorts=null, tty=null, openStdin=null, stdinOnce=null, env=null, cmd=null, image=kek:1, volumes={/model={}}, workingDir=null, entrypoint=null, networkDisabled=null, onBuild=null, labels={MODEL_VERSION_ID=1, SERVICE_ID=0, MODEL_NAME=kek, MODEL_TYPE=spark:2.1, MODEL_VERSION=1, HS_SERVICE_MARKER=HS_SERVICE_MARKER, DEPLOYMENT_TYPE=MODEL}, macAddress=null, hostConfig=null, stopSignal=null, healthcheck=null, networkingConfig=null}
[ERROR] i.h.s.m.ManagerHttpApi i.h.s.m.ManagerHttpApi$$anonfun$1.applyOrElse.74 Request error: POST unix://localhost:80/containers/create?name=s0modelkek: 409, body: {"message":"Conflict. The container name \"/s0modelkek\" is already in use by container \"0905c30d604b1a869ba26404758bedbc121bb8bee2c1823edabe0c591b9270d3\". You have to remove (or rename) that container to be able to reuse that name."}
com.spotify.docker.client.exceptions.DockerRequestException: Request error: POST unix://localhost:80/containers/create?name=s0modelkek: 409, body: {"message":"Conflict. The container name \"/s0modelkek\" is already in use by container \"0905c30d604b1a869ba26404758bedbc121bb8bee2c1823edabe0c591b9270d3\". You have to remove (or rename) that container to be able to reuse that name."}
at com.spotify.docker.client.DefaultDockerClient.propagate(DefaultDockerClient.java:2503) ~[docker-client-8.8.0.jar:8.8.0]
at com.spotify.docker.client.DefaultDockerClient.request(DefaultDockerClient.java:2453) ~[docker-client-8.8.0.jar:8.8.0]
at com.spotify.docker.client.DefaultDockerClient.createContainer(DefaultDockerClient.java:638) ~[docker-client-8.8.0.jar:8.8.0]
at io.hydrosphere.serving.manager.service.clouddriver.LocalCloudDriverService.io$hydrosphere$serving$manager$service$clouddriver$LocalCloudDriverService$$startModel(LocalCloudDriverService.scala:47) ~[classes/:?]
at io.hydrosphere.serving.manager.service.clouddriver.LocalCloudDriverService$$anonfun$deployService$1$$anonfun$3.apply(LocalCloudDriverService.scala:74) ~[classes/:?]
at io.hydrosphere.serving.manager.service.clouddriver.LocalCloudDriverService$$anonfun$deployService$1$$anonfun$3.apply(LocalCloudDriverService.scala:74) ~[classes/:?]
at scala.Option.map(Option.scala:146) ~[scala-library-2.11.11.jar:?]
at io.hydrosphere.serving.manager.service.clouddriver.LocalCloudDriverService$$anonfun$deployService$1.apply(LocalCloudDriverService.scala:74) ~[classes/:?]
at io.hydrosphere.serving.manager.service.clouddriver.LocalCloudDriverService$$anonfun$deployService$1.apply(LocalCloudDriverService.scala:72) ~[classes/:?]
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) ~[scala-library-2.11.11.jar:?]
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run$$$capture(Future.scala:24) ~[scala-library-2.11.11.jar:?]
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala) ~[scala-library-2.11.11.jar:?]
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) [akka-actor_2.11-2.5.8.jar:?]
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:43) [akka-actor_2.11-2.5.8.jar:?]
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [akka-actor_2.11-2.5.8.jar:?]
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [akka-actor_2.11-2.5.8.jar:?]
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [akka-actor_2.11-2.5.8.jar:?]
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [akka-actor_2.11-2.5.8.jar:?]
Caused by: javax.ws.rs.ClientErrorException: HTTP 409 Conflict
at org.glassfish.jersey.client.JerseyInvocation.createExceptionForFamily(JerseyInvocation.java:1044) ~[jersey-client-2.22.2.jar:?]
at org.glassfish.jersey.client.JerseyInvocation.convertToException(JerseyInvocation.java:1027) ~[jersey-client-2.22.2.jar:?]
at org.glassfish.jersey.client.JerseyInvocation.translate(JerseyInvocation.java:816) ~[jersey-client-2.22.2.jar:?]
at org.glassfish.jersey.client.JerseyInvocation.access$700(JerseyInvocation.java:92) ~[jersey-client-2.22.2.jar:?]
at org.glassfish.jersey.client.JerseyInvocation$5.completed(JerseyInvocation.java:773) ~[jersey-client-2.22.2.jar:?]
at org.glassfish.jersey.client.ClientRuntime.processResponse(ClientRuntime.java:198) ~[jersey-client-2.22.2.jar:?]
at org.glassfish.jersey.client.ClientRuntime.access$300(ClientRuntime.java:79) ~[jersey-client-2.22.2.jar:?]
at org.glassfish.jersey.client.ClientRuntime$2.run(ClientRuntime.java:180) ~[jersey-client-2.22.2.jar:?]
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271) ~[jersey-common-2.22.2.jar:?]
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267) ~[jersey-common-2.22.2.jar:?]
at org.glassfish.jersey.internal.Errors.process(Errors.java:315) ~[jersey-common-2.22.2.jar:?]
at org.glassfish.jersey.internal.Errors.process(Errors.java:297) ~[jersey-common-2.22.2.jar:?]
at org.glassfish.jersey.internal.Errors.process(Errors.java:267) ~[jersey-common-2.22.2.jar:?]
at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:340) ~[jersey-common-2.22.2.jar:?]
at org.glassfish.jersey.client.ClientRuntime$3.run(ClientRuntime.java:210) ~[jersey-client-2.22.2.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call$$$capture(Executors.java:511) ~[?:1.8.0_131]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java) ~[?:1.8.0_131]
at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) ~[?:1.8.0_131]
at java.util.concurrent.FutureTask.run(FutureTask.java) ~[?:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_131]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_131]
Step 1 in "How to launch demo" in readme is not reproducible, since build.sh doesn't exist in repo. We need to update this or put a link to RUN_DUMMY_DEMO.MD
Using the 0.0.18 release and vectors come back blank. tfidf and cv_tf should both return vectors.
[
{
"filtered": [
"foo"
],
"UU_EVAL_RESP_FTEXT": "foo",
"tfiddf": [],
"cv_tf": [],
"tokens": [
"foo"
]
}
]
Occasional build errors if model source is S3 bucket.
Sometimes build service couldn't find model directory and it fails.
Need to keep sync between bucket and local cache.
After model is uploaded to S3 source, manager needs to copy it to the local source on every upload request. Since we removed SQS watcher service, now model state in S3 and in local cache is sometimes inconsistent and desynced
[ERROR] a.a.OneForOneStrategy a.e.s.Slf4jLogger$$anonfun$receive$1$$anonfun$applyOrElse$1.apply$mcV$sp.72 key not found: Records
java.util.NoSuchElementException: key not found: Records
at scala.collection.MapLike$class.default(MapLike.scala:228) ~[scala-library-2.11.11.jar:?]
at scala.collection.AbstractMap.default(Map.scala:59) ~[scala-library-2.11.11.jar:?]
at scala.collection.MapLike$class.apply(MapLike.scala:141) ~[scala-library-2.11.11.jar:?]
at scala.collection.AbstractMap.apply(Map.scala:59) ~[scala-library-2.11.11.jar:?]
at io.hydrosphere.serving.manager.actor.modelsource.S3SourceWatcher$SQSMessage$.fromJson(S3SourceWatcher.scala:81) ~[manager.jar:0.0.1]
at io.hydrosphere.serving.manager.actor.modelsource.S3SourceWatcher$$anonfun$2.apply(S3SourceWatcher.scala:25) ~[manager.jar:0.0.1]
at io.hydrosphere.serving.manager.actor.modelsource.S3SourceWatcher$$anonfun$2.apply(S3SourceWatcher.scala:25) ~[manager.jar:0.0.1]
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) ~[scala-library-2.11.11.jar:?]
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) ~[scala-library-2.11.11.jar:?]
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) ~[scala-library-2.11.11.jar:?]
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) ~[scala-library-2.11.11.jar:?]
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) ~[scala-library-2.11.11.jar:?]
at scala.collection.AbstractTraversable.map(Traversable.scala:104) ~[scala-library-2.11.11.jar:?]
at io.hydrosphere.serving.manager.actor.modelsource.S3SourceWatcher.onWatcherTick(S3SourceWatcher.scala:25) ~[manager.jar:0.0.1]
at io.hydrosphere.serving.manager.actor.modelsource.SourceWatcher$$anonfun$watcherTick$1.applyOrElse(SourceWatcher.scala:38) ~[manager.jar:0.0.1]
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) ~[scala-library-2.11.11.jar:?]
at akka.actor.Actor$class.aroundReceive(Actor.scala:513) ~[akka-actor_2.11-2.5.3.jar:?]
at io.hydrosphere.serving.manager.actor.modelsource.S3SourceWatcher.aroundReceive(S3SourceWatcher.scala:17) ~[manager.jar:0.0.1]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:527) [akka-actor_2.11-2.5.3.jar:?]
at akka.actor.ActorCell.invoke(ActorCell.scala:496) [akka-actor_2.11-2.5.3.jar:?]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) [akka-actor_2.11-2.5.3.jar:?]
at akka.dispatch.Mailbox.run(Mailbox.scala:224) [akka-actor_2.11-2.5.3.jar:?]
at akka.dispatch.Mailbox.exec(Mailbox.scala:234) [akka-actor_2.11-2.5.3.jar:?]
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [akka-actor_2.11-2.5.3.jar:?]
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [akka-actor_2.11-2.5.3.jar:?]
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [akka-actor_2.11-2.5.3.jar:?]
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [akka-actor_2.11-2.5.3.jar:?]
manager | APP_OPTS=-Dapplication.grpcPort=9091 -Dapplication.port=9090 -DopenTracing.zipkin.enabled=false -DopenTracing.zipkin.port=9411 -DopenTracing.zipkin.host=zipkin -Dmanager.advertisedHost=manager -Dmanager.advertisedPort=9091 -Ddatabase.jdbcUrl=jdbc:postgresql://postgres:5432/docker -Ddatabase.username=docker -Ddatabase.password=docker -DcloudDriver.docker.networkName=demo_hydronet -DdockerRepository.type=local -Dapplication.shadowingOn=false -Dsidecar.adminPort=8082 -Dsidecar.ingressPort=8080 -Dsidecar.egressPort=8081 -Dsidecar.host=sidecar
postgres | LOG: database system is ready to accept connections
postgres | LOG: autovacuum launcher started
sidecar | [2018-07-26 08:52:31.494][1][info][main] source/server/server.cc:178] initializing epoch 0 (hot restart version=9.200.16384.227.options=capacity=16384, num_slots=8209 hash=228984379728933363)
manager | Archive: /hydro-serving/app/lib/jffi-1.2.9-native.jar
sidecar | [2018-07-26 08:52:31.532][1][warning][upstream] source/common/config/grpc_mux_impl.cc:205] gRPC config stream closed: 1,
sidecar | [2018-07-26 08:52:31.532][1][warning][upstream] source/common/config/grpc_mux_impl.cc:36] Unable to establish new stream
sidecar | [2018-07-26 08:52:31.532][1][info][config] source/server/configuration_impl.cc:52] loading 0 listener(s)
sidecar | [2018-07-26 08:52:31.532][1][info][config] source/server/configuration_impl.cc:92] loading tracing configuration
sidecar | [2018-07-26 08:52:31.532][1][info][config] source/server/configuration_impl.cc:119] loading stats sink configuration
sidecar | [2018-07-26 08:52:31.533][1][info][main] source/server/server.cc:353] starting main dispatch loop
sidecar | [2018-07-26 08:52:31.536][1][info][upstream] source/common/upstream/cluster_manager_impl.cc:127] cm init: initializing cds
manager | creating: jni/x86_64-Linux/
manager | inflating: jni/x86_64-Linux/libjffi-1.2.so
managerui | rm: can't remove '/etc/nginx/conf.d/default.conf': No such file or directory
managerui exited with code 1
Team,
Deos the Hydro Serving supports the Cloudera CDH?
Thanks
Vijay
{
"name": "hydrosphere/serving-grpc-runtime-spark-2_1",
"version": "0.0.1",
"modelTypes": [
"spark:2.1"
],
"tags": [
"string"
],
"configParams": {}
}
{
"modelId": 14
}
{
"id": 0,
"name": "testapp",
"executionGraph": {
"stages": [
{
"services": [
{
"serviceDescription": {
"runtimeId": 2,
"modelVersionId": 1,
"environmentId": 0
},
"weight": 100
}
],
"signatureName": "string"
}
]
}
}
Result:
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null,Step 1/6 : FROM busybox:1.28.0,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null,
,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null, ---> 5b0d59026729
,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null,Step 2/6 : LABEL MODEL_TYPE=spark:2.1,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null,
,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null, ---> Running in b5f914e59880
,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null,Removing intermediate container b5f914e59880
,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null, ---> df1966d1a6e8
,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null,Step 3/6 : LABEL MODEL_NAME=word2vec,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null,
,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null, ---> Running in 2172c7a57456
,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null,Removing intermediate container 2172c7a57456
,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null, ---> 5d32ba9e6aff
,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null,Step 4/6 : LABEL MODEL_VERSION=None,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null,
,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null, ---> Running in 877af9e80f15
,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null,Removing intermediate container 877af9e80f15
,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null, ---> b007c4cec8f3
,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null,Step 5/6 : VOLUME /model,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null,
,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null, ---> Running in e430f440c583
,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null,Removing intermediate container e430f440c583
,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null, ---> ede9188f3e2a
,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null,Step 6/6 : ADD model /model,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null,
,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null, ---> 8739a8be79c8
,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null,null,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null,Successfully built 8739a8be79c8
,null,null,None)
[INFO ] i.h.s.m.s.ModelManagementServiceImpl i.h.s.m.s.ModelManagementServiceImpl$$anon$1.handle.257 ProgressMessage(null,null,Successfully tagged word2vec:1
,null,null,None)
[INFO ] c.s.d.c.DefaultDockerClient c.s.d.c.DefaultDockerClient.createContainer.635 Creating container with ContainerConfig: ContainerConfig{hostname=null, domainname=null, user=null, attachStdin=null, attachStdout=null, attachStderr=null, portSpecs=null, exposedPorts=null, tty=null, openStdin=null, stdinOnce=null, env=null, cmd=null, image=word2vec:1, volumes={/model={}}, workingDir=null, entrypoint=null, networkDisabled=null, onBuild=null, labels={MODEL_VERSION_ID=1, SERVICE_ID=1, MODEL_NAME=word2vec, MODEL_TYPE=spark:2.1, MODEL_VERSION=1, HS_SERVICE_MARKER=HS_SERVICE_MARKER, DEPLOYMENT_TYPE=MODEL}, macAddress=null, hostConfig=null, stopSignal=null, healthcheck=null, networkingConfig=null}
[INFO ] c.s.d.c.DefaultDockerClient c.s.d.c.DefaultDockerClient.createContainer.635 Creating container with ContainerConfig: ContainerConfig{hostname=null, domainname=null, user=null, attachStdin=null, attachStdout=null, attachStderr=null, portSpecs=null, exposedPorts=[9091], tty=null, openStdin=null, stdinOnce=null, env=[SIDECAR_PORT=8081, SERVICE_ID=1, MODEL_DIR=/model, SIDECAR_HOST=192.168.90.61, APP_PORT=9091], cmd=null, image=hydrosphere/serving-grpc-runtime-spark-2_1:0.0.1, volumes={}, workingDir=null, entrypoint=null, networkDisabled=null, onBuild=null, labels={SERVICE_ID=1, RUNTIME_ID=2, SERVICE_NAME=r2m1e0, HS_SERVICE_MARKER=HS_SERVICE_MARKER, DEPLOYMENT_TYPE=APP}, macAddress=null, hostConfig=HostConfig{binds=null, blkioWeight=null, blkioWeightDevice=null, blkioDeviceReadBps=null, blkioDeviceWriteBps=null, blkioDeviceReadIOps=null, blkioDeviceWriteIOps=null, containerIdFile=null, lxcConf=null, privileged=null, portBindings={9091=[PortBinding{hostIp=0.0.0.0, hostPort=}]}, links=null, publishAllPorts=null, dns=null, dnsOptions=null, dnsSearch=null, extraHosts=null, volumesFrom=[s1modelword2vec], capAdd=null, capDrop=null, networkMode=null, securityOpt=null, devices=null, memory=null, memorySwap=null, memorySwappiness=null, memoryReservation=null, nanoCpus=null, cpuPeriod=null, cpuShares=null, cpusetCpus=null, cpusetMems=null, cpuQuota=null, cgroupParent=null, restartPolicy=null, logConfig=null, ipcMode=null, ulimits=null, pidMode=null, shmSize=null, oomKillDisable=null, oomScoreAdj=null, autoRemove=null, pidsLimit=null, tmpfs=null, readonlyRootfs=null, storageOpt=null}, stopSignal=null, healthcheck=null, networkingConfig=null}
[INFO ] c.s.d.c.DefaultDockerClient c.s.d.c.DefaultDockerClient.startContainer.657 Starting container with Id: 853ad5441d894760c17d2aedea93a011c97da1c93d4a5033afd643bd2df2272b
[ERROR] i.h.s.m.ManagerHttpApi i.h.s.m.ManagerHttpApi$$anonfun$1.applyOrElse.74 empty.head
java.lang.UnsupportedOperationException: empty.head
at scala.collection.immutable.Vector.head(Vector.scala:193) ~[scala-library-2.11.11.jar:?]
at io.hydrosphere.serving.manager.service.ApplicationManagementServiceImpl$$anonfun$inferAppContract$1.apply(ApplicationManagementService.scala:306) ~[classes/:?]
at io.hydrosphere.serving.manager.service.ApplicationManagementServiceImpl$$anonfun$inferAppContract$1.apply(ApplicationManagementService.scala:305) ~[classes/:?]
at scala.util.Success$$anonfun$map$1.apply(Try.scala:237) ~[scala-library-2.11.11.jar:?]
at scala.util.Try$.apply(Try.scala:192) ~[scala-library-2.11.11.jar:?]
at scala.util.Success.map(Try.scala:237) ~[scala-library-2.11.11.jar:?]
at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:237) ~[scala-library-2.11.11.jar:?]
at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:237) ~[scala-library-2.11.11.jar:?]
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) [scala-library-2.11.11.jar:?]
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55) [akka-actor_2.11-2.5.8.jar:?]
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91) [akka-actor_2.11-2.5.8.jar:?]
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) [akka-actor_2.11-2.5.8.jar:?]
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) [akka-actor_2.11-2.5.8.jar:?]
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72) [scala-library-2.11.11.jar:?]
at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90) [akka-actor_2.11-2.5.8.jar:?]
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) [akka-actor_2.11-2.5.8.jar:?]
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:43) [akka-actor_2.11-2.5.8.jar:?]
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [akka-actor_2.11-2.5.8.jar:?]
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [akka-actor_2.11-2.5.8.jar:?]
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [akka-actor_2.11-2.5.8.jar:?]
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [akka-actor_2.11-2.5.8.jar:?]
When I try to recreate the same application:
[ERROR] i.h.s.m.ManagerHttpApi i.h.s.m.ManagerHttpApi$$anonfun$1.applyOrElse.74 ERROR: duplicate key value violates unique constraint "service_service_name_key"
Detail: Key (service_name)=(r2m1e0) already exists.
org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "service_service_name_key"
Detail: Key (service_name)=(r2m1e0) already exists.
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2477) ~[postgresql-42.1.4.jar:42.1.4]
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2190) ~[postgresql-42.1.4.jar:42.1.4]
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:300) ~[postgresql-42.1.4.jar:42.1.4]
at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:428) ~[postgresql-42.1.4.jar:42.1.4]
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354) ~[postgresql-42.1.4.jar:42.1.4]
at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:169) ~[postgresql-42.1.4.jar:42.1.4]
at org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:136) ~[postgresql-42.1.4.jar:42.1.4]
at com.zaxxer.hikari.pool.ProxyPreparedStatement.executeUpdate(ProxyPreparedStatement.java:61) ~[HikariCP-2.6.3.jar:?]
at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeUpdate(HikariProxyPreparedStatement.java) ~[HikariCP-2.6.3.jar:?]
at slick.jdbc.JdbcActionComponent$InsertActionComposerImpl$SingleInsertAction$$anonfun$run$8.apply(JdbcActionComponent.scala:509) ~[slick_2.11-3.2.1.jar:?]
at slick.jdbc.JdbcActionComponent$InsertActionComposerImpl$SingleInsertAction$$anonfun$run$8.apply(JdbcActionComponent.scala:506) ~[slick_2.11-3.2.1.jar:?]
at slick.jdbc.JdbcBackend$SessionDef$class.withPreparedInsertStatement(JdbcBackend.scala:378) ~[slick_2.11-3.2.1.jar:?]
at slick.jdbc.JdbcBackend$BaseSession.withPreparedInsertStatement(JdbcBackend.scala:433) ~[slick_2.11-3.2.1.jar:?]
at slick.jdbc.JdbcActionComponent$ReturningInsertActionComposerImpl.preparedInsert(JdbcActionComponent.scala:638) ~[slick_2.11-3.2.1.jar:?]
at slick.jdbc.JdbcActionComponent$InsertActionComposerImpl$SingleInsertAction.run(JdbcActionComponent.scala:506) ~[slick_2.11-3.2.1.jar:?]
at slick.jdbc.JdbcActionComponent$SimpleJdbcProfileAction.run(JdbcActionComponent.scala:29) ~[slick_2.11-3.2.1.jar:?]
at slick.jdbc.JdbcActionComponent$SimpleJdbcProfileAction.run(JdbcActionComponent.scala:26) ~[slick_2.11-3.2.1.jar:?]
at slick.basic.BasicBackend$DatabaseDef$$anon$2.liftedTree1$1(BasicBackend.scala:242) ~[slick_2.11-3.2.1.jar:?]
at slick.basic.BasicBackend$DatabaseDef$$anon$2.run(BasicBackend.scala:242) ~[slick_2.11-3.2.1.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_131]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_131]
Return all apllications for a specific ModelVersion
Return all signatures for a specific ModelVersion
Return all datatypes
There is an error with body object of PUT method. Should rename key value from "KafkaStream" to "KafkaStreaming".
When we do GET request to /api/v1/model
- lastModelVersion
must returns correct modelVersion
field.
The UI is calling a Rest end-point that no longer around. It is trying to call ui/v1/model/withInfo. It looks likes the UI controller files were removed after build .18. any ideas on how to get this fixed ?
Could you please update readme.md, because it provides inconsistent information about how to launch ML Lambda. Your docker-compose
only contains postgres+manager whereas standalone commands start a few other services too. Docker-compose
should contain everything, so new users could get up and running quickly.
Also, you should probably use docker network instead of exposing everything via host network (you put HOST_IP
environment variable in docker-compose.yml
file - why?)
One last thing is that you could just push everything to Docker HUB, so new users will not be forced to build everything on their machines.
API returns that nextVersion available for all builded models after model update.
the code keeps hanging on on this line.
package io.hydrosphere.serving.manager.service.clouddriver
val containerApp = map.getOrElse(DEPLOYMENT_TYPE_APP, throw new RuntimeException(s"Can't find APP for service $serviceId in $seq"))
val containerModel = map.get(DEPLOYMENT_TYPE_MODEL)
here are the logs
Detected a modification of naivebayes model ...
2018-02-02T02:04:08.904425412Z [ERROR] a.a.OneForOneStrategy a.e.s.Slf4jLogger$$anonfun$receive$1$$anonfun$applyOrElse$1.apply$mcV$sp.69 Can't find APP for service -20 in ArrayBuffer(Container{id=943746c801f22a7327046483fe2e902cd7a140e81d2c30b0b60237668662f6f5, names=[/manager], image=hydrosphere/serving-manager:latest, imageId=sha256:5b64769a1afaec201315228f129c68f9d748d17dcb8892336c1265b6d4995f5a, command=/hydro-serving/app/start.sh, created=1517537030, state=running, status=Up 15 seconds, ports=[PortMapping{privatePort=8080, publicPort=8080, type=tcp, ip=0.0.0.0}, PortMapping{privatePort=8081, publicPort=8081, type=tcp, ip=0.0.0.0}, PortMapping{privatePort=8082, publicPort=8082, type=tcp, ip=0.0.0.0}, PortMapping{privatePort=9090, publicPort=9090, type=tcp, ip=0.0.0.0}, PortMapping{privatePort=9091, publicPort=9091, type=tcp, ip=0.0.0.0}], labels={APP=dev, HS_SERVICE_MARKER=HS_SERVICE_MARKER, MODEL=dev, MODEL_NAME=manager, MODEL_VERSION=latest, RUNTIME_TYPE_NAME=hysroserving-java, RUNTIME_TYPE_VERSION=latest, com.docker.compose.config-hash=4f0db1769543bf239bc78c5d86bec95419c1333fcf20860c1c3d919a4b84c3a3, com.docker.compose.container-number=1, com.docker.compose.oneoff=False, com.docker.compose.project=automation, com.docker.compose.service=manager, com.docker.compose.version=1.18.0, hydroServingServiceId=-20}, sizeRw=null, sizeRootFs=null, networkSettings=NetworkSettings{ipAddress=null, ipPrefixLen=null, gateway=null, bridge=null, portMapping=null, ports={}, macAddress=null, networks={automation_extnet=AttachedNetwork{aliases=null, networkId=691539ea57f31850a906f68150bcfeb7b6df61185ed71fb22fe20a8e1979927d, endpointId=172441d34b4990d656433199d99d1fa8b2ddec1d2cbf1ab05d391939486c975f, gateway=172.18.0.1, ipAddress=172.18.0.3, ipPrefixLen=16, ipv6Gateway=, globalIPv6Address=, globalIPv6PrefixLen=0, macAddress=02:42:ac:12:00:03}, automation_hydronet=AttachedNetwork{aliases=null, networkId=0946e51564a0333f2444f003060769f38bb5fc2cc541e1199a82f3aeb4aefe9a, endpointId=edec4e1ab6ef73e279cd22dd9c785554c10ee8a5cd87f0bace72f2d899f30b81, gateway=172.16.0.1, ipAddress=172.16.0.5, ipPrefixLen=24, ipv6Gateway=, globalIPv6Address=, globalIPv6PrefixLen=0, macAddress=02:42:ac:10:00:05}}, endpointId=null, sandboxId=null, sandboxKey=null, hairpinMode=null, linkLocalIPv6Address=null, linkLocalIPv6PrefixLen=null, globalIPv6Address=null, globalIPv6PrefixLen=null, ipv6Gateway=null}, mounts=[ContainerMount{type=bind, name=null, source=/opt/hydro-serving/integrations/automation/hydro-serving-runtime/models, destination=/models, driver=null, mode=rw, rw=true, propagation=rprivate}, ContainerMount{type=bind, name=null, source=/var/run/docker.sock, destination=/var/run/docker.sock, driver=null, mode=rw, rw=true, propagation=rprivate}]})
�z2018-02-02T02:04:08.904474888Z akka.actor.ActorInitializationException: akka://manager/user/$e: exception during creation
�2018-02-02T02:04:08.904478772Z at akka.actor.ActorInitializationException$.apply(Actor.scala:193) ~[akka-actor_2.11-2.5.8.jar:?]
�s2018-02-02T02:04:08.904481451Z at akka.actor.ActorCell.create(ActorCell.scala:608) ~[akka-actor_2.11-2.5.8.jar:?]
�w2018-02-02T02:04:08.904483738Z at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462) [akka-actor_2.11-2.5.8.jar:?]
�x2018-02-02T02:04:08.904485976Z at akka.actor.ActorCell.systemInvoke(ActorCell.scala:484) [akka-actor_2.11-2.5.8.jar:?]
�2018-02-02T02:04:08.904488321Z at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:282) [akka-actor_2.11-2.5.8.jar:?]
�n2018-02-02T02:04:08.904490581Z at akka.dispatch.Mailbox.run(Mailbox.scala:223) [akka-actor_2.11-2.5.8.jar:?]
�o2018-02-02T02:04:08.904492783Z at akka.dispatch.Mailbox.exec(Mailbox.scala:234) [akka-actor_2.11-2.5.8.jar:?]
�2018-02-02T02:04:08.904495002Z at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [akka-actor_2.11-2.5.8.jar:?]
�2018-02-02T02:04:08.904497252Z at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [akka-actor_2.11-2.5.8.jar:?]
�2018-02-02T02:04:08.904499506Z at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [akka-actor_2.11-2.5.8.jar:?]
�2018-02-02T02:04:08.904501736Z at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [akka-actor_2.11-2.5.8.jar:?]
I get this error over and over when trying to start the gateway service.
018-01-22T05:45:32.966950578Z [ERROR] io.hydrosphere.serving.gateway.actor.PipelineSynchronizeActor Unsupported Content-Type, supported: application/json WARNING arguments left: 1
As in the current version, manager doesn't have a unified error handling.
It'll be better to always return JSONs (as for now, it's mixed) and handle error gracefully.
If there are containers, that are using already reserved names, manager can't create an application. In addition to that, manager creates a db entries for incorrect containers, and it messes with whole application API.
There is hanging build in db that has running state, but the actual Future that builds it is no longer present. Thus, you can't build a model with such name and version.
Need to somehow handle such cases. Possible solutions:
Consider pipeline with features[-1] -> indexedFeatures[5] -> prediction
Model version number stuck on 2
.
[INFO ] i.h.s.m.s.m.b.InfoProgressHandler$ i.h.s.m.s.m.b.InfoProgressHandler$.handle.37 ProgressMessage(null,null, ---> 5608e9099050
,null,null,None)
[INFO ] i.h.s.m.s.m.b.InfoProgressHandler$ i.h.s.m.s.m.b.InfoProgressHandler$.handle.37 ProgressMessage(null,null,null,null,null,None)
[INFO ] i.h.s.m.s.m.b.InfoProgressHandler$ i.h.s.m.s.m.b.InfoProgressHandler$.handle.37 ProgressMessage(null,null,Successfully built 5608e9099050
,null,null,None)
[INFO ] i.h.s.m.s.m.b.InfoProgressHandler$ i.h.s.m.s.m.b.InfoProgressHandler$.handle.37 ProgressMessage(null,null,Successfully tagged ks-test:2
,null,null,None)
Third upload
[INFO ] i.h.s.m.s.m.b.InfoProgressHandler$ i.h.s.m.s.m.b.InfoProgressHandler$.handle.37 ProgressMessage(null,null,null,null,null,None)
[INFO ] i.h.s.m.s.m.b.InfoProgressHandler$ i.h.s.m.s.m.b.InfoProgressHandler$.handle.37 ProgressMessage(null,null,Successfully built fcfe8947eb60
,null,null,None)
[INFO ] i.h.s.m.s.m.b.InfoProgressHandler$ i.h.s.m.s.m.b.InfoProgressHandler$.handle.37 ProgressMessage(null,null,Successfully tagged ks-test:2
,null,null,None)
hs upload --host localhost --port 9090
default
hydrosphere/serving-sidecar:latest
[2018-06-08 08:40:55.761][1][info][main] source/server/drain_manager_impl.cc:63] shutting down parent after drain
[2018-06-08 08:42:18.424][1][warning][upstream] source/common/config/grpc_mux_impl.cc:205] gRPC config stream closed: 13,
[2018-06-08 08:42:18.424][1][warning][config] bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_mux_subscription_lib/common/config/grpc_mux_subscription_impl.h:66] gRPC update for type.googleapis.com/envoy.api.v2.RouteConfiguration failed
[2018-06-08 08:42:18.424][1][warning][config] bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_mux_subscription_lib/common/config/grpc_mux_subscription_impl.h:66] gRPC update for type.googleapis.com/envoy.api.v2.Listener failed
[2018-06-08 08:42:18.424][1][warning][config] bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_mux_subscription_lib/common/config/grpc_mux_subscription_impl.h:66] gRPC update for type.googleapis.com/envoy.api.v2.ClusterLoadAssignment failed
[2018-06-08 08:42:18.424][1][warning][config] bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_mux_subscription_lib/common/config/grpc_mux_subscription_impl.h:66] gRPC update for type.googleapis.com/envoy.api.v2.ClusterLoadAssignment failed
[2018-06-08 08:42:18.424][1][warning][config] bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_mux_subscription_lib/common/config/grpc_mux_subscription_impl.h:66] gRPC update for type.googleapis.com/envoy.api.v2.ClusterLoadAssignment failed
[2018-06-08 08:42:18.424][1][warning][config] bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_mux_subscription_lib/common/config/grpc_mux_subscription_impl.h:66] gRPC update for type.googleapis.com/envoy.api.v2.ClusterLoadAssignment failed
[2018-06-08 08:42:18.424][1][warning][config] bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_mux_subscription_lib/common/config/grpc_mux_subscription_impl.h:66] gRPC update for type.googleapis.com/envoy.api.v2.ClusterLoadAssignment failed
[2018-06-08 08:42:18.424][1][warning][config] bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_mux_subscription_lib/common/config/grpc_mux_subscription_impl.h:66] gRPC update for type.googleapis.com/envoy.api.v2.Cluster failed
[2018-06-08 08:48:12.934][1][info][upstream] source/common/upstream/cluster_manager_impl.cc:388] add/update cluster r3m4e0 starting warming
[2018-06-08 08:48:12.936][1][info][upstream] source/common/upstream/cluster_manager_impl.cc:395] warming cluster r3m4e0 complete
[2018-06-08 08:50:51.626][1][info][upstream] source/common/upstream/cluster_manager_impl.cc:437] removing cluster r3m4e0
[2018-06-08 08:51:08.379][1][info][upstream] source/common/upstream/cluster_manager_impl.cc:388] add/update cluster r3m5e0 starting warming
[2018-06-08 08:51:08.382][1][info][upstream] source/common/upstream/cluster_manager_impl.cc:395] warming cluster r3m5e0 complete
[2018-06-08 08:56:18.631][1][info][upstream] source/common/upstream/cluster_manager_impl.cc:437] removing cluster r3m5e0
[2018-06-08 08:56:31.216][1][info][upstream] source/common/upstream/cluster_manager_impl.cc:388] add/update cluster r3m6e0 starting warming
[2018-06-08 08:56:31.218][1][info][upstream] source/common/upstream/cluster_manager_impl.cc:395] warming cluster r3m6e0 complete
[2018-06-08 08:57:12.003][1][info][upstream] source/common/upstream/cluster_manager_impl.cc:437] removing cluster r3m6e0
docker run -e MANAGER_HOST=$HOST_IP -e HOST_PRIVATE_IP=$HOST_IP \
-e MANAGER_PORT=9091 \
-e SERVICE_ID=-20 \
-e SERVICE_NAME="manager" \
-p 8080:8080 -p 8081:8081 -p 8082:8082 \
hydrosphere/serving-sidecar:latest
{
"name": "hydrosphere/serving-runtime-python",
"version": "3.6-latest",
"modelTypes": [
"string"
],
"tags": [
"string"
],
"configParams": {}
}
INFO:PythonRuntimeService:Received inference request: model_spec {
name: "KblK"
signature_name: "ks_test"
}
inputs {
key: "distribution"
value {
dtype: DT_STRING
string_val: "normal"
}
}
inputs {
key: "distributionSample"
value {
dtype: DT_DOUBLE
tensor_shape {
dim {
size: -1
}
}
double_val: 1.0
}
}
inputs {
key: "sample"
value {
dtype: DT_DOUBLE
tensor_shape {
dim {
size: -1
}
}
double_val: 1.0
}
}
INFO:PythonRuntimeService:Answer: outputs {
key: "ksStatistics"
value {
dtype: DT_DOUBLE
double_val: 0.895
}
}
outputs {
key: "pValue"
value {
dtype: DT_DOUBLE
double_val: 0.17860426969764479
}
}
outputs {
key: "rejectionLevel"
value {
dtype: DT_DOUBLE
double_val: 1.3633957605919127
}
}
Was not able to reach UI due to unexposed port in serving-manager.
I've just added to fix in local
ports:
- "9090:9090"
Building big models might take certain time. On building stage, the ability to update corresponding app should be discarded.
For maximum CPU utilization TF can compiled for SSE4.2 and AVX instruction set.
But in this case we must check if given ec instance supports the runtime.
In some cases, files aren't copied to the runtime with preserved directory structure.
In some cases Spark detects checksum inequality in files.
Can't serve an application if signature name contains /
e.g. tensorflow/serving/predict
Steps to reproduce:
com.spotify.docker.client.exceptions.DockerRequestException:
Request error: POST unix://localhost:80/containers/create?name=binarizer_0-0-1: 409,
body: {"message":"Conflict. The container name \"/binarizer_0-0-1\" is already in use by container \"d60aff912cdce7106249b7fc1d8b5707c398492bd3027d255bfae372bcab851e\".
You have to remove (or rename) that container to be able to reuse that name."}
If runtime wasn't set correctly on creating application, it might cause problems with corresponding docker container. Invoking that application will likely fail, but moreover, user will not know, what happened, as there is no proper warning message. After invocation, runtime container will be dropped(removed).
Need to check if a model with similar version is already building.
It seems that manager writes model info to db before container actually starts.
Failed deploy:
[ERROR] i.h.s.m.ManagerApi i.h.s.m.ManagerApi$$anonfun$1.applyOrElse.86 Request error: POST unix://localhost:80/containers/create?name=ssd_bulat_0.0.1: 409, body: {"message":"Conflict. The container name \"/ssd_bulat_0.0.1\" is already in use by container \"0ece243521fe3312233ce11c0509d547325c0db245a99875da6575bcdf7d985d\". You have to remove (or rename) that container to be able to reuse that name."}
com.spotify.docker.client.exceptions.DockerRequestException: Request error: POST unix://localhost:80/containers/create?name=ssd_bulat_0.0.1: 409, body: {"message":"Conflict. The container name \"/ssd_bulat_0.0.1\" is already in use by container \"0ece243521fe3312233ce11c0509d547325c0db245a99875da6575bcdf7d985d\". You have to remove (or rename) that container to be able to reuse that name."}
at com.spotify.docker.client.DefaultDockerClient.propagate(DefaultDockerClient.java:2503) ~[docker-client-8.8.0.jar:8.8.0]
Failed retry:
[ERROR] i.h.s.m.ManagerApi i.h.s.m.ManagerApi$$anonfun$1.applyOrElse.86 ERROR: duplicate key value violates unique constraint "model_service_service_name_key"
Detail: Key (service_name)=(ssd_bulat_0.0.1) already exists.
org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "model_service_service_name_key"
Detail: Key (service_name)=(ssd_bulat_0.0.1) already exists.
Right now we can get modelRuntimes for a given model using /api/v1/modelRuntime/{modelId}/last
endpoint, but it's limited by a required 'maximum' query parameter, and we don't know how many modelRuntimes exist for this model, so we can't set maximum.
We need to either add some kind of pagination to this endpoint, or create a new endpoint that will return all modelRuntimes for a given model.
hello , I get a issue when exec "docker-compose up" on manager project, can you have a look, thanks very much
the logs as follow:
source/common/upstream/cluster_manager_impl.cc:388] add/update cluster manager-http starting warming
manager | [2018-07-27 08:14:10.570][ERROR] a.a.OneForOneStrategy a.e.s.Slf4jLogger$$anonfun$receive$1.$anonfun$applyOrElse$1.69 null
manager | java.lang.NullPointerException: null
manager | at com.google.protobuf.Utf8.encodedLength(Utf8.java:251) ~[protobuf-java-3.5.1.jar:?]
manager | at com.google.protobuf.CodedOutputStream.computeStringSizeNoTag(CodedOutputStream.java:866) ~[protobuf-java-3.5.1.jar:?]
manager | at com.google.protobuf.CodedOutputStream.computeStringSize(CodedOutputStream.java:626) ~[protobuf-java-3.5.1.jar:?]
manager | at envoy.api.v2.core.SocketAddress.__computeSerializedValue(SocketAddress.scala:42) ~[envoy-data-plane-api_2.12-v1.6.0_1.jar:v1.6.0_1]
manager | at envoy.api.v2.core.SocketAddress.serializedSize(SocketAddress.scala:52) ~[envoy-data-plane-api_2.12-v1.6.0_1.jar:v1.6.0_1]
manager | at envoy.api.v2.core.Address.__computeSerializedValue(Address.scala:20) ~[envoy-data-plane-api_2.12-v1.6.0_1.jar:v1.6.0_1]
manager | at envoy.api.v2.core.Address.serializedSize(Address.scala:27) ~[envoy-data-plane-api_2.12-v1.6.0_1.jar:v1.6.0_1]
manager | at envoy.api.v2.endpoint.Endpoint.__computeSerializedValue(Endpoint.scala:18) ~[envoy-data-plane-api_2.12-v1.6.0_1.jar:v1.6.0_1]
manager | at envoy.api.v2.endpoint.Endpoint.serializedSize(Endpoint.scala:24) ~[envoy-data-plane-api_2.12-v1.6.0_1.jar:v1.6.0_1]
manager | at envoy.api.v2.endpoint.LbEndpoint.__computeSerializedValue(LbEndpoint.scala:49) ~[envoy-data-plane-api_2.12-v1.6.0_1.jar:v1.6.0_1]
manager | at envoy.api.v2.endpoint.LbEndpoint.serializedSize(LbEndpoint.scala:58) ~[envoy-data-plane-api_2.12-v1.6.0_1.jar:v1.6.0_1]
manager | at envoy.api.v2.endpoint.LocalityLbEndpoints.$anonfun$__computeSerializedValue$1(LocalityLbEndpoints.scala:57) ~[envoy-data-plane-api_2.12-v1.6.0_1.jar:v1.6.0_1]
manager | at envoy.api.v2.endpoint.LocalityLbEndpoints.$anonfun$__computeSerializedValue$1$adapted(LocalityLbEndpoints.scala:57) ~[envoy-data-plane-api_2.12-v1.6.0_1.jar:v1.6.0_1]
manager | at scala.collection.Iterator.foreach(Iterator.scala:944) ~[scala-library.jar:?]
manager | at scala.collection.Iterator.foreach$(Iterator.scala:944) ~[scala-library.jar:?]
manager | at scala.collection.AbstractIterator.foreach(Iterator.scala:1432) ~[scala-library.jar:?]
manager | at scala.collection.IterableLike.foreach(IterableLike.scala:71) ~[scala-library.jar:?]
manager | at scala.collection.IterableLike.foreach$(IterableLike.scala:70) ~[scala-library.jar:?]
manager | at scala.collection.AbstractIterable.foreach(Iterable.scala:54) ~[scala-library.jar:?]
manager | at envoy.api.v2.endpoint.LocalityLbEndpoints.__computeSerializedValue(LocalityLbEndpoints.scala:57) ~[envoy-data-plane-api_2.12-v1.6.0_1.jar:v1.6.0_1]
manager | at envoy.api.v2.endpoint.LocalityLbEndpoints.serializedSize(LocalityLbEndpoints.scala:65) ~[envoy-data-plane-api_2.12-v1.6.0_1.jar:v1.6.0_1]
manager | at envoy.api.v2.ClusterLoadAssignment.$anonfun$__computeSerializedValue$1(ClusterLoadAssignment.scala:38) ~[envoy-data-plane-api_2.12-v1.6.0_1.jar:v1.6.0_1]
manager | at envoy.api.v2.ClusterLoadAssignment.$anonfun$__computeSerializedValue$1$adapted(ClusterLoadAssignment.scala:38) ~[envoy-data-plane-api_2.12-v1.6.0_1.jar:v1.6.0_1]
manager | at scala.collection.immutable.List.foreach(List.scala:389) ~[scala-library.jar:?]
manager | at envoy.api.v2.ClusterLoadAssignment.__computeSerializedValue(ClusterLoadAssignment.scala:38) ~[envoy-data-plane-api_2.12-v1.6.0_1.jar:v1.6.0_1]
manager | at envoy.api.v2.ClusterLoadAssignment.serializedSize(ClusterLoadAssignment.scala:45) ~[envoy-data-plane-api_2.12-v1.6.0_1.jar:v1.6.0_1]
manager | at scalapb.GeneratedMessage.toByteString(GeneratedMessageCompanion.scala:140) ~[scalapb-runtime_2.12-0.7.4.jar:0.7.4]
manager | at scalapb.GeneratedMessage.toByteString$(GeneratedMessageCompanion.scala:139) ~[scalapb-runtime_2.12-0.7.4.jar:0.7.4]
manager | at envoy.api.v2.ClusterLoadAssignment.toByteString(ClusterLoadAssignment.scala:28) ~[envoy-data-plane-api_2.12-v1.6.0_1.jar:v1.6.0_1]
manager | at scalapb.AnyCompanionMethods.pack(AnyMethods.scala:35) ~[scalapb-runtime_2.12-0.7.4.jar:0.7.4]
manager | at scalapb.AnyCompanionMethods.pack$(AnyMethods.scala:30) ~[scalapb-runtime_2.12-0.7.4.jar:0.7.4]
manager | at com.google.protobuf.any.Any$.pack(Any.scala:183) ~[scalapb-runtime_2.12-0.7.4.jar:0.7.4]
manager | at scalapb.AnyCompanionMethods.pack(AnyMethods.scala:28) ~[scalapb-runtime_2.12-0.7.4.jar:0.7.4]
manager | at scalapb.AnyCompanionMethods.pack$(AnyMethods.scala:27) ~[scalapb-runtime_2.12-0.7.4.jar:0.7.4]
manager | at com.google.protobuf.any.Any$.pack(Any.scala:183) ~[scalapb-runtime_2.12-0.7.4.jar:0.7.4]
manager | at ### ### io.hydrosphere.serving.manager.service.envoy.xds.AbstractDSActor.$anonfun$sendToObserver$1(AbstractDSActor.scala:63) ~[manager.jar:0.0.24-SNAPSHOT]
manager | at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:234) ~[scala-library.jar:?]
manager | at scala.collection.immutable.List.foreach(List.scala:389) ~[scala-library.jar:?]
manager | at scala.collection.TraversableLike.map(TraversableLike.scala:234) ~[scala-library.jar:?]
manager | at scala.collection.TraversableLike.map$(TraversableLike.scala:227) ~[scala-library.jar:?]
manager | at scala.collection.immutable.List.map(List.scala:295) ~[scala-library.jar:?]
manager | at io.hydrosphere.serving.manager.service.envoy.xds.AbstractDSActor.io$hydrosphere$serving$manager$service$envoy$xds$AbstractDSActor$$sendToObserver(AbstractDSActor.scala:63) ~[manager.jar:0.0.24-SNAPSHOT]
manager | at io.hydrosphere.serving.manager.service.envoy.xds.AbstractDSActor$$anonfun$receive$1.applyOrElse(AbstractDSActor.scala:90) ~[manager.jar:0.0.24-SNAPSHOT]
Does it support cassandra?
@AbhayAg I've removed template text from your message
Backend doesn't delete "kafkaStreaming" after application update method.
I am using the latest code.
manager | [ERROR] i.h.s.m.ManagerHttpApi i.h.s.m.ManagerHttpApi$$anonfun$1.applyOrElse.76 null
manager | java.lang.RuntimeException: null
manager | at io.hydrosphere.serving.manager.service.prometheus.PrometheusMetricsServiceImpl.fetchServices(PrometheusMetricsService.scala:37) ~[manager.jar:latest]
manager | at io.hydrosphere.serving.manager.controller.prometheus.PrometheusMetricsController$$anonfun$getServices$1$$anonfun$apply$1$$anonfun$apply$2.apply(PrometheusMetricsController.scala:30) ~[manager.jar:latest]
manager | at io.hydrosphere.serving.manager.controller.prometheus.PrometheusMetricsController$$anonfun$getServices$1$$anonfun$apply$1$$anonfun$apply$2.apply(PrometheusMetricsController.scala:30) ~[manager.jar:latest]
sidecar | [2018-02-09 02:03:07.572][26][debug][main] source/server/connection_handler_impl.cc:129] [C10] new connection
sidecar | [2018-02-09 02:03:07.572][26][debug][http] source/common/http/conn_manager_impl.cc:181] [C10] new stream
sidecar | [2018-02-09 02:03:07.572][26][debug][http] source/common/http/conn_manager_impl.cc:439] [C10][S4601685897479926094] request headers complete (end_stream=true):
sidecar | [2018-02-09 02:03:07.572][26][debug][http] source/common/http/conn_manager_impl.cc:444] [C10][S4601685897479926094] ':authority':'hd-nd05.campus.utah.edu:8080'
sidecar | [2018-02-09 02:03:07.572][26][debug][http] source/common/http/conn_manager_impl.cc:444] [C10][S4601685897479926094] 'user-agent':'curl/7.52.1'
sidecar | [2018-02-09 02:03:07.572][26][debug][http] source/common/http/conn_manager_impl.cc:444] [C10][S4601685897479926094] 'accept':'/'
sidecar | [2018-02-09 02:03:07.572][26][debug][http] source/common/http/conn_manager_impl.cc:444] [C10][S4601685897479926094] ':path':'/v1/prometheus/services'
sidecar | [2018-02-09 02:03:07.572][26][debug][http] source/common/http/conn_manager_impl.cc:444] [C10][S4601685897479926094] ':method':'GET'
sidecar | [2018-02-09 02:03:07.572][26][debug][router] source/common/router/router.cc:239] [C10][S4601685897479926094] cluster 'manager-http' match for URL '/v1/prometheus/services'
sidecar | [2018-02-09 02:03:07.572][26][debug][router] source/common/router/router.cc:284] [C10][S4601685897479926094] ':authority':'hd-nd05.campus.utah.edu:8080'
sidecar | [2018-02-09 02:03:07.572][26][debug][router] source/common/router/router.cc:284] [C10][S4601685897479926094] 'user-agent':'curl/7.52.1'
sidecar | [2018-02-09 02:03:07.572][26][debug][router] source/common/router/router.cc:284] [C10][S4601685897479926094] 'accept':'/'
sidecar | [2018-02-09 02:03:07.572][26][debug][router] source/common/router/router.cc:284] [C10][S4601685897479926094] ':path':'/v1/prometheus/services'
sidecar | [2018-02-09 02:03:07.572][26][debug][router] source/common/router/router.cc:284] [C10][S4601685897479926094] ':method':'GET'
sidecar | [2018-02-09 02:03:07.572][26][debug][router] source/common/router/router.cc:284] [C10][S4601685897479926094] 'x-forwarded-proto':'http'
sidecar | [2018-02-09 02:03:07.572][26][debug][router] source/common/router/router.cc:284] [C10][S4601685897479926094] 'x-request-id':'35b54b31-9920-9176-9df7-30d0d1c3fc1e'
sidecar | [2018-02-09 02:03:07.572][26][debug][router] source/common/router/router.cc:284] [C10][S4601685897479926094] 'x-b3-traceid':'8e19a689e87492a5'
sidecar | [2018-02-09 02:03:07.572][26][debug][router] source/common/router/router.cc:284] [C10][S4601685897479926094] 'x-b3-spanid':'8e19a689e87492a5'
sidecar | [2018-02-09 02:03:07.572][26][debug][router] source/common/router/router.cc:284] [C10][S4601685897479926094] 'x-b3-sampled':'1'
sidecar | [2018-02-09 02:03:07.572][26][debug][router] source/common/router/router.cc:284] [C10][S4601685897479926094] 'x-ot-span-context':'8e19a689e87492a5;8e19a689e87492a5;0000000000000000'
sidecar | [2018-02-09 02:03:07.572][26][debug][router] source/common/router/router.cc:284] [C10][S4601685897479926094] 'x-envoy-expected-rq-timeout-ms':'15000'
sidecar | [2018-02-09 02:03:07.572][26][debug][router] source/common/router/router.cc:284] [C10][S4601685897479926094] ':scheme':'http'
sidecar | [2018-02-09 02:03:07.572][26][debug][pool] source/common/http/http1/conn_pool.cc:74] [C3] using existing connection
sidecar | [2018-02-09 02:03:07.572][26][debug][router] source/common/router/router.cc:902] [C10][S4601685897479926094] pool ready
sidecar | [2018-02-09 02:03:07.575][26][debug][router] source/common/router/router.cc:553] [C10][S4601685897479926094] upstream headers complete: end_stream=false
sidecar | [2018-02-09 02:03:07.575][26][debug][http] source/common/http/conn_manager_impl.cc:859] [C10][S4601685897479926094] encoding headers via codec (end_stream=false):
sidecar | [2018-02-09 02:03:07.575][26][debug][http] source/common/http/conn_manager_impl.cc:864] [C10][S4601685897479926094] 'server':'envoy'
sidecar | [2018-02-09 02:03:07.575][26][debug][http] source/common/http/conn_manager_impl.cc:864] [C10][S4601685897479926094] 'date':'Fri, 09 Feb 2018 02:03:07 GMT'
sidecar | [2018-02-09 02:03:07.575][26][debug][http] source/common/http/conn_manager_impl.cc:864] [C10][S4601685897479926094] 'content-type':'text/plain; charset=UTF-8'
sidecar | [2018-02-09 02:03:07.575][26][debug][http] source/common/http/conn_manager_impl.cc:864] [C10][S4601685897479926094] 'content-length':'13'
sidecar | [2018-02-09 02:03:07.575][26][debug][http] source/common/http/conn_manager_impl.cc:864] [C10][S4601685897479926094] ':status':'500'
sidecar | [2018-02-09 02:03:07.575][26][debug][http] source/common/http/conn_manager_impl.cc:864] [C10][S4601685897479926094] 'x-envoy-upstream-service-time':'2'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.