Comments (18)
Fail to describe collection
could you offer full client side message and also server side log?
We can't see any details in the error message you offered
from milvus.
Hi @xiaofan-luan, below please find the client-side stack trace. I'll see if I can get a server-side log.
java.lang.Exception: Fail to describe collection
---------------------------------------------------------------------------
Py4JJavaError Traceback (most recent call last)
File <command-3658777111610979>, line 88
78 client.upsert(COLLECTION_NAME, {"doc_id": row[0], "vector": row[1]})
80 # -- Read data from Milvus using the Spark connector --
81 df = spark.read.format("milvus") \
82 .option("milvus.host", "milvus-standalone.alpha.k8s.dev.303net.net") \
83 .option("milvus.port", "50051") \
84 .option("milvus.collection.name", COLLECTION_NAME) \
85 .option("milvus.collection.vectorField", "vector") \
86 .option("milvus.collection.vectorDim", str(NUM_DIMENSIONS)) \
87 .option("milvus.collection.primaryKeyField", "doc_id") \
---> 88 .load() \
89 .select("doc_id", "vector")
91 df.show(n=20, truncate=False)
File /databricks/spark/python/pyspark/instrumentation_utils.py:48, in _wrap_function.<locals>.wrapper(*args, **kwargs)
46 start = time.perf_counter()
47 try:
---> 48 res = func(*args, **kwargs)
49 logger.log_success(
50 module_name, class_name, function_name, time.perf_counter() - start, signature
51 )
52 return res
File /databricks/spark/python/pyspark/sql/readwriter.py:314, in DataFrameReader.load(self, path, format, schema, **options)
312 return self._df(self._jreader.load(self._spark._sc._jvm.PythonUtils.toSeq(path)))
313 else:
--> 314 return self._df(self._jreader.load())
File /databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py:1355, in JavaMember.__call__(self, *args)
1349 command = proto.CALL_COMMAND_NAME +\
1350 self.command_header +\
1351 args_command +\
1352 proto.END_COMMAND_PART
1354 answer = self.gateway_client.send_command(command)
-> 1355 return_value = get_return_value(
1356 answer, self.gateway_client, self.target_id, self.name)
1358 for temp_arg in temp_args:
1359 if hasattr(temp_arg, "_detach"):
File /databricks/spark/python/pyspark/errors/exceptions/captured.py:188, in capture_sql_exception.<locals>.deco(*a, **kw)
186 def deco(*a: Any, **kw: Any) -> Any:
187 try:
--> 188 return f(*a, **kw)
189 except Py4JJavaError as e:
190 converted = convert_exception(e.java_exception)
File /databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py:326, in get_return_value(answer, gateway_client, target_id, name)
324 value = OUTPUT_CONVERTER[type](answer[2:], gateway_client)
325 if answer[1] == REFERENCE_TYPE:
--> 326 raise Py4JJavaError(
327 "An error occurred while calling {0}{1}{2}.\n".
328 format(target_id, ".", name), value)
329 else:
330 raise Py4JError(
331 "An error occurred while calling {0}{1}{2}. Trace:\n{3}\n".
332 format(target_id, ".", name, value))
Py4JJavaError: An error occurred while calling o943.load.
: java.lang.Exception: Fail to describe collection
at zilliztech.spark.milvus.MilvusCollection.initCollectionMeta(MilvusCollection.scala:57)
at zilliztech.spark.milvus.MilvusCollection.milvusCollectionMeta$lzycompute(MilvusCollection.scala:22)
at zilliztech.spark.milvus.MilvusCollection.milvusCollectionMeta(MilvusCollection.scala:22)
at zilliztech.spark.milvus.MilvusCollection.collectionSchema$lzycompute(MilvusCollection.scala:24)
at zilliztech.spark.milvus.MilvusCollection.collectionSchema(MilvusCollection.scala:24)
at zilliztech.spark.milvus.MilvusCollection.schema(MilvusCollection.scala:35)
at zilliztech.spark.milvus.Milvus.inferSchema(Milvus.scala:127)
at org.apache.spark.sql.execution.datasources.v2.DataSourceV2Utils$.getTableFromProvider(DataSourceV2Utils.scala:91)
at org.apache.spark.sql.execution.datasources.v2.DataSourceV2Utils$.loadV2Source(DataSourceV2Utils.scala:138)
at org.apache.spark.sql.DataFrameReader.$anonfun$load$1(DataFrameReader.scala:333)
at scala.Option.flatMap(Option.scala:271)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:331)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:226)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397)
at py4j.Gateway.invoke(Gateway.java:306)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:199)
at py4j.ClientServerConnection.run(ClientServerConnection.java:119)
at java.lang.Thread.run(Thread.java:750)
Caused by: io.grpc.StatusRuntimeException: INTERNAL: http2 exception
at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:271)
at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:252)
at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:165)
at io.milvus.grpc.MilvusServiceGrpc$MilvusServiceBlockingStub.describeCollection(MilvusServiceGrpc.java:4421)
at io.milvus.client.AbstractMilvusGrpcClient.describeCollection(AbstractMilvusGrpcClient.java:691)
at io.milvus.client.MilvusServiceClient.lambda$describeCollection$9(MilvusServiceClient.java:343)
at io.milvus.client.MilvusServiceClient.retry(MilvusServiceClient.java:259)
at io.milvus.client.MilvusServiceClient.describeCollection(MilvusServiceClient.java:343)
at zilliztech.spark.milvus.MilvusCollection.initCollectionMeta(MilvusCollection.scala:54)
... 24 more
Caused by: io.netty.handler.codec.http2.Http2Exception: First received frame was not SETTINGS. Hex dump for first 5 bytes: 485454502f
at io.netty.handler.codec.http2.Http2Exception.connectionError(Http2Exception.java:109)
at io.netty.handler.codec.http2.Http2ConnectionHandler$PrefaceDecoder.verifyFirstFrameIsSettings(Http2ConnectionHandler.java:353)
at io.netty.handler.codec.http2.Http2ConnectionHandler$PrefaceDecoder.decode(Http2ConnectionHandler.java:247)
at io.netty.handler.codec.http2.Http2ConnectionHandler.decode(Http2ConnectionHandler.java:453)
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529)
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
... 1 more
from milvus.
@xiaofan-luan this is something about HTTP vs HTTP2.
These seem similar:
grpc/grpc-java#2905
https://stackoverflow.com/questions/73331402/grpc-client-http2exception-first-received-frame-was-not-settings-hex-dump-fo
Any idea? Might this be something to do with dependency libraries? protobuf, grpc... ?
The server-side logs are clean, no errors.
serverside-logs-last-50-lines.txt
from milvus.
I think this is not a bug. Might becasue of you enabled tls server side but connected without tls or what.
from milvus.
/assign @dgoldenberg-ias
please share some info about how you config tls or deploy milvus?
/unassign
from milvus.
Why the "milvus.port" is "50051"?
Is the 50051 mapped from 19530 of Milvus?
from milvus.
Hi, responding to your comments, @yanliang567, @yhmo:
- We did not configure TLS or do anything explicit related to it.
- Regarding the ports, here's the explanation from our devops:
50051 is the default port for grpc client
weβre forwarding it to 19530
We restrict 19530 port from the client side
Note that 50051 works from the client side generally, when using pymilvus, e.g. the MilvusClient
.
3. How the standalone Milvus is currently configured - I'm attaching the values yaml file with all the settings.
Generally, I think there may be a classpath issue whereby grpc is getting http1 dependency jars loaded ahead of http2 dependencies. Can you provide a list of http2 dependencies you think should be there? or http1 jars we could check for on the classpath?
Of note, the code is running as a python notebook in Databricks, with the spark_milvus_1_0_0_SNAPSHOT.jar installed into the Libraries of the Spark cluster.
from milvus.
Caused by: io.netty.handler.codec.http2.Http2Exception: First received frame was not SETTINGS. Hex dump for first 5 bytes: 485454502f
Don't quite understand that, since
Caused by: io.netty.handler.codec.http2.Http2Exception: First received frame was not SETTINGS. Hex dump for first 5 bytes: 485454502f
this seems to be very related to network.
All the dependencies is list in the mvn file: https://github.com/milvus-io/milvus-sdk-java
from milvus.
try lastest 2.3.5 might help you since we upgrade grpc and netty version
from milvus.
I think you mean upgrading to 2.4.0, right? We're actually on 2.3.12, just checked.
-- Side note --
I've re-run my test on Databricks 12.2 LTS (includes Apache Spark 3.3.2, Scala 2.12)
which is Spark 3.3.2; that's the version your Spark connector is built for. Spark is all the way up to 3.5.1 now. Databricks supports Spark up to 3.5.0.
I get the same error, with Spark 3.3.2. Just thinking that you may want to build the connector for various Spark versions. Maybe upgrade the code to latest Spark. Spark API's have changed. For instance, if I do:
<!-- <spark.version>3.3.2</spark.version> -->
<spark.version>3.4.1</spark.version>
I get a build error:
[ERROR] /Users/dgoldenberg/tech/milvus/spark-milvus/src/main/java/org/apache/spark/sql/execution/datasources/parquet/MVectorizedParquetRecordReader.java:[268,43] incompatible types: org.apache.spark.sql.execution.vectorized.WritableColumnVector cannot be converted to org.apache.spark.sql.execution.vectorized.ConstantColumnVector
from milvus.
no I mean SDK version.
Using Milvus-SDK-java 2.3.5
from milvus.
Not sure how this is applicable. We're developing using Python. We're using PyMilvus 2.4.0.
How/where would the Milvus Java SDK come into the picture?
from milvus.
from milvus.
seems that spark connector is using java connector.
could you try to ping your milvus cluster from spark cluster and check if network is avalaible?
from milvus.
seems that spark connector is using java connector.
Milvus' Spark connector is written in Java. There is no Spark connector using Java connector. There is one JAR which is the Milvus Spark connector, it's in Java, therefore the Java stack trace.
could you try to ping your milvus cluster from spark cluster and check if network is avalaible?
We don't need to do that. Look at the code. As you can see, we first establish a connection with Milvus using the pymilvus MilvusClient, we then create a collection, and feed several documents to it. All of that succeeds. That means that network is totally fine, the port is absolutely correct, and Milvus Standalone is working.
Let's establish that and get past these points, please.
What's not working is the Spark connector. My guess is that there is some classpath issue where HTTP1 dependency libraries are being loaded instead of HTTP2.
from milvus.
agreed on the analysis. is there any suggestion on how to verify the probleam.
And we are sure that pymilvus and spark is running on the same network env?
from milvus.
grpc/grpc-java#2905 this might be the issue?
from milvus.
Hi @xiaofan-luan, I don't know if that is "the issue". It seems to me that at least for the Milvus Spark connector, HTTP2 is expected. Is that true? This doesn't seem to be documented but seems likely. Then it'd follow that we need to enable TLS.
There is no mention of it in https://milvus.io/docs/integrate_with_spark.md.
I'm thinking we'd then want to follow https://milvus.io/docs/tls.md.
If this is what's happening, I think it'd be definitely worth mentioning in the Spark-Milvus Connector User Guide.
from milvus.
Related Issues (20)
- [Bug]: The milvus instance does not have any requests but the query count(*) results become larger HOT 1
- [Bug]: test_search_pagination_group_by failed HOT 2
- [Bug]: Chunkcache files may potentially leak HOT 1
- [Enhancement]: row-based import, only allow one JSON file each time HOT 2
- [Bug]: [standalone][LRU] Milvus standalone goroutine leak when enable lru cache HOT 2
- [Enhancement]: Make querycoordv2 collection observer task driven
- [Bug]: GPU_BRUTE_FORCE Index Milvus Search error HOT 1
- [Bug]: The number of L0 segmtn loaded by the querynode keeps growing HOT 1
- [Feature]: Support brew install and brew services
- [Enhancement]: Do not list prefix on S3 when load partition HOT 2
- [Bug]: TestAvgReassignPolicy/test_normal_reassigning_for_one_available_nodes ut failed
- [Bug]: The query result is empty after Milvus recovers from a querynode pod kill or pod failure chaos HOT 3
- [Enhancement]: DataNode Compactor tracing and timing is confusing
- [Bug]: [benchmark][standalone][LRU] search and query raiase error in concurrent DQL and insert scene HOT 2
- [Bug]: falled to query: node offline HOT 3
- [Bug]: Fail to search on QueryNode: When metricType is L2 and range_filter < radius, precision loss error occurs. The range_filter must be less than the radius. HOT 8
- [Bug]: [benchmark][cluster] After queryNode pod failure is restored, hybrid_search RT increases in replica=2 scene
- [Bug]: Docker arm64 image milvusdb/milvus:v2.4.0: symbol lookup error HOT 17
- [Bug]: ARM64 compilation problems HOT 2
- [Bug]: prefix for CN works well, but not postfix HOT 19
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from milvus.