Comments (6)
Great! I'll mark the issue as closed.
from kafka-connect-bigquery.
@ahmedjami how are you starting the connector? And how are you adding it to the Docker image?
from kafka-connect-bigquery.
Hi @C0urante,
I'm building the Connector from docker as described below:
FROM confluentinc/cp-kafka-connect:5.4.3
RUN mkdir /usr/kafka-connect-bq/
COPY kcbq-connector-1.6.5.jar /usr/kafka-connect-bq/
Then I'm starting all the components (zookeeper, cp-kafka, kafka-connector and schema-registry) with docker-compose:
version: '2.1'
services:
zoo1:
image: zookeeper:3.4.9
hostname: zoo1
ports:
- "2181:2181"
environment:
ZOO_MY_ID: 1
ZOO_PORT: 2181
ZOO_SERVERS: server.1=zoo1:2888:3888
volumes:
- ./helenus-dev/zoo1/data:/data
- ./helenus-dev/zoo1/datalog:/datalog
kafka2:
image: confluentinc/cp-kafka:5.3.1
hostname: kafka2
ports:
- "9092:9092"
- "29092:29092"
environment:
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka2:9092,PLAINTEXT_HOST://localhost:29092
KAFKA_ZOOKEEPER_CONNECT: "zoo1:2181"
KAFKA_BROKER_ID: 1
KAFKA_LOG4J_LOGGERS: "kafka.controller=INFO,kafka.producer.async.DefaultEventHandler=INFO,state.change.logger=INFO"
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_DELETE_TOPIC_ENABLE: "true"
volumes:
- ./helenus-dev/kafka2/data:/var/lib/kafka/data
depends_on:
- zoo1
schemaregistry:
image: confluentinc/cp-schema-registry
hostname: schemaregistry
ports:
- "8081:8081"
environment:
- "SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=zoo1:2181"
- "SCHEMA_REGISTRY_KAFKASTORE_SECURITY_PROTOCOL=PLAINTEXT"
- "SCHEMA_REGISTRY_LISTENERS=http://schemaregistry:8081, http://localhost:8081"
- "SCHEMA_REGISTRY_HOST_NAME=schemaregistry"
- "SCHEMA_REGISTRY_DEBUG=true"
volumes:
- ./helenus-dev/schemaregistry/data:/var/lib/
depends_on:
- kafka2
connect:
build:
context: .
dockerfile: Dockerfile_kafkaconnect
depends_on:
- zoo1
- kafka2
- schemaregistry
ports:
- "8083:8083"
environment:
CONNECT_BOOTSTRAP_SERVERS: "kafka2:9092"
CONNECT_REST_PORT: 8083
CONNECT_GROUP_ID: compose-connect-group
CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs
CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets
CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status
CONNECT_KEY_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: 'http://schemaregistry:8081'
CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: 'http://schemaregistry:8081'
CONNECT_INTERNAL_KEY_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_INTERNAL_VALUE_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_REST_ADVERTISED_HOST_NAME: "kafka-connect"
CONNECT_LOG4J_ROOT_LOGLEVEL: "INFO"
CONNECT_LOG4J_LOGGERS: "org.apache.kafka.connect.runtime.rest=WARN,org.reflections=ERROR"
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_PLUGIN_PATH: '/usr/kafka-connect-bq'
When it's started, kafka-connect works well until reaching the stage where it will discover and load the connector and stop with the following error:
NFO Loading plugin from: /usr/kafka-connect-bq/kcbq-connector-1.6.5.jar (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader)
connect_1 | [2020-11-19 21:49:29,361] ERROR Stopping due to error (org.apache.kafka.connect.cli.ConnectDistributed)
connect_1 | java.lang.NoClassDefFoundError: com/google/cloud/bigquery/BigQuery
from kafka-connect-bigquery.
I am facing exactly the same problem:
- My connect version is "confluentinc/cp-kafka-connect:5.5.1"
- And Kcbq connector: "kcbq-connector-2.1.0-SNAPSHOT.jar"
- And i move the connector like this:
CONNECT_PLUGIN_PATH: /usr/share/java,/usr/share/confluent-hub-components,/data/connect-jars
volumes:
- ${PWD}/data/jars:/data/connect-jars
from kafka-connect-bigquery.
Thanks @ahmedjami and @jtriq. I believe I know the causes of your issues. There's two things that need to be addressed.
The docs for the plugin.path property shed some light here:
List of paths separated by commas (,) that contain plugins (connectors, converters, transformations). The list should consist of top level directories that include any combination of:
a) directories immediately containing jars with plugins and their dependencies
b) uber-jars with plugins and their dependencies
c) directories immediately containing the package directory structure of classes of plugins and their dependencies
Note: symlinks will be followed to discover dependencies or plugins.
Examples: plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectors
The first problem is that it appears that your plugin.path
is itself a directory containing the jars for the connector and its dependencies; however, it should instead be a directory that contains directories, each of which contains jars for connectors and dependencies. In @ahmedjami's case, I'd suggest altering the MKDIR
directive in your Dockerfile to create a /usr/kafka-connect-plugins/kafka-connect-bigquery
directory (names of course can be altered arbitrarily as long as the directory structure is preserved), then modifying the COPY
directive to place the jars for the connector and its dependencies into /usr/kafka-connect-plugins/kafka-connect-bigquery/
, and finally, changing the CONNECT_PLUGIN_PATH
environment variable to /usr/kakfa-connect-plugins
(not /usr/kafka-connect-plugins/kafka-connect-bigquery
).
Secondly, it's necessary to not only copy the connector jar, but all of its dependencies as well, onto the image. If you're building from source on a version that uses Maven, you'll find all of the necessary jars in the kcbq-connector/target/components/packages/wepay-kafka-connect-bigquery-${VERSION}/wepay-kafka-connect-bigquery-${VERSION}/lib/
directory after running mvn package
. Can't remember exactly how to do this with the Gradle build but I think there was a command that built a fat tar which could be extracted and used similarly.
Hope this helps!
from kafka-connect-bigquery.
@C0urante thanks a lot for your help 👍
Now, kafka-connect and connector are starting without any errors and I will be able to continue with the rest of procedure.
Cheers
from kafka-connect-bigquery.
Related Issues (20)
- Merge flush tmp tables left unflushed
- GCSToBQLoadRunnable does not detect error during load and removes blobs even though they were not loaded
- GCSToBQLoadRunnable doesn't respect GCS folder
- setting different `clusteringPartitionFieldNames` across different tables in the same connector
- schemaless ( JSON events ) to bq fail to create table HOT 2
- Version 2.4.4 can't be found
- Frequent BigQueryException HOT 2
- Support for Storage Write API HOT 18
- Support for partition filter required config when using UPSERT
- Convert io.debezium.time.MicroTimestamp to Bigquery TIMESTAMP HOT 2
- Timestamp conversion issue
- ERROR : INVALID_ARGUMENT jobInternalError HOT 2
- Setting kafkaKeyFieldName, when keys are string and topic has no schema, is not working HOT 2
- Infinite option for bigQueryRetry HOT 3
- Connector Not Restarting after BigQueryConnectException HOT 2
- Separate billing project & destination project HOT 2
- Records Duplication in BigQuery when Streaming Data from Kafka Topic via kafka-connect-bigquery HOT 8
- JSON data type is not working HOT 1
- Potenial data consistency issues HOT 3
- Configure batch size on KafkaConnect BigQuery sink connector HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kafka-connect-bigquery.