Comments (4)
The format of compressed blocks is described here
https://github.com/yandex/ClickHouse/blob/master/dbms/include/DB/IO/CompressedStream.h
The checksum is currently only CityHash128. I haven't found any well-known java implementation for it.
As I see checking the checksum may be turned off on server by setting a property http_native_compression_disable_checksumming_on_decompress.
So I think it is possible to support decompress=1 even without checksum.
Also as I understand it is possible to turn on gzip compression of http post content. Currently it is turned off here
https://github.com/yandex/clickhouse-jdbc/blob/master/src/main/java/ru/yandex/clickhouse/util/ClickHouseHttpClientBuilder.java#L39
It is turned off mostly because compression on selects is supported and lz4 is faster.
It is possible to turn this on at configuration of the driver, but I haven't tested this.
from clickhouse-java.
Thanks for the pointers!
After poking around a bit more, it seems it is possible to enable gzip compression for POST body (streaming) like this:
ClickHouseStatementImpl stat = (ClickHouseStatementImpl)connection.createClickHouseStatement();
stat.sendStream(new GzipCompressingEntity(new InputStreamEntity(bais, -1)), "INSERT INTO " + tableForStreaming);
This is 'almost no hackery' approach and doesn't require changes to ClickHouse JDBC driver code.
Nevertheless it would be great if you could add a better support for compression in the future (as you mentioned, lz4 is probably faster, so probably best to implement ClickHouse-style compression at the driver level).
from clickhouse-java.
Merged basic variant of support to master. Will be in the next release. At least it compresses data and works according to my tests. However, I haven't evaluated performance yet.
The property is disabled by default now and checksum is not calculated and is not checked on server side.
Some parameters like default compression buffer size and compressor option may still change.
from clickhouse-java.
Added checksumming, so this must be the last part of the issue. Decompress is still disabled by default for now.
from clickhouse-java.
Related Issues (20)
- Upgrading 0.4.6 -> 0.5.0 or 0.6.0-patch4 breaks Metabase driver HOT 3
- feature request: Expose "queryId" field from "ClickHouseHttpResponse" to "ClickHouseResponse" HOT 1
- find a bug in clickhouse-jdbc 0.6.0 HOT 2
- [QA] Java: How to use the Native TCP protocol instead of the HTTP protocol? Is it GA-ed yet? HOT 10
- Add resseting memorized roles
- Versioning schema is weird HOT 3
- Github releases sometimes have no description HOT 2
- Cannot write to ostream at offset 692. (CANNOT_WRITE_TO_OSTREAM) HOT 1
- Clickhouse-JDBC does not support default columns in recommended input function
- Create the blog post for External customers
- [client-v2] Explore Migrate of Kafka Connector to use client-v2 HOT 1
- [client-v2] Explore migration of JDBC to the new client
- Migrate Metabase to use client-v2 (don't merge) HOT 3
- [QA] Hive and Clickhouse. How to select and insert data from clickhouse into hive using spark thrift server?
- Implement QueuedByteArrayInputStream
- clickhouse-jdbc driver did not support connection url parameters HOT 2
- ClickHouse java client - reusing same instance of ClickHouseClient gives execution timeout after 10 inserts HOT 6
- Benchmarking v2 vs v1 API
- Error message is too verbose HOT 2
- Spark read clickhouse table, connect timed out
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clickhouse-java.