Comments (12)
#626 is definitely related to this. The Dataflow PubSub setup is extremely tricky to get right. They create a boat load of CloudBigtableSingleTableWriteFn's, each with its own buffered mutator. Apparently, there are also cases where the WriteFn will not be cleaned up by dataflow.
I'll work on this ASAP.
from java-bigtable-hbase.
I was hoping to finish everything by EOD today, but I still have a couple of things to do. I fixed some of the issues. #635 should be a decent bandaid, for now.
from java-bigtable-hbase.
I just deployed a new snapshot with a boat load of changes linked fro this issue. Can you please take it for a test drive?
from java-bigtable-hbase.
Took the current 0.2.3-SNAPSHOT
for a spin and things look a lot better. Thread count is high at ~7k but doesn't seem to be growing.
Did a jstack of the DF worker java process. Here's a rough breakdown of the thread counts:
- 3600
bigtable-connection-shared-executor-poolY-tX
threads - 1800
bigtable-grpc-elg-X
threads - 2100
reconnection-async-close-X
threads - ~300 threads blocking on loading hbase/bigtable configuration, all with the below stack trace:
"Thread-313" #350 daemon prio=1 os_prio=0 tid=0x00007f93482b7000 nid=0x193 waiting for monitor entry [0x00007f920a7e3000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.util.zip.ZipFile.getEntry(ZipFile.java:308)
- waiting to lock <0x0000000083309e70> (a java.util.jar.JarFile)
at java.util.jar.JarFile.getEntry(JarFile.java:240)
at java.util.jar.JarFile.getJarEntry(JarFile.java:223)
at sun.misc.URLClassPath$JarLoader.getResource(URLClassPath.java:1005)
at sun.misc.URLClassPath$JarLoader.findResource(URLClassPath.java:983)
at sun.misc.URLClassPath$1.next(URLClassPath.java:240)
at sun.misc.URLClassPath$1.hasMoreElements(URLClassPath.java:250)
at java.net.URLClassLoader$3$1.run(URLClassLoader.java:601)
at java.net.URLClassLoader$3$1.run(URLClassLoader.java:599)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader$3.next(URLClassLoader.java:598)
at java.net.URLClassLoader$3.hasMoreElements(URLClassLoader.java:623)
at sun.misc.CompoundEnumeration.next(CompoundEnumeration.java:45)
at sun.misc.CompoundEnumeration.hasMoreElements(CompoundEnumeration.java:54)
at java.util.ServiceLoader$LazyIterator.hasNextService(ServiceLoader.java:354)
at java.util.ServiceLoader$LazyIterator.hasNext(ServiceLoader.java:393)
at java.util.ServiceLoader$1.hasNext(ServiceLoader.java:474)
at javax.xml.parsers.FactoryFinder$1.run(FactoryFinder.java:293)
at java.security.AccessController.doPrivileged(Native Method)
at javax.xml.parsers.FactoryFinder.findServiceProvider(FactoryFinder.java:289)
at javax.xml.parsers.FactoryFinder.find(FactoryFinder.java:267)
at javax.xml.parsers.DocumentBuilderFactory.newInstance(DocumentBuilderFactory.java:120)
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2218)
at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2195)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2112)
- locked <0x00000007145de8c8> (a org.apache.hadoop.conf.Configuration)
at org.apache.hadoop.conf.Configuration.set(Configuration.java:989)
at org.apache.hadoop.conf.Configuration.set(Configuration.java:961)
at com.google.cloud.bigtable.dataflow.CloudBigtableConfiguration.toHBaseConfig(CloudBigtableConfiguration.java:182)
at com.google.cloud.bigtable.dataflow.AbstractCloudBigtableTableDoFn.getConnection(AbstractCloudBigtableTableDoFn.java:49)
- eliminated <0x00000007145dba80> (a com.google.cloud.bigtable.dataflow.CloudBigtableIO$CloudBigtableSingleTableWriteFn)
at com.google.cloud.bigtable.dataflow.CloudBigtableIO$CloudBigtableSingleTableWriteFn.getBufferedMutator(CloudBigtableIO.java:615)
- locked <0x00000007145dba80> (a com.google.cloud.bigtable.dataflow.CloudBigtableIO$CloudBigtableSingleTableWriteFn)
at com.google.cloud.bigtable.dataflow.CloudBigtableIO$CloudBigtableSingleTableWriteFn.processElement(CloudBigtableIO.java:640)
at com.google.cloud.dataflow.sdk.util.DoFnRunner.invokeProcessElement(DoFnRunner.java:189)
at com.google.cloud.dataflow.sdk.util.DoFnRunner.processElement(DoFnRunner.java:171)
...
from java-bigtable-hbase.
That's a lot of connections. We have 2 reconnection-async-close threads per connection. It's not good that you have over 1,000 opened connections at once. I did some work to reduce the number of Connections to 1 for all writes on a single VM. I see you're doing writes, are you also doing reads? Do you open your own Connections?
from java-bigtable-hbase.
I'll see if I can fix the Configuration issue. We shouldn't be reading xml files from Dataflow.
from java-bigtable-hbase.
We're only doing writes, applying a CloudBigtableIO.writeToTable(cbtConfig)
to a PCollection<Mutation>
of Put
operations, pretty much exactly like in this example. We're not opening our own connections.
from java-bigtable-hbase.
Thanks for your patience with this. I just built a -SNAPSHOT that should fix the issue related to this problem:
java.lang.Thread.State: BLOCKED (on object monitor)
at java.util.zip.ZipFile.getEntry(ZipFile.java:308)
- waiting to lock <0x0000000083309e70> (a java.util.jar.JarFile)
at java.util.jar.JarFile.getEntry(JarFile.java:240)
at java.util.jar.JarFile.getJarEntry(JarFile.java:223)
at sun.misc.URLClassPath$JarLoader.getResource(URLClassPath.java:1005)
...
I'm not sure why we're creating so many connections. I'd like to figure out why that's happening.
In the meantime, I'll put in a fix to share threading resources across connections, since the thread pools auto-expand anyway.
from java-bigtable-hbase.
Are you able to reproduce the issue of many connections being created?
from java-bigtable-hbase.
Other users were able to reproduce the problem in dataflow with a pub-sub source. I have not tried that specific scenario yet. I'm working with a test that creates a whole bunch of Connections to see the effects. We have some performance testing as well with a lot of connections. I think that I'm going to address the underlying issues first, and then look at the Dataflow components.
Do you have any objections to that?
from java-bigtable-hbase.
Sounds good to me 👍
from java-bigtable-hbase.
This was fixed a while ago. CLosing.
from java-bigtable-hbase.
Related Issues (20)
- bigtable.hbase.mirroring.TestReadVerificationSampling: testNoReadsVerificationOnGets failed HOT 1
- bigtable.hbase.mirroring.TestBufferedMutator: testBufferedMutatorPerformsMutations[mutateConcurrently: false] failed HOT 1
- bigtable.hbase.mirroring.TestReadVerificationSampling: testPartialReadsVerificationOnGets failed HOT 1
- bigtable.hbase.mirroring.TestBufferedMutator: testBufferedMutatorPrimaryErrorHandling[mutateConcurrently: true] failed HOT 1
- bigtable.hbase.mirroring.TestReadVerificationSampling: testAllReadsVerificationOnGets failed HOT 1
- bigtable.hbase.mirroring.TestBufferedMutator: testBufferedMutatorPerformsMutations[mutateConcurrently: true] failed HOT 1
- bigtable.hbase2_x.replication.HbaseToCloudBigtableBidirectionalReplicationEndpointTest: testDropsReplicatedEntry failed HOT 1
- bigtable.hbase.TestRetryBehavior: testRpcWillRetryOnAbort[multi-put] failed HOT 1
- Is Java 17 supported by this API? HOT 1
- bigtable.hbase.TestRetryBehavior: testRpcWillRetryOnAbort[single-put] failed HOT 1
- bigtable.hbase.wrappers.veneer.TestBulkMutationVeneerApi: testWhenBatcherIsClosed failed HOT 1
- CONTRIBUTING.md authentication step should refer to correct section
- finishBundle() should timeout after a long time
- Confused about release rules
- bigtable.hbase.TestColumnFamilyAdmin: testRemoveColumn failed HOT 1
- ConcurrentMirroringBUfferedMutator mutate() doesnt always rethrow exceptions
- bigtable.hbase.TestListTables: testListTableNamesWithEmptyElement failed HOT 1
- bigtable.hbase.TestListTables: testTableNames failed HOT 1
- bigtable.hbase.TestCreateTable: testTableNames failed HOT 1
- java.lang.IllegalStateException: Could not find an appropriate constructor for com.google.cloud.bigtable.hbase2_x.BigtableConnection
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from java-bigtable-hbase.