Code Monkey home page Code Monkey logo

Comments (16)

NikolaBorisov avatar NikolaBorisov commented on August 17, 2024 7

@manolama We are also hitting this issue. Basically in the Hbase RS we start getting this

2019-02-14 22:33:29,653 WARN org.apache.hadoop.hbase.regionserver.RSRpcServices: Client tried to access missing scanner 0

printed a lot until the RegionServer dies out of memory because of these requests. If I understand correctly this is because OpenTSDB keeps retrying the same requests. Sadly this causes our whole OpenTSDB/Hbase cluster to die because it eventually happens to the RegionServer hosting .META region. Can you please advice us how to downgrade and to what version? We use Hbase 2.0.0.

from asynchbase.

NikolaBorisov avatar NikolaBorisov commented on August 17, 2024 2

One more comment is that I think having this merge request merged is better then the current state. It is better to fail the user request then to do an Inf loop of retries with no chance of success and take down the whole cluster.

from asynchbase.

xuming01 avatar xuming01 commented on August 17, 2024

In addition, i use asynchbase1.8.2 and hbase2.1.0.

from asynchbase.

manolama avatar manolama commented on August 17, 2024

Yeah let's increment the count and change it to NonRecoverableException. Could you issue a PR please?

from asynchbase.

xuming01 avatar xuming01 commented on August 17, 2024

Yeah let's increment the count and change it to NonRecoverableException. Could you issue a PR please?
OK,I will do.

from asynchbase.

seanlook avatar seanlook commented on August 17, 2024

This should be a severe bug in opentsdb as I use the latest version. I will downgrade the opentsdb for now.

from asynchbase.

manolama avatar manolama commented on August 17, 2024

The fix for this one will be a bit more involved if we want it to be recoverable/retry-able.
@seanlook you can also just downgrade the HBase client for OpenTSDB without downgrading the entire package if you like.

from asynchbase.

clinta avatar clinta commented on August 17, 2024

We've also encountered this on Hbase 2.1.2. We've tried running with this patch and while it is certainly an improvement, the tsdb servers no longer kill our entire hbase cluster, the tsdb servers do stop processing requests and have to be restarted manually.

I've also tried downgrading to asynchbase 1.8.1, but that does not work with Hbase 2.

from asynchbase.

NikolaBorisov avatar NikolaBorisov commented on August 17, 2024

From reading the code I think the

rpc.attempt++;

Is a good idea anyway since we don't ever want inf retries. @manolama Can you give us some guidance how to build opentsdb with a custom patched version of asynchbase.

from asynchbase.

openaphid avatar openaphid commented on August 17, 2024
  1. clone the code of opentsdb
  2. place a custom patched asynchbase to a web server
  3. create a file with the md5 signature of asynchbase..jar and put it under third_party/hbase/
  4. modify third_party/hbase/include.mk, update ASYNCHBASE_VERSION and ASYNCHBASE_BASE_URL
  5. run ./build.sh

from asynchbase.

dilip-devaraj avatar dilip-devaraj commented on August 17, 2024

@manolama
Are there any issues with the 2 changes in this PR, since it has still not been merged ?

We are using Hbase 0.98 , and saw the problem of UnknownScannerException running tsdb2.4 and asynchbase 1.8.2. We then switched back to tsdb2.2 and asynchbase 1.7.1 , and the problem went away. Since we still needed some features of tsdb2.4, we tried running tsdb2.4 with older asynchbase 1.7.1 , however we still see some UnknownScannerException
Is it safe to use tsdb2.4, with asynchbase 1.8.2 and above custom patch ?

from asynchbase.

tgwk avatar tgwk commented on August 17, 2024

Just wanted to provide some feedback on this one: we were having major issues for a while. After long investigations and trying a few other things (including adding health-checks and auto-restarts by moving to K8 -- it helped as a work-around, but we were still getting outages), we ended up applying the patch in PR 202 and that basically solved our issues. Our OpenTSDBs are now much healthier.
(we run HBase 1.1.2 at the moment)

from asynchbase.

1256040466zy avatar 1256040466zy commented on August 17, 2024

通过阅读代码,我认为

rpc.attempt++;

无论如何,这是一个好主意,因为我们永远不希望重试。@manolama您能否给我们一些指导,说明如何使用自定义修补版本的asynchbase构建opentsdb。

请问一下 这个源码的类在哪里呢

from asynchbase.

joshnorell avatar joshnorell commented on August 17, 2024

I to seem to be noticing this bug in Opentsdb 2.4, AsyncHbase 1.8.2. I have tried unsuccessfully to compile AsyncHBase with this change. I believe there are some operations in the code that are incompatible with my version of JDK(11), and I am not able to downgrade. Can someone please provide the jar file with this modification? Thank you

from asynchbase.

iamgd67 avatar iamgd67 commented on August 17, 2024

seams related to this change 061ec3 ,
prior to this, will not retry UnknownScannerException, so 1.8.1 should be good.

from asynchbase.

RuralHunter avatar RuralHunter commented on August 17, 2024

Is this project dead? why is this problem still not fixed?

from asynchbase.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.