Code Monkey home page Code Monkey logo

hbase-java-api-example's Introduction

hbase-java-api-example

This is a simple example usage of HBase on Trusted Analytics Platform.

This application utilizes HBase service broker (from TAP) and HBase Client API to connect to HBase. It performs basic operations, like:

  • list tables
  • show table description (column families)
  • get n last rows from given table
  • get n first rows from given table
  • create a table

After being deployed to TAP it provides these functionalities through the following endpoints:

URL method operation
/api/tables GET list the tables
/api/tables POST create new table
/api/tables/{name} GET describe details of given table
/api/tables/{name}/head GET get first rows of given table
/api/tables/{name}/tail GET get last rows of given table
/api/tables/{name}/row POST add new value for given row
/api/tables/{name}/row/{rowKey} GET get row by given row key

You can use Swagger API to work with the service:

http://hbase-reader.{domain.com}/swagger-ui.html

Under the hood

This is a simple spring boot application. Key point of interest here are:

  • extracting HBase configuration information (required for connection; provided by hbase-broker and kerberos-broker)
  • connect to HBase and authenticate in kerberos
  • using HBase client to perform some admin operations (in our case: getting information on tables)
  • using HBase client to perform some operations on tables (in our case: reading data)

The following sections will present information on the broker and client API role.

HBase broker

HBase broker of TAP provisions a namespace for the user. After binding to an app, it also provides some configuration information.

{
  "VCAP_SERVICES": {
    "hbase": [
      {
        "credentials": {
          "HADOOP_CONFIG_KEY": { 
            ...
            "hbase.zookeeper.property.clientPort": "2181",
            "hbase.zookeeper.quorum": "cdh-master-0.node.server.com,cdh-master-1.node.server.com,cdh-master-2.node.server.com",
            ...
          },
          "hbase.namespace": "2bd6c4db32236dd4a33d19f8ef76257b4a69ff1b",
          ...
        },
        "label": "hbase",
        "name": "hbase1",
        "plan": "bare",
        "tags": []
      }
   ]
   ...

Essential fragments here are:

  • name key - service instance name
  • credential section - crucial configuration information, including:
    • zookeeper settings (required to connect to HBase)
    • hbase.namespace key - the namespace created for the user

Kerberos broker

In TAP Kerberos credentials can be obtained from kerberos-broker. After creating service instance and binding it to an application, the following information are available:

  "kerberos": [
   {
    "credentials": {
     "enabled": true,
     "kcacert": "...",
     "kdc": "...",
     "kpassword": "...",
     "krealm": "...",
     "kuser": "..."
    },
    "label": "kerberos",
    "name": "kerberos-instance",
    "plan": "shared",
    "tags": [
     "kerberos"
    ]
   }
  ]

Connecting to HBase

TAP platform provides hadoop-utils library. It contains many usefull utils. For example, connecting to HBase boils down to:

    Hbase.newInstance().createConnection().connect();

hadoop-utils takes care of the configuration and authentication (reads data from HBase and Kerberos service binding).

HBase Java API (1.1.2)

HBase project provides Java client API.

If you want to use the API in your Maven project, the corresponding dependency is:

<dependency>
	<groupId>org.apache.hbase</groupId>
	<artifactId>hbase-client</artifactId>
	<version>1.1.2</version>
</dependency>

("org.apache.hbase:hbase-client:1.1.2" for Gradle).

In our case, we depend on hadoop-utils instead which bring all required dependencies:

<dependency>
	<groupId>org.trustedanalytics</groupId>
	<artifactId>hadoop-utils</artifactId>
	<version>0.6.5</version>
</dependency>

("org.trustedanalytics:hadoop-utils:0.6.5" for Gradle)

You'll find javadocs here: https://hbase.apache.org/apidocs/index.html/

The API allows for interaction with HBase for DDL (administrative tasks like tables creation/deletion) and DML (data importing, querying).

This sample application shows some examples of these operations.

Row get

       Result r = null; 
       try (Connection connection = hBaseConnectionFactory.connect()) {
            Table table = connection.getTable(TableName.valueOf(name));
            Get get = new Get(Bytes.toBytes(rowKey));
            r = table.get(get);
        } catch (org.apache.hadoop.hbase.TableNotFoundException e) {
            throw new TableNotFoundException(name);
        } catch (IOException e) {
            LOG.error("Error while talking to HBase.", e);
        }

Table scan

Get first 10 rows of given table (by name):

       List<RowValue> result = new ArrayList<>();
       try (Connection connection = hBaseConnectionFactory.connect()) {
            Table table = connection.getTable(TableName.valueOf(name));

            Scan scan = new Scan();
            scan.setFilter(new PageFilter(10));

            try (ResultScanner rs = table.getScanner(scan)) {
                for (Result r = rs.next(); r != null; r = rs.next()) {
                    //conversionsService.constructRowValue is a helper method (defined in the app)
                    result.add(conversionsService.constructRowValue(r));
                }
            }
        }

Admin API usage

Fetch list of tables:

       List<TableDescription> result = null;
       try (Connection connection = hBaseConnectionFactory.connect();
          Admin admin = connection.getAdmin()) {
          HTableDescriptor[] tables = admin.listTables();

          Stream<HTableDescriptor> tableDescriptorsStream = Arrays.stream(tables);

          //ConversionService.constructTableDecription is a helper method (defined in the app)
          result = tableDescriptorsStream.map(conversionsService::constructTableDescription) 
              .collect(Collectors.toList());
      } catch (IOException e) {
          LOG.error("Error while talking to HBase.", e);
      }

Of course, obtaining the connection for every operation is costly (connect to ZooKeeper, connect to HBase takes time). In real life, you'd probably strive to reuse HBase connections.

Compiling and deploying the example

Manual deployment

App deployment is described in details on the Platform Wiki: Getting started Guilde.

The procedure boils down to following steps. After cloning the repository you will be able to compile the project with:

./gradlew clean check assemble

(optional) to update headers use

./gradlew licenseFormatMain

Before deploying, which can be done with cf push, make sure there is an HBase instance available for you.

Also take notice that after you build the project with gradlew assemble the application manifest file has been auto-generated from the template src/cloudfoundry/manifest.yml and copied into the project root folder.

If it is not already done, create an instance of HBase service:

cf create-service hbase bare hbase1

To use this instance either add it to manifest.yml or bind it to the app through CLI.

If it is not already done, create an instance of Kerberos service:

cf create-service kerberos shared kerberos-instance

You can define the bindings in services section of app's manifest file:

---
applications:
- name: hbase-reader
  memory: 1G
  instances: 1
  host: hbase-reader
  path: build/libs/hbase-rest-0.0.2.jar
  services:
      - hbase1
      - kerberos-instance

Sample manifest is provided in this project for your convenience. Please modify it for your needs (application name, service name, etc.) For example, src/main/resources/application-cloud.properties uses HBase service name for some keys. Adjust properties file accordingly to your needs.

After this you are ready to push your application to the platform:

cf push

If you plan to bind an instance of HBase to applications that is already running, you could do this with following commands:

cf bind-service hbase-reader hbase1

cf bind-service hbase-reader kerberos-instance


cf restage hbase-reader

Automated deployment

  • Switch to deploy directory: cd deploy
  • Install tox: sudo -E pip install --upgrade tox
  • Run: tox
  • Activate virtualenv with installed dependencies: . .tox/py27/bin/activate
  • Run deployment script: python deploy.py providing required parameters when running script (python deploy.py -h to check script parameters with their descriptions).

TO DO

  • update info about namespace and service name in applciation.properties. How namespace is read/used.

hbase-java-api-example's People

Contributors

karol-brejna-i avatar pgrabusz avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.