Code Monkey home page Code Monkey logo

aws-athena-hive-metastore's Introduction

The source code includes the reference project implementation code and it is a Maven project with the following modules.

  • hms-service-api: the APIs between Lambda function and Athena service clients, which are defined in the HiveMetaStoreService interface. Since this is a service contract, please don’t change anything in this module.
  • hms-lambda-handler: a set of default lambda handlers to process each hive metastore API calls. The class MetadataHandler is the dispatcher for all different API calls. Customer don’t need to change this package either.
  • hms-lambda-layer: a Maven assembly project to put hms-sevice-api, hms-lambda-handler, and their dependencies into a zip file so that this zip file could be registered as a Lambda layer and then could be used by multiple Lambda functions.
  • *hms-lambda-func: *an example Lambda function, where
    • HiveMetaStoreLambdaFunc: the example lambda function and it simply extends MetadataHandler.
    • ThriftHiveMetaStoreClient: a thrift client to communicate with hive metastore. This client is written for Hive 2.3.0. For other hive versions, customer might need to update this class to make sure the response objects are compatible.
    • ThriftHiveMetaStoreClientFactory: controls the behavior of the lambda function, for example, customer could provide their own set of HandlerProviders by overriding the getHandlerProvider() method.
    • hms.properties: Lambda function configuration. Most likely customer only need to update the following two properties
      • hive.metastore.uris: the URIs of the hive metastore, for example, thrift://ip-172-31-11-81.ec2.internal:9083
      • hive.metastore.response.spill.location: the s3 location to store response objects when their sizes exceed a given threshod, for example, 4MB. The threshold is defined in the property “hive.metastore.response.spill.threshold”, but we don’t recommend customer change the default value.
      • The two properties could be overridden by Lambda environment variables (https://docs.aws.amazon.com/lambda/latest/dg/env_variables.html) so that customer don’t need to recompile the source code for different Lambda functions with different properties.

Customer could choose to update the source code and build the artifacts from scratch. To do that, they need to have Apache Maven (https://maven.apache.org/) installed and then run the command “mvn install” to generate the layer zip file in the output folder called “target” in the module hms-lambda-layer and the lambda function jar in the module hms-lambd-func. Customer need to update the two properties, i.e., hive.metastore.uris and hive.metastore.response.spill.location in the file hms.properties in the hms-lambda-func module before they build the artifacts.

The artifacts consists of the following files

  • hms-lambda-func-1.0-withdep.jar: an example Lambda function with all runtime dependencies, this jar can be used alone to define a lambda function
  • hms-lambda-layer-1.0-athena.zip: the runtime library for Lambda functions as a Lambda layer (https://docs.aws.amazon.com/lambda/latest/dg/configuration-layers.html).
  • hms-lambda-func-1.0.jar: an example lightweight Lambda function and it relies on the layer to provide Lambda runtime dependencies

aws-athena-hive-metastore's People

Contributors

walterjx avatar dependabot[bot] avatar chrlso avatar jongn avatar tommu180 avatar johnfangjz avatar amazon-auto avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.