Code Monkey home page Code Monkey logo

go-drill's Introduction

go-drill

PkgGoDev codecov Go Report Card CI Test Smoke Test License

go-drill is a highly efficient Pure Go Client and Sql driver for Apache Drill and Dremio. It differs from other clients / drivers by using the native Protobuf API to communicate instead of the REST API. The use of Protobuf enables zero-copy access to the returned data, resulting in greater efficiency.

At the present time, the driver may be used without authentication or with authentication via SASL gssapi-krb-5.

In typical use, the driver is initialized with a list of zookeeper hosts to enable the driver to locate drillbits. It is also possible to connect directly to a drillbit via the client.

Install

Client

go get -u github.com/factset/go-drill

Driver

go get -u github.com/factset/go-drill/driver

Usage

The driver can be used like a typical Golang SQL driver:

import (
  "strings"
  "database/sql"

  _ "github.com/factset/go-drill/driver"
)

func main() {
  props := []string{
    "zk=zookeeper1,zookeeper2,zookeeper3",
    "auth=kerberos",
    "service=<krb_service_name>",
    "cluster=<clustername>",
  }

  db, err := sql.Open("drill", strings.Join(props, ";"))
}

Alternately, you can just use the client directly:

import (
  "context"

  "github.com/factset/go-drill"
)

func main() {
  // create client, doesn't connect yet
  cl := drill.NewClient(drill.Options{/* fill out options */}, "zookeeper1", "zookeeper2", "zookeeper3")

  // connect the client
  err := cl.Connect(context.Background())
  // if there was any issue connecting, err will contain the error, otherwise will
  // be nil if successfully connected
}

Developing

Refreshing the Protobuf Definitions

A command is provided to easily refresh the protobuf definitions, provided you have protoc already on your PATH. The source should be in a directory structure like .../github.com/factset/go-drill/ for development, allowing usage of go generate which will run the command.

Alternatively, the provided command drillProto can be used manually via go run ./internal/cmd/drillProto from the root of the source directory.

$ go run ./internal/cmd/drillProto -h
Drill Proto.

Usage:
        drillProto -h | --help
        drillProto download [-o PATH]
        drillProto fixup [-o PATH]
        drillProto gen [-o PATH] ROOTPATH
        drillProto runall [-o PATH] ROOTPATH

Arguments:
        ROOTPATH  location of the root output for the generated .go files

Options:
        -h --help           Show this screen.
        -o PATH --out PATH  .proto destination path [default: protobuf]

drillProto download will simply download the .proto files to the specified path from the apache drill github repo.

drillProto fixup adds the option go_package = "github.com/factset/go-drill/internal/rpc/proto/..." to each file.

drillProto gen will generate the .pb.go files from the protobuf files, using the provided ROOTPATH as the root output where it will write the files in the structure of <ROOTPATH>/github.com/factset/go-drill/internal/rpc/proto/....

drillProto runall does all of the steps in order as one command.

Regenerate the data vector handling

Running go generate ./internal/data will regenerate the .gen.go files from their templates.

go-drill's People

Contributors

dialtr avatar zeroshade avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

johncoene

go-drill's Issues

SetMaxOpenConns > zookeepers

I'm running into an issue when I set SetMaxOpenConns to something greater than the number of zookeeper nodes available.

I'm not confident whether it is a bug or intentional, though my understanding was that I could create multiple connections to single cluster.

I believe the error occurs here because eindex is out of range.

return newClient, newClient.ConnectEndpoint(ctx, zook.GetEndpoint(newClient.drillBits[eindex]))

Example

package main

import (
	"database/sql"
	"fmt"
	"log"
	"strings"
	"sync"

	_ "github.com/factset/go-drill/driver"
)

type AWS struct {
	key    string
	secret string
}

var maxConnections int = 6

func ConnectDrill() (*sql.DB, error) {
	props := []string{
		"zk=localhost:2181,localhost:2182,localhost:2183",
		"cluster=drillbits1",
		"schema=s3",
	}

	db, err := sql.Open("drill", strings.Join(props, ";"))
	if err != nil {
		return nil, err
	}
	if err = db.Ping(); err != nil {
		return nil, err
	}

	db.SetMaxOpenConns(maxConnections)

	return db, nil
}

func query(con *sql.DB, wg *sync.WaitGroup) {
	defer wg.Done()
	rows, err := con.Query("SELECT * FROM sys.drillbits")

	if err != nil {
		log.Panic(err)
	}

	for rows.Next() {
		rows.Scan()
	}
	fmt.Println("queried")
}

func main() {
	var wg sync.WaitGroup

	con, _ := ConnectDrill()

	wg.Add(10)
	go query(con, &wg)
	go query(con, &wg)
	go query(con, &wg)
	go query(con, &wg)
	go query(con, &wg)
	go query(con, &wg)
	go query(con, &wg)
	go query(con, &wg)
	go query(con, &wg)
	go query(con, &wg)

	wg.Wait()
}
oroutine 75 [running]:
github.com/factset/go-drill.(*Client).NewConnection(0xc0000f1500, {0xa9c7f0, 0xc000024110})
	/home/john/go/pkg/mod/github.com/factset/[email protected]/client.go:173 +0x475
github.com/JohnCoene/go-drill/driver.(*connector).Connect(0x0?, {0xa9c7f0?, 0xc000024110?})
	/home/john/Golang/src/github.com/JohnCoene/go-drill/driver/connector.go:19 +0x2f
database/sql.(*DB).conn(0xc00020cd00, {0xa9c7f0, 0xc000024110}, 0x1)
	/usr/local/go/src/database/sql/sql.go:1395 +0x782
database/sql.(*DB).query(0x0?, {0xa9c7f0, 0xc000024110}, {0x9c8f44, 0x1b}, {0x0, 0x0, 0x0}, 0x0?)
	/usr/local/go/src/database/sql/sql.go:1732 +0x5d
database/sql.(*DB).QueryContext(0x0?, {0xa9c7f0, 0xc000024110}, {0x9c8f44, 0x1b}, {0x0, 0x0, 0x0})
	/usr/local/go/src/database/sql/sql.go:1710 +0xda
database/sql.(*DB).Query(...)
	/usr/local/go/src/database/sql/sql.go:1728
main.query(0x0?, 0x0?)
	/home/john/Golang/src/github.com/JohnCoene/go-drill/cmd/main.go:42 +0x8c
created by main.main
	/home/john/Golang/src/github.com/JohnCoene/go-drill/cmd/main.go:66 +0x299
exit status 2

shell returned 1

And thanks for making this package!

PrepareContext and Prepare function calls hang up to return statementHandlers

Hi Team,

I am new to the apache drill and currently I am using the latest factset drill client library for calling the preperedStatement functionality of the sql, where the sql function call to Prepare() and PrepareContext() both are hang up in drill client call https://github.com/factset/go-drill/blob/v1.2.0/client.go#L427 expecting the message on the queryHandleChannel of type queryHandle := make(chan *rpc.CompleteRpcMessage)

Can someone please explain why it is happening, If I have missed something, explain me how we can resolve this issue.

Sample code of the same issue :

`package main

import (
"context"
"database/sql"
"fmt"
"strings"

_ "github.com/factset/go-drill/driver"

)

func main() {
sqlDb := DrillConnect()
var contx = context.Background()
stmt1, err := sqlDb.PrepareContext(contx, "SELECT * FROM dfs./home/ghuchhar/Downloads/apache-drill-1.21.1/sample-data/region.parquet")
if err != nil {
fmt.Println(err.Error())
return
}
res, err := stmt1.Query()
if err != nil {
fmt.Println(err)
}
for res.Next() {
fmt.Println(res.Columns())
}
}

func DrillConnect() *sql.DB {
props := []string{
"host=0.0.0.0",
"port=31010",
}
db, err := sql.Open("drill", strings.Join(props, ";"))
if err != nil {
fmt.Println(err)
return nil
}
if err := db.Ping(); err != nil {
fmt.Println(err)
return nil
}
return db
}`

Thanks in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.