Code Monkey home page Code Monkey logo

clj-hector's Introduction

clj-hector

A simple Clojure client for Cassandra that wraps Hector. The 0.2.1 release was built against Clojure 1.4.0.

Current build status: Build status

Installation

Add the following to your project.clj

:dependencies [[org.clojars.paul/clj-hector "0.3.1"]]

Usage

Schema Manipulation

(def cluster (cluster "Pauls Cluster" "localhost"))
(add-keyspace cluster
              {:name "Keyspace Name"
               :replication 3
               :column-families [{:name "a"}
                                 {:name "b"
                                  :comparator :long}]})
(add-column-family cluster "Keyspace Name" {:name "c"})
(drop-keyspace cluster "Keyspace Name")

Basic retrieval of rows

(def c (cluster "Pauls Cluster" "localhost"))
(def ks (keyspace c "Twitter"))
(get-rows ks "Users" ["paul"] :n-serializer :string)

user> (-> (cluster "Pauls Cluster" "localhost")
          (keyspace "Twitter")
          (get-rows "Users" ["paul"] :n-serializer :string))
({"paul" {"age" #<byte[] [B@324a897c>, "login" #<byte[] [B@3b8845af>}})

It's also possible to query for column slices

user> (-> (cluster "Pauls Cluster" "localhost")
          (keyspace "Twitter")
          (get-columns "Users" "paul" ["age" "login"] :n-serializer :string))

Serializing non-String types

user> (put ks "Users" "Paul" {"age" 30})
#<MutationResultImpl MutationResult took (2us) for query (n/a) on host: localhost(127.0.0.1):9160>
user> (get-rows ks "Users" ["Paul"] :n-serializer :string :v-serializer :integer)
({"Paul" {"age" 30}})

The following serializers are supported

  • :string
  • :integer
  • :long
  • :bytes

Super Columns

Firstly, the column family will need to support super columns.

user> (add-column-family cluster "Keyspace Name" {:name "UserRelationships"
                                                  :type :super})

Storing super columns works using a nested map structure:

user> (put ks "UserRelationships" "paul" {"SuperCol" {"k" "v"} "SuperCol2" {"k2" "v2"}} :type :super)
#<MutationResultImpl MutationResult took (6us) for query (n/a) on host: localhost(127.0.0.1):9160>

Retrieving super columns with get-super-rows:

user> (get-super-rows ks "UserRelationships" ["paul"] ["SuperCol" "SuperCol2"] :s-serializer :string :n-serializer :string :v-serializer :string)
({"paul" ({"SuperCol", {"a" "1", "k" "v"}} {"SuperCol2", {"k2" "v2"}})})

In the above example, note the addition of the s-serializer option: this controls how super column names should be deserialized.

You can also query for a sequence of columns:

user> (get-super-columns ks "UserRelationships" "paul" "SuperCol" ["a" "k"] :s-serializer :string :n-serializer :string :v-serializer :string)
{"a" "1", "k" "v"}

Deleting Rows

It's possible to delete all columns identified by keys with the delete-rows function. This works with both super-column families and regular column families.

To delete the example above:

user> (delete-rows ks "UserRelationships" ["paul"])

user> (get-super-columns ks "UserRelationships" "paul" "SuperCol" ["a" "k"] :s-serializer :string :n-serializer :string :v-serializer :string)
{}

user> (get-super-rows ks "UserRelationships" ["paul"] ["SuperCol" "SuperCol2"] :s-serializer :string :n-serializer :string :v-serializer :string)
({"paul" ()})

TODO: In the above query, a row is returned despite having no results. This should probably just return an empty sequence.

Query metadata

Hector exposes data about how long queries took to execute (and on which host). This is provided as metadata on the query result maps:

user> (meta (get-rows ks "Users" ["Paul"] {:n-serializer :string :v-serializer :integer}))
{:exec_us 2, :host #<CassandraHost localhost(127.0.0.1):9160>}

Experimental Schema Querying

clj-hector allows you to provide default schema settings for the specified column families (see ./test/clj_hector/test/schema.clj for examples).

For example, when operating with the MyColumnFamily column family, you can provide default name and value serializers as follows:

(def MyColumnFamily [:name "MyColumnFamily"
                     :n-serializer :string
                     :v-serializer :string])

Then, when querying, wrap the functions with the with-schema macro:

(with-schemas [MyColumnFamily]
  (put ks "MyColumnFamily" "row-key" {"k" "v"})
  (get-rows ks "MyColumnFamily" ["row-key"])))

Note that it's still very early days- all suggestions and forks are welcome!

TODO

  • Better support different Hector query types- multimethod dispatch based on arity of pk and c args?
  • Paging support for queries (somehow wiring into chunked sequences?)
  • Better support of CassandraHostConfigurator
  • Refactoring

License

Copyright (c) Paul Ingles

Distributed under the Eclipse Public License, the same as Clojure.

clj-hector's People

Contributors

alanpeabody avatar blakesmith avatar licenser avatar nickmbailey avatar pingles avatar rplevy-draker avatar ryfow avatar thobbs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

clj-hector's Issues

Adding Secondary Indexes

I would like to be able to add secondary index to a column family via clj-hector.

I don't believe there is a way to do this currently.

@rplevy-draker and myself would be willing to write the code to add this functionality if you have an idea of how you might like to see it implemented.

Update README and documentation

There's been a few changes to the API. Make sure all the documentation in GitHub Pages, the Wiki and the README are up to date.

Add support for composite column types

I wanted to check to see if there was any plan to implement support composite column types. I was thinking about taking that on, but I realized it's probably going to involve significant rework of your existing codebase and didn't want to begin before first speaking with you.

Upgrade Hector/Cassandra

Once the upgrade to Clojure 1.4.0 is settled, it's probably also worth upgrading Hector and Cassandra.

As of writing, the latest releases are: hector-core 1.0-5, and cassandra-all 1.0.10 (Cassandra has actually been updated to a 1.1.0 release although I've not done any testing with Hector 1.0.5 yet).

I'm going to leave this here so that people using clj-hector more aggressively can comment on the value of upgrading :)

Add support for composite keys

I've pushed a change on my fork to support composite keys, and works fine in my use. However, as it stands, it won't work if a CF has both a composite key as well as composite column (unless composites are identical). This is because ToClojure/AbstractComposite only looks for the :c-serializer option.
Before I make any of those changes, wanted to discuss with you (and perhaps mstump) about necessary fix. Also, since a CF could have Composite values as well, might want to provision for that.

Use an in-process Cassandra daemon

Currently the tests rely on having a Cassandra daemon running. cascading.cassandra creates an in-process daemon that the tests then connect to.

This could perhaps be introduced as a with-inproc-daemon macro, or perhaps a with-test-keyspace macro- could then tidy the dropping/creating keyspaces between tests

Super columns: supercolumn parameter is not optional for super CF Documents

Hello there!

I seem to be having some trouble creating a mutation on a super column family.

First, I create the column family:

letterleaf.tasks.bootstrap> (ddl/add-column-family cass/cluster cass/keyspace-name {:name "Documents" :type :super :comparator :ascii})
 "2611c110-8ce6-11e1-0000-5416249b7faf"

Which yields a schema like so:

create column family Documents
  with column_type = 'Super'
  and comparator = 'AsciiType'
  and subcomparator = 'BytesType'
  and default_validation_class = 'BytesType'
  and key_validation_class = 'BytesType'
  and rows_cached = 0.0
  and row_cache_save_period = 0
  and row_cache_keys_to_save = 0
  and keys_cached = 200000.0
  and key_cache_save_period = 14400
  and read_repair_chance = 1.0
  and gc_grace = 864000
  and min_compaction_threshold = 4
  and max_compaction_threshold = 32
  and replicate_on_write = true
  and row_cache_provider = 'ConcurrentLinkedHashCacheProvider'
  and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy';

Then I try to 'put' rows into the column family:

letterleaf.models.document> (hector/put cass/keyspace "Documents" "[email protected]" {"123" {"sec-0" "bob"}})

Which throws this exception:

InvalidRequestException(why:supercolumn parameter is not optional for super CF Documents)
  [Thrown class me.prettyprint.hector.api.exceptions.HInvalidRequestException]

Restarts:
 0: [QUIT] Quit to the SLIME top level
 1: [CAUSE1] Invoke debugger on cause   [Thrown class org.apache.cassandra.thrift.InvalidRequestException]

Backtrace:
  0: ExceptionsTranslatorImpl.java:52 me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate
  1: HConnectionManager.java:252 me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover
  2:   ExecutingKeyspace.java:97 me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation
  3:        MutatorImpl.java:243 me.prettyprint.cassandra.model.MutatorImpl.execute
  4:                core.clj:164 clj-hector.core/put
  5:             RestFn.java:470 clojure.lang.RestFn.invoke
  6:            NO_SOURCE_FILE:1 letterleaf.models.document$eval2464.invoke
  7:          Compiler.java:5424 clojure.lang.Compiler.eval
  8:          Compiler.java:5391 clojure.lang.Compiler.eval
  9:               core.clj:2382 clojure.core/eval
 10:                core.clj:532 swank.core/eval719[fn]
 11:            MultiFn.java:163 clojure.lang.MultiFn.invoke
 12:                basic.clj:54 swank.commands.basic/eval-region
 13:                basic.clj:44 swank.commands.basic/eval-region
 14:                basic.clj:78 swank.commands.basic/eval989[fn]
 15:                Var.java:365 clojure.lang.Var.invoke
 16:            (Unknown Source) letterleaf.models.document$eval2462.invoke
 17:          Compiler.java:5424 clojure.lang.Compiler.eval
 18:          Compiler.java:5391 clojure.lang.Compiler.eval
 19:               core.clj:2382 clojure.core/eval
 20:                core.clj:100 swank.core/eval-in-emacs-package
 21:                core.clj:256 swank.core/eval-for-emacs
 22:                Var.java:373 clojure.lang.Var.invoke
 23:                AFn.java:167 clojure.lang.AFn.applyToHelper
 24:                Var.java:482 clojure.lang.Var.applyTo
 25:                core.clj:540 clojure.core/apply
 26:                core.clj:107 swank.core/eval-from-control
 27:                core.clj:112 swank.core/eval-loop
 28:                core.clj:341 swank.core/spawn-repl-thread[fn]
 29:                AFn.java:159 clojure.lang.AFn.applyToHelper
 30:                AFn.java:151 clojure.lang.AFn.applyTo
 31:                core.clj:540 clojure.core/apply
 32:                core.clj:338 swank.core/spawn-repl-thread[fn]
 33:             RestFn.java:397 clojure.lang.RestFn.invoke
 34:                 AFn.java:24 clojure.lang.AFn.run
 35:             Thread.java:680 java.lang.Thread.run

Can anyone shed any insight into this exception? I'm new to Cassandra, so I'm probably missing something obvious, but I wasn't sure if it was something specifically with clj-hector.

Thanks!

Blake

Consolidate ser/deser in serialize.clj and ddl.clj

We have some mismatch when serializing to/from hector/clojure in the serialize.clj and ddl.clj namespaces. We should rework those namespaces to make sure things match up. The conversion to clojure in serialize.clj is also returning some things still as hector objects (for example comparator type). We should avoid doing that.

This will likely have to be a backwards compatible breaking change so we may need to bump the major version number accordingly.

TTL support in defschema

Provide TTL information in defschema. Will need to update put to read values from the schema definitions.

Allow querying on a secondary index

My initial thoughts here would be to allow additional options to be passed to the already existing 'get-rows' function. Since the mongodb api seems to be often referenced as extremly developer friendly maybe something similar to a clojure lib they have?

https://github.com/aboekhoff/congomongo

Basically a ':where' option followed by a clause. Will have to be some cleverness with serializing the column name specified in the clause. I think it needs to end up as a byte array before being sent to cassandra.

How to use CQL

Hi,
I am using this clj-hector for one of my project, creating a web tool to browse cassandra cluster. I want to execute a query like "select * from my-column-family". Could you suggest me which api to be used and how to use that. I used the below code but it does not render that data
get-rows-cql-query keyspace "select * from Emp"

suppress log output in test run

Cassandra and Hector log a lot of information during the average test run. This masks the output from the tests, by default I'd like this to be suppressed, perhaps with a way of specifically enabling it for certain test runs from the repl?

Upgrade clojure version

We are on 1.2.1 right now. The tests right now break on 1.3. Almost certainly due to the auto boxing behavior changes in 1.3. There might be some issues with any dynamic vars we have as well.

Also 1.4 is out, which again changes boxing behavior. Ints will box to Long in 1.3 but integer in 1.4 believe. We should probably just upgrade straight to 1.4.

Wrap createRangeSlicesQuery

It's possible I'm just modeling my data incorrectly, but getting a range of rows with HFactory.createRangeSlicesQuery seems quite useful in Hector. Is there a reason it's not supported in clj-hector?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.