Code Monkey home page Code Monkey logo

strapdata / elassandra Goto Github PK

View Code? Open in Web Editor NEW
1.7K 88.0 198.0 446.71 MB

Elassandra = Elasticsearch + Apache Cassandra

Home Page: http://www.elassandra.io

License: Apache License 2.0

Java 99.00% Shell 0.19% Batchfile 0.01% Python 0.04% HTML 0.01% Groovy 0.69% Emacs Lisp 0.01% Perl 0.02% PowerShell 0.04% FreeMarker 0.01% ANTLR 0.02%
cassandra elasticsearch search completion aggregation nosql masterless mission-critical fuzzy-search rest-api lucene kibana logstash spark

elassandra's Introduction

Elassandra Build Status Documentation Status GitHub release

Twitter

Elassandra Logo

Elassandra is an Apache Cassandra distribution including an Elasticsearch search engine. Elassandra is a multi-master multi-cloud database and search engine with support for replicating across multiple datacenters in active/active mode.

Elasticsearch code is embedded in Cassanda nodes providing advanced search features on Cassandra tables and Cassandra serves as an Elasticsearch data and configuration store.

Elassandra architecture

Elassandra supports Cassandra vnodes and scales horizontally by adding more nodes without the need to reshard indices.

Project documentation is available at doc.elassandra.io.

Benefits of Elassandra

For Cassandra users, elassandra provides Elasticsearch features :

  • Cassandra updates are indexed in Elasticsearch.
  • Full-text and spatial search on your Cassandra data.
  • Real-time aggregation (does not require Spark or Hadoop to GROUP BY)
  • Provide search on multiple keyspaces and tables in one query.
  • Provide automatic schema creation and support nested documents using User Defined Types.
  • Provide read/write JSON REST access to Cassandra data.
  • Numerous Elasticsearch plugins and products like Kibana.
  • Manage concurrent elasticsearch mappings changes and applies batched atomic CQL schema changes.
  • Support Elasticsearch ingest processors allowing to transform input data.

For Elasticsearch users, elassandra provides useful features :

  • Elassandra is masterless. Cluster state is managed through cassandra lightweight transactions.
  • Elassandra is a sharded multi-master database, where Elasticsearch is sharded master-slave. Thus, Elassandra has no Single Point Of Write, helping to achieve high availability.
  • Elassandra inherits Cassandra data repair mechanisms (hinted handoff, read repair and nodetool repair) providing support for cross datacenter replication.
  • When adding a node to an Elassandra cluster, only data pulled from existing nodes are re-indexed in Elasticsearch.
  • Cassandra could be your unique datastore for indexed and non-indexed data. It's easier to manage and secure. Source documents are now stored in Cassandra, reducing disk space if you need a NoSQL database and Elasticsearch.
  • Write operations are not restricted to one primary shard, but distributed across all Cassandra nodes in a virtual datacenter. The number of shards does not limit your write throughput. Adding elassandra nodes increases both read and write throughput.
  • Elasticsearch indices can be replicated among many Cassandra datacenters, allowing write to the closest datacenter and search globally.
  • The cassandra driver is Datacenter and Token aware, providing automatic load-balancing and failover.
  • Elassandra efficiently stores Elasticsearch documents in binary SSTables without any JSON overhead.

Quick start

Upgrade Instructions

Elassandra 6.8.4.2+

<<<<<<< HEAD Since version 6.8.4.2, the gossip X1 application state can be compressed using a system property. Enabling this settings allows the creation of a lot of virtual indices. Before enabling this setting, upgrade all the 6.8.4.x nodes to the 6.8.4.2 (or higher). Once all the nodes are in 6.8.4.2, they are able to decompress the application state even if the settings isn't yet configured locally.

Elassandra 6.2.3.25+

Elassandra use the Cassandra GOSSIP protocol to manage the Elasticsearch routing table and Elassandra 6.8.4.2+ add support for compression of the X1 application state to increase the maxmimum number of Elasticsearch indices. For backward compatibility, the compression is disabled by default, but once all your nodes are upgraded into version 6.8.4.2+, you should enable the X1 compression by adding -Des.compress_x1=true in your conf/jvm.options and rolling restart all nodes. Nodes running version 6.8.4.2+ are able to read compressed and not compressed X1.

Elassandra 6.2.3.21+

Before version 6.2.3.21, the Cassandra replication factor for the elasic_admin keyspace (and elastic_admin_[datacenter.group]) was automatically adjusted to the number of nodes of the datacenter. Since version 6.2.3.21 and because it has a performance impact on large clusters, it's now up to your Elassandra administrator to properly adjust the replication factor for this keyspace. Keep in mind that Elasticsearch mapping updates rely on a PAXOS transaction that requires QUORUM nodes to succeed, so replication factor should be at least 3 on each datacenter.

Elassandra 6.2.3.19+

Elassandra 6.2.3.19 metadata version now relies on the Cassandra table elastic_admin.metadata_log (that was elastic_admin.metadata from 6.2.3.8 to 6.2.3.18) to keep the elasticsearch mapping update history and automatically recover from a possible PAXOS write timeout issue.

When upgrading the first node of a cluster, Elassandra automatically copy the current metadata.version into the new elastic_admin.metadata_log table. To avoid Elasticsearch mapping inconsistency, you must avoid mapping update while the rolling upgrade is in progress. Once all nodes are upgraded, the elastic_admin.metadata is not more used and can be removed. Then, you can get the mapping update history from the new elastic_admin.metadata_log and know which node has updated the mapping, when and for which reason.

Elassandra 6.2.3.8+

Elassandra 6.2.3.8+ now fully manages the elasticsearch mapping in the CQL schema through the use of CQL schema extensions (see system_schema.tables, column extensions). These table extensions and the CQL schema updates resulting of elasticsearch index creation/modification are updated in batched atomic schema updates to ensure consistency when concurrent updates occurs. Moreover, these extensions are stored in binary and support partial updates to be more efficient. As the result, the elasticsearch mapping is not more stored in the elastic_admin.metadata table.

WARNING: During the rolling upgrade, elasticserach mapping changes are not propagated between nodes running the new and the old versions, so don't change your mapping while you're upgrading. Once all your nodes have been upgraded to 6.2.3.8+ and validated, apply the following CQL statements to remove useless elasticsearch metadata:

ALTER TABLE elastic_admin.metadata DROP metadata;
ALTER TABLE elastic_admin.metadata WITH comment = '';

WARNING: Due to CQL table extensions used by Elassandra, some old versions of cqlsh may lead to the following error message "'module' object has no attribute 'viewkeys'.". This comes from the old python cassandra driver embedded in Cassandra and has been reported in CASSANDRA-14942. Possible workarounds:

  • Use the cqlsh embedded with Elassandra
  • Install a recent version of the cqlsh utility (pip install cqlsh) or run it from a docker image:
docker run -it --rm strapdata/cqlsh:0.1 node.example.com

Elassandra 6.x changes

  • Elasticsearch now supports only one document type per index backed by one Cassandra table. Unless you specify an elasticsearch type name in your mapping, data is stored in a cassandra table named "_doc". If you want to search many cassandra tables, you now need to create and search many indices.
  • Elasticsearch 6.x manages shard consistency through several metadata fields (_primary_term, _seq_no, _version) that are not used in elassandra because replication is fully managed by cassandra.

Installation

Ensure Java 8 is installed and JAVA_HOME points to the correct location.

  • Download and extract the distribution tarball
  • Define the CASSANDRA_HOME environment variable : export CASSANDRA_HOME=<extracted_directory>
  • Run bin/cassandra -e
  • Run bin/nodetool status
  • Run curl -XGET localhost:9200/_cluster/state

Example

Try indexing a document on a non-existing index:

curl -XPUT 'http://localhost:9200/twitter/_doc/1?pretty' -H 'Content-Type: application/json' -d '{
    "user": "Poulpy",
    "post_date": "2017-10-04T13:12:00Z",
    "message": "Elassandra adds dynamic mapping to Cassandra"
}'

Then look-up in Cassandra:

bin/cqlsh -e "SELECT * from twitter.\"_doc\""

Behind the scenes, Elassandra has created a new Keyspace twitter and table _doc.

admin@cqlsh>DESC KEYSPACE twitter;

CREATE KEYSPACE twitter WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': '1'}  AND durable_writes = true;

CREATE TABLE twitter."_doc" (
    "_id" text PRIMARY KEY,
    message list<text>,
    post_date list<timestamp>,
    user list<text>
) WITH bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';
CREATE CUSTOM INDEX elastic__doc_idx ON twitter."_doc" () USING 'org.elassandra.index.ExtendedElasticSecondaryIndex';

By default, multi valued Elasticsearch fields are mapped to Cassandra list. Now, insert a row with CQL :

INSERT INTO twitter."_doc" ("_id", user, post_date, message)
VALUES ( '2', ['Jimmy'], [dateof(now())], ['New data is indexed automatically']);
SELECT * FROM twitter."_doc";

 _id | message                                          | post_date                           | user
-----+--------------------------------------------------+-------------------------------------+------------
   2 |            ['New data is indexed automatically'] | ['2019-07-04 06:00:21.893000+0000'] |  ['Jimmy']
   1 | ['Elassandra adds dynamic mapping to Cassandra'] | ['2017-10-04 13:12:00.000000+0000'] | ['Poulpy']

(2 rows)

Then search for it with the Elasticsearch API:

curl "localhost:9200/twitter/_search?q=user:Jimmy&pretty"

And here is a sample response :

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.6931472,
    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.6931472,
        "_source" : {
          "post_date" : "2019-07-04T06:00:21.893Z",
          "message" : "New data is indexed automatically",
          "user" : "Jimmy"
        }
      }
    ]
  }
}

Support

License

This software is licensed under the Apache License, version 2 ("ALv2"), quoted below.

Copyright 2015-2018, Strapdata ([email protected]).

Licensed under the Apache License, Version 2.0 (the "License"); you may not
use this file except in compliance with the License. You may obtain a copy of
the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
License for the specific language governing permissions and limitations under
the License.

Acknowledgments

  • Elasticsearch, Logstash, Beats and Kibana are trademarks of Elasticsearch BV, registered in the U.S. and in other countries.
  • Apache Cassandra, Apache Lucene, Apache, Lucene and Cassandra are trademarks of the Apache Software Foundation.
  • Elassandra is a trademark of Strapdata SAS.

elassandra's People

Contributors

areek avatar bleskes avatar brwe avatar cbuescher avatar clintongormley avatar colings86 avatar dadoonet avatar dakrone avatar dimitris-athanasiou avatar dnhatn avatar droberts195 avatar imotov avatar jasontedor avatar javanna avatar jaymode avatar jimczi avatar jpountz avatar kimchy avatar lcawl avatar martijnvg avatar mikemccand avatar nik9000 avatar rjernst avatar rmuir avatar s1monw avatar spinscale avatar talevy avatar tlrx avatar uboness avatar ywelsch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

elassandra's Issues

Error creating an es-index for an already existing table

Using 2.1.1-14.
So, my table is:

CREATE TABLE IF NOT EXISTS a.b(
  a BIGINT,
  b VARCHAR,
  c VARCHAR,
  dTIMESTAMP,
  PRIMARY KEY ((a,b), c)
)

And using a simple index:

{
    "settings": {"keyspace": "a"},
    "mappings": {
        "b": {
            "properties": {
                "c": {"type": "string", "index": "not_analyzed", "cql_collection": "singleton"},
            }

        }
    }
}

And I get "null pointer exception" when contacting to es. Logs attached.
CRUD on cassandra works fine though. Also "index" on es fails with same error.

system.log.txt

Elastic Index ignores documents created using Cassandra UPDATE

I am using elassandra-2.1.1-17

When creating new documents using INSERT, Elastic indexes them. But when using UPDATE to create the new documents - Elastic ignores them. See example:

cqlsh > CREATE KEYSPACE IF NOT EXISTS test
    WITH replication = {
                           'class': 'NetworkTopologyStrategy',
                           'dc1': '1'
                       }
    AND durable_writes = true;

cqlsh > CREATE TABLE test.t1 (
    name text,
    id int,
    nicks set<text>,
    PRIMARY KEY (name, id)
);
$> curl -XDELETE "localhost:9200/test_index?pretty=true" 

$> curl -XPUT "localhost:9200/test_index?pretty=true" -d \
'{
    "settings": {
        "keyspace":"test"
    },
    "mappings": {
            "t1": {
                "properties": {
                    "name": {"type": "string", "cql_collection": "singleton", "index": "not_analyzed", "cql_primary_key_order": 0, "cql_partition_key": true},
                    "id": {"type": "integer", "cql_collection": "singleton", "index": "no", "cql_primary_key_order": 1},
                    "nicks": {"type": "string", "cql_collection": "set", "index": "not_analyzed"}
                }
            }
    }
}' 

Now add a couple of documents:

cqlsh> UPDATE test.t1 SET nicks = nicks + {'abc'} WHERE name='Moses' AND id=14;
cqlsh> INSERT INTO test.t1 (name, id, nicks) VALUES ('Jerry', 33, {'jj', 'jlo'});
cqlsh> SELECT * FROM test.t1 ;
 name  | id | nicks
-------+----+---------------
 Moses | 14 |       {'abc'}
 Jerry | 33 | {'jj', 'jlo'}
(2 rows)

Check results for Elastic (you only see 1 entry, the one added using INSERT):

$> curl -XGET "localhost:9200/test_index/t1/_search?pretty=true"
{
  "took" : 16,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "test_index",
      "_type" : "t1",
      "_id" : "[\"Jerry\",33]",
      "_score" : 1.0,
      "_source":{"name":"Jerry","id":33,"nicks":["jj","jlo"]}
    } ]
  }
}

Elassandra benefits

Congrats for this solution which seems to be very interesting. I am already using Cassandra and Elasticsearch to serve different storage needs, C* as source of truth and ES as serving layer for user analytics, and I can see the benefits of deep integration from the data pipeline point of view because there is no need to double ingest. But I would like to know if the following assumptions are right:

  • as data is stored in C* which manage primary keys, Elassandra natively provides idempotence meaning writing for example from a Spark job, if one worker crash the bulk insert will be replayed but duplicated entries will be overriden so resulting in no duplicates in ES ?
  • as you provide a bi directionnal relation between C* and ES, you can both use ES-hadoop and Spark-Cassandra connector to write to Elassandra ?
  • as the number of shard is linked to the number of partition in Cassandra, index shards will natively growth with the number of C* nodes or vnodes ? Does it mean that search performance would be better than a static nb of shard index in standard ES ?
  • I am always sceptical with forks because following the source product evolution is very hard and time consumming. I have heard in your C* summit video that few C* classes are modified but lots of ES classes are modified (more than 1000 ?). Aren't you afraid of being late on original products version ? How many resources are deeply implied on this project ? For example C* 2.2 is end of life next month but tic/toc release isn't very stable...
    Again I think the idea behind your project is very clever and interesting, the main risk will be a small adoption that will reduce the capabilities to maintain it from the base C* and ES.

Problem while installing

Hi,
I was trying to install Elassandra on centos using this page. I am facing few issues. I am using CentOS release 6.5 (Final).

When I run command bellow
yum install elassandra

It fails saying Package elassandra-2.1.1-14.noarch.rpm is not signed. Then I tried install with --nogpgcheck, and the installation completed.

When I try su - esandra it says user esandra does not exist. I checked my /etc/passwd file and figured out there is a user named `cassandra".

using cassandra user I tried doing systemctl start elassandra it fails saying systemctl, command not found.

Then I tried


[cassandra@localhost ~]$ elassandra status
Elassandra is stopped.

[cassandra@localhost ~]$ elassandra start
Starting Elassandra: Password:

Have I missed anything while installation ?

Thanks

Building and or tarball download

I am trying to build the source, but I am getting failures due to forbidden method invocations.

Failed to execute goal de.thetaphi:forbiddenapis:2.0:check (check-forbidden-apis) on project elassandra:

The link to the tarball release also appears to be broken. I would really like to check this project as an alternative to the Datastax distribution and their search based on Solr.

Startup elassandra add please

i want testing this product and found out some thing what must exist to start elassandra.

After downloading sigar and adding the sigar-bin to the lib folder it was working.
Also CASSANDRA_CONF must be set :-)

Windows installation ?

I tried to install the tool but I'm using Windows as operating system.
Everything seems to be working except that no Elastic Search instance is launched.
I think it is only launched through bin/cassandra but not from bin/cassandra.bat but I didn't look further.

Here is the trace of the command "cassandra -e":

WARNING! Powershell script execution unavailable.
   Please use 'powershell Set-ExecutionPolicy Unrestricted'
   on this user-account to run cassandra with fully featured
   functionality on this platform.
Starting with legacy startup options
Starting Cassandra Server

"with compact storage" is not supported by Elassandra ?

When I'm building Cassandra table "with compact storage" and creating ElasticSearch for that keyspace I don't see indices are created when looking at output of "describe table" cqlsh command. However ElasticSearch does reports about successful index creation. And, of course, querying that index brings back zero hits..

CREATE KEYSPACE IF NOT EXISTS vm
        WITH replication = { 'class': 'NetworkTopologyStrategy', 'dc1': '1' }
        AND durable_writes = true;

    CREATE TABLE IF NOT EXISTS vm.fs1 (
        name text, path text, size bigint, snap text, time timestamp, type int, vmid text,
        PRIMARY KEY (vmid, path, name, snap, time, size)
    )
    WITH COMPACT STORAGE;

curl -XPUT "localhost:9200/vm_index?pretty=true" -d '{
    "settings": {
        "keyspace":"vm",
        "analysis": {
            "analyzer":  {"prefix-analyzer" : {"type": "custom","tokenizer": "prefix-tokenizer"}},
            "tokenizer": {"prefix-tokenizer": {"type": "path_hierarchy","delimiter": "/"}}
        }
    },
    "mappings": {
        "fs1": {
            "properties": {
                "time": {"type": "date",   "cql_collection": "singleton","index": "no","format": "y-M-d H:m:s"},
                "path": {"type": "string", "cql_collection": "singleton","index": "analyzed", "analyzer": "prefix-analyzer","search_analyzer": "keyword"},
                "name": {"type": "string", "cql_collection": "singleton","index": "not_analyzed"},
                "size": {"type": "long",   "cql_collection": "singleton","index": "no"},
                "type": {"type": "byte",   "cql_collection": "singleton","index": "no"},
                "vmid": {"type": "string", "cql_collection": "singleton","index": "no"},
                "snap": {"type": "string", "cql_collection": "singleton","index": "no"}
            }
        }
    }
}'

cqlsh> describe table vm.fs1;

CREATE TABLE vm.fs1 (
    vmid text,
    path text,
    name text,
    snap text,
    time timestamp,
    size bigint,
    type int,
    PRIMARY KEY (vmid, path, name, snap, time, size)
) WITH COMPACT STORAGE
    AND CLUSTERING ORDER BY (path ASC, name ASC, snap ASC, time ASC, size ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';

Hybrid cluster/keyspace setup

Is it possible to configure a particular keyspace (and its tables) with 2 datacenters where one DC is Cassandra-only nodes (serving realtime queries and no secondary index overhead) and second DC with Elassandra nodes (secondary indexes)?

Frozen Columns

Why all columns are frozen columns and not only the data type?

Mailing list ?

Some issues would be better in a mailing list. This way less work for the developers having questions in the issues. Something like a google group should be easy to set up.

Error Opening zip file or JAR manifest missing jamm-0.3.0.jar

@vroyer

When I git clone elasandra and after the build. I tried to run Cassandra I got this error:

Java version: 1.8.0_102-b14

[ec2-user@ip-172-28-198-99 bin]$ ./cassandra
[ec2-user@ip-172-28-198-99 bin]$ Error opening zip file or JAR manifest missing : ./../lib/jamm-0.3.0.jar
Error occurred during initialization of VM
agent library failed to init: instrument

Cheers,
Diego Pacheco

Upgarding ElasticSearch for Elassandra

Hi,

The latest release of Elassandra contains Elastic Search version 2.1.1.
I want to upgrade elastic search to the latest release i.e 2.3.4 so that i can use kibana latest release i.e 4.5 for elassandra.
I am unable to find any information on the same.
Please provide steps to upgrade.

Thanks.

regards,
Chinmaya

Elasticsearch geo_point causes ERROR during insert into Elassandra with CURL

Hi,
I'm trying to run the Kibana Getting Started tutorial with Elassandra 2.1.1-8 and Kibana 4.3.3 (linux-x64).
To accomplish the Map Visualization I have launched the following command:

****** ELASTICSEARCH COORDINATES DEFINITION ******

curl -XPUT http://localhost:9200/logstash_20150520 -d ' { "mappings": { "log": { "properties": { "geo": { "properties": { "coordinates": { "type": "geo_point" } } } } } } } ';

and then I've done a bulk insert with the following command:

****** ELASTICSEARCH BULK INSERT ******

curl -XPOST 'localhost:9200/_bulk?pretty' --data-binary @logs.jsonl

The bulk fails with the following ERROR (from system.log):

######## START ERROR

2016-04-28 13:08:41,244 ERROR [elasticsearch[localhost][index][T#4]] InternalCassandraClusterService.java:1479 insertDocument [localhost] [logstash_20150520].[log] failed to parse field geo={dest=IN, src=R
U, coordinates={lon=-87.44675972, lat=31.01621528}, srcdest=RU:IN}
java.lang.ClassCastException: null
2016-04-28 13:08:41,244 DEBUG [elasticsearch[localhost][index][T#4]] TransportReplicationAction.java:604 performOnPrimary [localhost] [logstash_20150520][0], node[c7610fa4-e08b-405b-b855-c682a935543a], [P]
, v[1], s[STARTED], a[id=B0BPlnTrS2-fmliQ7rwABw]: Failed to execute [index {[logstash_20150520][log][test], source[
{
"@timestamp" : "2015-05-01T15:57:34.915Z",
"ip" : "166.114.155.140",
"extension" : "jpg",
"response" : "200",
"geo" : {
"coordinates" : {
"lat" : 31.01621528,
"lon" : -87.44675972
},
"src" : "RU",
"dest" : "IN",
"srcdest" : "RU:IN"
},
"@tags" : ["success", "info"],
"utc_time" : "2015-05-20T15:57:34.915Z",
"referer" : "http://twitter.com/success/ellison-onizuka",
"agent" : "Mozilla/5.0 (X11; Linux i686) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.50 Safari/534.24",
"clientip" : "166.114.155.140",
"bytes" : 6839,
"host" : "media-for-the-masses.theacademyofperformingartsandscience.org",
"request" : "/uploads/laurel-b-clark.jpg",
"url" : "https://media-for-the-masses.theacademyofperformingartsandscience.org/uploads/laurel-b-clark.jpg",
"@message" : "166.114.155.140 - - [2015-05-20T15:57:34.915Z] "GET /uploads/laurel-b-clark.jpg HTTP/1.1" 200 6839 "-" "Mozilla/5.0 (X11; Linux i686) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.50 Safari/534.24"",
"spaces" : "this is a thing with lots of spaces wwwwoooooo",
"xss" : "<script>console.log("xss")</script>",
"headings" : ["

daniel-burbank

", "http://twitter.com/success/john-grunsfeld"],
"links" : ["[email protected]", "http://www.slate.com/info/curtis-brown", "www.www.slate.com"],
"relatedContent" : [],
"machine" : {
"os" : "win xp",
"ram" : 32212254720
},
"@Version" : "1"
}]}]
java.lang.ClassCastException: null
2016-04-28 13:08:41,244 TRACE [elasticsearch[localhost][index][T#4]] TransportReplicationAction.java:542 finishAsFailed [localhost] operation failed
java.lang.ClassCastException: null
2016-04-28 13:08:41,244 TRACE [elasticsearch[localhost][index][T#4]] ChildMemoryCircuitBreaker.java:182 addWithoutBreaking [localhost] [request] Adjusted breaker by [16440] bytes, now [16440]
2016-04-28 13:08:41,245 INFO [elasticsearch[localhost][index][T#4]] BytesRestResponse.java:131 convert /logstash_20150520/log/test Params: {id=test, index=logstash_20150520, type=log}
java.lang.ClassCastException: null

######## END ERROR

These are the index definition of Elasticsearch and the Keyspace definition of Cassandra

****** ELASTICSEARCH INDEX ******

"logstash_20150520" : {
"state" : "open",
"settings" : {
"index" : {
"creation_date" : "1461776568462",
"uuid" : "yx1IPFnmT5icKYTcMfZh7w",
"number_of_replicas" : "0",
"number_of_shards" : "1",
"version" : {
"created" : "2010199"
}
}
},
"mappings" : {
"log" : {
"properties" : {
"spaces" : {
"type" : "string"
},
"relatedContent" : {
"properties" : {
"articleTag" : {
"type" : "string"
},
"twitterCard" : {
"type" : "string"
},
"ogImageHeight" : {
"type" : "string"
},
"articlePublished_time" : {
"format" : "strict_date_optional_time||epoch_millis",
"type" : "date"
},
"twitterSite" : {
"type" : "string"
},
"ogDescription" : {
"type" : "string"
},
"url" : {
"type" : "string"
},
"articleModified_time" : {
"format" : "strict_date_optional_time||epoch_millis",
"type" : "date"
},
"ogType" : {
"type" : "string"
},
"twitterImage" : {
"type" : "string"
},
"ogImageWidth" : {
"type" : "string"
},
"ogUrl" : {
"type" : "string"
},
"ogTitle" : {
"type" : "string"
},
"ogImage" : {
"type" : "string"
},
"twitterTitle" : {
"type" : "string"
},
"ogSite_name" : {
"type" : "string"
},
"twitterDescription" : {
"type" : "string"
},
"articleSection" : {
"type" : "string"
}
}
},
"@message" : {
"type" : "string"
},
"bytes" : {
"type" : "long"
},
"geo" : {
"properties" : {
"dest" : {
"type" : "string"
},
"src" : {
"type" : "string"
},
"coordinates" : {
"type" : "geo_point"
},
"srcdest" : {
"type" : "string"
}
}
},
"host" : {
"type" : "string"
},
"clientip" : {
"type" : "string"
},
"@tags" : {
"type" : "string"
},
"xss" : {
"type" : "string"
},
"utc_time" : {
"format" : "strict_date_optional_time||epoch_millis",
"type" : "date"
},
"links" : {
"type" : "string"
},
"machine" : {
"properties" : {
"os" : {
"type" : "string"
},
"ram" : {
"type" : "long"
}
}
},
"@Version" : {
"type" : "string"
},
"agent" : {
"type" : "string"
},
"url" : {
"type" : "string"
},
"memory" : {
"type" : "long"
},
"phpmemory" : {
"type" : "long"
},
"ip" : {
"type" : "string"
},
"response" : {
"type" : "string"
},
"extension" : {
"type" : "string"
},
"headings" : {
"type" : "string"
},
"@timestamp" : {
"format" : "strict_date_optional_time||epoch_millis",
"type" : "date"
},
"request" : {
"type" : "string"
},
"referer" : {
"type" : "string"
}
}
}
},
"aliases" : [ ]
},

****** CASSANDRA KEYSPACE ******

CREATE KEYSPACE logstash_20150520 WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1': '1'} AND durable_writes = true;

CREATE TYPE logstash_20150520.log_geo_coordinates (
lat double,
lon double
);

CREATE TYPE logstash_20150520.log_geo (
coordinates frozen<list<frozen<log_geo_coordinates>>>,
srcdest frozen<list>,
dest frozen<list>,
src frozen<list>
);

CREATE TYPE logstash_20150520.log_machine (
ram frozen<list>,
os frozen<list>
);

CREATE TYPE logstash_20150520."log_relatedContent" (
"ogDescription" frozen<list>,
"twitterDescription" frozen<list>,
"ogSite_name" frozen<list>,
"articleSection" frozen<list>,
"articlePublished_time" frozen<list>,
"twitterSite" frozen<list>,
"ogTitle" frozen<list>,
"twitterTitle" frozen<list>,
"twitterImage" frozen<list>,
"ogImageHeight" frozen<list>,
"articleModified_time" frozen<list>,
url frozen<list>,
"ogType" frozen<list>,
"ogUrl" frozen<list>,
"ogImageWidth" frozen<list>,
"ogImage" frozen<list>,
"twitterCard" frozen<list>,
"articleTag" frozen<list>
);

CREATE TABLE logstash_20150520.log (
"_id" text PRIMARY KEY,
"@message" list,
"@tags" list,
"@timestamp" list,
"@Version" list,
agent list,
bytes list,
clientip list,
extension list,
geo list<frozen<log_geo>>,
headings list,
host list,
ip list,
links list,
machine list<frozen<log_machine>>,
memory list,
phpmemory list,
referer list,
"relatedContent" list<frozen<log_relatedContent>>,
request list,
response list,
spaces list,
url list,
utc_time list,
xss list
) WITH bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = 'Auto-created by Elassandra'
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
CREATE CUSTOM INDEX elastic_log_xss_idx ON logstash_20150520.log (xss) USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log_host_idx ON logstash_20150520.log (host) USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log__message_idx ON logstash_20150520.log ("@message") USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log_extension_idx ON logstash_20150520.log (extension) USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log_clientip_idx ON logstash_20150520.log (clientip) USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log_spaces_idx ON logstash_20150520.log (spaces) USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log_bytes_idx ON logstash_20150520.log (bytes) USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log_url_idx ON logstash_20150520.log (url) USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log__version_idx ON logstash_20150520.log ("@Version") USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log_geo_idx ON logstash_20150520.log (geo) USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log_ip_idx ON logstash_20150520.log (ip) USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log_relatedContent_idx ON logstash_20150520.log ("relatedContent") USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log_memory_idx ON logstash_20150520.log (memory) USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log__tags_idx ON logstash_20150520.log ("@tags") USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log_request_idx ON logstash_20150520.log (request) USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log_machine_idx ON logstash_20150520.log (machine) USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log_response_idx ON logstash_20150520.log (response) USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log_headings_idx ON logstash_20150520.log (headings) USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log_referer_idx ON logstash_20150520.log (referer) USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log_links_idx ON logstash_20150520.log (links) USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log__timestamp_idx ON logstash_20150520.log ("@timestamp") USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log_phpmemory_idx ON logstash_20150520.log (phpmemory) USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log_utc_time_idx ON logstash_20150520.log (utc_time) USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';
CREATE CUSTOM INDEX elastic_log_agent_idx ON logstash_20150520.log (agent) USING 'org.elasticsearch.cassandra.ElasticSecondaryIndex';

Maybe I'm doing something wrong?

If I don't run the command for the geo_point creation on the Elasticsearch index, the bulk insert goes well but lat and lon are simply doubles (not geo coordinates) and kibana can't show them over the Map.

Thanks.

Alberto

Alter Table From Cassandra to Elasticsearch SYNC

I have created a table via elasticsearch and was sync to cassandra:

 _id | message                                    | postDate                            | user
-----+--------------------------------------------+-------------------------------------+------------
   2 |     ['Another tweet, will it be indexed?'] | ['2009-11-15 14:12:12.000000+0000'] | ['kimchy']
   1 | ['Trying out Elassandra, so far so good?'] | ['2009-11-15 13:12:00.000000+0000'] | ['kimchy']

when I altered the table tweet and add email column:

ALTER TABLE tweet ADD email varchar;

 _id | email | message                                    | postDate                            | user
-----+-------+--------------------------------------------+-------------------------------------+------------
   2 |  null |     ['Another tweet, will it be indexed?'] | ['2009-11-15 14:12:12.000000+0000'] | ['kimchy']
   1 |  null | ['Trying out Elassandra, so far so good?'] | ['2009-11-15 13:12:00.000000+0000'] | ['kimchy']

The new column doesn't appear in elasticsearch:

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "twitter",
      "_type" : "tweet",
      "_id" : "2",
      "_score" : 1.0,
      "_source":{"postDate":"2009-11-15T14:12:12.000Z","message":"Another tweet, will it be indexed?","user":"kimchy"}
    }, {
      "_index" : "twitter",
      "_type" : "tweet",
      "_id" : "1",
      "_score" : 1.0,
      "_source":{"postDate":"2009-11-15T13:12:00.000Z","message":"Trying out Elassandra, so far so good?","user":"kimchy"}
    } ]
  }
}

The question is how am I be able to sync the changes of the table from cassandra to elasticsearch?

Unlike Elasticsearch, can the number of shards be changed in Elassandra as in Cassandra?

Sorry for a silly a question. I'm aware, the number_of_shards is a fixed value in Elasticsearch as this can not be changed after creating the index. Whereas the new nodes (shard or partition node) can be added to the Cassandra ring. From the notes, I guess the Elassandra uses the Cassandra config for the keyspace storage layer and it's partitioner type etc. Just to confirm, does Elassandra inherits the number of shards (config) from Cassandra or Elasticsearch? In other words, can the number of shards be changed for Elassandra index later stage after creating it?

ElasticSearch API issues

Hi,

I am having issues using the ElasticSearch API with Put, Get and Delete operations. If we use a standard ElasticSearch server then:

  1. Put a document on the index - returns "created: true"
  2. Get it successfully - returns "document"
  3. Delete it from the index - returns "found: true"
  4. Try to retrieve it again - returns nothing

If we use Elassandra, 1,2 and 3 work but on 4 it still returns the part of the document and an ElasticSearch document id?

Again, if we use a standard ElasticSearch server:

  1. Put a document on the index - returns "created: true"
  2. Put same document on the index - returns "created: false"
  3. Delete it from the index - returns "found: true"
  4. Delete same document from the index - returns "found: false"

If we use Elassandra, the service always returns "created: true" and "found: true"?

Although sometimes we get an error "write_failure_exception" if we try to place the same document on the index again even if we delete it first?

And what is the story in regards to deleting indexes?

Thanks for any help

Peter

Build string on ES startpage is empty

error show build numer:

{
  "name" : "abc",
  "cluster_name" : "Test Cluster",
  "version" : {
    "number" : "2.1.2",
    "build_hash" : "${buildNumber}",
    "build_timestamp" : "NA",
    "build_snapshot" : true,
    "lucene_version" : "5.3.1"
  },
  "tagline" : "You Know, for Search"
}

elastic search configurations

I'm a bit confused on where to place elasticsearch config parameters? The docs say "don't touch" elasticsearch.yml but where do you place default/global index parameters such as index.max_result_window?

Partial/selective indexes

What:
Like postgresql has: https://www.postgresql.org/docs/current/static/indexes-partial.html
Why:
To create smaller/specialiazed/faster indexes (or just index subset of table).
Example:
Move heavy-user's items to it's own index. Create a small index for "last hour" data (will require deleting of old indexes).
How:
By using the current partition-index http://doc.elassandra.io/en/latest/mapping.html#partitioned-index. Modifying it so if the function returns Null or empty-string, no indexing is done into es.

Makes sense ?

How to execute cassandra query using where clause.

Hi,

I am trying to execute the select query using the where clause on the user_name field.
CREATE TABLE insta2.tbl_jobseeker (
id int PRIMARY KEY,
user_name text
) WITH bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
CREATE INDEX username ON insta2.tbl_jobseeker (user_name);

But it is throwing the exception as:
Traceback (most recent call last):
File "/usr/local/bin/cqlsh", line 1258, in perform_simple_statement
result = future.result()
File "cassandra/cluster.py", line 3629, in cassandra.cluster.ResponseFuture.result (cassandra/cluster.c:69380)
raise self._final_exception
ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read] message="Operation failed - received 0 responses and 1 failures" info={'failures': 1, 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}

Indexing data injected from Cassandra [QUESTION]

I'm sorry to open a github issue, didn't find any other ways to ask a question and didn't find the answer in documentation. I have a lot of data in CSV format, I want to use elassandra to perform a full text search on some of the fields. I also don't want to manually initiate ElasticSearch index creation, I would like index to be created on the fly, when I'm inserting records into Cassandra.

Most of the examples in project documentation show how to work with elassandra from ElasticSearch side, by inserting data using "curl -XPUT" commands. How can I create/update index on the fly, when inserting data using cqlsh commands ?

Let's say each record in my CSV file contains four fields: timestamp, text, int and ascii. I want to push all this data into elassandra (using cqlsh, maybe by some bulk loader) and to perform wildcard search on text field (using curl -XGET ... _search). Without triggering the indexing process by myself. Is it possible ? Can anybody provide a sequence of commands that I should run ?

And one additional question: when I'm deleting the data from Cassandra (e.g. using cqlsh DELETE statement), the index of ElasticSearch is updated automatically ? Or I should trigger re-indexing by myself ?

Error when mapping a uuid cassandra type

Hello, I am testing elassandra as a solution because (great job by the way!) I needs analitics in my product (I am currently using cassanda 3, but I am testing elassandra and will migrate to it when the support on cassandra 3 is released).

I am having issues creating an index over a table with timeuuid/uuid column type. Here is what I am doing so far:

CREATE KEYSPACE IF NOT EXISTS tenantsdatabase WITH replication={ 'class':'NetworkTopologyStrategy', 'dc1':'1' };

CREATE TABLE IF NOT EXISTS tenantsdatabase.forms (
    tenantid varchar,
    relieveid timeuuid, 
    formcontent varchar,
PRIMARY KEY ((tenantid), relieveid));


http://xxx.xxx.xxx.xxx:9200/forms_index

{
   "settings" : { "keyspace" : "tenantsdatabase" },
   "mappings" : {
      "forms" : {
           "discover":".*",
           "properties" : {
              "tenantid" : { "type": "string", "index": "no" },
              "relieveid" : { "type": "string", "index": "analyzed" },
              "formcontent" : { "type" : "string", "index" : "analyzed" }
            }

      }
   }
}


GET http://xxx.xxx.xxx.xxx:9200/forms_index

The problem seems to be the mapping from string to uuid because when I change the type of the table from
relieveid timeuuid
to
relieveid varchar
the index works ok.

My question is: is possible to map a timeuuid/uuid cassandra type in an elastic index? I'm doing something wrong?

(Sorry about my english)

Starting Elassandra [ecstart]

Hello,

I always get this error if I start Elassandra:

What could be the problem?

CASSANDRA_HOME is set correct.

Starting with JVM debug address=4242 suspend=n
Starting with Elasticsearch enabled.
CLASSPATH=/root/elassandra-2.1.1-2//conf:/root/elassandra-2.1.1-2//bin/../lib/HdrHistogram-2.1.6.jar:/root/elassandra-2.1.1-2//bin/../lib/ST4-4.0.8.jar:/root/elassandra-2.1.1-2//bin/../lib/airline-0.6.jar:/root/elassandra-2.1.1-2//bin/../lib/ant-1.8.2.jar:/root/elassandra-2.1.1-2//bin/../lib/antlr-3.5.2.jar:/root/elassandra-2.1.1-2//bin/../lib/antlr-runtime-3.5.jar:/root/elassandra-2.1.1-2//bin/../lib/apache-log4j-extras-1.2.17.jar:/root/elassandra-2.1.1-2//bin/../lib/asm-4.1.jar:/root/elassandra-2.1.1-2//bin/../lib/asm-commons-4.1.jar:/root/elassandra-2.1.1-2//bin/../lib/cassandra-thrift-2.2.4.jar:/root/elassandra-2.1.1-2//bin/../lib/commons-cli-1.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/commons-codec-1.6.jar:/root/elassandra-2.1.1-2//bin/../lib/commons-lang3-3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/commons-math3-3.2.jar:/root/elassandra-2.1.1-2//bin/../lib/compiler-0.8.13.jar:/root/elassandra-2.1.1-2//bin/../lib/compress-lzf-1.0.2.jar:/root/elassandra-2.1.1-2//bin/../lib/concurrentlinkedhashmap-lru-1.4.jar:/root/elassandra-2.1.1-2//bin/../lib/crc32ex-0.1.1.jar:/root/elassandra-2.1.1-2//bin/../lib/disruptor-3.0.1.jar:/root/elassandra-2.1.1-2//bin/../lib/elassandra-2.1.1-2.jar:/root/elassandra-2.1.1-2//bin/../lib/fastutil-6.5.7.jar:/root/elassandra-2.1.1-2//bin/../lib/groovy-all-2.4.4-indy.jar:/root/elassandra-2.1.1-2//bin/../lib/guava-18.0.jar:/root/elassandra-2.1.1-2//bin/../lib/hibernate-validator-4.3.0.Final.jar:/root/elassandra-2.1.1-2//bin/../lib/high-scale-lib-1.0.6.jar:/root/elassandra-2.1.1-2//bin/../lib/hppc-0.7.1.jar:/root/elassandra-2.1.1-2//bin/../lib/httpcore-4.2.4.jar:/root/elassandra-2.1.1-2//bin/../lib/icu4j-54.1.jar:/root/elassandra-2.1.1-2//bin/../lib/jackson-core-2.6.2.jar:/root/elassandra-2.1.1-2//bin/../lib/jackson-core-asl-1.9.13.jar:/root/elassandra-2.1.1-2//bin/../lib/jackson-dataformat-cbor-2.6.2.jar:/root/elassandra-2.1.1-2//bin/../lib/jackson-dataformat-smile-2.6.2.jar:/root/elassandra-2.1.1-2//bin/../lib/jackson-dataformat-yaml-2.6.2.jar:/root/elassandra-2.1.1-2//bin/../lib/jackson-mapper-asl-1.9.13.jar:/root/elassandra-2.1.1-2//bin/../lib/jamm-0.3.0.jar:/root/elassandra-2.1.1-2//bin/../lib/javassist-3.20.0-GA.jar:/root/elassandra-2.1.1-2//bin/../lib/javax.inject-1.jar:/root/elassandra-2.1.1-2//bin/../lib/jbcrypt-0.3m.jar:/root/elassandra-2.1.1-2//bin/../lib/jboss-logging-3.1.0.CR2.jar:/root/elassandra-2.1.1-2//bin/../lib/jcl-over-slf4j-1.7.7.jar:/root/elassandra-2.1.1-2//bin/../lib/jna-4.1.0.jar:/root/elassandra-2.1.1-2//bin/../lib/joda-convert-1.2.jar:/root/elassandra-2.1.1-2//bin/../lib/joda-time-2.8.2.jar:/root/elassandra-2.1.1-2//bin/../lib/json-simple-1.1.jar:/root/elassandra-2.1.1-2//bin/../lib/jsr166e-1.1.0.jar:/root/elassandra-2.1.1-2//bin/../lib/jts-1.13.jar:/root/elassandra-2.1.1-2//bin/../lib/junit-4.10.jar:/root/elassandra-2.1.1-2//bin/../lib/libthrift-0.9.2.jar:/root/elassandra-2.1.1-2//bin/../lib/log4j-1.2.17.jar:/root/elassandra-2.1.1-2//bin/../lib/log4j-over-slf4j-1.7.7.jar:/root/elassandra-2.1.1-2//bin/../lib/logback-classic-1.1.3.jar:/root/elassandra-2.1.1-2//bin/../lib/logback-core-1.1.3.jar:/root/elassandra-2.1.1-2//bin/../lib/lucene-analyzers-common-5.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/lucene-analyzers-icu-5.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/lucene-analyzers-kuromoji-5.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/lucene-analyzers-phonetic-5.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/lucene-analyzers-smartcn-5.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/lucene-analyzers-stempel-5.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/lucene-backward-codecs-5.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/lucene-codecs-5.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/lucene-core-5.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/lucene-expressions-5.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/lucene-grouping-5.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/lucene-highlighter-5.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/lucene-join-5.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/lucene-memory-5.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/lucene-misc-5.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/lucene-queries-5.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/lucene-queryparser-5.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/lucene-sandbox-5.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/lucene-spatial-5.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/lucene-spatial3d-5.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/lucene-suggest-5.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/lucene-test-framework-5.3.1.jar:/root/elassandra-2.1.1-2//bin/../lib/lz4-1.3.0.jar:/root/elassandra-2.1.1-2//bin/../lib/metrics-core-3.1.0.jar:/root/elassandra-2.1.1-2//bin/../lib/netty-3.10.5.Final.jar:/root/elassandra-2.1.1-2//bin/../lib/netty-all-4.0.23.Final.jar:/root/elassandra-2.1.1-2//bin/../lib/reporter-config-base-3.0.0.jar:/root/elassandra-2.1.1-2//bin/../lib/reporter-config3-3.0.0.jar:/root/elassandra-2.1.1-2//bin/../lib/sigar-1.6.4.jar:/root/elassandra-2.1.1-2//bin/../lib/slf4j-api-1.7.7.jar:/root/elassandra-2.1.1-2//bin/../lib/snakeyaml-1.15.jar:/root/elassandra-2.1.1-2//bin/../lib/snappy-java-1.1.1.7.jar:/root/elassandra-2.1.1-2//bin/../lib/spatial4j-0.5.jar:/root/elassandra-2.1.1-2//bin/../lib/stream-2.5.2.jar:/root/elassandra-2.1.1-2//bin/../lib/super-csv-2.1.0.jar:/root/elassandra-2.1.1-2//bin/../lib/t-digest-3.0.jar:/root/elassandra-2.1.1-2//bin/../lib/thrift-server-0.3.7.jar:/root/elassandra-2.1.1-2//bin/../lib/validation-api-1.0.0.GA.jar
root@v22016022703732037:~/elassandra-2.1.1-2/bin# CompilerOracle: inline org/apache/cassandra/db/AbstractNativeCell.compareTo (Lorg/apache/cassandra/db/composites/Composite;)I
CompilerOracle: inline org/apache/cassandra/db/composites/AbstractSimpleCellNameType.compareUnsigned (Lorg/apache/cassandra/db/composites/Composite;Lorg/apache/cassandra/db/composites/Composite;)I
CompilerOracle: inline org/apache/cassandra/io/util/Memory.checkBounds (JJ)V
CompilerOracle: inline org/apache/cassandra/io/util/SafeMemory.checkBounds (JJ)V
CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compare (Ljava/nio/ByteBuffer;[B)I
CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compare ([BLjava/nio/ByteBuffer;)I
CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compareUnsigned (Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I
CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/lang/Object;JILjava/lang/Object;JI)I
CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/lang/Object;JILjava/nio/ByteBuffer;)I
CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I
Listening for transport dt_socket at address: 4242
{2.1.2-SNAPSHOT}: Setup Failed ...
- NullPointerException[null value in entry: path.home=null]
java.lang.NullPointerException: null value in entry: path.home=null
    at com.google.common.collect.CollectPreconditions.checkEntryNotNull(CollectPreconditions.java:33)
    at com.google.common.collect.ImmutableMap.entryOf(ImmutableMap.java:135)
    at com.google.common.collect.ImmutableSortedMap.fromEntries(ImmutableSortedMap.java:282)
    at com.google.common.collect.ImmutableSortedMap.copyOfInternal(ImmutableSortedMap.java:275)
    at com.google.common.collect.ImmutableSortedMap.copyOf(ImmutableSortedMap.java:206)
    at org.elasticsearch.common.settings.Settings.<init>(Settings.java:83)
    at org.elasticsearch.common.settings.Settings$Builder.build(Settings.java:1214)
    at org.apache.cassandra.service.ElassandraDaemon.main(ElassandraDaemon.java:250)

Installation process

Hi,

The tool looks promising indeed, Tried to follow the install process but it isn't that clear unfortunately. I am using Ubuntu 14.04.
I have already installed elastic and cassandra and it works.
But when I try to add the elassandra it doesn't combine them together.
Is there a detailed procedure for your installation?

Regards

Elassandra Shield integration

Hi Vincent,
I would like to protect Elasticsearch with the Shield Plugin but when I try to configure it I receiver this error:

You must set the ES_CLASSPATH var

After setting the ES_CLASSPATH var I receive another error:

Error: Could not find or load main class org.elasticsearch.shield.authc.esusers.tool.ESUsersTool

Probably I'm doing something wrong; can you tell me how to configure the Shield plugin over Elassandra?

Thank you

Alberto

elastic_admin keyspace issues

Hi,

I installed Elassandra release 2.1.1-15 on Ubuntu 14.04 and ran into an issue with the elastic_admin keyspace failing to be created. Is Elassandra supposed to work out of the box, or are there some other configuration settings I am missing?
Here is my log file regarding the startup and trying to index a twitter user.

Thanks

java.lang.NullPointerException when starting elassandra

Hi,
while starting elassandra using,
./cassandra -e
i got an error in system.log

2016-05-19 18:58:09,525 ERROR [main] ElassandraDaemon.java:376 main Exception
java.lang.NullPointerException: null
at org.apache.cassandra.service.ElassandraDaemon.activate(ElassandraDaemon.java:112) ~[elassandra-2.1.1-10.jar:na]
at org.apache.cassandra.service.ElassandraDaemon.main(ElassandraDaemon.java:338) ~[elassandra-2.1.1-10.jar:na]

What will be the reason ? Is there any need for another elasticsearch server?

Cannot create index pattern in Kibana (cluster setup)

Has anyone else experienced any problems while setting up Kibana 4.3 while having multiple nodes?
When using elassandra with just a single node, I'm able to setup the index pattern and see my data in Kibana as it's being loaded. However, when I try to do the same thing with several nodes, I cannot create the index (Kibana is able to find the index and recognize the Time-field name, however it doesn't do anything when I click create).

Additionally, online it is posted that I should use the sed command as:

sed -i .bak -e "s/type: ${q}index-pattern${q}/type: ${q}index_pattern${q}/g" -e "s/type = ${q}index-pattern${q}/type = ${q}index_pattern${q}/g" -e "s%${q}index-pattern${q}: ${q}/settings/objects/savedSearches/${q}%${q}index_pattern${q}: ${q}/settings/objects/savedSearches/${q}%g" optimize/bundles/kibana.bundle.js src/ui/public/index_patterns/.js src/ui/public/index_patterns/.js

src/ui/public/index_patterns/*.js is repeated twice. In order to get kibana working on a single node, I also used sed on src/plugins/kibana/public/discover/controllers/discover.js.

Note: I'm using elassandra 2.1.1

full debian and ubuntu support

This weekend I will work on a new startup script and installation guide for installing elassandra on debian or ubuntu. Please no longer use the debian init script because it contains some bugs.
Thanks.

Creating mapping from cassandra table fails on float value

Hi Vincent,
I've created a table in Cassandra with several data types (text, timestamp, bigint, double, boolean, float) and I've done a synchronization to create a Mapping in Elasticsearch with CURL

curl -XPUT "http://localhost:9200/my_keyspace/_mapping/my_table" -d '{ "my_table" : { "columns_regexp" : "a.*", "properties" : { "name" : { "type" : "string", "index" : "not_analyzed" } } } }'

All fields are mapped well except the FLOAT value (I can't see the value querying Elasticsearch where the field exists but empty).

Let me know if you are able to reproduce the issue.
Thanks,

Alberto

Support for PasswordAuthenticator/CassandraAuthorizer [QUESTION]

I apologize if this is already stated, but I couldn't find anything mentioned in the documentation or the other issues.

Is there support for those using the PasswordAuthenticator/CassandraAuthorizer?

I was able to spin up a cluster with the "AllowAllAuthenticator" and everything works great. I encounter errors when I try to build a cluster (or single node) with:
authenticator: org.apache.cassandra.auth.PasswordAuthenticator
authorizer: org.apache.cassandra.auth.CassandraAuthorizer

Cassandra starts fine, but none of the ElasticSearch functionality seems to work, which probably makes sense if it can't write to the elastic_admin keyspace.

If needed, running:
elassandra-2.1.1-17.9.noarch
elassandra-kibana-4.3.3-1.x86_64

I can send more info if needed.

does each elassandra node only has index information for its own node and any benchmarks?

Does each elassandra node only has index information for its own node, i.e., it indices only the data it owns by the node's assigned partition keys/tokens?

We had a very bad experience with the our old DSE cassandra 1.2 + SOLR setup, that any SOLR query would fan out to all nodes because each node would contain index information of the node itself and when you have an index on non-partition key columns, the indexed data "has to be" searched on all nodes. The more the nodes the higher the IO and slower the calls.

I could not find any documentation on how it fares on throughput or any benchmarks or comparison with other cassandra SOLR indices out there.
http://elassandra.readthedocs.io/en/latest/integration.html

Allow querying ES index directly with CQL

It would be really great, instead of using the REST API of ES, to allow querying ES directly from CQL.

This has been achieved already by https://github.com/Stratio/cassandra-lucene-index :

ELECT * FROM tweets WHERE expr(tweets_index,'{
    filter : {type:"range", field:"time", lower:"2014/04/25", upper:"2014/05/01"}
}') limit 100;

For standard and basic filtering, users can use CQL directly to query data. Of course the REST API will still be there for advance usage

Elastic Search plugins support

Hello, I am wondering if Elassandra will support the ecosystem of ES plugins, in specific the watcher plugin for sending alerts?

How to apply filter query?

Hi,
I am trying to apply elasticsearch filter queries to retrieve data.
POST request,
http://localhost:9200/twitter/tweet/_search { "query_string" : { "default_field" : "message", "query" : "hi" } }

But i am getting following error
{ "error": { "root_cause": [ { "type": "search_parse_exception", "reason": "failed to parse search source. unknown search element [query_string]", "line": 2, "col": 5 } ], "type": "search_phase_execution_exception", "reason": "all shards failed", "phase": "query_fetch", "grouped": true, "failed_shards": [ { "shard": 0, "index": "twitter", "node": "03ecbd77-ef8f-4502-bcd6-bf453b24d116", "reason": { "type": "search_parse_exception", "reason": "failed to parse search source. unknown search element [query_string]", "line": 2, "col": 5 } } ] }, "status": 400 }

Is this possible to apply all the filter queries in elassandra?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.