yc-huang / hive-mongo Goto Github PK

View Code? Open in Web Editor NEW

32.0 32.0 33.0 3.95 MB

hive storage handler for connecting with MongoDB

License: Apache License 2.0

Java 100.00%

hive-mongo's People

Contributors

Stargazers

Watchers

hive-mongo's Issues

metadata exception

A SELECT query on a Hive table stored By MongoDb storage handler generates
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NumberFormatException: null

Exception in thread "main" java.lang.NoSuchMethodError: com.mongodb.DB.authenticate(Ljava/lang/String;[C)Z

This is mistake message ：

Exception in thread "main" java.lang.NoSuchMethodError: com.mongodb.DB.authenticate(Ljava/lang/String;[C)Z
at org.yong3.hive.mongo.MongoTable.(MongoTable.java:22)
at org.yong3.hive.mongo.MongoSplit.getSplits(MongoSplit.java:84)
at org.yong3.hive.mongo.MongoInputFormat.getSplits(MongoInputFormat.java:86)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextSplits(FetchOperator.java:371)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:303)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:458)
at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:427)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1765)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:237)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:169)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:380)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:740)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:685)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)

Can you solve my problem?

Joining to Hive Data?

Maybe I am assuming to much here, but should we be able to join to Hive Data with out a problem? I.e. we have a mongo linked table, should I be able to join that data to standard hive table? Here's a query example:

select h.* from
(select mongo_field1, mongo_field2 from mongo_table) m
JOIN
(select hive_field1, hive_field2 from hive table where day = '2012-09-25') h
ON m.mongo_field1 = h.hive_field1

note: both fields are of STRING type and both subqueries run as intended. Here is the error I get:
umber of reduce tasks not specified. Estimated from input data size: 1

In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapred.reduce.tasks=
java.lang.ArithmeticException: / by zero
at org.yong3.hive.mongo.MongoSplit.getSplits(MongoSplit.java:86)
at org.yong3.hive.mongo.MongoInputFormat.getSplits(MongoInputFormat.java:84)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:292)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1044)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1036)
at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:171)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:918)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:869)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1126)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:869)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:843)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:435)
at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1326)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1118)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
Job Submission failed with exception 'java.lang.ArithmeticException(/ by zero)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask
hive (hive_database)>

Error:org.apache.hadoop.hive.serde2.ColumnProjectionUtils.getReadColumnIDs(Lorg/apache/hadoop/conf/Configuration;)Ljava/util/ArrayList

More conversation about hive-mongo

yc-huang -

I'd like to have offline conversation about how I may be able to contribute to hive-mongo. Please shoot me a message at [email protected] (This first email is a spam defense, I'll switch to my private once we talk).

Great work on this.

Feature Request: read bson data directly from dbpath (without mongod running)

It would be really cool of Hive-mongo could read directly from MongoDB files rather than having to go through a mongod process (this way I could run it directly against backups without having to start mongod on them). If this is too difficult/impossible, the next best thing would be to be able to run it against the bson files produced by mongodump (though at that point, I'm already halfway to exporting the data to another format anyway).

Error in loading storage handler.org.yong3.hive.mongo.MongoStorageHandler

Hi,
Thanks for the hive plug in. I ran into some issue while following readme to test the installation.
I am getting error in loading the storage handler. It may be a path or some environment variable issue.

I verified : <mongo_hive_home>/src/main/java/org/yong3/hive/mongo/MongoStorageHandler.java exists.

mongo_hive home dir: /home/ben/tools/mongo_hive

here is the full log

[DEV] ben@bi-all:~/tools$ $HIVE_HOME/bin/hive --auxpath /home//ben/tools/mongo_hive/mongo-java-driver-2.6.3.jar,/home/ben/tools/mongo_hive/guava-r06.jar, /home/ben/tools/mongo_hive/hive-mongo-0.0.1-SNAPSHOT.jar
Logging initialized using configuration in jar:file:/home/ben/tools/hive-0.8.1-bin/lib/hive-common-0.8.1.jar!/hive-log4j.properties
Hive history file=/tmp/ben/hive_job_log_ben_201207211824_1059679816.txt
hive> create external table mongo_events(id int, dt int, ip string, event string) stored by "org.yong3.hive.mongo.MongoStorageHandler" with serdeproperties ( "mongo.column.mapping" = "_id,dt,ip,event" ) tblproperties ( "mongo.host" = "bi-fb-app.dev", "mongo.port" = "27017", "mongo.db" = "game", "mongo.collection" = "event_20120703" );
Failed with exception org.apache.hadoop.hive.ql.metadata.HiveException: Error in loading storage handler.org.yong3.hive.mongo.MongoStorageHandler
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

How do I compile?

Your readme has great instructions on how to use this once it's built, but unfortunately, no instructions on how to build it. I'm not a Java programmer myself, so a little guidance on this would help a lot. Thank you.

org.yong3.hive.mongo.MongoStorageHandler class not found

With the hive-mongo-0.0.4-jar-with-dependencies.jar the org.yong3.hive.mongo.MongoStorageHandler class is not found.

Exception

I am trying to run the example usage;

hive --auxpath Hive-mongo/target/hive-mongo-0.0.1-SNAPSHOT-jar-with-dependencies.jar

The jar files are correctly loaded. But after creating the mongo_user table, I try to update that table as follows and get the following exception. Any idea why this wrong FileSystem error is occuring here?

hive> insert overwrite table mongo_users select id, name,age from test;

java.lang.IllegalArgumentException: Wrong FS: file://Hive-mongo/target/hive-mongo-0.0.1-SNAPSHOT-jar-with-dependencies.jar, expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:381)
at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:55)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:393)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:163)
at org.apache.hadoop.mapred.JobClient.copyRemoteFiles(JobClient.java:627)
at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:730)
at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:655)
at org.apache.hadoop.mapred.JobClient.access$300(JobClient.java:174)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:865)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:435)
at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1326)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1118)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Job Submission failed with exception 'java.lang.IllegalArgumentException(Wrong FS: file://Hive-mongo/target/hive-mongo-0.0.1-SNAPSHOT-jar-with-dependencies.jar, expected: file:///)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask

java.lang.NoSuchMethodError: org.apache.hadoop.hive.serde2.SerDeUtils.hasAnyNullObject

I am trying to perform an export into mongo using the Hive-mongo driver but a very frustrating error is popping up exactly at the very end of the map-reduce process. This error has to do with the serde dependency which I believe is the hive-exec.jar. Is this a bug? Can you please help me solve this issue..?

Stacktrace:

[root@localhost ~]# hive
Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j.properties
Hive history file=/tmp/root/hive_job_log_root_201305281708_1510602612.txt
hive> insert into table mongo_popular_routes
> select r.route_short_name as id,
> r.route_long_name as name,
> cast(count(*) / 31 as INT) as daily_movements
> from train_movements tm
> join routes r
> on tm.route_id = r.route_id
> where tm.event_type = 1
> group by r.route_short_name, r.route_long_name;
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapred.reduce.tasks=
Starting Job = job_1369775178800_0001, Tracking URL = http://localhost.localdomain:8088/proxy/application_1369775178800_0001/
Kill Command = /usr/lib/hadoop/bin/hadoop job -Dmapred.job.tracker=127.0.0.1:8021 -kill job_1369775178800_0001
Hadoop job information for Stage-0: number of mappers: 2; number of reducers: 1
2013-05-28 17:09:05,141 Stage-0 map = 0%, reduce = 0%
2013-05-28 17:09:21,597 Stage-0 map = 50%, reduce = 0%, Cumulative CPU 1.16 sec
2013-05-28 17:09:22,850 Stage-0 map = 50%, reduce = 0%, Cumulative CPU 1.16 sec
2013-05-28 17:09:23,991 Stage-0 map = 50%, reduce = 0%, Cumulative CPU 3.6 sec
2013-05-28 17:09:25,087 Stage-0 map = 50%, reduce = 0%, Cumulative CPU 3.6 sec
2013-05-28 17:09:26,460 Stage-0 map = 50%, reduce = 17%, Cumulative CPU 4.09 sec
2013-05-28 17:09:27,665 Stage-0 map = 50%, reduce = 17%, Cumulative CPU 6.11 sec
2013-05-28 17:09:28,849 Stage-0 map = 50%, reduce = 17%, Cumulative CPU 6.11 sec
2013-05-28 17:09:30,031 Stage-0 map = 50%, reduce = 17%, Cumulative CPU 6.14 sec
2013-05-28 17:09:31,185 Stage-0 map = 50%, reduce = 17%, Cumulative CPU 7.09 sec
2013-05-28 17:09:32,395 Stage-0 map = 50%, reduce = 17%, Cumulative CPU 7.13 sec
2013-05-28 17:09:33,567 Stage-0 map = 59%, reduce = 17%, Cumulative CPU 8.02 sec
2013-05-28 17:09:34,718 Stage-0 map = 59%, reduce = 17%, Cumulative CPU 8.02 sec
2013-05-28 17:09:36,034 Stage-0 map = 59%, reduce = 17%, Cumulative CPU 8.05 sec
2013-05-28 17:09:37,230 Stage-0 map = 59%, reduce = 17%, Cumulative CPU 8.97 sec
2013-05-28 17:09:38,435 Stage-0 map = 59%, reduce = 17%, Cumulative CPU 8.97 sec
2013-05-28 17:09:39,691 Stage-0 map = 59%, reduce = 17%, Cumulative CPU 9.83 sec
2013-05-28 17:09:40,803 Stage-0 map = 59%, reduce = 17%, Cumulative CPU 9.83 sec
2013-05-28 17:09:41,992 Stage-0 map = 59%, reduce = 17%, Cumulative CPU 9.86 sec
2013-05-28 17:09:43,117 Stage-0 map = 68%, reduce = 17%, Cumulative CPU 11.04 sec
2013-05-28 17:09:44,227 Stage-0 map = 68%, reduce = 17%, Cumulative CPU 11.04 sec
2013-05-28 17:09:45,405 Stage-0 map = 68%, reduce = 17%, Cumulative CPU 11.05 sec
2013-05-28 17:09:46,539 Stage-0 map = 76%, reduce = 17%, Cumulative CPU 12.23 sec
2013-05-28 17:09:47,696 Stage-0 map = 76%, reduce = 17%, Cumulative CPU 12.23 sec
2013-05-28 17:09:48,832 Stage-0 map = 76%, reduce = 17%, Cumulative CPU 13.52 sec
2013-05-28 17:09:49,934 Stage-0 map = 76%, reduce = 17%, Cumulative CPU 13.52 sec
2013-05-28 17:09:51,061 Stage-0 map = 76%, reduce = 17%, Cumulative CPU 13.54 sec
2013-05-28 17:09:52,166 Stage-0 map = 83%, reduce = 17%, Cumulative CPU 14.95 sec
2013-05-28 17:09:53,299 Stage-0 map = 100%, reduce = 17%, Cumulative CPU 15.57 sec
2013-05-28 17:09:54,578 Stage-0 map = 100%, reduce = 17%, Cumulative CPU 15.79 sec
2013-05-28 17:09:55,653 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:09:56,726 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:09:57,783 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:09:58,863 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:09:59,985 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:10:01,117 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:10:02,234 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:10:03,364 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:10:04,438 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:10:05,516 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:10:06,713 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:10:07,850 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:10:09,011 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:10:10,109 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:10:11,240 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:10:12,379 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:10:13,443 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:10:14,499 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:10:15,628 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:10:16,743 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:10:17,862 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:10:18,935 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:10:20,091 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 14.86 sec
2013-05-28 17:10:21,294 Stage-0 map = 100%, reduce = 100%, Cumulative CPU 14.86 sec
MapReduce Total cumulative CPU time: 14 seconds 860 msec
Ended Job = job_1369775178800_0001 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1369775178800_0001_m_000001 (and more) from job job_1369775178800_0001

Task with the most failures(4):

Task ID:
task_1369775178800_0001_r_000000

URL:

http://localhost.localdomain:50030/taskdetails.jsp?jobid=job_1369775178800_0001&tipid=task_1369775178800_0001_r_000000

Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: java.lang.NoSuchMethodError: org.apache.hadoop.hive.serde2.SerDeUtils.hasAnyNullObject(Ljava/util/List;Lorg/apache/hadoop/hive/serde2/objectinspector/StandardStructObjectInspector;[Z)Z
at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:268)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:448)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.hive.serde2.SerDeUtils.hasAnyNullObject(Ljava/util/List;Lorg/apache/hadoop/hive/serde2/objectinspector/StandardStructObjectInspector;[Z)Z
at org.apache.hadoop.hive.ql.exec.JoinOperator.processOp(JoinOperator.java:126)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:247)
... 7 more

Compile and Install

I was reading the closed issue on how to install and as a complete Java newbie, I have some questions about compilation and installation. In one comment there, there were some tips on how to build it, so I copied them here for reference.

First, before I begin, I am excited for this project. It's a great idea, and helps people get the most out of different systems without a lot of bandaids regularly moving data between systems. Thank you for your work on this project.

Ok
Q1: When it comes to jar files. How platform specific are they? Could a jar file be compiled and included in the git for those of us who are very new to building jars and all the dependencies that are involved?

Q2. Which jars specified in the README example come from this project vs. mongoDB vs. other projects?

Q3. Once all JARs are collected, can they just be copied to the Hive Lib path without specifying them on each job? (or maybe only specifying them on create external table jobs?

Q4. If the building is required for those of us who have never done this, can a step by step be made? I know (below) is a simple step by step, but even that I had issues with (maven wouldn't install for me). Also, I see in the pom.xml file, there are LOTS of dependencies, how do I provide those to maven so they know where they are etc? Also, openJDK vs. SunJDK and if I am building this on a system that running some other hadoop stuff will java installs conflict etc? So many questions on how Java works!

Q5. I apologize for the complete newbness of my questions, assume NO knowledge of building Java or how things work from that perspective... I am a data guy :)

Here's a simple guide on how to build, hope it helps:

make sure you have java sdk installed (otherwise download and install from http://www.oracle.com/technetwork/java/index.html) , $JAVA_HOME env variable is point to the installed directory and $JAVA_HOME/bin/ is included in $PATH env variable;
download maven from http://maven.apache.org and install to a directory (let's say $MAVEN_HOME), add $MAVEN_HOME/bin to $PATH
git clone Hive-Mongo to a directory; launch a cmd shell, cd that directory and execute "mvn package"; if everything is OK, you can find "hive-mongo-0.0.1-SNAPSHOT.jar" in the "target" directory.

Is there any plan to add support for username password?

Hi,

is there any plan to add username password to connection strings? If it is already not there

I tried to locate but could not find

Exception in thread "main" java.lang.NoSuchMethodError:

Hi to all,

Recently I am trying to code in apache river java, and I am testing for user input value from client and process in server and result back to client, but at same time I get the error :

Exception in thread "main" java.lang.NoSuchMethodError: com.sun.jini.example.hel lo.Hello.sayHello()Ljava/lang/String;
at com.sun.jini.example.hello.Client.mainAsSubject(Client.java:142)
at com.sun.jini.example.hello.Client.main(Client.java:83)

Let me know how to overcome this error and continue my project work.

Thank you all in advance.

Additional data types to be added?

I'm obviously asking for too much (without incentive also), but is there any plan to support additional types including these (in order of importance that I feel, but please feel free to re-order):

boolean
Date , in format of "Date":{ "$Date":"..."} or "Date":{"$ISODate":"..."}
custom structs

Thanks, and this really is the best Serde out there with great performance, any chance we can contribute via donation?

Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.hive.serde2.ColumnProjectionUtils.getReadColumnIDs(Lorg/apache/hadoop/conf/Configuration;)Ljava/util/ArrayList;

Hi,
Iam Currently Working on Hive0.14 - MonogDB 2.6.3, as per the steps I builded successfully and copied hive-mongo-0.0.3.jar,guava-r06.jar,mongo-2.6.3.jar into hive_home/lib directory and also in hadoop_home/share/hadoop/mapred,hdfs,yarn (Hadoop 2.4). I created sample external hive-mongostorage table (TBL NAME : mongodb_archives) and then i insert overwrite into hive-mongostorage table ,its sending data into mongodbs collection and also i could query inserted data in collection the problem is when ever i query hive-mongostorage table i get this execption.
hive> select * from mongodb_archives;
OK
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.hive.serde2.ColumnProjectionUtils.getReadColumnIDs(Lorg/apache/hadoop/conf/Configuration;)Ljava/util/ArrayList;
at org.yong3.hive.mongo.MongoInputFormat.getRecordReader(MongoInputFormat.java:39)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:498)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:588)
at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:561)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1621)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:267)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
[hadoop@hadoop ~]$
This exception made hive to drop out of cli
Please help me out in this issue

Thanks,
Divya N

java.lang.AbstractMethodError in 'insert into' while using org.yong3.hive.mongo.MongoStorageHandler.configureJobConf

Hi
i have made External table in hive using MongoStorageHandler. It became succesful. But when i tried 'insert into' statement, it failed with following error
java.lang.AbstractMethodError: org.yong3.hive.mongo.MongoStorageHandler.configureJobConf(Lorg/apache/hadoop/hive/ql/plan/TableDesc;Lorg/apache/hadoop/mapred/JobConf;)V
at org.apache.hadoop.hive.ql.plan.PlanUtils.configureJobConf(PlanUtils.java:800)
at org.apache.hadoop.hive.ql.plan.MapWork.configureJobConf(MapWork.java:479)
at org.apache.hadoop.hive.ql.plan.MapredWork.configureJobConf(MapredWork.java:70)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:379)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:144)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1414)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1192)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1020)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

Can anyone help me out. Thanks in advance

Error in metadata: java.lang.NullPointerException

I am working at testing this, once again thank you for your efforts. I've compiled all the per the other issue I put in, thanks for that as well. I have a mongodb running on a different host that is working well I can also connect to it on 27017 from the host running hive. I've created a users collection in the test db. when I run the following:

create external table mongo_users(id int, name string, age int)
stored by "org.yong3.hive.mongo.MongoStorageHandler"
with serdeproperties ( "mongo.column.mapping" = "_id,name,age" )
tblproperties ( "mongo.host" = "192.166.0.11", "mongo.port" = "27017",
"mongo.db" = "test", "mongo.collection" = "users" );

With Verbose logging turned on, this is the result. Any advice would be welcome. I am running the current version of MongoDB with Hive 0.9.

hive> create external table mongo_users(id int, name string, age int)
> stored by "org.yong3.hive.mongo.MongoStorageHandler"
> with serdeproperties ( "mongo.column.mapping" = "_id,name,age" )
> tblproperties ( "mongo.host" = "192.168.0.11", "mongo.port" = "27017",
> "mongo.db" = "test", "mongo.collection" = "users" );
12/10/13 10:51:04 INFO ql.Driver:
12/10/13 10:51:04 INFO ql.Driver:
12/10/13 10:51:04 INFO parse.ParseDriver: Parsing command: create external table mongo_users(id int, name string, age int)
stored by "org.yong3.hive.mongo.MongoStorageHandler"
with serdeproperties ( "mongo.column.mapping" = "_id,name,age" )
tblproperties ( "mongo.host" = "192.168.0.11", "mongo.port" = "27017",
"mongo.db" = "test", "mongo.collection" = "users" )
12/10/13 10:51:04 INFO parse.ParseDriver: Parse Completed
12/10/13 10:51:04 INFO parse.SemanticAnalyzer: Starting Semantic Analysis
12/10/13 10:51:04 INFO parse.SemanticAnalyzer: Creating table mongo_users position=22
12/10/13 10:51:04 INFO ql.Driver: Semantic Analysis Completed
12/10/13 10:51:04 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:null, properties:null)
12/10/13 10:51:04 INFO ql.Driver: </PERFLOG method=compile start=1350143464531 end=1350143464687 duration=156>
12/10/13 10:51:04 INFO ql.Driver:
12/10/13 10:51:04 INFO ql.Driver: Starting command: create external table mongo_users(id int, name string, age int)
stored by "org.yong3.hive.mongo.MongoStorageHandler"
with serdeproperties ( "mongo.column.mapping" = "_id,name,age" )
tblproperties ( "mongo.host" = "192.168.0.11", "mongo.port" = "27017",
"mongo.db" = "test", "mongo.collection" = "users" )
12/10/13 10:51:04 INFO exec.DDLTask: Use StorageHandler-supplied org.yong3.hive.mongo.MongoSerDe for table mongo_users
12/10/13 10:51:04 INFO hive.log: DDL: struct mongo_users { i32 id, string name, i32 age}
FAILED: Error in metadata: java.lang.NullPointerException
12/10/13 10:51:04 ERROR exec.Task: FAILED: Error in metadata: java.lang.NullPointerException
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:544)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3305)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:242)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1326)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1118)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:116)
at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.(StandardStructObjectInspector.java:106)
at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:274)
at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:259)
at org.yong3.hive.mongo.MongoSerDe.initialize(MongoSerDe.java:101)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:203)
at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:260)
at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:253)
at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:490)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:518)
... 17 more

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
12/10/13 10:51:04 ERROR ql.Driver: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
12/10/13 10:51:04 INFO ql.Driver: </PERFLOG method=Driver.execute start=1350143464687 end=1350143464745 duration=58>
12/10/13 10:51:04 INFO ql.Driver:
12/10/13 10:51:04 INFO ql.Driver: </PERFLOG method=releaseLocks start=1350143464745 end=1350143464746 duration=1>

yc-huang / hive-mongo Goto Github PK

hive-mongo's People

Contributors

Stargazers

Watchers

Forkers

hive-mongo's Issues

This is mistake message ：

Task with the most failures(4):

Recommend Projects

Recommend Topics

Recommend Org