I'm trying to use the engine from Scala.
Pretty simple setup. Using example parquet file from testdata folder.
Code looks like this:
val ctx = new ExecutionContext(Map.empty[String,String].asJava)
val pqtSource = new ParquetDataSource("data/alltypes_plain.parquet")
println(pqtSource.schema().toString)
ctx.registerDataSource("pdata",pqtSource)
val df2 = ctx.sql("select id,bool_col from pdata")
val c2 = ctx.execute(df2).iterator().asScala.toList.map(r => println(r))
Schema(fields=[Field(name=id, dataType=Int(32, true)), Field(name=bool_col, dataType=Bool), Field(name=tinyint_col, dataType=Int(32, true)), Field(name=smallint_col, dataType=Int(32, true)), Field(name=int_col, dataType=Int(32, true)), Field(name=bigint_col, dataType=Int(64, true)), Field(name=float_col, dataType=FloatingPoint(SINGLE)), Field(name=double_col, dataType=FloatingPoint(DOUBLE)), Field(name=date_string_col, dataType=Binary), Field(name=string_col, dataType=Binary), Field(name=timestamp_col, dataType=Binary)])
Reading 8 rows
null,null
null,null
null,null
null,null
null,null
null,null
null,null
null,null