suketa / ruby-duckdb Goto Github PK
View Code? Open in Web Editor NEWRuby binding for DuckDB
Home Page: https://github.com/suketa/ruby-duckdb
License: MIT License
Ruby binding for DuckDB
Home Page: https://github.com/suketa/ruby-duckdb
License: MIT License
failed brew extract duckdb duckdb/taps --version $DUCKDB_VERSION when DUCKDB_VERSION = 0.3.1
==> Searching repository history
Error: duckdb: undefined method `[]' for nil:NilClass
Error: Process completed with exit code 1.
download binary from duckdb osx binary to fix this issue?
use duckdb_append_hugeint method
duckdb 0.2.8 provides C API of Apache arrow support.
Implement it in ruby-duckdb.
duckdb provides column_count in duckdb_result struct in C header file but DuckDB::Result not.
add DuckDB::Result#column_count method.
db = DuckDB::Database.open
con = db.connect
con.query('CREATE TABLE users (id INTEGER, name STRING)')
con.query("INSERT INTO users VALUES (1, 'Bob'), (2, 'John')")
r = con.query('SELECT name FROM users')
p r.class # => DuckDB::Result
p r.column_count # => expected 1 but raises NoMethodError. <= fix this
added new some DUCKDB_XXX_TYPE in C header file in duckdb 0.2.8
support these types in ruby-duckdb.
diff --git a/src/include/duckdb.h b/src/include/duckdb.h
index b7ac54134..218255383 100644
--- a/src/include/duckdb.h
+++ b/src/include/duckdb.h
@@ -42,6 +42,14 @@ typedef enum DUCKDB_TYPE {
DUCKDB_TYPE_INTEGER,
// int64_t
DUCKDB_TYPE_BIGINT,
+ // uint8_t
+ DUCKDB_TYPE_UTINYINT,
+ // uint16_t
+ DUCKDB_TYPE_USMALLINT,
+ // uint32_t
+ DUCKDB_TYPE_UINTEGER,
+ // uint64_t
+ DUCKDB_TYPE_UBIGINT,
// float
DUCKDB_TYPE_FLOAT,
duckdb_append_interval is added in duckdb.h of duckdb 0.2.9
Implement Appender#append_interval from Appender#_append_interval private method.
implement PreparedStatement#bind_interval using by PreparedStatement#_bind_interval
implement Appender#_append_timestamp private method from duckdb_append_timestamp
Appender#_append_timestamp(year, month, day, hour, minute, second, microsecond)
This doesnt seem to work with the brew version of duckdb, and it's not obvious to me which of the install methods (https://duckdb.org/docs/installation/) this is intended to work with.
before #172, implement DuckDB::Appender#_append_date private method.
expected behavior
db = DuckDB::Database.open
con = db.connect
con.query('CREATE TABLE dates (date_value DATE)')
appender = con.appender('dates')
appender.begin_row
appender.send('_append_date', 2021, 9, 12) # insert 2021/9/12
appender.end_row
appender.flush
fix rake test failures with duckdb 0.2.9
1) Error:
DuckDBTest::AppenderTest#test_append:
DuckDB::Error: failed to append
/home/suke/mywork/ruby-duckdb/lib/duckdb/appender.rb:60:in `append_varchar'
/home/suke/mywork/ruby-duckdb/lib/duckdb/appender.rb:60:in `append'
/home/suke/mywork/ruby-duckdb/test/duckdb_test/appender_test.rb:65:in `sub_test_append_column'
/home/suke/mywork/ruby-duckdb/test/duckdb_test/appender_test.rb:200:in `test_append'
2) Error:
DuckDBTest::AppenderTest#test_append_varchar_with_date_column:
DuckDB::Error: failed to append
/home/suke/mywork/ruby-duckdb/test/duckdb_test/appender_test.rb:69:in `append_varchar'
/home/suke/mywork/ruby-duckdb/test/duckdb_test/appender_test.rb:69:in `sub_test_append_column'
/home/suke/mywork/ruby-duckdb/test/duckdb_test/appender_test.rb:161:in `test_append_varchar_with_date_column'
3) Error:
DuckDBTest::AppenderTest#test_append_varchar_with_timestamp_column:
DuckDB::Error: failed to append
/home/suke/mywork/ruby-duckdb/test/duckdb_test/appender_test.rb:69:in `append_varchar'
/home/suke/mywork/ruby-duckdb/test/duckdb_test/appender_test.rb:69:in `sub_test_append_column'
/home/suke/mywork/ruby-duckdb/test/duckdb_test/appender_test.rb:171:in `test_append_varchar_with_timestamp_column'
4) Error:
DuckDBTest::AppenderTest#test_append_varchar_with_time_column:
DuckDB::Error: failed to append
/home/suke/mywork/ruby-duckdb/test/duckdb_test/appender_test.rb:69:in `append_varchar'
/home/suke/mywork/ruby-duckdb/test/duckdb_test/appender_test.rb:69:in `sub_test_append_column'
/home/suke/mywork/ruby-duckdb/test/duckdb_test/appender_test.rb:166:in `test_append_varchar_with_time_column'
related issues:
implement PreapredStatement#bind_date using by PreparedStatement#_bind_date private method.
class Foo
def to_str
'December, 2nd, 2021'
end
end
require 'date'
require 'duckdb'
db = DuckDB::Database.open
con = db.connect
con.query('CREATE TABLE dates (date_value DATE)')
...
stmt = DuckDB::PreparedStatement.new(con, 'SELECT * FROM dates WHERE date_value = $1')
stmt.bind_date(1, Date.today)
# or
# stmt.bind_date(1, Time.now)
# or
# stmt.bind_date(1, 'December, 2nd, 2021')
# or
# stmt.bind_date(1, Foo.new)
fix build error with duckdb 0.2.9.
The reason is duckdb 0.2.9 changed duckdb_nparams function.
-DUCKDB_API duckdb_state duckdb_nparams(duckdb_prepared_statement prepared_statement, idx_t *nparams_out); // 0.2.8
+DUCKDB_API idx_t duckdb_nparams(duckdb_prepared_statement prepared_statement); // 0.2.9
bump to duckdb 0.2.9 in CI
change CI to use duckdb v0.2.9 and v0.3.0
macosx CI uses the latest version of duckdb.
change to use the latest 2 versions of duckdb like Ubuntu CI.
FYI:
implement Appender#_append_time from duckdb_append_time
Appender#_append_time(hour, min, sec, micros)
Hello,
I'm trying to use ruby-duckdb but I can't use parquet_scan function. Is parquet supported ?
parallels@ubuntu-linux-20-04-desktop:~/Desktop/Parallels Shared Folders/Home/project/monplein$ irb
3.0.0 :001 > require "duckdb"
=> true
3.0.0 :002 > db = DuckDB::Database.open
=> #<DuckDB::Database:0x0000aaaaccd56660>
3.0.0 :003 > con = db.connect
=> #<DuckDB::Connection:0x0000aaaaccdddfc0>
3.0.0 :004 > con.query("CREATE TABLE station_gas AS SELECT * FROM parquet_scan('
file.parquet')")
Traceback (most recent call last):
6: from /usr/share/rvm/rubies/ruby-3.0.0/bin/irb:23:in `<main>'
5: from /usr/share/rvm/rubies/ruby-3.0.0/bin/irb:23:in `load'
4: from /usr/share/rvm/rubies/ruby-3.0.0/lib/ruby/gems/3.0.0/gems/irb-1.3.0/exe/irb:11:in `<top (required)>'
3: from (irb):4:in `<main>'
2: from /usr/share/rvm/gems/ruby-3.0.0/gems/duckdb-0.2.9.0/lib/duckdb/connection.rb:23:in `query'
1: from /usr/share/rvm/gems/ruby-3.0.0/gems/duckdb-0.2.9.0/lib/duckdb/connection.rb:23:in `query_sql'
DuckDB::Error (Catalog Error: Table Function with name parquet_scan does not exist!)
Did you mean "arrow_scan"?
LINE 1: ...ATE TABLE station_gas AS SELECT * FROM parquet_scan('file.parquet')
^
3.0.0 :005 >
With the duckdb binary, I haven't any issue.
parallels@ubuntu-linux-20-04-desktop:~/Desktop/Parallels Shared Folders/Home/project/monplein$ duckdb
v0.0.1-dev0
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
D CREATE TABLE station_gas AS SELECT * FROM parquet_scan('file.parquet');
D select count(1) from station_gas;
┌──────────┐
│ count(1) │
├──────────┤
│ 3324810 │
└──────────┘
implement DuckDB::PreparedStatement#_bind_interval private method using C API duckdb_bind_interval.
stmt._bind_interval(i, months, days, micros)
duckdb_append_timestamp is added 0.2.9
Implement Appender#append_timestamp public method from Appender#_append_timestamp private method(#185).
appender.append_timestamp(Time.now)
appender.append_timestamp(Date.today)
appender.append_timestamp('2021-09-23 08:45:01')
appender.append_timestamp("aaa") # => raise exception. ArgumentError (exception of Time.parse error.)
Fix LoadError on Windows CI
Github Actions: https://github.com/suketa/ruby-duckdb/runs/2816377370
some static function in C are not declared.
for example:
https://github.com/suketa/ruby-duckdb/blob/master/ext/duckdb/connection.c
Appender#append_time
does not accept the object having to_str method.
For example:
require 'duckdb'
class Foo
def to_str
Time.now.strftime('%Y-%m-%dT%H:%M:%S')
end
end
v = Foo.new
Time.parse(v) # => time object
DuckDB::Database.open do |db|
db.connect do |con|
con.execute('CREATE TABLE t (value TIMESTAMP)')
append = con.appender('t')
append
.begin_row
.append_time(v) # => expected to append but raise ArgumentError.
.end_row
.flush
end
end
Appender#append_time
does not accept the object having to_str method.
For example:
require 'duckdb'
class Foo
def to_str
Time.now.strftime('%Y-%m-%dT%H:%M:%S')
end
end
v = Foo.new
Time.parse(v) # => time object
DuckDB::Database.open do |db|
db.connect do |con|
con.execute('CREATE TABLE t (value TIMESTAMP)')
append = con.appender('t')
append
.begin_row
.append_timestamp(v) # => expected to append but raise ArgumentError.
.end_row
.flush
end
end
7z x
didn't work well?. so duckdb.dll was not found. and got error....
cp duckdb.dll C:/Windows/System[3](https://github.com/suketa/ruby-duckdb/runs/5155603401?check_suite_focus=true#step:6:3)2/
shell: C:\Program Files\PowerShell\7\pwsh.EXE -command ". '{0}'"
env:
TMPDIR: D:\a\_temp
HOME: C:\Users\runneradmin
MSYS2_PATH_TYPE: inherit
MAKE: make.exe
Path: C:\Program Files\MongoDB\Server\5.0\bin;C:\aliyun-cli;C:\vcpkg;C:\Program Files (x86)\NSIS\;C:\tools\zstd;C:\Program Files\Mercurial\;C:\hostedtoolcache\windows\stack\2.7.3\x6[4](https://github.com/suketa/ruby-duckdb/runs/5155603401?check_suite_focus=true#step:6:4);C:\cabal\bin;C:\\ghcup\bin;C:\tools\ghc-9.2.1\bin;C:\Program Files\dotnet;C:\mysql\bin;C:\Program Files\R\R-4.1.2\bin\x64;C:\SeleniumWebDrivers\GeckoDriver;C:\Program Files (x86)\sbt\bin;C:\Program Files (x86)\GitHub CLI;C:\Program Files\Git\bin;C:\Program Files (x86)\pipx_bin;C:\hostedtoolcache\windows\go\1.16.13\x64\bin;C:\hostedtoolcache\windows\Python\3.9.10\x64\Scripts;C:\hostedtoolcache\windows\Python\3.9.10\x64;C:\tools\kotlinc\bin;C:\hostedtoolcache\windows\Java_Temurin-Hotspot_jdk\8.0.322-6\x64\bin;C:\npm\prefix;C:\Program Files (x86)\Microsoft SDKs\Azure\CLI2\wbin;C:\ProgramData\kind;C:\Program Files\Microsoft\jdk-11.0.12.7-hotspot\bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Windows\System32\OpenSSH\;C:\Program Files\dotnet\;C:\ProgramData\Chocolatey\bin;C:\Program Files\Docker;C:\Program Files\PowerShell\7\;C:\Program Files\Microsoft\Web Platform Installer\;C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\170\Tools\Binn\;C:\Program Files\Microsoft SQL Server\1[5](https://github.com/suketa/ruby-duckdb/runs/5155603401?check_suite_focus=true#step:6:5)0\Tools\Binn\;C:\Program Files\nodejs\;C:\Program Files\LLVM\bin;C:\Program Files\OpenSSL\bin;C:\Strawberry\c\bin;C:\Strawberry\perl\site\bin;C:\Strawberry\perl\bin;C:\ProgramData\chocolatey\lib\pulumi\tools\Pulumi\bin;C:\Program Files\TortoiseSVN\bin;C:\Program Files\CMake\bin;C:\ProgramData\chocolatey\lib\maven\apache-maven-3.8.4\bin;C:\Program Files\Microsoft Service Fabric\bin\Fabric\Fabric.Code;C:\Program Files\Microsoft SDKs\Service Fabric\Tools\ServiceFabricLocalClusterManager;C:\Program Files\Git\cmd;C:\Program Files\Git\mingw[6](https://github.com/suketa/ruby-duckdb/runs/5155603401?check_suite_focus=true#step:6:6)4\bin;C:\Program Files\Git\usr\bin;C:\Program Files\GitHub CLI\;c:\tools\php;C:\Program Files (x86)\sbt\bin;C:\SeleniumWebDrivers\ChromeDriver\;C:\SeleniumWebDrivers\EdgeDriver\;C:\Program Files\Amazon\AWSCLIV2\;C:\Program Files\Amazon\SessionManagerPlugin\bin\;C:\Program Files\Amazon\AWSSAMCLI\bin\;C:\Program Files\Microsoft SQL Server\130\Tools\Binn\;C:\Users\runneradmin\.dotnet\tools;C:\Users\runneradmin\.cargo\bin;C:\Users\runneradmin\AppData\Local\Microsoft\WindowsApps
ACLOCAL_PATH: /mingw64/share/aclocal:/usr/share/aclocal
``` LANG: en_US.UTF-8
MANPATH: /mingw64/share/man
MINGW_CHOST: x86_64-w64-mingw32
MINGW_PACKAGE_PREFIX: mingw-w64-x86_64
MINGW_PREFIX: /mingw64
MSYSTEM: MINGW64
MSYSTEM_CARCH: x86_64
MSYSTEM_CHOST: x86_64-w64-mingw32
MSYSTEM_PREFIX: /mingw64
PKG_CONFIG_PATH: /mingw64/lib/pkgconfig:/mingw64/share/pkgconfig
PROMPT: $P$G
RI_DEVKIT: c:\msys64
Copy-Item: D:\a\_temp\654f3ffc-5[7](https://github.com/suketa/ruby-duckdb/runs/5155603401?check_suite_focus=true#step:6:7)7f-4eae-[8](https://github.com/suketa/ruby-duckdb/runs/5155603401?check_suite_focus=true#step:6:8)c2d-b5c0[9](https://github.com/suketa/ruby-duckdb/runs/5155603401?check_suite_focus=true#step:6:9)60[12](https://github.com/suketa/ruby-duckdb/runs/5155603401?check_suite_focus=true#step:6:12)2ed.ps1:2
Line |
2 | cp duckdb.dll C:/Windows/System32/
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| Cannot find path 'D:\a\ruby-duckdb\ruby-duckdb\duckdb.dll' because it does not exist.
Error: Process completed with exit code 1.
imprement Appender#append_hugeint from Appender#_append_hugeint
Appender#append_time
Appender#append_datetime
does not accept the object having to_str method.
For example:
require 'duckdb'
class Foo
def to_str
Time.now.strftime('%Y-%m-%dT%H:%M:%S')
end
end
v = Foo.new
Time.parse(v) # => time object
DuckDB::Database.open do |db|
db.connect do |con|
con.execute('CREATE TABLE t (value TIMESTAMP)')
append = con.appender('t')
append
.begin_row
.append_timestamp(v) # => expected to append but raise ArgumentError.
.end_row
.flush
end
end
duckdb provides row_count in duckdb_result struct in C header file but DuckDB::Result not.
add DuckDB::Result#row_count method.
db = DuckDB::Database.open
con = db.connect
con.query('CREATE TABLE users (id INTEGER, name STRING)')
con.query("INSERT INTO users VALUES (1, 'Bob'), (2, 'John')")
r = con.query('SELECT name FROM users')
p r.class # => DuckDB::Result
p r.row_count # => expected 2 but raises NoMethodError. <= fix this
write CONTRIBUTION.md
(duckdb_append_date is added in duckdb.h of duckdb 0.2.9.
Implement Appender#append_date public method using Append#_append_date private method.
expected behavior:
require 'date'
require 'duckdb'
db = DuckDB::Database.open
con = db.connect
con.query('CREATE TABLE dates (date_value DATE)')
appender = con.appender('dates')
appender.begin_row
appender.append_date(Date.today)
# or
# appender.append_date(Time.now)
# appender.append_date('2021-09-14') # string to be parsed by Date.parse
appender.end_row
appender.flush
Implement DuckDB::Result#columns and DuckDB::Column class.
DuckDB::Column class describes duckdb_column struct in duckdb.
We can add a simple instruction to install DuckDB C++ environment for Ruby in the README.md
file. The most easiest way to setup would be to create two folders inside /usr/local/
:
duckdb.h
and duckdb.hpp
header files.libduckdb.so
file.We can then use ldconfig
to link the shared libraries, using:
sudo ldconfig /usr/local/lib -v # -v is optional, only for verbose
This would thus eliminate the need to specify the library and include file while installing the gem file.
Ruby 2.5.8 is unsupported now. drop Ruby 2.5.8 from CI
implement PreparedStatement#bind_timestamp by using PreaparedStatement#_bind_timestamp
class Foo
def to_str
'2021-12-11 07:42:17.345'
end
end
require 'duckdb'
db = DuckDB::Database.open
con = db.connect
con.query('CREATE TABLE timestamps (value TIMESTAMP)')
...
stmt = DuckDB::PreparedStatement.new(con, 'SELECT * FROM dates WHERE time_value = $1')
stmt.bind_time(Time.now)
stmt.bind_time('2021-12-11 07:42:17.285')
stmt.bind_time(Foo.new)
The rows_changed member added in duckdb_result struct from duckdb 0.2.8.
Implement DuckDB::Result#rows_changed.
implement PreparedStatement#bind_time using by PreparedSatement#_bind_time private method.
class Foo
def to_str
'01:01:00'
end
end
require 'duckdb'
db = DuckDB::Database.open
con = db.connect
con.query('CREATE TABLE times (time_value TIME)')
...
stmt = DuckDB::PreparedStatement.new(con, 'SELECT * FROM dates WHERE time_value = $1')
stmt.bind_time(1, Time.now)
stmt.bind_time(1, '01:01:00')
stmt.bind_time(1, Foo.new)
duckdb_append_time is available in duckdb 0.2.8.
Implement Appender#append_time public method from Appender#_append_time private method (#184).
time = Time.now
appender.append_time(time)
time = '01:01:01'
appender.append_time(time)
Run options: --seed 41265
# Running:
D:/a/ruby-duckdb/ruby-duckdb/lib/duckdb/config.rb:37: warning: instance variable @key_descriptions not initialized
.................................................................................E...................
Finished in 0.117408s, 860.2510 runs/s, 2231.5421 assertions/s.
1) Error:
DuckDBTest::AppenderTest#test_append:
DuckDB::Error: Catalog Error: Table with name "t" already exists!
D:/a/ruby-duckdb/ruby-duckdb/lib/duckdb/connection.rb:23:in `query_sql'
D:/a/ruby-duckdb/ruby-duckdb/lib/duckdb/connection.rb:23:in `query'
D:/a/ruby-duckdb/ruby-duckdb/test/duckdb_test/appender_test.rb:35:in `create_table'
D:/a/ruby-duckdb/ruby-duckdb/test/duckdb_test/appender_test.rb:39:in `create_appender'
D:/a/ruby-duckdb/ruby-duckdb/test/duckdb_test/appender_test.rb:74:in `sub_test_append_column'
D:/a/ruby-duckdb/ruby-duckdb/test/duckdb_test/appender_test.rb:324:in `test_append'
101 runs, 262 assertions, 0 failures, 1 errors, 0 skips
rake aborted!
Command failed with status (1)
Tasks: TOP => test
(See full trace by running task with --trace)
implement private method Append#_append_interval from duckdb_append_interval
Append#_append_interval(months, days, micors)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.