Code Monkey home page Code Monkey logo

sqlmlutils's Introduction

sqlmlutils

BuildAndTest

sqlmlutils is a package designed to help users interact with SQL databases (SQL Server and Azure SQL Database) and execute R or Python code in SQL from an R/Python client. Currently, only the R version of sqlmlutils is supported in Azure SQL Database. Python support will be added later.

Check out the README in each language folder for language-specific details and code examples!

Installation

To install sqlmlutils, follow the instructions below for Python and R, respectively.

Python: To install from PyPI: Run

pip install sqlmlutils

To install from file, download the latest release from https://github.com/microsoft/sqlmlutils/releases:

pip install sqlmlutils-1.1.0.zip

R:

Download the latest release from https://github.com/microsoft/sqlmlutils/releases.

Windows:

To obtain the version of R your server is currently using, please use this query:

EXEC sp_execute_external_script
    @language = N'R',
    @script = N'
        v = R.version
        OutputDataSet = data.frame(rversion=paste0(v$major, ".", v$minor))',
    @input_data_1 = N'select 1'
WITH RESULT SETS ((rversion varchar(max)));

Get the version of R which the server is using and install it locally. Then, run the following commands with the same version of R.

From command prompt, run

R.exe -e "install.packages('odbc', type='binary')"
R.exe CMD INSTALL sqlmlutils_1.0.0.zip

OR To build a new package file and install, run

.\buildandinstall.cmd

Linux

R.exe -e "install.packages('odbc')"
R.exe CMD INSTALL sqlmlutils_1.0.0.tar.gz

Details

sqlmlutils contains 3 main parts:

  • Execution of Python/R in SQL databases using sp_execute_external_script
  • Creation and execution of stored procedures created from scripts and functions
  • Install and manage packages in SQL databases

For more specifics and examples of how to use each language's API, look at the README in the respective folder.

Execute in SQL

Execute in SQL provides a convenient way for the user to execute arbitrary Python/R code inside a SQL database using an sp_execute_external_script. The user does not have to know any t-sql to use this function. Function arguments are serialized into binary and passed into the t-sql script that is generated. Warnings and printed output will be printed at the end of execution, and any results returned by the function will be passed back to the client.

Stored Procedures (Sprocs)

The goal of this utility is to allow users to create and execute stored procedures on their database without needing to know the exact syntax of creating one. Functions and scripts are wrapped into a stored procedure and registered into a database, then can be executed from the Python/R client.

Package Management

R and Python package management with sqlmlutils is supported in SQL Server 2019 CTP 2.4 and later.

With package management users can install packages to a remote SQL database from a client machine. The packages are downloaded on the client and then sent over to SQL databases where they will be installed into library folders. The folders are per-database so packages will always be installed and made available for a specific database. The package management APIs provided a PUBLIC and PRIVATE folders. Packages in the PUBLIC folder are accessible to all database users. Packages in the PRIVATE folder are only accessible by the user who installed the package.

sqlmlutils's People

Contributors

aaronburtle avatar dphansen avatar erikms avatar jarupatj avatar jonathanzhu11 avatar microsoft-github-policy-service[bot] avatar microsoftopensource avatar msftgits avatar ntakru avatar rajmera3 avatar santina avatar schnalzenbergerm avatar seantleonard avatar tritm78 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sqlmlutils's Issues

R Package Installation Fails on dependant package cpp11

When installing packages such as tidyverse, broom and haven that depend on the package cpp11, sqlmlutils fails with the error:

Resolving package dependencies for (haven)...
Downloading package [1/7] readr (2.0.2)...
Downloading package [2/7] cpp11 (0.4.0)...
Warning in utils::download.packages(pkg$Package, destdir = destdir, available = binaryPackages, :
no package 'cpp11' at the repositories
Could not find binary version in repo, trying source instead...
Found source package, building into a binary package...
Error: Failed to download package cpp11.

My install (install_packages.R) script is as follows:

library(sqlmlutils)
connection <- connectionInfo(
  server   = "XXXXX",
  database = "XXXXX")

r = getOption("repos")
r["CRAN"] = "http://cran.us.r-project.org"
options(repos = r)

sql_install.packages(connectionString = connection, pkgs = "haven", verbose = TRUE, scope = "PUBLIC")

I am running the script with the command:
C:\Program Files\R\R-3.5.2\bin>RScript.exe "C:\Temp\install_packages.R"

I am able to install the package into R on the same machine without issue
C:\Program Files\R\R-3.5.2\bin>R.exe -e "install.packages('haven', type='binary', repos = 'http://cran.us.r-project.org')"

error in download_script.py

It give an errror on this when run with pip version 20

if pipversion >= LooseVersion("19.3"):
    from pip._internal import pep425tags

I had updated with

if pipversion >= LooseVersion("19.3"):
    from wheel import pep425tags

and it worked

How to Force Upgrade of Python Package?

This is not clearly described in the documentation either here or on Microsoft's website. What is the process to upgrade a Python package to a later version when you receive the following information message:

Package scikit-learn exists on server. Set upgrade to True to force upgrade.".format(pkgname))
The version of scikit-learn you are trying to install is 0.22.1.
The version installed on the server is 0.18.1

What is the syntax I am supposed to follow?

C Stack Usage Error Installing Large Package

I'm trying to install a large R package (~65MB) using sqlmlutils onto SQL Server 2019 CU1.  I get the following error when doing so:
Error: Installation of packages failed with error C stack usage  138399847 is too close to the limit

I can run install.packages() and install the package against a local R installation; that works fine. I was also able to run install.packages() within sp_execute_external_script in SQL Server 2017.

It looks like this issue is related to package size. Almost all of the package size is due to a serialized neural network model. If I strip that out, the package size goes from 67MB to 156KB and installation completes successfully. I'd like to keep the neural network model together with this package, so splitting it out is not a great long-term solution for me.

R Test Warning in `test.sqlPackage.dependencies.R`

Primary warning message: Skipping base packages (lattice)

TODO:
identify why warning appears and mitigate to clean test log.

── Warning ('test.sqlPackage.dependencies.R:148:9'): Installing a package that is already in use ──
Skipping base packages (lattice)
Backtrace:

  1. ├─base::tryCatch(...) at test.sqlPackage.dependencies.R:129:5
  2. │ └─base (local) tryCatchList(expr, classes, parentenv, handlers)
  3. ├─utils::capture.output(...) at test.sqlPackage.dependencies.R:148:9
  4. │ └─base::withVisible(...elt(i))
  5. └─sqlmlutils::sql_install.packages(...)
  6. └─sqlmlutils:::sqlInstallPackagesExtLib(...) at R/R/sqlPackage.R:109:5
  7. └─sqlmlutils:::getDependentPackagesToInstall(...) at R/R/sqlPackage.R:1659:13
    

R/dist contains old version of package

The online documentation for installing R packages to SQL Managed Instance directs users to R/dist in order to install the 'latest' version of the package, except the version at this location is old (0.7.1). This will cause clients who follow the documentation to install an old version of sqlmlutils which will lead to other issues when they try to use the package. The documentation should be updated to direct users to the releases, and/or you should make sure to always place the latest version in R/dist.

Pytorch is not installable using sqlmlutils

From the PyForMLS subdirectory in a cmd window (Windows 10) I launch python (vs 3.5.2)
Then I run these 3 python statements

import sqlmlutils
connection=sqlmlutils.ConnectionInfo(server="localhost",database="mydatabase")
sqlmlutils.SQLPackageManager(connection).install("torch")

The last statement generates this:

STDOUT message(s) from external script:
Processing c:\data\sql\mssql15.mssqlserver\mssql\externallibraries\5\2\1_\pyyamltmp\pyyaml-5.3.1-cp37-cp37m-win_amd64.whl
Installing collected packages: PyYAML
Successfully installed PyYAML-5.3.1
Installed package. Cleaning up temporary directories.
Cleaned up temporary directory.
Package successfully installed.
STDERR message(s) from external script:
Failed building wheel for torch
Failed cleaning build dir for torch
Command ""C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\PYTHON_SERVICES\python.exe" -u -c "import setuptools, tokenize;file='C:\Data\SQL\MSSQL15.MSSQLSERVER\MSSQL\ExternalLibraries\5\2\1\_\pip-req-build-nfi11418\setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record C:\Data\SQL\MSSQL15.MSSQLSERVER\MSSQL\ExternalLibraries\5\2\1_\pip-record-qwk33zxf\install-record.txt --single-version-externally-managed --compile --home=C:\Data\SQL\MSSQL15.MSSQLSERVER\MSSQL\ExternalLibraries\5\2\1_\pip-target-qh7puf9o" failed with error code 1 in C:\Data\SQL\MSSQL15.MSSQLSERVER\MSSQL\ExternalLibraries\5\2\1_\pip-req-build-nfi11418
STDOUT message(s) from external script:
Processing c:\data\sql\mssql15.mssqlserver\mssql\externallibraries\5\2\1_\torchtmp\torch-0.1.2.post2.tar.gz

Building wheels for collected packages: torch
Running setup.py bdist_wheel for torch: started
Running setup.py bdist_wheel for torch: finished with status 'error'
Complete output from command "C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\PYTHON_SERVICES\python.exe" -u -c "import setuptools, tokenize;file='C:\Data\SQL\MSSQL15.MSSQLSERVER\MSSQL\ExternalLibraries\5\2\1\_\pip-req-build-nfi11418\setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" bdist_wheel -d C:\Data\SQL\MSSQL15.MSSQLSERVER\MSSQL\ExternalLibraries\5\2\1_\pip-wheel-dvpxe0od --python-tag cp37:
running bdist_wheel
running build
running build_deps
Traceback (most recent call last):
File "", line 1, in
STDOUT message(s) from external script:
File "C:\Data\SQL\MSSQL15.MSSQLSERVER\MSSQL\ExternalLibraries\5\2\1_\pip-req-build-nfi11418\setup.py", line 265, in
description="Tensors and Dynamic neural networks in Python with strong GPU acceleration",
File "C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\PYTHON_SERVICES\lib\site-packages\setuptools_init_.py", line 143, in setup
return distutils.core.setup(**attrs)
File "C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\PYTHON_SERVICES\lib\distutils\core.py", line 148, in setup
dist.run_commands()
File "C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\PYTHON_SERVICES\lib\distutils\dist.py", line 966, in run_commands
self.run_command(cmd)
File "C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\PYTHON_SERVICES\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()

STDOUT message(s) from external script:
File "C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\PYTHON_SERVICES\lib\site-packages\wheel\bdist_wheel.py", line 188, in run
self.run_command('build')
File "C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\PYTHON_SERVICES\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\PYTHON_SERVICES\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\PYTHON_SERVICES\lib\distutils\command\build.py", line 135, in run
self.run_command(cmd_name)
File "C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\PYTHON_SERVICES\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
Unable to launch runtime for 'Python' script for request id: FF48E36C-0966-405D-8285-F1D50BA09736. Please check the configuration of the 'Python' runtime.
Traceback (most recent call last):
File "C:\Program Files\Microsoft\PyForMLS\lib\site-packages\sqlmlutils\sqlqueryexecutor.py", line 58, in execute
self._mssqlconn.execute_non_query(builder.base_script, builder.params)
File "src_mssql.pyx", line 1033, in _mssql.MSSQLConnection.execute_non_query
File "src_mssql.pyx", line 1061, in _mssql.MSSQLConnection.execute_non_query
File "src_mssql.pyx", line 1634, in _mssql.check_and_raise
File "src_mssql.pyx", line 1683, in _mssql.maybe_raise_MSSQLDatabaseException
_mssql.MSSQLDatabaseException: (39021, b"Unable to launch runtime for 'Python' script for request id: FF48E36C-0966-405D-8285-F1D50BA09736. Please check the configuration of the 'Python' runtime.DB-Lib error message 20018, severity 16:\nGeneral SQL Server error: Check messages from the SQL Server\n")

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\Program Files\Microsoft\PyForMLS\lib\site-packages\sqlmlutils\packagemanagement\sqlpackagemanager.py", line 185, in _install_many
self._install_single(sqlexecutor, target_package_file, scope, True, out_file=out_file)
File "C:\Program Files\Microsoft\PyForMLS\lib\site-packages\sqlmlutils\packagemanagement\sqlpackagemanager.py", line 202, in _install_single
sqlexecutor.execute(builder, out_file=out_file, getResults=False)
File "C:\Program Files\Microsoft\PyForMLS\lib\site-packages\sqlmlutils\sqlqueryexecutor.py", line 61, in execute
raise RuntimeError("Error in SQL Execution") from e
RuntimeError: Error in SQL Execution

The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "", line 1, in
File "C:\Program Files\Microsoft\PyForMLS\lib\site-packages\sqlmlutils\packagemanagement\sqlpackagemanager.py", line 74, in install
self._install_from_pypi(package, upgrade, version, install_dependencies, scope, out_file=out_file)
File "C:\Program Files\Microsoft\PyForMLS\lib\site-packages\sqlmlutils\packagemanagement\sqlpackagemanager.py", line 151, in _install_from_pypi
self._install_from_file(target_package_file, scope, upgrade, out_file=out_file)
File "C:\Program Files\Microsoft\PyForMLS\lib\site-packages\sqlmlutils\packagemanagement\sqlpackagemanager.py", line 174, in _install_from_file
self._install_many(target_package_file, dependencies_to_install, scope, out_file=out_file)
File "C:\Program Files\Microsoft\PyForMLS\lib\site-packages\sqlmlutils\packagemanagement\sqlpackagemanager.py", line 189, in _install_many
raise RuntimeError("Package installation failed, installed dependencies were rolled back.") from e
RuntimeError: Package installation failed, installed dependencies were rolled back.

Cant install the xgboost package

using the current examples I cant install the package.

import sqlmlutils
connection = sqlmlutils.ConnectionInfo(server="name,port", database="dbname", uid="accname", pwd="accpwd")
sqlmlutils.SQLPackageManager(connection).install("XGBoost")

the error I am getting:

Traceback (most recent call last):
File "C:\Users\engserveradmin\AppData\Roaming\Python\Python37\site-packages\sqlmlutils\sqlqueryexecutor.py", line 65, in execute_query
self._cursor.execute(query, params)
pyodbc.Error: ('01000', '[01000] [Microsoft][ODBC SQL Server Driver][DBNETLIB]ConnectionWrite (send()). (10054) (SQLPutData); [01000] [Microsoft][ODBC SQL Server Driver][DBNETLIB]General network error. Check your network documentation. (11)')

I've tried to download the files manually but I am getting the same error

Cannot Create Dir RtmpmaJNMP

Running sqlmlutils 0.7.1 in R against a SQL Server 2019 RC1 machine.

Reproduction Steps

Running a sample package call:
sql_install.packages(connectionString = ad_connection, pkgs = "data.table", scope = "PUBLIC", verbose = TRUE)

ad_connection is a local variable which represents a database where ML Services is set up correctly.

Here is the output:

2019-10-28 09:48:37.69  Starting package install on SQL server (Driver={SQL Server};Server=D2-SQL-R-02;Database=AnomalyDetection;Trusted_Connection=Yes;)...
2019-10-28 09:48:37.69  Verifying permissions to install packages on SQL server...
Warning: unable to access index for repository https://cran.rstudio.com/src/contrib:
  cannot open destfile 'C:\Users\KEVIN~1.FEA\AppData\Local\Temp\RtmpmaJNMP\file75603f8535cd', reason 'No such file or directory'
Warning: unable to access index for repository https://cran.rstudio.com/bin/windows/contrib/3.5:
  cannot open destfile 'C:\Users\KEVIN~1.FEA\AppData\Local\Temp\RtmpmaJNMP\file756038c548a9', reason 'No such file or directory'
Error: Cannot find specified packages (data.table) to install
In addition: Warning message:
In dir.create(downloadDir) :
  cannot create dir 'C:\Users\KEVIN~1.FEA\AppData\Local\Temp\RtmpmaJNMP\download7560232a7e1e', reason 'No such file or directory'

Looks like it's choking on the fact that there is no RtmpmaJNMP folder in my AppData\Local\Temp. I created one and successfully installed data.table.

When installing a local package, I had received a similar warning, but the package did install correctly.

2019-10-28 09:42:55.31  Starting package install on SQL server (Driver={SQL Server};Server=D2-SQL-R-02;Database=AnomalyDetection;Trusted_Connection=Yes;)...
2019-10-28 09:42:55.31  Verifying permissions to install packages on SQL server...
2019-10-28 09:42:55.36  Copying package to Sql server [1/1] AnomalyDetection...
2019-10-28 09:42:56.54  Installing packages to library path, this may take some time...
2019-10-28 09:43:10.77  Successfully installed packages on SQL server (AnomalyDetection).
Warning message:
In dir.create(downloadDir) :
  cannot create dir 'C:\Users\KEVIN~1.FEA\AppData\Local\Temp\RtmpmaJNMP\download756035966c58', reason 'No such file or directory'

Desired Outcome

I would like sql_install.packages to check if the RtmpmaJNMP folder exists and create it if it does not.

Workarounds

Creating the RtmpmaJNMP folder manually resolved my issue. After creating the folder, the install process created two files in it:

  • repos_https%3A%2F%2Fcran.rstudio.com%2Fbin%2Fwindows%2Fcontrib%2F3.5.rds
  • repos_https%3A%2F%2Fcran.rstudio.com%2Fsrc%2Fcontrib.rds

R Version required

Hi All,

When I tried to install 'sqlmlutils' package with R version 3.5.2 , getting message 'package ‘sqlmlutils’ is not available (for R version 3.5.2) '.

Please suggest me required R version to work with 'sqlmlutils' package.

Thanks,
Siva Kumar

Better documentation on installation and usage

We need to update the README.md with instruction of installing the package from GitHub and some usage examples for these scenarios:

  • running SPEES
  • creating stored procedures
  • installing packages

sqlmlutils fail to install packages on SQL Server 2022

Got python to work, but installing packages throws this error.

As a FYI - I could not get sqlmlutils to work on 2019. It would look like it install, but the python package would fail to load properly. Pip Install worked on 2019.

Given the location of Python is different on 2022 to 2019, is it possible to use Pip Install on 2022??

sqlmlutils CALL

import sqlmlutils
connection = sqlmlutils.ConnectionInfo(server="xx\xx", uid="sa", pwd="xxx")
sqlmlutils.SQLPackageManager(connection).install("QuantLib")

FULL ERROR


ProgrammingError Traceback (most recent call last)
File ~\azuredatastudio-python\lib\site-packages\sqlmlutils\sqlqueryexecutor.py:67, in SQLQueryExecutor.execute_query(self, query, params, out_file)
66 if params is not None:
---> 67 self._cursor.execute(query, params)
68 else:

ProgrammingError: ('42000', "[42000] [Microsoft][ODBC SQL Server Driver][SQL Server]A 'Python' script error occurred during execution of 'sp_execute_external_script' with HRESULT 0x80004004. External script request id is CD35FDA4-7E2E-425F-B013-8BD4D069AC74. (39004) (SQLExecDirectW); [42000] [Microsoft][ODBC SQL Server Driver][SQL Server]An external script error occurred: \n\r\nError in execution. Check the output for more information.\r (39019)")

During handling of the above exception, another exception occurred:

RuntimeError Traceback (most recent call last)
Cell In [1], line 3
1 import sqlmlutils
2 connection = sqlmlutils.ConnectionInfo(server="xx\xx", uid="sa", pwd="xxx")
----> 3 sqlmlutils.SQLPackageManager(connection).install("QuantLib")

File ~\azuredatastudio-python\lib\site-packages\sqlmlutils\packagemanagement\sqlpackagemanager.py:78, in SQLPackageManager.install(self, package, upgrade, version, install_dependencies, scope, out_file)
76 self._install_from_file(package, scope, upgrade, out_file=out_file)
77 else:
---> 78 self._install_from_pypi(package, upgrade, version, install_dependencies, scope, out_file=out_file)

File ~\azuredatastudio-python\lib\site-packages\sqlmlutils\packagemanagement\sqlpackagemanager.py:158, in SQLPackageManager._install_from_pypi(self, target_package, upgrade, version, install_dependencies, scope, out_file)
155 target_package = target_package + "==" + version
157 with tempfile.TemporaryDirectory() as temporary_directory:
--> 158 pipdownloader = PipDownloader(self._connection_info, temporary_directory, target_package, language_name = self._language_name)
159 target_package_file = pipdownloader.download_single()
160 self._install_from_file(target_package_file, scope, upgrade, out_file=out_file)

File ~\azuredatastudio-python\lib\site-packages\sqlmlutils\packagemanagement\pipdownloader.py:20, in PipDownloader.init(self, connection, downloaddir, targetpackage, language_name)
18 self._targetpackage = targetpackage
19 self._language_name = language_name
---> 20 server_info = SQLPythonExecutor(connection, self._language_name).execute_function_in_sql(servermethods.get_server_info)
21 globals().update(server_info)

File ~\azuredatastudio-python\lib\site-packages\sqlmlutils\sqlpythonexecutor.py:56, in SQLPythonExecutor.execute_function_in_sql(self, func, input_data_query, *args, **kwargs)
29 def execute_function_in_sql(self,
30 func: Callable, *args,
31 input_data_query: str = "",
32 **kwargs):
33 """Execute a function in SQL Server.
34
35 :param func: function to execute_function_in_sql. NOTE: This function is shipped to SQL as text.
(...)
54 [0.28366218546322625, 0.28366218546322625]
55 """
---> 56 df, _ = execute_query(SpeesBuilderFromFunction(func,
57 self._language_name,
58 input_data_query,
59 *args,
60 **kwargs),
61 self._connection_info)
63 results, output, error = self._get_results(df)
65 if output is not None:

File ~\azuredatastudio-python\lib\site-packages\sqlmlutils\sqlqueryexecutor.py:24, in execute_query(builder, connection, out_file)
22 def execute_query(builder, connection: ConnectionInfo, out_file:str=None):
23 with SQLQueryExecutor(connection=connection) as executor:
---> 24 return executor.execute(builder, out_file=out_file)

File ~\azuredatastudio-python\lib\site-packages\sqlmlutils\sqlqueryexecutor.py:41, in SQLQueryExecutor.execute(self, builder, out_file)
40 def execute(self, builder: SQLBuilder, out_file=None):
---> 41 return self.execute_query(builder.base_script, builder.params, out_file=out_file)

File ~\azuredatastudio-python\lib\site-packages\sqlmlutils\sqlqueryexecutor.py:96, in SQLQueryExecutor.execute_query(self, query, params, out_file)
93 continue
95 except Exception as e:
---> 96 raise RuntimeError("Error in SQL Execution: " + str(e))
98 return df, output_params

RuntimeError: Error in SQL Execution: ('42000', "[42000] [Microsoft][ODBC SQL Server Driver][SQL Server]A 'Python' script error occurred during execution of 'sp_execute_external_script' with HRESULT 0x80004004. External script request id is CD35FDA4-7E2E-425F-B013-8BD4D069AC74. (39004) (SQLExecDirectW); [42000] [Microsoft][ODBC SQL Server Driver][SQL Server]An external script error occurred: \n\r\nError in execution. Check the output for more information.\r (39019)")

R script execution slow after package installation

This issue may not be directly related to this SQLMLUTILS, but maybe you would be able to provide some context or a potential solution.

As we install more libraries, especially tidyverse which installs 70+ sub packages, execution of even the most basic R script goes from sub second execution to 7+ seconds.

Here is the installation script:

library(sqlmlutils)

connection <- connectionInfo(
  server   = "XXXXXX",
  database = "XXXXXX")

r = getOption("repos")
r["CRAN"] = "https://mran.microsoft.com/snapshot/2019-02-01"
options(repos = r)

pkgs <- c("tictoc", "data.table", "tidyverse", "zoo", "DBI")
sql_install.packages(connectionString = connection, pkgs = pkgs, verbose = TRUE, scope = "PUBLIC")

The R script I am testing with is:

EXECUTE sp_execute_external_script @language = N'R'
    , @script = N'
a <- 1
b <- 2
c <- a/b
d <- a*b
print(c(c, d))
'

As a separate test on the same server to rule out any issues with the packages, I installed the packages using the process documented for SQL Server 2017 which installs packages globally for all databases, the script executes in sub seconds times.

Script to test global installation of packages:

r = getOption("repos")
r["CRAN"] = "https://mran.microsoft.com/snapshot/2019-02-01"
options(repos = r)

pkgs <- c("tictoc", "data.table", "tidyverse", "zoo", "DBI")
install.packages(pkgs)

This would be fine, but our plan is to run the database in SQL Managed Instance on Azure and this global approach is not an option. By the way, I did try the test of running the simple script before and after installing packages on Azure and experienced the same behavior, less than a second before packages, 7+ seconds after packages.

Any guidance you could provide would be great!

Installation of any python package fails

Trying to install a python package on SQL2017 always fails with "Incorrect syntax near 'Python'"
e.g.
..
sqlmlutils.SQLPackageManager(connection).install("tensorflow")

fails with this exception:

Traceback (most recent call last):
File "C:\Anaconda3\lib\site-packages\sqlmlutils\sqlqueryexecutor.py", line 44, in execute
self._mssqlconn.execute_query(builder.base_script, builder.params)
File "src_mssql.pyx", line 1064, in _mssql.MSSQLConnection.execute_query
File "src_mssql.pyx", line 1095, in _mssql.MSSQLConnection.execute_query
File "src_mssql.pyx", line 1228, in _mssql.MSSQLConnection.format_and_run_query
File "src_mssql.pyx", line 1639, in _mssql.check_cancel_and_raise
File "src_mssql.pyx", line 1683, in _mssql.maybe_raise_MSSQLDatabaseException
_mssql.MSSQLDatabaseException: (102, b"Incorrect syntax near 'Python'.DB-Lib error message 20018, severity 15:\nGeneral SQL Server error: Check messages from the SQL Server\n")

It seems that the python script created by class CreateLibraryBuilder is invalid.

error to install new package

I am using Azure Data Studio and when I run the command: sqlmlutils.SQLPackageManager(connection).install("text-tools") I got the error below.
My python version is 3.8.2

RuntimeError Traceback (most recent call last)
in
1 import sqlmlutils
2 connection = sqlmlutils.ConnectionInfo(server="myserver", database="mydatabase", uid="myuser", pwd="mypass")
----> 3 sqlmlutils.SQLPackageManager(connection).install("text-tools")

~\AppData\Roaming\Python\Python36\site-packages\sqlmlutils\packagemanagement\sqlpackagemanager.py in install(self, package, upgrade, version, install_dependencies, scope, out_file)
70 self._install_from_file(package, scope, upgrade, out_file=out_file)
71 else:
---> 72 self._install_from_pypi(package, upgrade, version, install_dependencies, scope, out_file=out_file)
73
74 def uninstall(self,

~\AppData\Roaming\Python\Python36\site-packages\sqlmlutils\packagemanagement\sqlpackagemanager.py in _install_from_pypi(self, target_package, upgrade, version, install_dependencies, scope, out_file)
148 with tempfile.TemporaryDirectory() as temporary_directory:
149 pipdownloader = PipDownloader(self._connection_info, temporary_directory, target_package)
--> 150 target_package_file = pipdownloader.download_single()
151 self._install_from_file(target_package_file, scope, upgrade, out_file=out_file)
152

~\AppData\Roaming\Python\Python36\site-packages\sqlmlutils\packagemanagement\pipdownloader.py in download_single(self)
24
25 def download_single(self) -> str:
---> 26 _, pkgsdownloaded = self._download(False)
27 return pkgsdownloaded[0]
28

~\AppData\Roaming\Python\Python36\site-packages\sqlmlutils\packagemanagement\pipdownloader.py in _download(self, withdependencies)
42
43 if len(packagesdownloaded) <= 0:
---> 44 raise RuntimeError("Failed to download any packages, pip returned error: " + error)
45
46 return pkgreqs, packagesdownloaded

RuntimeError: Failed to download any packages, pip returned error: ERROR: Exception:
Traceback (most recent call last):
File "c:\users\user0123\azuredatastudio-python\0.0.1\lib\site-packages\pip_internal\cli\base_command.py", line 153, in _main
status = self.run(options, args)
File "c:\users\user0123\azuredatastudio-python\0.0.1\lib\site-packages\pip_internal\commands\download.py", line 144, in run
resolver.resolve(requirement_set)
File "c:\users\user0123\azuredatastudio-python\0.0.1\lib\site-packages\pip_internal\legacy_resolve.py", line 201, in resolve
self._resolve_one(requirement_set, req)
File "c:\users\user0123\azuredatastudio-python\0.0.1\lib\site-packages\pip_internal\legacy_resolve.py", line 365, in _resolve_one
abstract_dist = self._get_abstract_dist_for(req_to_install)
File "c:\users\user0123\azuredatastudio-python\0.0.1\lib\site-packages\pip_internal\legacy_resolve.py", line 311, in _get_abstract_dist_for
req.populate_link(self.finder, upgrade_allowed, self.require_hashes)
File "c:\users\user0123\azuredatastudio-python\0.0.1\lib\site-packages\pip_internal\req\req_install.py", line 225, in populate_link
self.link = finder.find_requirement(self, upgrade)
File "c:\users\user0123\azuredatastudio-python\0.0.1\lib\site-packages\pip_internal\index.py", line 879, in find_requirement
req.name, specifier=req.specifier, hashes=hashes,
File "c:\users\user0123\azuredatastudio-python\0.0.1\lib\site-packages\pip_internal\index.py", line 861, in find_best_candidate
candidates = self.find_all_candidates(project_name)
File "c:\users\user0123\azuredatastudio-python\0.0.1\lib\site-packages\pip_internal\index.py", line 806, in find_all_candidates
links=page_links,
File "c:\users\user0123\azuredatastudio-python\0.0.1\lib\site-packages\pip_internal\index.py", line 775, in evaluate_links
candidate = self.get_install_candidate(link_evaluator, link)
File "c:\users\user0123\azuredatastudio-python\0.0.1\lib\site-packages\pip_internal\index.py", line 754, in get_install_candidate
is_candidate, result = link_evaluator.evaluate_link(link)
File "c:\users\user0123\azuredatastudio-python\0.0.1\lib\site-packages\pip_internal\index.py", line 193, in evaluate_link
supported_tags = self._target_python.get_tags()
File "c:\users\user0123\azuredatastudio-python\0.0.1\lib\site-packages\pip_internal\models\target_python.py", line 102, in get_tags
impl=self.implementation,
File "c:\users\user0123\azuredatastudio-python\0.0.1\lib\site-packages\pip_internal\pep425tags.py", line 406, in get_supported
if is_manylinux2014_compatible():
File "c:\users\user0123\azuredatastudio-python\0.0.1\lib\site-packages\pip_internal\pep425tags.py", line 252, in is_manylinux2014_compatible
return pip.internal.utils.glibc.have_compatible_glibc(2, 17)
File "c:\users\user0123\azuredatastudio-python\0.0.1\lib\site-packages\pip_internal\utils\glibc.py", line 89, in have_compatible_glibc
version_str = glibc_version_string()
File "c:\users\user0123\azuredatastudio-python\0.0.1\lib\site-packages\pip_internal\utils\glibc.py", line 19, in glibc_version_string
return glibc_version_string_confstr() or glibc_version_string_ctypes()
File "c:\users\user0123\azuredatastudio-python\0.0.1\lib\site-packages\pip_internal\utils\glibc.py", line 51, in glibc_version_string_ctypes
process_namespace = ctypes.CDLL(None)
File "c:\users\user0123\azuredatastudio-python\0.0.1\lib\ctypes_init
.py", line 348, in init
self._handle = _dlopen(self._name, mode)
TypeError: LoadLibrary() argument 1 must be str, not None

sql_install.packages creates external package library for R packages

Hi, I'm checking which packages are already installed (SQL Server 2019) and the related library path using a specific connection string

connection <- connectionInfo( server = "server\\instance", database = "anyDB")

for the SQL server/instance. The default library path is C:\Program Files\Microsoft SQL Server\MSSQL15.MHPROD01\R_SERVICES\library .

sql_installed.packages(connectionString = connection, fields=c("Package", "LibPath", "Attributes", "Scope"))

When running this function to install an arbitrary package (for example "tidyverse" or any other)

sql_install.packages(connectionString = connection, pkgs = "tidyverse", verbose = TRUE, scope = "PUBLIC")

then a new external library path F:\Program Files\Microsoft SQL Server\MSSQL15.MHPROD01\MSSQL\ExternalLibraries\5\1\1\tidyverse\R is created.

Now, when running again

sql_installed.packages(connectionString = connection, fields=c("Package", "LibPath", "Attributes", "Scope"))

I can't find the previously installed new package.

Why is that? And how can I appropriately install new packages using sqlmlutils.

R module installation on SQL Managed Instance

I am using sqlmlutils to install R modules onto SQL Managed instance in Azure

The particular issue that I have is that although the log output for the installation of data.table shows as Version 1.14.6 being downloaded,

2022-11-21 15:04:11.81 Starting package install on SQL server (Driver=SQL Server;Server=;Database=master;uid=;pwd=;)...
2022-11-21 15:04:11.81 Verifying permissions to install packages on SQL server...
2022-11-21 15:04:22.92 Resolving package dependencies for (data.table)...
2022-11-21 15:04:25.88 Downloading package [1/1] data.table (1.14.6)...
2022-11-21 15:04:29.99 Copying package to Sql server [1/1] data.table...
2022-11-21 15:04:32.59 Installing packages to library path, this may take some time...
2022-11-21 15:04:34.79 Successfully installed packages on SQL server (data.table).

from SQL query after module installation, version reported is 1.12.8 (installed into C:/WFRoot/DB8C.1/Fabric/work/Applications/Worker.CL_App14/work/Data/ExternalLibraries/32763/1/1 while all other modules are in C:/WFRoot/Ext/R.9.4.7.1185/library)

Is there any way to get data.table 1.14.6 installed on SQL Managed instance as we'd like to use some of the most recent functions?

`iptools` no longer on CRAN, breaks test.sqlPackage.dependencies.R test

Test "Binary Package install with LinkingTo dependency" now fails since iptools is no longer on CRAN.

A special download process looks to be required to get this working which doesn't look to be possible with SQLMLUtils at this time. Essentially, need to use devtools(<install>), however, that isn't built into sqlmlutils.

Package ‘iptools’ was removed from the CRAN repository.

Formerly available versions can be obtained from the archive.

Archived on 2023-02-20 as issues were not corrected in time.

A summary of the most recent check results can be obtained from the check results archive.

Please use the canonical form https://cran.r-project.org/package=iptools to link to this page.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.