Monday, May 4, 2020

Accessing Python Libraries in SQL Server



I have been exploring to setting up a contextualized search on one of my datasets. For this experiment,  I was planning on using Machine Learning Services  ( MLS) in SQL Server 2019.
And , the library I was word2vec . This library is not part of the Standard libraries that came with SQL Server MLS .

Took sometime , to get this installed in my SQL Server environment. I was hoping , this blog post will be helpful , if anyone is trying to install python libraries in SQL Server .

First , we need to identify the system path , where Python is installed.


C:\>sqlcmd -W -S SERNJ\SQL2019DB
1>
2> EXEC sp_execute_external_script
3>   @language =N'Python',
4>   @script=N'import sys; print("\n".join(sys.path))'
5> go
STDOUT message(s) from external script:

C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\python37.zip
C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\DLLs
C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\lib
C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES
C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\lib\site-packages
C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\lib\site-packages\win32
C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\lib\site-packages\win32\lib
C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\lib\site-packages\Pythonwin
C:\ProgramData\SQL2019DEV\Temp-PY\Appcontainer1\3E21F1D4-0032-473A-A2EB-D6763111F7F9\rxLibs
C:\ProgramData\SQL2019DEV\Temp-PY\Appcontainer1\7AD5E2EF-A002-4B2E-8ED1-4BFC71DBDB57


Then navigate to the python services scripts  ( in my case , C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\Scripts ) directory and install gensim using pip.


C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\Scripts>pip install gensim
pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available.
Collecting gensim
  Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/gensim/
  Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/gensim/
  Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/gensim/
  Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/gensim/
  Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/gensim/
  Could not fetch URL https://pypi.org/simple/gensim/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/gensim/ (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available.")) - skipping
  Could not find a version that satisfies the requirement gensim (from versions: )
No matching distribution found for gensim
pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available.
Could not fetch URL https://pypi.org/simple/pip/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/pip/ (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available.")) - skipping

For the above command to work , we need to add these directories to the path. 

C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\Lib\site-packages\conda
C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\Library\bin
C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\Scripts

Now , try the pip again to install gensim package.

C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\Library>pip install gensim

Collecting gensim
    100% |████████████████████████████████| 24.2MB 1.3MB/s
Collecting Cython==0.29.14 (from gensim)
    100% |████████████████████████████████| 1.7MB 6.5MB/s
Collecting smart-open>=1.8.1 (from gensim)
    100% |████████████████████████████████| 112kB 3.3MB/s
Requirement already satisfied: scipy>=0.18.1 in c:\program files\microsoft sql server\mssql15.sql2019dev\python_services\lib\site-packages (from gensim) (1.1.0)
Requirement already satisfied: six>=1.5.0 in c:\program files\microsoft sql server\mssql15.sql2019dev\python_services\lib\site-packages (from gensim) (1.12.0)
Requirement already satisfied: numpy>=1.11.3 in c:\program files\microsoft sql server\mssql15.sql2019dev\python_services\lib\site-packages (from gensim) (1.15.4)
Requirement already satisfied: requests in c:\program files\microsoft sql server\mssql15.sql2019dev\python_services\lib\site-packages (from smart-open>=1.8.1->gensim) (2.21.0)
Collecting boto (from smart-open>=1.8.1->gensim)
    100% |████████████████████████████████| 1.4MB 6.5MB/s
Collecting boto3 (from smart-open>=1.8.1->gensim)
    100% |████████████████████████████████| 133kB 6.5MB/s
Requirement already satisfied: idna<2 .9="">=2.5 in c:\program files\microsoft sql server\mssql15.sql2019dev\python_services\lib\site-packages (from requests->smart-open>=1.8.1->gensim) (2.8)
Requirement already satisfied: urllib3<1 .25="">=1.21.1 in c:\program files\microsoft sql server\mssql15.sql2019dev\python_services\lib\site-packages (from requests->smart-open>=1.8.1->gensim) (1.24.1)
Requirement already satisfied: certifi>=2017.4.17 in c:\program files\microsoft sql server\mssql15.sql2019dev\python_services\lib\site-packages (from requests->smart-open>=1.8.1->gensim) (2019.3.9)
Requirement already satisfied: chardet<3 .1.0="">=3.0.2 in c:\program files\microsoft sql server\mssql15.sql2019dev\python_services\lib\site-packages (from requests->smart-open>=1.8.1->gensim) (3.0.4)
Collecting s3transfer<0 .4.0="">=0.3.0 (from boto3->smart-open>=1.8.1->gensim)
    100% |████████████████████████████████| 71kB 4.6MB/s
Collecting botocore<1 .17.0="">=1.16.1 (from boto3->smart-open>=1.8.1->gensim)
    100% |████████████████████████████████| 6.2MB 6.7MB/s
Collecting jmespath<1 .0.0="">=0.7.1 (from boto3->smart-open>=1.8.1->gensim)
Requirement already satisfied: docutils<0 .16="">=0.10 in c:\program files\microsoft sql server\mssql15.sql2019dev\python_services\lib\site-packages (from botocore<1 .17.0="">=1.16.1->boto3->smart-open>=1.8.1->gensim) (0.14)
Requirement already satisfied: python-dateutil<3 .0.0="">=2.1 in c:\program files\microsoft sql server\mssql15.sql2019dev\python_services\lib\site-packages (from botocore<1 .17.0="">=1.16.1->boto3->smart-open>=1.8.1->gensim) (2.8.0)
Building wheels for collected packages: smart-open
  Running setup.py bdist_wheel for smart-open ... done
  Stored in directory: C:\Users\zahir\AppData\Local\pip\Cache\wheels\27\65\38\8d7f5fe8d7afb4e4566587b2d1933cec185fba19257836c943
Successfully built smart-open
Installing collected packages: Cython, boto, jmespath, botocore, s3transfer, boto3, smart-open, gensim
  Found existing installation: Cython 0.29.2
    Uninstalling Cython-0.29.2:
      Successfully uninstalled Cython-0.29.2
Successfully installed Cython-0.29.14 boto-2.49.0 boto3-1.13.1 botocore-1.16.1 gensim-3.8.3 jmespath-0.9.5 s3transfer-0.3.3 smart-open-2.0.0


Now , we should be able to use gensim in SQL Server . 
More to come on this. 


Stay tuned. 

No comments:

Post a Comment