Friday, December 11, 2020

Cleanup of Audit Trail in Oracle

 

Its been a while, since I blogged. Lot of things are going on like most of us. 

Hopefully , in the coming weeks and 2021 onwards, things go smooth

It is recommended that we move  objects related to audit trail ( Unified Audit Trail / FGA Audit Trail , ... ) to a different tablespace rather than default tablespace. We can use dbms_audit_mgmt package to perform the move.

In the following example , we moved the audit related objects to a different tablespace TS_AUD_TRAIL


SQL> Select table_name , tablespace_name

  2  from dba_tables where table_name in ( 'AUD$' , 'FGA_LOG$')

  3  ;

 

TABLE_NAME                     TABLESPACE_NAME

------------------------------ ------------------------------

FGA_LOG$                       SYSAUX

AUD$                           SYSAUX

 

SQL> begin

  2    dbms_audit_mgmt.set_audit_trail_location( audit_trail_type =>dbms_audit_mgmt.audit_trail_aud_std   ,audit_trail_location_value => 'TS_AUD_TRAIL' ) ;

  3  end ;

  4  /

 

PL/SQL procedure successfully completed.

 

SQL> begin

       dbms_audit_mgmt.set_audit_trail_location( audit_trail_type =>dbms_audit_mgmt.audit_trail_fga_std   ,audit_trail_location_value => 'TS_AUD_TRAIL' ) ;

     end;

/

 

PL/SQL procedure successfully completed.

 

SQL> Select table_name , tablespace_name

     from dba_tables where table_name in ( 'AUD$' , 'FGA_LOG$')

     /

 

TABLE_NAME                     TABLESPACE_NAME

------------------------------ ------------------------------

FGA_LOG$                       TS_AUD_TRAIL

AUD$                           TS_AUD_TRAIL

 

Also , it is better to do a clean up of these records as it is expected to grow. This is important , if we don't move these tables from the default tablespace. There are few other bugs that also causes SYSAUX to grow.
Of course , we need to make sure that we are in compliance with the corporate's retention policy . 
Management of audit trail can be either be manual or automatic. 

To manually clear the audit trail , we can follow the following steps. 
We will first check , when was the last time , we did manual cleanup. 

SQL> SELECT * FROM DBA_AUDIT_MGMT_LAST_ARCH_TS;

no rows selected


DBA_AUDIT_MGMT_LAST_ARCH_TS shows that there was no previous cleanup. 
Now , I want to purge any records before Jan 1st 2020. For this , I set the date in DBMS_AUDIT_MGMT.SET_LAST_ARCHIVE_TIMESTAMP and then perform the clean up.

SQL> BEGIN
  2    DBMS_AUDIT_MGMT.SET_LAST_ARCHIVE_TIMESTAMP (
  3      AUDIT_TRAIL_TYPE => DBMS_AUDIT_MGMT.AUDIT_TRAIL_UNIFIED,
  4      LAST_ARCHIVE_TIME => '01-JAN-2020');
  5  END;
  6  /

PL/SQL procedure successfully completed.

SQL> BEGIN
  2    DBMS_AUDIT_MGMT.clean_audit_trail(
  3     audit_trail_type        => DBMS_AUDIT_MGMT.AUDIT_TRAIL_UNIFIED,
  4     use_last_arch_timestamp => TRUE);
  5  END;
  6  /

PL/SQL procedure successfully completed.

After the clean up , we re run the query to list the contents from DBA_AUDIT_MGMT_LAST_ARCH_TS. Now , it has the entry .

SQL> SELECT * FROM DBA_AUDIT_MGMT_LAST_ARCH_TS;

AUDIT_TRAIL          RAC_INSTANCE
-------------------- ------------
LAST_ARCHIVE_TS
---------------------------------------------------------------------------
DATABASE_ID CONTAINER_GUID
----------- ---------------------------------
UNIFIED AUDIT TRAIL             0
01-JAN-20 12.00.00.000000 AM +00:00
5553077069 733F2AB490104389843DFGFADFDS


SQL> BEGIN
  2    DBMS_AUDIT_MGMT.SET_LAST_ARCHIVE_TIMESTAMP (
  3      AUDIT_TRAIL_TYPE => DBMS_AUDIT_MGMT.AUDIT_TRAIL_AUD_STD,
  4      LAST_ARCHIVE_TIME => '01-JAN-2020');
  5  END;
  6  /
BEGIN
*
ERROR at line 1:
ORA-46258: Cleanup not initialized for the audit trail
ORA-06512: at "SYS.DBMS_AUDIT_MGMT", line 177
ORA-06512: at "SYS.DBMS_AUDIT_MGMT", line 941
ORA-06512: at line 2


Elapsed: 00:00:00.01
SQL> SELECT JOB_NAME, JOB_STATUS, AUDIT_TRAIL from DBA_AUDIT_MGMT_CLEANUP_JOBS;

no rows selected

Elapsed: 00:00:00.00
SQL> BEGIN
  2  DBMS_AUDIT_MGMT.INIT_CLEANUP(audit_trail_type => DBMS_AUDIT_MGMT.AUDIT_TRAIL_AUD_STD, default_cleanup_interval => 24);
  3  END;
  4  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:15.32
SQL> BEGIN
  2    DBMS_AUDIT_MGMT.SET_LAST_ARCHIVE_TIMESTAMP (
  3      AUDIT_TRAIL_TYPE => DBMS_AUDIT_MGMT.AUDIT_TRAIL_AUD_STD,
  4      LAST_ARCHIVE_TIME => '01-JAN-2020');
  5  END;
  6  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.00
SQL> BEGIN
  2    DBMS_AUDIT_MGMT.clean_audit_trail(
  3     audit_trail_type        => DBMS_AUDIT_MGMT.AUDIT_TRAIL_AUD_STD,
  4     use_last_arch_timestamp => TRUE);
  5  END;
  6  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:07.70
SQL>


By performing clean up of these audit trail , we can keep the tablespace's storage under control. 
Comments welcome. 

 






Wednesday, May 13, 2020

Stored procedure cannot be the input to the Python script in MS SQL Server


I am in the journey of using python in MS SQL Stored procedures.

I am big fan of stored procedures since my earlier days in Sybase in 1993 up till now. 
The same for Oracle packages and procedures. 
With stored procedures , we can encapsulate all the business and data logic.
Being the stored procedure is inside the database , the data and logic is secure , highly performant and so on. 
In essence , it is an API that can be called from any languages / tools. 

MS SQL server started supporting embedding python scripts in T-SQL. 
Unfortunately , the input for the python cannot be stored procedure. 
Hopefully , it is changed in the coming releases. 

Here we go 

a) We will create a table emp with 2 records in it. 

C:\>sqlcmd -W -S DBSER\SQL2019DEV -d rptsdb
1>
2> select * from emp;
3> go
id fname lname
-- ----- -----
1 Jane Doe
1 John Doe

(2 rows affected)

b) Create python external script with simple select as the input. 

1> EXECUTE sp_execute_external_script @language = N'Python',
2> @script = N'
3~ ## print(InputDataSet)
4~ OutputDataSet = InputDataSet
5~ ',
6> @input_data_1 = N'SELECT fname  , lname  FROM emp  ORDER BY 1;'
7> WITH RESULT SETS ((  fname nvarchar(50), lname nvarchar(50)     ));
8> go

fname lname
----- -----
Jane Doe
John Doe

(2 rows affected)

c) Create stored procedure to list the records from the table 'emp' 


1> create procedure p_emp_list
2> as
3> begin
4> SELECT fname , lname
5> FROM  emp
6> end;
7> go

1> exec p_emp_list;
2> go

fname lname
----- -----
Jane Doe
John Doe

(2 rows affected)

d) Let us use the above stored procedure as the input to Python script 

1> EXECUTE sp_execute_external_script @language = N'Python',
2> @script = N'
3~ ## print(InputDataSet)
4~ OutputDataSet = InputDataSet
5~ ',
6> @input_data_1 = N'exec p_emp_list;'
7> WITH RESULT SETS ((  fname nvarchar(50), lname nvarchar(50) ));
8> go

Msg 39001, Level 16, State 2, Server DBSER\SQL2019DEV, Line 1
Only SELECT statement is supported for input data query to 'sp_execute_external_script' stored procedure.
Msg 11536, Level 16, State 1, Server DBSER\SQL2019DEV, Line 1
EXECUTE statement failed because its WITH RESULT SETS clause specified 1 result set(s), but the statement only sent 0 result set(s) at run time.



Monday, May 4, 2020

Accessing Python Libraries in SQL Server



I have been exploring to setting up a contextualized search on one of my datasets. For this experiment,  I was planning on using Machine Learning Services  ( MLS) in SQL Server 2019.
And , the library I was word2vec . This library is not part of the Standard libraries that came with SQL Server MLS .

Took sometime , to get this installed in my SQL Server environment. I was hoping , this blog post will be helpful , if anyone is trying to install python libraries in SQL Server .

First , we need to identify the system path , where Python is installed.


C:\>sqlcmd -W -S SERNJ\SQL2019DB
1>
2> EXEC sp_execute_external_script
3>   @language =N'Python',
4>   @script=N'import sys; print("\n".join(sys.path))'
5> go
STDOUT message(s) from external script:

C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\python37.zip
C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\DLLs
C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\lib
C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES
C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\lib\site-packages
C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\lib\site-packages\win32
C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\lib\site-packages\win32\lib
C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\lib\site-packages\Pythonwin
C:\ProgramData\SQL2019DEV\Temp-PY\Appcontainer1\3E21F1D4-0032-473A-A2EB-D6763111F7F9\rxLibs
C:\ProgramData\SQL2019DEV\Temp-PY\Appcontainer1\7AD5E2EF-A002-4B2E-8ED1-4BFC71DBDB57


Then navigate to the python services scripts  ( in my case , C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\Scripts ) directory and install gensim using pip.


C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\Scripts>pip install gensim
pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available.
Collecting gensim
  Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/gensim/
  Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/gensim/
  Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/gensim/
  Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/gensim/
  Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/gensim/
  Could not fetch URL https://pypi.org/simple/gensim/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/gensim/ (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available.")) - skipping
  Could not find a version that satisfies the requirement gensim (from versions: )
No matching distribution found for gensim
pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available.
Could not fetch URL https://pypi.org/simple/pip/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/pip/ (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available.")) - skipping

For the above command to work , we need to add these directories to the path. 

C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\Lib\site-packages\conda
C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\Library\bin
C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\Scripts

Now , try the pip again to install gensim package.

C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019DEV\PYTHON_SERVICES\Library>pip install gensim

Collecting gensim
    100% |████████████████████████████████| 24.2MB 1.3MB/s
Collecting Cython==0.29.14 (from gensim)
    100% |████████████████████████████████| 1.7MB 6.5MB/s
Collecting smart-open>=1.8.1 (from gensim)
    100% |████████████████████████████████| 112kB 3.3MB/s
Requirement already satisfied: scipy>=0.18.1 in c:\program files\microsoft sql server\mssql15.sql2019dev\python_services\lib\site-packages (from gensim) (1.1.0)
Requirement already satisfied: six>=1.5.0 in c:\program files\microsoft sql server\mssql15.sql2019dev\python_services\lib\site-packages (from gensim) (1.12.0)
Requirement already satisfied: numpy>=1.11.3 in c:\program files\microsoft sql server\mssql15.sql2019dev\python_services\lib\site-packages (from gensim) (1.15.4)
Requirement already satisfied: requests in c:\program files\microsoft sql server\mssql15.sql2019dev\python_services\lib\site-packages (from smart-open>=1.8.1->gensim) (2.21.0)
Collecting boto (from smart-open>=1.8.1->gensim)
    100% |████████████████████████████████| 1.4MB 6.5MB/s
Collecting boto3 (from smart-open>=1.8.1->gensim)
    100% |████████████████████████████████| 133kB 6.5MB/s
Requirement already satisfied: idna<2 .9="">=2.5 in c:\program files\microsoft sql server\mssql15.sql2019dev\python_services\lib\site-packages (from requests->smart-open>=1.8.1->gensim) (2.8)
Requirement already satisfied: urllib3<1 .25="">=1.21.1 in c:\program files\microsoft sql server\mssql15.sql2019dev\python_services\lib\site-packages (from requests->smart-open>=1.8.1->gensim) (1.24.1)
Requirement already satisfied: certifi>=2017.4.17 in c:\program files\microsoft sql server\mssql15.sql2019dev\python_services\lib\site-packages (from requests->smart-open>=1.8.1->gensim) (2019.3.9)
Requirement already satisfied: chardet<3 .1.0="">=3.0.2 in c:\program files\microsoft sql server\mssql15.sql2019dev\python_services\lib\site-packages (from requests->smart-open>=1.8.1->gensim) (3.0.4)
Collecting s3transfer<0 .4.0="">=0.3.0 (from boto3->smart-open>=1.8.1->gensim)
    100% |████████████████████████████████| 71kB 4.6MB/s
Collecting botocore<1 .17.0="">=1.16.1 (from boto3->smart-open>=1.8.1->gensim)
    100% |████████████████████████████████| 6.2MB 6.7MB/s
Collecting jmespath<1 .0.0="">=0.7.1 (from boto3->smart-open>=1.8.1->gensim)
Requirement already satisfied: docutils<0 .16="">=0.10 in c:\program files\microsoft sql server\mssql15.sql2019dev\python_services\lib\site-packages (from botocore<1 .17.0="">=1.16.1->boto3->smart-open>=1.8.1->gensim) (0.14)
Requirement already satisfied: python-dateutil<3 .0.0="">=2.1 in c:\program files\microsoft sql server\mssql15.sql2019dev\python_services\lib\site-packages (from botocore<1 .17.0="">=1.16.1->boto3->smart-open>=1.8.1->gensim) (2.8.0)
Building wheels for collected packages: smart-open
  Running setup.py bdist_wheel for smart-open ... done
  Stored in directory: C:\Users\zahir\AppData\Local\pip\Cache\wheels\27\65\38\8d7f5fe8d7afb4e4566587b2d1933cec185fba19257836c943
Successfully built smart-open
Installing collected packages: Cython, boto, jmespath, botocore, s3transfer, boto3, smart-open, gensim
  Found existing installation: Cython 0.29.2
    Uninstalling Cython-0.29.2:
      Successfully uninstalled Cython-0.29.2
Successfully installed Cython-0.29.14 boto-2.49.0 boto3-1.13.1 botocore-1.16.1 gensim-3.8.3 jmespath-0.9.5 s3transfer-0.3.3 smart-open-2.0.0


Now , we should be able to use gensim in SQL Server . 
More to come on this. 


Stay tuned. 

Tuesday, April 21, 2020

Inline External table in Oracle



It is good to be back. Hope everyone is safe. 
Let us pray for everyone whose lives have been impacted by #covid19.

Oracle has been enhancing "external table" features in each release . 

Here is one of my post describing the support for wildcard characters in the file location. 
 http://mfzahirdba.blogspot.com/2015/07/external-table-enhancements-in-oracle.html

There were few external table enhancements in Oracle 18c. 
One of them is that you do not need to create table first and then reference that external table in your query.

In Oracle 18c and later , I can define the external table definition inline and access the files without defining the external table.

Let us use the same example files in http://mfzahirdba.blogspot.com/2015/07/external-table-enhancements-in-oracle.html

Here is an example . 

Sign on and create directory and grant the access to regular user ( in our case , zahir).

SQL> create directory EX_TAB_DIR as 'c:\datadumps\';

Directory created.

SQL> grant read , write , execute on directory EX_TAB_DIR to zahir;

Grant succeeded.

Sign on to the database with the user account.

SQL> show user
USER is "ZAHIR"


SQL> ho dir c:\datadumps
 Volume in drive C has no label.
 Volume Serial Number is E247-41EF

 Directory of c:\datadumps

04/21/2020  05:09 PM   
          .
04/21/2020  05:09 PM   
          ..
04/21/2020  04:52 PM                28 emp1.txt
04/21/2020  04:52 PM                25 emp2.txt
04/21/2020  04:53 PM                50 emp500.txt
           


SQL> ho type c:\datadumps\emp1.txt type c:\datadumps\emp2.txt type c:\datadumps\emp500.txt

c:\datadumps\emp1.txt


101,zahir,F
102,mohamed,R
The system cannot find the file specified.
Error occurred while processing: type.

c:\datadumps\emp2.txt


201,Hameed,A
252,Abdul,RThe system cannot find the file specified.
Error occurred while processing: type.

c:\datadumps\emp500.txt


511,Balasubramaniam,A
577,Yesudas,A
501,Janaki,A



SQL>  col emp_id format a5
SQL> col emp_name  format a20
SQL> col emp_type format a5
SQL>  SELECT
  2      emp_id,
  3      emp_name,
  4      emp_type
  5  FROM
  6      external
  7  ( ( emp_id   VARCHAR2(1000),
  8             emp_name VARCHAR2(1000),
  9             emp_type VARCHAR2(1000)
 10             )
 11      TYPE ORACLE_LOADER DEFAULT DIRECTORY "EX_TAB_DIR"
 12      ACCESS PARAMETERS
 13      ( debug = 3
 14      RECORDS DELIMITED BY NEWLINE
 15          CHARACTERSET we8mswin1252
 16          BADFILE ex_tab_dir : 'emp_ext.bad_xt'
 17          LOGFILE 'emp_ext.log_xt'
 18      FIELDS TERMINATED BY "," LDRTRIM MISSING FIELD VALUES ARE NULL REJECT ROWS WITH ALL NULL FIELDS
 19  )
 20  LOCATION ( 'emp*.txt' ) reject limit unlimited )
 21   ;

EMP_I EMP_NAME             EMP_T
----- -------------------- -----
101   zahir                F
102   mohamed              R
201   Hameed               A
252   Abdul                R
511   Balasubramaniam      A
577   Yesudas              A
501   Janaki               A

7 rows selected.


In the program listing , the external table is defined inline between the lines 7 and 20.With this method, it is easier to load the data from flat files into the database without defining the table first. 

Comments Welcome.