Wednesday, December 27, 2017

R Dataframe in the eyes of SQL developer.


R has gained lot of momentum in the last few years  for Data Science. At first , for a SQL professional , this may be bit daunting ; however there are lot of similarities  between the RDBMS concepts and R concepts , that will make the learning curve tad easier. One of the similarity is the data frame. 


Data frame is one of the important component in R to capture the data from the external data sources ( aka importing from CSV , loading from RDMS  , and so on ) . 

It is conceptually same as the a table in a RDBMS system.

In the following , I have created a data frame with  3 elements and 6 rows. 
In RDBMS , this is the same as creating a table 'emp' and inserting 6 records. 

emp <- data.frame="" span="">
  name=c("Zahir","Farook","Hameed","Basheer","Aslam","Suhaib"),
  deptno=c(10,20,30,30,20,20),
 city=c("Monroe","Trichy","Kilakarai","Kilakarai","Chennai","Chennai"))  
















When  the data frame is referenced at the prompt , it returns the entire data set. This is similar to "SELECT * FROM EMP",




The function "rbind" is used to insert a record into the existing dataset. 
This is similar to "INSERT INTO EMP values ('Karady' , 100 , 'Colombo') "

emp <- arady="" data.frame="" deptno="c(100),city=c(" emp="" name="c(" olombo="" rbind="" span="">

 

The function "nrow" is used to get the record count of the dataset.
This is similar to " SELECT count(*) from EMP". 


The function "ncol" is used to get the record count of the columns.
This is similar to " SELECT count(*) from information_schema.columns where table_name =EMP'" .


 With the following example , we are filtering the records that have deptno = 30 . This is similar to
"SELECT *  FROM EMP WHERE DEPTNO= 30'.



 We can add , additional filter with the pipe function . Pipe is used for 'OR' condition. 
This is similar to "SELECT *  FROM EMP WHERE DEPTNO= 30 ORCITY ='Chennai'.





As we can see , there  are lot of similarites in with the concept of Table (tuple) and the dataframe. 
This could be a starting point to get familiar with R  for a SQL professional . 

I understand , I have just scratched the surface on the data frame and its functions.

As of now , Oracle and MS SQL Server has incoprated 'R' into their offerings.

Comments Welcome.




Saturday, November 4, 2017

DBSAT in 12c .




Oracle has provided a new tool to assess the security configuration and advise on it. It is a lightweight utility and the most important thing is that it is FREE ( surprise!).  As long as , we have the active support from Oracle , we should be able to download the utility .

For the first release , it is very promising  , albeit the findings for Windows Operating Systems is limited. For Windows , it does not look the OS configurations. 

a) First , a user needs to be created with minimalist privileges . The documentation has the sample script that could be provisioned . 

create user dbsat_user identified by xxxxx;
// If Database Vault is enabled, connect as DV_ACCTMGR to run this command
grant create session to dbsat_user;
grant select_catalog_role to dbsat_user;
grant select on sys.registry$history to dbsat_user;
grant select on sys.dba_users_with_defpwd to dbsat_user; // 11g and 12c
grant select on audsys.aud$unified to dbsat_user; // 12c only
grant audit_viewer to dbsat_user; // 12c
grant capture_admin to dbsat_user;// 12c covers sys.dba_priv_captures, sys.priv_capture$, sys.capture_run_log$
// if Database Vault is enabled, connect as DV_OWNER to run this command
grant DV_SECANALYST to dbsat_user;
 
b) We need to "collect" the configuration  by running the  commands below. As usual , this will run " as the administrator". At the event of  the process , it will ask for the password . Make sure that it is secured as the file has very important information.  


D:\dbsat_output>set path=%path%;d:\dbsat

D:\dbsat_output>dbsat collect dbsat_user/xxxxxxxxxx dbsat_output20171024

This tool is intended to assist in you in identifying potential
vulnerabilities in your system, but you are solely responsible for
your system and the effect and results of the execution of this tool
(including, without limitation, any damage or data loss). Further,
the output generated by this tool may include potentially sensitive
system configuration data and information that could be used by a
skilled attacker to penetrate your system. You are solely responsible
for ensuring that the output of this tool, including any generated
reports, is handled in accordance with your company's policies.

Connecting to the target Oracle database...


SQL*Plus: Release 12.1.0.2.0 Production on Tue Oct 24 13:01:05 2017

Copyright (c) 1982, 2014, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Advanced Analytics
and Real Application Testing options

Database Security Assessment Tool version 1.0.2 (October 2016)
Setup complete.
SQL queries complete.
OS Commands Skipped.
BEGIN
*
ERROR at line 1:
ORA-20002: Complete without OS Commands.
ORA-06512: at line 4


Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Advanced Analytics
and Real Application Testing options
DBSAT Collector completed successfully.

Calling D:\app\oracle\product\12.1.0\dbhome_1\bin\zip.exe to encryptdbsat_output20171024.json...

Enter password:
Verify password:
  adding: dbsat_output20171024.json (164 bytes security) (deflated 88%)
zip completed successfully.


D:\dbsat_output>dir
Volume in drive D is New Volume
Volume Serial Number is 32D6-96CB

Directory of D:\dbsat_output

10/24/2017  01:01 PM              .
10/24/2017  01:01 PM              ..
10/24/2017  01:01 PM            29,227 dbsat_output20171024.zip
               1 File(s)         29,227 bytes
               2 Dir(s)  223,632,429,056 bytes free

c) We need to generate a report based on the file , generated from the last step. At the completion of this step , we will have three files ( text file , excel , html ) , that lists the recommendations. Be sure to safeguard this file , as the exposure to the outside world  would cause security risks that our database has now. 


D:\dbsat_output>dbsat report -a -n dbsat_output20171024

This tool is intended to assist in you in identifying potential
vulnerabilities in your system, but you are solely responsible for
your system and the effect and results of the execution of this tool
(including, without limitation, any damage or data loss). Further,
the output generated by this tool may include potentially sensitive
system configuration data and information that could be used by a
skilled attacker to penetrate your system. You are solely responsible
for ensuring that the output of this tool, including any generated
reports, is handled in accordance with your company's policies.

Archive:  dbsat_output20171024.zip
[dbsat_output20171024.zip] dbsat_output20171024.json password:
password incorrect--reenter:
password incorrect--reenter:
  inflating: dbsat_output20171024.json
Database Security Assessment Tool version 1.0.2 (October 2016)
DBSAT Reporter ran successfully.


Also , to generate the report , you would need python installed in your machine. 
My preference is to use the  html as it has links to the individual sections . 

It lists the vulnerability , opportunity ( to use RAS , OLS , ... ) 

This is one of the "Hands on Lab" sessions that I attended at OpenWorld 2017. It is worth looking into it , as it is a light weight and does the job in few minutes. 


For additional information , please visit  https://docs.oracle.com/cd/E76178_01/SATUG/toc.htm#SATUG-GUID-C7E917BB-EDAC-4123-900A-D4F2E561BFE9

Comments welcome. 


Wednesday, November 1, 2017

datapatch in Oracle 12c.



Post installation of PSU has been changed since oracle 12c . There is a new script ( datapatch) that takes care of all the details. 

Once the opatch has been done to apply the patch , we need to start the services / database in upgrade mode and run the datapatch. 

Here  is the example.


Post Installation using Datapatch
---------------------------------------


C:\26792364>cd %ORACLE_HOME%/OPatch

C:\app\Oracle\product\12.1.0\dbhome_2\OPatch>datapatch -verbose
SQL Patching tool version 12.1.0.2.0 Production on Wed Nov  1 14:42:31 2017
Copyright (c) 2012, 2017, Oracle.  All rights reserved.

Log file for this invocation: c:\app\oracle\cfgtoollogs\sqlpatch\sqlpatch_2256_2017_11_01_14_42_31\sqlpatch_invocation.log

Connecting to database...OK
Bootstrapping registry and package to current versions...done
Determining current state...done

Current state of SQL patches:
Patch 26792364 (WINDOWS ORACLE JAVAVM COMPONENT BUNDLE PATCH 12.1.0.2.171017 (64bit): 26792364):
  Installed in the binary registry only
Bundle series PSU:
  ID 171017 in the binary registry and ID 171017 in the SQL registry

Adding patches to installation queue and performing prereq checks...
Installation queue:
  Nothing to roll back
  The following patches will be applied:
    26792364 (WINDOWS ORACLE JAVAVM COMPONENT BUNDLE PATCH 12.1.0.2.171017 (64bit): 26792364)

Installing patches...
Patch installation complete.  Total patches installed: 1

Validating logfiles...
Patch 26792364 apply: SUCCESS
  logfile: c:\app\oracle\cfgtoollogs\sqlpatch\26792364\21612362/26792364_apply_CSDBQA12_2017Nov01_14_43_23.log (no errors)
SQL Patching tool complete on Wed Nov  1 14:47:53 2017


Confirm by checking the metadata
--------------------------------.


SQL> set linesize 1000
SQL> select * from
  2  ( Select  patch_id , patch_uid , version , status , action_time , flags
  3  from
  4  dba_registry_sqlpatch
  5  order by action_time desc )
  6  Where rownum <= 2 ;

  PATCH_ID  PATCH_UID VERSION              STATUS          ACTION_TIME                                                         FLAGS
---------- ---------- -------------------- --------------- --------------------------------------------------------------------------- ----------
  26792364   21612362 12.1.0.2             SUCCESS         01-NOV-17 02.47.53.329000 PM                                        UJJ

  26720785   21612430 12.1.0.2             SUCCESS         01-NOV-17 02.05.48.787000 PM                                        NB



I understand , oracle is changing the way the releases by going to Release Update / Release Update Revision .  Not sure , if this is going to be changed with the RU / RUR. 

And , I guess , with the autonomous database ( Oracle 18c ) , we need not worry about these kind of tasks . :-) 

Update :

I have delete the references to the Oracle 18c as autonomous . Thanks to Tim Hall for correcting this ( https://oracle-base.com/blog/2018/01/03/oracle-database-18c-is-not-an-autonomous-database/). 


Monday, July 17, 2017

Partitioning made easy - Part 3

Oracle has been simplifying and enhancing the  implementation of its partitioning features in every release.

I have listed some of the paritioning enhancements  in Oracle 12c Release in the following posts.

http://mfzahirdba.blogspot.com/2014/12/partitioning-made-easy.html
http://mfzahirdba.blogspot.com/2015/03/partitioning-made-easy-part-2.html

In Oracle 11g , we had auto creation of  partition in range based partitions.
See http://mfzahirdba.blogspot.com/2011/02/interval-partitioning.html

In Oracle 12c Release 2 ,  Oracle has introduced auto creation of partition in list based partitions.
Here is an example. 

SQL> CREATE TABLE t
  2   (
  3   oid int  ,
  4   oname varchar2(200),
  5   otype VARCHAR2(50)
  6   )
  7   partition BY list
  8    (otype)
  9    ( partition p_tab VALUES  ('TABLE'));

Table created.

SQL> insert into t
  2  select object_id ,
  3  object_name ,
  4  object_type from all_objects
  5  where object_type ='TABLE';

132 rows created.

Now , 132 records were inserted into the table t , as we have listed 'TABLE' as the value of the partitioning element.

If we try to insert records that are other than 'TABLE' , then we are greeted with the error. 

SQL> insert into t
  2  select object_id ,
  3  object_name ,
  4  object_type from all_objects
  5  where object_type ='PROCEDURE';

insert into t select object_id , object_name , object_type from all_objects where object_type ='PROCEDURE'
            *
ERROR at line 1:
ORA-14400: inserted partition key does not map to any partition

In release 2 , if we create the table in the following way , new values o otype will have its own partition. 

Let us create another table and perform the above mentioned steps. 

SQL> CREATE TABLE t1
  2   (
  3   oid int  ,
  4   oname varchar2(200),
  5   otype VARCHAR2(50)
  6   )
  7   partition BY list
  8    (otype) automatic
  9    ( partition p_tab VALUES  ('TABLE'));

Table created.

SQL> insert into t1
  2  select object_id ,
  3  object_name ,
  4  object_type from all_objects
  5  where object_type ='TABLE';

132 rows created.

SQL> insert into t1
  2  select object_id ,
  3  object_name ,
  4  object_type from all_objects
  5  where object_type ='PROCEDURE';

27 rows created.

The second insert created automatically created a partition to insert the new value ( aka , records with otype 'PROCEDURE') and given the partition system defined name . Of course , we can rename this partition to suit our needs later.  

SQL> col table_name format a5
SQL> col partition_name format a25
SQL> col high_value  format a25
SQL> select table_name , partition_name , high_value from user_tab_partitions;

SQL> select table_name , partition_name , high_value from user_tab_partitions;

TABLE PARTITION_NAME            HIGH_VALUE
----- ------------------------- -------------------------
T     P_TAB                     'TABLE'
T1    P_TAB                     'TABLE'
T1    SYS_P212                  'PROCEDURE'


With this enhancement , partitioning is made little bit easier. 
Comments welcome.

Monday, March 6, 2017

LISTAGG in 12c Release 2




Back in 2009 , I blogged about 'LISTAGG' in 11g Release 2  ( http://mfzahirdba.blogspot.de/2009/10/analytics-20.html) to concatenate the columns into one string. 

It was neat but we ran  into a error if the resultant was more than 4000 characters.

With this release ( 12c release 2) , it has gotten little bit better . We can avoid the ORA-01489 by adding few clauses .  

Let us create a test case. 


 1  SELECT owner  ,
  2    object_type ,
  3    listagg(object_name , '~' ) within GROUP (
  4  ORDER BY object_name )
  5  FROM t
  6  WHERE owner IN ('ZAHIR' )
  7  GROUP BY owner ,
  8    object_type
  9* ORDER BY object_type
SQL> /
ERROR:

ORA-01489: result of string concatenation is too long


When we add the overflow clause  as highlighed below, it does not produce error but truncates the result with the wordcount of how many characters  left in the string.

 1  SELECT owner  ,
  2    object_type ,
  3    listagg(object_name , '~' ON overflow
  4  TRUNCATE ) within GROUP (
  5  ORDER BY object_name )
  6  FROM t
  7  WHERE owner IN ('ZAHIR' , 'OUTLN')
  8  GROUP BY owner ,
  9    object_type
 10* ORDER BY object_type
SQL> /


OWNER OBJECT_TYPE
-------------------------------------------------------------------------------------------------------------------------------- -----------------------
LISTAGG(OBJECT_NAME,'~'ONOVERFLOWTRUNCATE)WITHINGROUP(ORDERBYOBJECT_NAME)
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
ZAHIR INDEX
IDX_EMP

ZAHIR INDEX PARTITION
BIN$Sh
....
...
....

~BIN$ShGc5cGpFengUwEAAH80sg==$0~...(600)

By default , the wordcount appears  . We can remove the wordcount as below. 

  1  SELECT owner  ,
  2    object_type ,
  3    listagg(object_name , '~' ON overflow
  4  TRUNCATE WITHOUT COUNT  ) within GROUP (
  5  ORDER BY object_name )
  6  FROM t
  7  WHERE owner IN ('ZAHIR' )
  8  GROUP BY owner ,
  9    object_type
 10* ORDER BY object_type
SQL> /


ZAHIR  INDEX
IDX_EMP

ZAHIR  INDEX PARTITION
BIN$Sh
....
...
....

~BIN$ShGc5cGpFengUwEAAH80sg==$0~... 

We can add explicit information with some static text . In my example below , I have added a text as 'Characters Trimmed' 

  1  SELECT owner  ,
  2    object_type ,
  3    listagg(object_name , '~' ON overflow
  4  TRUNCATE 'Characters  Trimmed'
  5  WITH COUNT ) within GROUP (
  6  ORDER BY object_name )
  7  FROM t
  8  WHERE owner IN ('ZAHIR' )
  9  GROUP BY owner ,
 10    object_type
 11* ORDER BY object_type




ZAHIR  INDEX
IDX_EMP

ZAHIR  INDEX PARTITION
BIN$Sh
....
...
....

~BIN$ShGc5cGpFengUwEAAH80sg==$0~Characters  Trimmed(601)


These are good enhancements . I wish , we have a distinct options in the LISTAGG results . That will be immensely helpful.  Hopefully , we will see this in the next few releases. 

Comments welcome.