Thursday, April 26, 2018

Case Sensitiveness - Part 2



Oracle maintains the case sensitiveness by default.

Prior to Oracle 12c Release 2 , we can achieve the nls settings at the session level.
Please see the blog post here ( http://mfzahirdba.blogspot.com/2010/04/case-sensitiveness.html)


With Oracle 12c Release 2 , we can set it at the column level , table level and of course at the session level.Column level settings takes precedence over other settings .

Here is the example .

Let us create the table in  a traditional way . 

SQL> CREATE TABLE empl_reg (
  2          firstname   VARCHAR2(30) ,
  3          lastname    VARCHAR2(30) ,
  4      sal         INT
  5  );

Table created.

SQL> insert into empl_reg values ( 'Zahir','mohideen', 1000 ) ;

1 row created.

SQL> insert into empl_reg values ( 'zahir','mohideen', 2000 ) ;

1 row created.

SQL> insert into empl_reg values ( 'zAhir','mohIdEen', 5000 ) ;

1 row created.

SQL> insert into empl_reg  values ( 'Abdul','MohIdEen', 50000 ) ;

1 row created.

SQL> commit;

Commit complete.

SQL>  Select * from empl_reg where firstname ='zahir';

FIRSTNA LASTNAME            SAL
------- ------------ ----------
zahir   mohideen           2000

In this case , to get all records with First name as 'Zahir' , as a workaround , we typically use lower/upper function. Sometimes , we will use Function Based Index on this functions to boost the performance. 

SQL> Select * from empl_reg where lower(firstname) ='zahir';

FIRSTNA LASTNAME            SAL
------- ------------ ----------
Zahir   mohideen           1000
zahir   mohideen           2000
zAhir   mohIdEen           5000


Now , let us create the table with columns - collation case insensitive ( ci) way .

SQL> Create table emp
  2  ( firstname varchar2(30) collate binary_ci ,
  3    lastname varchar2(30) collate binary_ci ,
  4    sal int
  5  ) ;

Create table emp
*
ERROR at line 1:
ORA-43929: Collation cannot be specified if parameter MAX_STRING_SIZE=STANDARD is set.


You would need to set the parameter to extended at the CDB , PDB SEED  and PDBs and bounce the instance , as this is a static parameter.

This is a bit tedious workflow . Hopefully , in the next release , this process is made simpler . 


SQL> CREATE TABLE empl (
  2          firstname   VARCHAR2(30) COLLATE binary_ci,
  3          lastname    VARCHAR2(30) COLLATE binary_ci,
  4      sal         INT
  5  );

Table created.

SQL> insert into empl values ( 'Zahir','mohideen', 1000 ) ;

1 row created.

SQL> insert into empl values ( 'zahir','mohideen', 2000 ) ;

1 row created.

SQL> insert into empl values ( 'zAhir','mohIdEen', 5000 ) ;

1 row created.

SQL> insert into empl values ( 'Abdul','MohIdEen', 50000 ) ;

1 row created.

SQL> commit;

Commit complete.

SQL> Select * from empl where firstname ='zahir';

FIRSTNA LASTNAME            SAL
------- ------------ ----------
Zahir   mohideen           1000
zahir   mohideen           2000
zAhir   mohIdEen           5000 

Now , we can get all the three records no matter what the case is .
As we can see, the explain plan does nlsssort .

SQL> explain plan for Select * from empl where firstname ='zahir';

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 3277272027

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    47 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| EMPL |     1 |    47 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

   1 - filter(NLSSORT("FIRSTNAME",'nls_sort=''BINARY_CI''')=HEXTORAW('7A
              6168697200'))

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)


18 rows selected.


SQL> explain plan for Select * from empl_reg where firstname ='zahir';

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1040152178

------------------------------------------------------------------------------
| Id  | Operation         | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |          |     1 |    47 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| EMPL_REG |     1 |    47 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

   1 - filter("FIRSTNAME"='zahir')

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)


17 rows selected.



Comments Welcome!



Wednesday, April 25, 2018

APPROX_PERCENTILE in Oracle 12cR2



In my one of my previous blog post (http://mfzahirdba.blogspot.com/2012/09/difference-between-percentilecont-and.html) , we discussed about the PERCENTILE_CONT , PERCENTILE_DISC and median . 

Beginning in 12c Release , we can get the approximate value of PERCENTILE_CONT and these related functions . 

The adavcantage is that this is MUCH faster than regular PERCENTILE_CONT functions . This is very much useful , when we don't need the exact value for the results . Examples could be analyzing the website visits , surveys , etc. 

As it can be seen from the example below , the execution time differs from the PERCENTILE_CONT / APPROX_PERCENTILE. The difference in the execution time could be significant , if the underlying data object is huge . 



SQL> CREATE
2    TABLE t AS
3  SELECT
4    'TEST' item ,
5    'E'   AS region ,
6    level AS wk ,
7    ROUND(dbms_random.value(1,500)) forecastqty
8  FROM
9    dual
10    CONNECT BY level <= 3000 ;

 Table created.

 Execution #1:

 SQL> SELECT
   2      item,
   3      region,
   4      wk,
   5      PERCENTILE_CONT(0.5) WITHIN GROUP(ORDER BY forecastqty) AS fqty
   6  FROM
   7      t
   8  GROUP BY
   9      item,
  10      region,
 11      wk


 .....
 .....
 ......


 ITEM R         WK       FQTY
 ---- - ---------- ----------
 TEST E      29986        228
 TEST E      29987        329
 TEST E      29988          8
 TEST E      29989        465
 TEST E      29990        411
 TEST E      29991        484
 TEST E      29992        232
 TEST E      29993        416
 TEST E      29994        443
 TEST E      29995         95
 TEST E      29996         48

 ITEM R         WK       FQTY
 ---- - ---------- ----------
 TEST E      29997        385
 TEST E      29998        367
 TEST E      29999        433

 30000 rows selected.

Elapsed: 00:00:11.70


 Execution #2:


SQL> SELECT
  2      item,
  3      region,
  4      wk,
  5      APPROX_PERCENTILE(0.5) WITHIN GROUP(
  6      ORDER BY
  7          forecastqty
  8      )   as approx_fqty
  9  FROM
 10      t
 11  GROUP BY
 12      item,
 13      region,
 14      wk
 15/

 ......
 ....
 ......



ITEM R         WK       FQTY
---- - ---------- ----------
TEST E      29986        228
TEST E      29987        329
TEST E      29988          8
TEST E      29989        465
TEST E      29990        411
TEST E      29991        484
TEST E      29992        232
TEST E      29993        416
TEST E      29994        443
TEST E      29995         95
TEST E      29996         48

ITEM R         WK       FQTY
---- - ---------- ----------
TEST E      29997        385
TEST E      29998        367
TEST E      29999        433

30000 rows selected.

Elapsed: 00:00:09.23



In Oracle 12c Release 2, there are few other new functions for approximation. 


Also , on a similar note .... 


In Oracle 12c Release 1, APPROX_COUNT_DISTINCT was introduced to cater to these situations . 

Please see this link for the deails  

http://mfzahirdba.blogspot.com/2015/01/approxcountdistinct-new-oracle-12c.html

In Oracle 12c Release 2 , we have another parameter (APPROX_FOR_COUNT_DISTINCT)  that can be set at the session level / system level to override the "approxmiation" behavior . Once we set this , the legacy  COUNT functions in our legacy code will go for approximation .  There is no need to revisit all our code to change it to 'APPROX_COUNT_DISTINCT' Again , this can be changed , only if the business use case allows it . 


BTW , this is my 100th blog post. Comments welcome !