ZAHIR MOHIDEEN: April 2018

Thursday, April 26, 2018

Case Sensitiveness - Part 2

Oracle maintains the case sensitiveness by default.

Prior to Oracle 12c Release 2 , we can achieve the nls settings at the session level.
Please see the blog post here ( http://mfzahirdba.blogspot.com/2010/04/case-sensitiveness.html)

With Oracle 12c Release 2 , we can set it at the column level , table level and of course at the session level.Column level settings takes precedence over other settings .

Here is the example .

Let us create the table in a traditional way .

SQL> CREATE TABLE empl_reg (
2 firstname VARCHAR2(30) ,
3 lastname VARCHAR2(30) ,
4 sal INT
5 );

Table created.

SQL> insert into empl_reg values ( 'Zahir','mohideen', 1000 ) ;

1 row created.

SQL> insert into empl_reg values ( 'zahir','mohideen', 2000 ) ;

1 row created.

SQL> insert into empl_reg values ( 'zAhir','mohIdEen', 5000 ) ;

1 row created.

SQL> insert into empl_reg values ( 'Abdul','MohIdEen', 50000 ) ;

1 row created.

SQL> commit;

Commit complete.

SQL> Select * from empl_reg where firstname ='zahir';

FIRSTNA LASTNAME SAL
------- ------------ ----------
zahir mohideen 2000

In this case , to get all records with First name as 'Zahir' , as a workaround , we typically use lower/upper function. Sometimes , we will use Function Based Index on this functions to boost the performance.

SQL> Select * from empl_reg where lower(firstname) ='zahir';

FIRSTNA LASTNAME SAL
------- ------------ ----------
Zahir mohideen 1000
zahir mohideen 2000
zAhir mohIdEen 5000

Now , let us create the table with columns - collation case insensitive ( ci) way .

SQL> Create table emp
2 ( firstname varchar2(30) collate binary_ci ,
3 lastname varchar2(30) collate binary_ci ,
4 sal int
5 ) ;

Create table emp
*
ERROR at line 1:
ORA-43929: Collation cannot be specified if parameter MAX_STRING_SIZE=STANDARD is set.

You would need to set the parameter to extended at the CDB , PDB SEED and PDBs and bounce the instance , as this is a static parameter.

This is a bit tedious workflow . Hopefully , in the next release , this process is made simpler .

SQL> CREATE TABLE empl (
2 firstname VARCHAR2(30) COLLATE binary_ci,
3 lastname VARCHAR2(30) COLLATE binary_ci,
4 sal INT
5 );

Table created.

SQL> insert into empl values ( 'Zahir','mohideen', 1000 ) ;

1 row created.

SQL> insert into empl values ( 'zahir','mohideen', 2000 ) ;

1 row created.

SQL> insert into empl values ( 'zAhir','mohIdEen', 5000 ) ;

1 row created.

SQL> insert into empl values ( 'Abdul','MohIdEen', 50000 ) ;

1 row created.

SQL> commit;

Commit complete.

SQL> Select * from empl where firstname ='zahir';

FIRSTNA LASTNAME SAL
------- ------------ ----------
Zahir mohideen 1000
zahir mohideen 2000
zAhir mohIdEen 5000

Now , we can get all the three records no matter what the case is .
As we can see, the explain plan does nlsssort .

SQL> explain plan for Select * from empl where firstname ='zahir';

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 3277272027

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 47 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| EMPL | 1 | 47 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

1 - filter(NLSSORT("FIRSTNAME",'nls_sort=''BINARY_CI''')=HEXTORAW('7A
6168697200'))

Note
-----
- dynamic statistics used: dynamic sampling (level=2)

18 rows selected.

SQL> explain plan for Select * from empl_reg where firstname ='zahir';

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1040152178

------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 47 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| EMPL_REG | 1 | 47 | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

1 - filter("FIRSTNAME"='zahir')

Note
-----
- dynamic statistics used: dynamic sampling (level=2)

17 rows selected.

Comments Welcome!

Wednesday, April 25, 2018

APPROX_PERCENTILE in Oracle 12cR2

In my one of my previous blog post (http://mfzahirdba.blogspot.com/2012/09/difference-between-percentilecont-and.html) , we discussed about the PERCENTILE_CONT , PERCENTILE_DISC and median .

Beginning in 12c Release , we can get the approximate value of PERCENTILE_CONT and these related functions .

The adavcantage is that this is MUCH faster than regular PERCENTILE_CONT functions . This is very much useful , when we don't need the exact value for the results . Examples could be analyzing the website visits , surveys , etc.

As it can be seen from the example below , the execution time differs from the PERCENTILE_CONT / APPROX_PERCENTILE. The difference in the execution time could be significant , if the underlying data object is huge .

SQL> CREATE
2 TABLE t AS
3 SELECT
4 'TEST' item ,
5 'E' AS region ,
6 level AS wk ,
7 ROUND(dbms_random.value(1,500)) forecastqty
8 FROM
9 dual
10 CONNECT BY level <= 3000 ;

Table created.

Execution #1:

SQL> SELECT
2 item,
3 region,
4 wk,
5 PERCENTILE_CONT(0.5) WITHIN GROUP(ORDER BY forecastqty) AS fqty
6 FROM
7 t
8 GROUP BY
9 item,
10 region,
11 wk

.....
.....
......

ITEM R WK FQTY
---- - ---------- ----------
TEST E 29986 228
TEST E 29987 329
TEST E 29988 8
TEST E 29989 465
TEST E 29990 411
TEST E 29991 484
TEST E 29992 232
TEST E 29993 416
TEST E 29994 443
TEST E 29995 95
TEST E 29996 48

ITEM R WK FQTY
---- - ---------- ----------
TEST E 29997 385
TEST E 29998 367
TEST E 29999 433

30000 rows selected.

Elapsed: 00:00:11.70

Execution #2:

SQL> SELECT
2 item,
3 region,
4 wk,
5 APPROX_PERCENTILE(0.5) WITHIN GROUP(
6 ORDER BY
7 forecastqty
8 ) as approx_fqty
9 FROM
10 t
11 GROUP BY
12 item,
13 region,
14 wk
15/

......
....
......

ITEM R WK FQTY
---- - ---------- ----------
TEST E 29986 228
TEST E 29987 329
TEST E 29988 8
TEST E 29989 465
TEST E 29990 411
TEST E 29991 484
TEST E 29992 232
TEST E 29993 416
TEST E 29994 443
TEST E 29995 95
TEST E 29996 48

ITEM R WK FQTY
---- - ---------- ----------
TEST E 29997 385
TEST E 29998 367
TEST E 29999 433

30000 rows selected.

Elapsed: 00:00:09.23

In Oracle 12c Release 2, there are few other new functions for approximation.

Also , on a similar note ....

In Oracle 12c Release 1, APPROX_COUNT_DISTINCT was introduced to cater to these situations .

Please see this link for the deails

http://mfzahirdba.blogspot.com/2015/01/approxcountdistinct-new-oracle-12c.html

In Oracle 12c Release 2 , we have another parameter (APPROX_FOR_COUNT_DISTINCT) that can be set at the session level / system level to override the "approxmiation" behavior . Once we set this , the legacy COUNT functions in our legacy code will go for approximation . There is no need to revisit all our code to change it to 'APPROX_COUNT_DISTINCT' Again , this can be changed , only if the business use case allows it .

BTW , this is my 100th blog post. Comments welcome !