Monday, June 16, 2014

SQL Treasures.

The reason that I call this topic as treasure is that this feature was introduced in Oracle 8.1.6 ( a whiiiiiile back ) . Not many people are using ( to its fullest extent ) .

You would have guessed it by now . Yeah . It is analytics.

Analytics is one of the greatest additon to the SQL family . Though ,it is documented only in Datawarehousing guide . It is equally important in OLTP application .

It has ranking , reporting , statisticial  functions to name a few.

As an example , let us go to scott schema .

In this post , I will cover few basic analytical SQLs . Some of the analytical constructs has been introduced in SQL Server 2005.

Rest of the analytical functions in the subsequent posts.

Example 1: 

Usuage of ROW_NUMBER function .

This is to return  a running sequence number.

SQL> SELECT empno,
  2         ename,
  3         job,
  4         sal,
  5         deptno,
  6         row_number() over(ORDER BY sal DESC) rn
  7  FROM emp;

EMPNO ENAME      JOB             SAL DEPTNO         RN
----- ---------- --------- --------- ------ ----------
 7839 KING       PRESIDENT   5000.00     10          1
 7902 FORD       ANALYST     3000.00     20          2
 7788 SCOTT      ANALYST     3000.00     20          3
 7566 JONES      MANAGER     2975.00     20          4
 7698 BLAKE      MANAGER     2850.00     30          5
 7782 CLARK      MANAGER     2450.00     10          6
 7499 ALLEN      SALESMAN    1600.00     30          7
 7844 TURNER     SALESMAN    1500.00     30          8
 7934 MILLER     CLERK       1300.00     10          9
 7521 WARD       SALESMAN    1250.00     30         10
 7654 MARTIN     SALESMAN    1250.00     30         11
 7876 ADAMS      CLERK       1100.00     20         12
 7900 JAMES      CLERK        950.00     30         13
 7369 SMITH      CLERK        800.00     20         14

Example 2:

The following example has lot of functions .

1. LEAD and LAG will let us to look at the previous and next records's value .

2. RANK and DENSE_RANK let us to rank the particular record based on the column ( we specify ) . The difference between RANK and DENSE_RANK comes when there is a tie in the column value . DENSE_RANK does not skip the rank , where as the RANK does.

SQL> SELECT empno,
  2         ename,
  3         job,
  4         sal,
  5         deptno,
  6         row_number() over(PARTITION BY deptno ORDER BY sal ASC) rn,
  7         rank() over(PARTITION BY deptno ORDER BY sal ASC) rank,
  8         dense_rank() over(PARTITION BY deptno ORDER BY sal ASC) dense_rank,
  9         lag(sal) over(PARTITION BY deptno ORDER BY sal ASC) previous_sal,
 10         lead(sal) over(PARTITION BY deptno ORDER BY sal ASC) next_sal
 11  FROM emp
 12  ORDER BY deptno, sal
 13  /

EMPNO ENAME      JOB             SAL DEPTNO         RN       RANK DENSE_RANK PREVIOUS_SAL   NEXT_SAL
----- ---------- --------- --------- ------ ---------- ---------- ---------- ------------ ----------
 7934 MILLER     CLERK       1300.00     10          1          1          1                    2450
 7782 CLARK      MANAGER     2450.00     10          2          2          2         1300       5000
 7839 KING       PRESIDENT   5000.00     10          3          3          3         2450
 7369 SMITH      CLERK        800.00     20          1          1          1                    1100
 7876 ADAMS      CLERK       1100.00     20          2          2          2          800       2975
 7566 JONES      MANAGER     2975.00     20          3          3          3         1100       3000
 7788 SCOTT      ANALYST     3000.00     20          4          4          4         2975       3000
 7902 FORD       ANALYST     3000.00     20          5          4          4         3000
 7900 JAMES      CLERK        950.00     30          1          1          1                    1250
 7654 MARTIN     SALESMAN    1250.00     30          2          2          2          950       1250
 7521 WARD       SALESMAN    1250.00     30          3          2          2         1250       1500
 7844 TURNER     SALESMAN    1500.00     30          4          4          3         1250       1600
 7499 ALLEN      SALESMAN    1600.00     30          5          5          4         1500       2850
 7698 BLAKE      MANAGER     2850.00     30          6          6          5         1600



Example 3:

NTILE let us to divide the results in equal height . 
In the following example , the result set is divided into three equal parts . 

I find  NTILE  very useful  in do it yourself (DIY) parallel-zing   jobs . In DIY jobs , you could divvy up the result set and send it to parallel jobs either using DBMS_JOB / DBMS_SCHDULER.

SQL> SELECT empno, ename, job, sal, deptno, ntile(3) over(ORDER BY sal) ntile
  2  FROM emp
  3  ORDER BY sal
  4  /

EMPNO ENAME      JOB             SAL DEPTNO      NTILE
----- ---------- --------- --------- ------ ----------
 7369 SMITH      CLERK        800.00     20          1
 7900 JAMES      CLERK        950.00     30          1
 7876 ADAMS      CLERK       1100.00     20          1
 7521 WARD       SALESMAN    1250.00     30          1
 7654 MARTIN     SALESMAN    1250.00     30          1
 7934 MILLER     CLERK       1300.00     10          2
 7844 TURNER     SALESMAN    1500.00     30          2
 7499 ALLEN      SALESMAN    1600.00     30          2
 7782 CLARK      MANAGER     2450.00     10          2
 7698 BLAKE      MANAGER     2850.00     30          2
 7566 JONES      MANAGER     2975.00     20          3
 7788 SCOTT      ANALYST     3000.00     20          3
 7902 FORD       ANALYST     3000.00     20          3
 7839 KING       PRESIDENT   5000.00     10          3



Example 4: 

The first look at the following example may be little bit intimidating . If you look at the second example , it may be clear .

In the latter example , we get the ratio of the salaries in DEPT 10 . The total of sal is 8750 , out of which CLARK's salary is 2450 ( 28% of the department's total salary ).

It helped us in one scenario , where the requirement was to save the report into an excel spreadsheet format .
This report was developed using PowerBuilder . In PowerBuilder , we can save the contents of the datawindow into an excel spreadsheet . The developer has done in the ratio calculations in the front end ( computed column in PB lingo ) . When the data window was saved as an excel , the calculation did not make into an excel ( as only  the result set of  the SQL was saved ) . In this case , the following function came in handy.

SQL> SELECT empno,
  2         ename,
  3         job,
  4         deptno,
  5         sal,
  6         ratio_to_report(sal) over() rr_whole,
  7         ratio_to_report(sal) over(PARTITION BY deptno) rr_deptno
  8  FROM emp
  9  /

EMPNO ENAME      JOB       DEPTNO       SAL   RR_WHOLE  RR_DEPTNO
----- ---------- --------- ------ --------- ---------- ----------
 7782 CLARK      MANAGER       10   2450.00 0.08440999       0.28
 7839 KING       PRESIDENT     10   5000.00 0.17226528 0.57142857
 7934 MILLER     CLERK         10   1300.00 0.04478897 0.14857142
 7566 JONES      MANAGER       20   2975.00 0.10249784 0.27356321
 7902 FORD       ANALYST       20   3000.00 0.10335917 0.27586206
 7876 ADAMS      CLERK         20   1100.00 0.03789836 0.10114942
 7369 SMITH      CLERK         20    800.00 0.02756244 0.07356321
 7788 SCOTT      ANALYST       20   3000.00 0.10335917 0.27586206
 7521 WARD       SALESMAN      30   1250.00 0.04306632 0.13297872
 7844 TURNER     SALESMAN      30   1500.00 0.05167958 0.15957446
 7499 ALLEN      SALESMAN      30   1600.00 0.05512489 0.17021276
 7900 JAMES      CLERK         30    950.00 0.03273040 0.10106382
 7698 BLAKE      MANAGER       30   2850.00 0.09819121 0.30319148
 7654 MARTIN     SALESMAN      30   1250.00 0.04306632 0.13297872



SQL> SELECT empno,
  2         ename,
  3         job,
  4         deptno,
  5         sal,
  6         ratio_to_report(sal) over(PARTITION BY deptno) rr_deptno
  7  FROM emp
  8  Where deptno = 10
  9  /

EMPNO ENAME      JOB       DEPTNO       SAL  RR_DEPTNO
----- ---------- --------- ------ --------- ----------
 7782 CLARK      MANAGER       10   2450.00       0.28
 7839 KING       PRESIDENT     10   5000.00 0.57142857
 7934 MILLER     CLERK         10   1300.00 0.14857142

1 comment:

  1. Very helpful, it will be great if you could add the any new enhancements in 10g or 11g

    ReplyDelete