Windowing clause in Analytical ( Oracle ) / Windows functions ( SQL Server ) plays an important role of how the results are rendered. This can be better explained with the following use case.
Here is an use case.
A employee may have multiple email addresses in a table. The requirement was to display the first non null email address .
Create table tab_t
( id int ,
lname varchar2(20) ,
fname varchar2(20) ,
start_dt date ,
end_dt date ,
sal number(10,2) ,
email varchar2(20)
) ;
insert into tab_t values ( 1 , 'LA1' , 'FI1' , to_date('10-01-1992' , 'MM-DD-YYYY') , null , 1000.00 , 'fn1@email.com');
insert into tab_t values ( 1 , 'LA1' , 'FI1' , to_date('12-24-2000' , 'MM-DD-YYYY') , to_date('02-15-2002' , 'MM-DD-YYYY') , 7.25 , null );
insert into tab_t values ( 1 , 'LA1' , 'FI1' , to_date('05-16-1999' , 'MM-DD-YYYY') , to_date('11-15-2000' , 'MM-DD-YYYY') , 5.25 , null );
insert into tab_t values ( 2 , 'LA2' , 'FI2' , to_date('01-01-1993' , 'MM-DD-YYYY') , to_date('01-01-1995' , 'MM-DD-YYYY'), 75000.00 , 'fn2@email.com');
insert into tab_t values ( 2 , 'LA2' , 'FI2' , to_date('12-24-2000' , 'MM-DD-YYYY') , to_date('02-15-2002' , 'MM-DD-YYYY') , 7.25 , 'fn2@newemail.com' );
insert into tab_t values ( 3 , 'LA3' , 'FI3' , to_date('07-02-1970' , 'MM-DD-YYYY') , to_date('02-04-1980' , 'MM-DD-YYYY') , 3.25 , null );
insert into tab_t values ( 3 , 'LA3' , 'FI3' , to_date('02-05-1980' , 'MM-DD-YYYY') , to_date('02-04-1980' , 'MM-DD-YYYY') , 10250.00 , 'myfirst@email.com' );
insert into tab_t values ( 4 , 'LA4' , 'FI4' , to_date('11-01-2009' , 'MM-DD-YYYY') , null , 90000 , 'mywork@email.com' );
SQL> Select * from tab_t;
Here is the SQL to list the first non null email address and we will filter the results where rn =1 in the second image below .
Listing 2:
SELECT
t.*,
FIRST_VALUE(email) IGNORE NULLS OVER(
PARTITION BY t.id
ORDER BY
coalesce(end_dt,TO_DATE('01-JAN-2050') ) DESC,start_dt DESC
) as fv_email,
ROW_NUMBER() OVER(
PARTITION BY t.id
ORDER BY
coalesce(end_dt,TO_DATE('01-JAN-2050') ) DESC,start_dt DESC
) as rn
FROM
tab_t t
Listing 3:
Select * from
(
SELECT
t.*,
FIRST_VALUE(email) IGNORE NULLS OVER(
PARTITION BY t.id
ORDER BY
coalesce(end_dt,TO_DATE('01-JAN-2050') ) DESC,start_dt DESC
) as fv_email,
ROW_NUMBER() OVER(
PARTITION BY t.id
ORDER BY
coalesce(end_dt,TO_DATE('01-JAN-2050') ) DESC,start_dt DESC
) as rn
FROM
tab_t t
)
Where rn = 1
Looks like , the SQL works fine.
Let us add few more records , where the first active record has null email address.
insert into tab_t values ( 5 , 'LA5' , 'FI5' , to_date('10-01-1992' , 'MM-DD-YYYY') , null , 1000.00 , null);
insert into tab_t values ( 5 , 'LA5' , 'FI5' , to_date('12-24-2000' , 'MM-DD-YYYY') , to_date('02-15-2002' , 'MM-DD-YYYY') , 7.25 , 'email.com' );
insert into tab_t values ( 5 , 'LA5' , 'FI5' , to_date('05-16-1999' , 'MM-DD-YYYY') , to_date('11-15-2000' , 'MM-DD-YYYY') , 5.25 , null );
Let us execute the SQL in Listing 2 and see the results.
Here is the result after executing the SQL in Listing 3.
Obviously , our SQL did not address all the scenarios . We need the email.com as the first non null email address for Employee 5. This is because , how the "window" is defined . By default , the window is from "unbounded rows preceding" to the "CURRENT ROW" . Let us change the "Window" to "UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING" as seen below.
Listing 3:
SELECT
t.*,
FIRST_VALUE(email) IGNORE NULLS OVER(
PARTITION BY t.id
ORDER BY
coalesce(end_dt,TO_DATE('01-JAN-2050') ) DESC,start_dt DESC
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) as fv_email,
ROW_NUMBER() OVER(
PARTITION BY t.id
ORDER BY
coalesce(end_dt,TO_DATE('01-JAN-2050') ) DESC,start_dt DESC
) as rn
FROM
tab_t t
Listing 4:
SQL with filter only the first record.
Select * from
(
SELECT
t.*,
FIRST_VALUE(email) IGNORE NULLS OVER(
PARTITION BY t.id
ORDER BY
coalesce(end_dt,TO_DATE('01-JAN-2050') ) DESC,start_dt DESC
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) as fv_email,
ROW_NUMBER() OVER(
PARTITION BY t.id
ORDER BY
coalesce(end_dt,TO_DATE('01-JAN-2050') ) DESC,start_dt DESC
) as rn
FROM
tab_t t
)
Where rn = 1
Now , with changing the "window " , we are able to get the correct results.
Comments welcome.