Looking back with the analytic function LAG

In this article I will show you how to use the LAG analytic function to look back in your SQL results.

LAG gives you access to another row at a given physical offset prior to that position.
Think about this for a while… you do not need a self join!
On large result sets, this is a major performance compared to using self joins!

Let’s have a look at the salaries of the IT department…

HR>SELECT e.last_name, e.salary
  FROM employees e, departments d
 WHERE department_name = 'IT'
   AND e.department_id = d.department_id
 ORDER BY salary;

LAST_NAME   SALARY
---------- -------
Lorentz       4200
Austin        4800
Pataballa     4800
Ernst         6000
Hunold        9000
5 rows selected

Now for each row I would like to see the value of the previous salary:

HR>SELECT e.last_name, e.salary
      ,LAG(salary, 1, 0) OVER ( ORDER BY salary) AS prev_sal
  FROM employees e, departments d
 WHERE department_name = 'IT'
   AND e.department_id = d.department_id
 ORDER BY salary;

LAST_NAME   SALARY   PREV_SAL
---------- ------- ----------
Lorentz       4200      0
Austin        4800   4200
Pataballa     4800   4800
Ernst         6000   4800
Hunold        9000   6000
5 rows selected.

The default offset is 1 (the value from one row back)
Optionally, you can add a default value.
e.g.: LAG(salary, 1, 0) means the salary from the previous row, if no value is known the value is 0

What I’m really interested in is the difference between the salaries:

HR>SELECT e.last_name, e.salary
      ,(salary - LAG(salary, 1, 0) OVER ( ORDER BY salary)) AS diff
  FROM employees e, departments d
 WHERE department_name = 'IT'
   AND e.department_id = d.department_id
 ORDER BY salary;

LAST_NAME   SALARY   DIFF
---------- ------- ----------
Lorentz       4200   4200
Austin        4800    600
Pataballa     4800      0
Ernst         6000   1200
Hunold        9000   3000
5 rows selected.

Notice the first DIFF value, this is 4200 because the default value is 0
We can change this and have a better looking result:

HR>SELECT e.last_name, e.salary
      ,(salary - LAG(salary, 1, 4200) OVER ( ORDER BY salary)) AS diff
  FROM employees e, departments d
 WHERE department_name = 'IT'
   AND e.department_id = d.department_id
 ORDER BY salary;

LAST_NAME   SALARY   DIFF
---------- ------- ----------
Lorentz       4200      0
Austin        4800    600
Pataballa     4800      0
Ernst         6000   1200
Hunold        9000   3000
5 rows selected.

Related posts

Tags: , ,

One Response to “Looking back with the analytic function LAG”

  1. AndyBW says:

    “We can change this and have a better looking result” – may be if change it to: …LAG(salary, 1, salary)…

Leave a Reply