Advanced SQL Part 3— Window Functions with Django ORM (RANK, ROW_NUMBER, NTILE, FIRST_VALUE)
Welcome to this exploration of Advanced SQL — Window Functions Part 3.
- To learn about frame type — ROWS, RANGE, and GROUPS with Postgres and Django ORM, refer to the part 1 of this blog series
- To learn about functions like PARTITION_BY, EXCLUDE, LEAD/LAG with Postgres and Django ORM, refer to the part 2of this blog series
RECAP:
What are window functions?
With SQL:2003, the ISO SQL Standard introduced window functions, a new mode of row-based computation:
Input Output
Aggregate functions(sum, avg, count) group of rows → row (one per group)
window function row vicinity → row (one per row)
● Window functions operate on a set of rows ( commonly known as window or frame or row vicinity) and return a single value for each row.
● The term window describes the set of rows on which the function operates.
● A window function uses values from the rows in a window.
In the Part 1 of this blog series, we learnt how to create a frame using frame_types — ROWS, RANGE and GROUPS.
In the part 2 of this blog series, we learnt about different functions like EXCLUDE, PARTITION BY, LAG and LEAD
Now we’ll see more about the other functionality that window functions provide.
Table Creations
We will create a dummy table with some records and work with window functions.
DROP TABLE IF EXISTS sample;
CREATE TABLE sample (
row text PRIMARY KEY,
a int,
b BOOLEAN
);
INSERT INTO sample(row, a, b) VALUES
('Q1', 1, false),
('Q2', 2, true),
('Q3', 3, true),
('Q4', 3, false),
('Q5', 3, true),
('Q6', 4, true),
('Q7', 6, false),
('Q8', 6, false),
('Q9', 7, true);
For Django Users
Create a model same as the table above and fill the records using Admin Panel
class Sample(models.Model):
row = models.CharField(max_length=5)
a = models.PositiveIntegerField()
b = models.BooleanField(default=True)
def __str__(self):
return self.row + "-" + str(self.a) +"-" + str(self.b)
1. FIRST_VALUE, LAST_VALUE, NTH_VALUE
Aggregates reduce all rows inside a frame to a single value.
Now for something different:
Positional access to individual rows inside a frame is provided by three window functions:
FIRST_VALUE(e)
LAST_VALUE(e)
NTH_VALUE(e, n)
NTH_VALUE(A, s): will return NULL if the index is not present
NOTE: FIRST_VALUE(e) ≡ NTH_VALUE(e,1).
Query 1 : FIRST_VALUE, LAST_VALUE, NTH_VALUE example
SELECT w."row" AS "current row",
array_agg(w."row") OVER win AS "rows in frame",
FIRST_VALUE(w."row") OVER win AS "first row",
LAST_VALUE(w."row") OVER win AS "last row",
NTH_VALUE(w."row",2) OVER win AS "second row"
FROM sample AS w
WINDOW win AS (ORDER BY w.a ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING)
ORDER BY w.a, w.row;
OUTPUT -
current row | rows in frame | first row | last row | second row
------------+---------------------+-----------+----------+------------
Q1 | {Q1,Q2,Q3} | Q1 | Q3 | Q2
Q2 | {Q1,Q2,Q3,Q4} | Q1 | Q4 | Q2
Q3 | {Q1,Q2,Q3,Q4,Q5} | Q1 | Q5 | Q2
Q4 | {Q2,Q3,Q4,Q5,Q6} | Q2 | Q6 | Q3
Q5 | {Q3,Q4,Q5,Q6,Q7} | Q3 | Q7 | Q4
Q6 | {Q4,Q5,Q6,Q7,Q8} | Q4 | Q8 | Q5
Q7 | {Q5,Q6,Q7,Q8,Q9} | Q5 | Q9 | Q6
Q8 | {Q6,Q7,Q8,Q9} | Q6 | Q9 | Q7
Q9 | {Q7,Q8,Q9} | Q7 | Q9 | Q8
Explanation -
For the current row Q4
Frame type — ROWS
Window — ROWS BETWEEN 2PRECEDING AND 2 FOLLOWING -
{Q2,Q3,Q4,Q5,Q6}
Q2 and Q3 are 2 preceding, Q5 and Q6 are 2 following
FIRST_VALUE = Q2 as that is the first element in the window
LAST_VALUE = Q6 as that is the last element in the window
NTH_VALUE(w.row, 2) — Q3 as it as the 2nd index
Django Query 1 : FIRST_VALUE, LAST_VALUE, NTH_VALUE example
from django.db.models import Avg, F, RowRange, Window, Count, Sum
from django.db.models.functions import Lag, Lead, LastValue, NthValue, FirstValue
window = {
'frame': RowRange(start=-2, end=2),
'order_by': F('a').asc()
}
qs = Sample.objects.annotate(
first = Window(
expression=FirstValue('row'), **window
),
last = Window(
expression=LastValue('row'), **window
),
second_val = Window(
expression=NthValue('row', 2), **window
),
).order_by('a')
for i in qs:
print(i.row, i.a, i.b, i.first, i.last, i.second_val)
OUTPUT -
Q1 1 False Q1 Q3 Q2
Q2 2 True Q1 Q4 Q2
Q3 3 True Q1 Q5 Q2
Q4 3 False Q2 Q6 Q3
Q5 3 True Q3 Q7 Q4
Q6 4 True Q4 Q8 Q5
Q7 6 False Q5 Q9 Q6
Q8 6 False Q6 Q9 Q7
Q9 7 True Q7 Q9 Q8
2. ROW_NUMBER(), DENSE_RANK(), RANK()
Countless problem scenarios involve the number (position) or rank of the current row in an ordered sequence of rows.
Family of window functions to number/rank rows:
ROW_NUMBER()
DENSE_RANK()
RANK()
Scope is the partition (if present) — FRAME is irrelevant.
ROW_NUMBER — gives incremental row number to all the rows
DENSE_RANK — Peer rows (rows with same values) will share a common rank. No rank numbers are lost.
RANK — Peer rows (rows with same values) will share a common rank. In this case rank numbers are lost.
Query 2 : ROW_NUMBER, DENSE_RANK, RANK example
SELECT w."row" AS "current row",
w.a,
ROW_NUMBER() OVER win AS "ROW_NUMBER",
DENSE_RANK() OVER win AS "DENSE_RANK",
RANK() OVER win AS "RANK"
FROM sample AS w
WINDOW win AS (ORDER BY w.a)
ORDER BY w.a;
OUTPUT -
current row | a | ROW_NUMBER | DENSE_RANK | RANK
------------+---+------------+------------+------
Q1 | 1 | 1 | 1 | 1
Q2 | 2 | 2 | 2 | 2
Q3 | 3 | 3 | 3 | 3
Q4 | 3 | 4 | 3 | 3
Q5 | 3 | 5 | 3 | 3
Q6 | 4 | 6 | 4 | 6
Q7 | 6 | 7 | 5 | 7
Q8 | 6 | 8 | 5 | 7
Q9 | 7 | 9 | 6 | 9
Django Query 2 : ROW_NUMBER, DENSE_RANK, RANK example
from django.db.models import Avg, F, RowRange, Window, Count, Sum
from django.db.models.functions import Lag, Lead, DenseRank, Rank, RowNumber
window = {
'order_by': F('a').asc()
}
qs = Sample.objects.annotate(
row_number = Window(
expression=RowNumber(), **window
),
rank = Window(
expression=Rank(), **window
),
dense_rank = Window(
expression=DenseRank(), **window
),
).order_by('a')
for i in qs:
print(i.row, i.a, i.b, i.row_number, i.dense_rank, i.rank)
OUTPUT -
Q1 1 False 1 1 1
Q2 2 True 2 2 2
Q3 3 True 3 3 3
Q4 3 False 4 3 3
Q5 3 True 5 3 3
Q6 4 True 6 4 6
Q7 6 False 7 5 7
Q8 6 False 8 5 7
Q9 7 True 9 6 9
3. NTILE
NTILE creates equal chunks or partitions of data.
Example if NTILE(3) is specified, then chunks of 3 records are created.
Query 3 : NTILE example
SELECT w."row" AS "current row",
w.a,
NTILE(3) OVER win AS "NTILE(3)"
FROM sample AS w
WINDOW win AS (ORDER BY w.a)
ORDER BY w.a;
OUTPUT -
current row | a | NTILE(3)
— — — — — — + — -+ — — — — —
Q1 | 1 | 1
Q2 | 2 | 1
Q3 | 3 | 1
Q4 | 3 | 2
Q5 | 3 | 2
Q6 | 4 | 2
Q7 | 6 | 3
Q8 | 6 | 3
Q9 | 7 | 3
Django Query 3 : NTILE example
from django.db.models.functions import Ntile
window = {
'order_by': F('a').asc()
}
qs = Sample.objects.annotate(
ntile = Window(
expression=Ntile(3), **window
),
).order_by('a')
for i in qs:
print(i.row, i.a, i.b, i.ntile)
OUTPUT -
Q1 1 False 1
Q2 2 True 1
Q3 3 True 1
Q4 3 False 2
Q5 3 True 2
Q6 4 True 2
Q7 6 False 3
Q8 6 False 3
Q9 7 True 3
This is not the end of window functions. We will use all the knowledge gained till now and solve problems in the next part of this window function blog series.
… to be continued
Useful Information
References
Blog: https://www.bluetickconsultants.com/advanced-sql-window-functions-part-3.html
Our Open Source Projects
Open Source Projects: https://www.bluetickconsultants.com/open-source.html
GitHub Open Source repos: https://github.com/orgs/bluetickconsultants/repositories
Author:
Bluetick Consultants LLP — https://www.bluetickconsultants.com