Advanced SQL Part 3— Window Functions with Django ORM (RANK, ROW_NUMBER, NTILE, FIRST_VALUE)

Bluetick Consultants Inc
6 min readMar 9, 2023

--

Welcome to this exploration of Advanced SQL — Window Functions Part 3.

  • To learn about frame type — ROWS, RANGE, and GROUPS with Postgres and Django ORM, refer to the part 1 of this blog series
  • To learn about functions like PARTITION_BY, EXCLUDE, LEAD/LAG with Postgres and Django ORM, refer to the part 2of this blog series

RECAP:

What are window functions?

With SQL:2003, the ISO SQL Standard introduced window functions, a new mode of row-based computation:

        Input                                          Output
Aggregate functions(sum, avg, count) group of rows → row (one per group)
window function row vicinity → row (one per row)

● Window functions operate on a set of rows ( commonly known as window or frame or row vicinity) and return a single value for each row.
● The term window describes the set of rows on which the function operates.
● A window function uses values from the rows in a window.

In the Part 1 of this blog series, we learnt how to create a frame using frame_types — ROWS, RANGE and GROUPS.

In the part 2 of this blog series, we learnt about different functions like EXCLUDE, PARTITION BY, LAG and LEAD

Now we’ll see more about the other functionality that window functions provide.

Table Creations

We will create a dummy table with some records and work with window functions.

DROP TABLE IF EXISTS sample;
CREATE TABLE sample (
row text PRIMARY KEY,
a int,
b BOOLEAN
);
INSERT INTO sample(row, a, b) VALUES
('Q1', 1, false),
('Q2', 2, true),
('Q3', 3, true),
('Q4', 3, false),
('Q5', 3, true),
('Q6', 4, true),
('Q7', 6, false),
('Q8', 6, false),
('Q9', 7, true);

For Django Users

Create a model same as the table above and fill the records using Admin Panel

class Sample(models.Model):
row = models.CharField(max_length=5)
a = models.PositiveIntegerField()
b = models.BooleanField(default=True)

def __str__(self):
return self.row + "-" + str(self.a) +"-" + str(self.b)

1. FIRST_VALUE, LAST_VALUE, NTH_VALUE

Aggregates reduce all rows inside a frame to a single value.
Now for something different:

Positional access to individual rows inside a frame is provided by three window functions:
FIRST_VALUE(e)
LAST_VALUE(e)
NTH_VALUE(e, n)

NTH_VALUE(A, s): will return NULL if the index is not present

NOTE: FIRST_VALUE(e) ≡ NTH_VALUE(e,1).

Query 1 : FIRST_VALUE, LAST_VALUE, NTH_VALUE example

SELECT w."row"                       AS "current row",
array_agg(w."row") OVER win AS "rows in frame",
FIRST_VALUE(w."row") OVER win AS "first row",
LAST_VALUE(w."row") OVER win AS "last row",
NTH_VALUE(w."row",2) OVER win AS "second row"
FROM sample AS w
WINDOW win AS (ORDER BY w.a ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING)
ORDER BY w.a, w.row;

OUTPUT -

current row |  rows in frame      | first row | last row | second row
------------+---------------------+-----------+----------+------------
Q1 | {Q1,Q2,Q3} | Q1 | Q3 | Q2
Q2 | {Q1,Q2,Q3,Q4} | Q1 | Q4 | Q2
Q3 | {Q1,Q2,Q3,Q4,Q5} | Q1 | Q5 | Q2
Q4 | {Q2,Q3,Q4,Q5,Q6} | Q2 | Q6 | Q3
Q5 | {Q3,Q4,Q5,Q6,Q7} | Q3 | Q7 | Q4
Q6 | {Q4,Q5,Q6,Q7,Q8} | Q4 | Q8 | Q5
Q7 | {Q5,Q6,Q7,Q8,Q9} | Q5 | Q9 | Q6
Q8 | {Q6,Q7,Q8,Q9} | Q6 | Q9 | Q7
Q9 | {Q7,Q8,Q9} | Q7 | Q9 | Q8

Explanation -

For the current row Q4
Frame type — ROWS

Window — ROWS BETWEEN 2PRECEDING AND 2 FOLLOWING -
{Q2,Q3,Q4,Q5,Q6}
Q2 and Q3 are 2 preceding, Q5 and Q6 are 2 following

FIRST_VALUE = Q2 as that is the first element in the window
LAST_VALUE = Q6 as that is the last element in the window
NTH_VALUE(w.row, 2) — Q3 as it as the 2nd index

Django Query 1 : FIRST_VALUE, LAST_VALUE, NTH_VALUE example

from django.db.models import Avg, F, RowRange, Window, Count, Sum
from django.db.models.functions import Lag, Lead, LastValue, NthValue, FirstValue

window = {
'frame': RowRange(start=-2, end=2),
'order_by': F('a').asc()
}

qs = Sample.objects.annotate(
first = Window(
expression=FirstValue('row'), **window
),
last = Window(
expression=LastValue('row'), **window
),
second_val = Window(
expression=NthValue('row', 2), **window
),
).order_by('a')

for i in qs:
print(i.row, i.a, i.b, i.first, i.last, i.second_val)

OUTPUT -

Q1 1 False Q1 Q3 Q2
Q2 2 True Q1 Q4 Q2
Q3 3 True Q1 Q5 Q2
Q4 3 False Q2 Q6 Q3
Q5 3 True Q3 Q7 Q4
Q6 4 True Q4 Q8 Q5
Q7 6 False Q5 Q9 Q6
Q8 6 False Q6 Q9 Q7
Q9 7 True Q7 Q9 Q8

2. ROW_NUMBER(), DENSE_RANK(), RANK()

Countless problem scenarios involve the number (position) or rank of the current row in an ordered sequence of rows.
Family of window functions to number/rank rows:
ROW_NUMBER()
DENSE_RANK()
RANK()

Scope is the partition (if present) — FRAME is irrelevant.

ROW_NUMBER — gives incremental row number to all the rows
DENSE_RANK — Peer rows (rows with same values) will share a common rank. No rank numbers are lost.
RANK — Peer rows (rows with same values) will share a common rank. In this case rank numbers are lost.

Query 2 : ROW_NUMBER, DENSE_RANK, RANK example

SELECT w."row"               AS "current row",
w.a,
ROW_NUMBER() OVER win AS "ROW_NUMBER",
DENSE_RANK() OVER win AS "DENSE_RANK",
RANK() OVER win AS "RANK"
FROM sample AS w
WINDOW win AS (ORDER BY w.a)
ORDER BY w.a;

OUTPUT -

current row | a | ROW_NUMBER | DENSE_RANK | RANK
------------+---+------------+------------+------
Q1 | 1 | 1 | 1 | 1
Q2 | 2 | 2 | 2 | 2
Q3 | 3 | 3 | 3 | 3
Q4 | 3 | 4 | 3 | 3
Q5 | 3 | 5 | 3 | 3
Q6 | 4 | 6 | 4 | 6
Q7 | 6 | 7 | 5 | 7
Q8 | 6 | 8 | 5 | 7
Q9 | 7 | 9 | 6 | 9

Django Query 2 : ROW_NUMBER, DENSE_RANK, RANK example

from django.db.models import Avg, F, RowRange, Window, Count, Sum
from django.db.models.functions import Lag, Lead, DenseRank, Rank, RowNumber

window = {
'order_by': F('a').asc()
}

qs = Sample.objects.annotate(
row_number = Window(
expression=RowNumber(), **window
),
rank = Window(
expression=Rank(), **window
),
dense_rank = Window(
expression=DenseRank(), **window
),
).order_by('a')

for i in qs:
print(i.row, i.a, i.b, i.row_number, i.dense_rank, i.rank)

OUTPUT -

Q1 1 False 1 1 1
Q2 2 True 2 2 2
Q3 3 True 3 3 3
Q4 3 False 4 3 3
Q5 3 True 5 3 3
Q6 4 True 6 4 6
Q7 6 False 7 5 7
Q8 6 False 8 5 7
Q9 7 True 9 6 9

3. NTILE

NTILE creates equal chunks or partitions of data.

Example if NTILE(3) is specified, then chunks of 3 records are created.

Query 3 : NTILE example

SELECT w."row"                 AS "current row",
w.a,
NTILE(3) OVER win AS "NTILE(3)"
FROM sample AS w
WINDOW win AS (ORDER BY w.a)
ORDER BY w.a;

OUTPUT -

current row | a | NTILE(3)
— — — — — — + — -+ — — — — —
Q1 | 1 | 1
Q2 | 2 | 1
Q3 | 3 | 1
Q4 | 3 | 2
Q5 | 3 | 2
Q6 | 4 | 2
Q7 | 6 | 3
Q8 | 6 | 3
Q9 | 7 | 3

Django Query 3 : NTILE example

from django.db.models.functions import Ntile

window = {
'order_by': F('a').asc()
}

qs = Sample.objects.annotate(
ntile = Window(
expression=Ntile(3), **window
),
).order_by('a')

for i in qs:
print(i.row, i.a, i.b, i.ntile)

OUTPUT -

Q1 1 False 1
Q2 2 True 1
Q3 3 True 1
Q4 3 False 2
Q5 3 True 2
Q6 4 True 2
Q7 6 False 3
Q8 6 False 3
Q9 7 True 3

This is not the end of window functions. We will use all the knowledge gained till now and solve problems in the next part of this window function blog series.

… to be continued

Useful Information

References

Blog: https://www.bluetickconsultants.com/advanced-sql-window-functions-part-3.html

Next part: https://www.bluetickconsultants.com/blogs.html

Our Open Source Projects

Open Source Projects: https://www.bluetickconsultants.com/open-source.html

GitHub Open Source repos: https://github.com/orgs/bluetickconsultants/repositories

Author:

Bluetick Consultants LLP https://www.bluetickconsultants.com

--

--

Bluetick Consultants Inc

Bluetick Consultants LLP: Driving Digital Transformation with Innovations like Generative AI, Cloud Migration, Talent Augmentation & More.