Running totals using windowed functions

[Category: Programming and Databases] [link] [Date: 2013-08-14 17:39:10]

Many modern databases support windowed aggregate functions. One of the most useful applications of these are computing running totals for either entire data sets or by a sub-grouping. On older database systems, one was reduced to while loops, cursors, CTE's, or perhaps very inefficient nested select queries to generate running totals. Thankfully, there are better options now. The below example is in PostgreSQL (supported 8.4+). A similar strategy works in Microsoft SQL but only in 2012+.

-- create data
drop table if exists pricedata;
create temp table pricedata (category text,item text,price float8);
insert into pricedata
values ('Fishing','Tackle Box',20)
       ,('Camping','Sleeping Bag',50)
-- use windowed aggregate function with over ()
select category,item
    ,sum(price) over (partition by category order by item) totalprice
from pricedata;

We get a logical result of:

category  item          totalprice
Baseball  Bat           40
Camping   Sleeping Bag  50
Camping   Stove         80
Fishing   Bait          5
Fishing   Rod           105
Fishing   Tackle Box    125

It's fast, convenient, and pretty powerful.

Montage video of Planet Earth

[Category: Music] [link] [Date: 2013-08-03 01:38:32]

After listening to the song "DEA" by The American Dollar on repeat for the last few weeks, I noticed that it would fit very well as music to one of my favorite montage videos of all time: a BBC Planet Earth montage promotional video that originally features "Hoppipolla" by Sigur Ros. I hadn't done any video editing in quite some time and enjoyed the opportunity to combine two of my favorite things. After extracting the clips and a few evenings of editing, I was satisfied with the result. I did, however, run out of clips near the end (I was mostly shortening them from their original length).

Using a recursive CTE to traverse a simple directed graph

[Category: Programming and Databases] [link] [Date: 2013-07-26 18:28:53]

Many modern databases support the use of Common Table Expressions (CTE). Numerous times these have saved me the effort of writing code outside of the database when attempting to do basic routing, tree walking, or graph traversal. I, personally, find the CTE syntax somewhat difficult to follow. There are numerous examples online already showing CTE usage, but I thought I would throw my hat into the ring with basic example showing an interaction with a simple directed graph in PostgreSQL.

Suppose there is a directed graph, like the one seen below, and we want to walk the path from the node with id=3 to the node where id=0. The representation of the simple directed graph as a table gives us each id and a parentid.

A simple directed graph

We write a CTE like below

-- create data
drop table if exists relations;
create temp table relations (id int4, parentid int4);
insert into relations
values (0,-1),
-- walk graph
with recursive link(id,parentid) as
     select id,parentid
     from relations
     where id=3
     union all
     from relations r, link l
     where and r.parentid not in (-1)
select id,parentid
from link;

When the above code sample is run, it returns

id   parentid
3    2
2    1
1    0

So how does this wizardry work? Notice the union all statement within the recursive statement. The part before the union all is the "anchor". This defines the starting point for the CTE. Think of it like the first record (we start at id=3 with parentid=2). If this query returns more than one record, each record will be evaluated recursively, and it may be necessary to include another field to be able to separate the resulting records. The lower query is where the magic happens. The statement should call itself again.

To invoke the recursion, simply call the recursive statement. This is what the last select statement does. In this case "link" is the name of the recursive statement. Make sure that the recursive statement eventually returns no records, or else it will run forever. If you have a non-directed graph with loops, you will need to store visited locations to avoid such problems.

< Newer Posts  |  3  |  Older Posts >