recursive queries

I love recursion, and parent child relationships
I just found a wonderful example, and had to share the experience

(note, I am not the author, just republishing)

Many SQL implementations don’t have loops, making some kinds of analysis very difficult. Postgres, SQL Server, and several others have the next best thing — recursive CTEs!

We’ll use them to solve the Traveling Salesman Problem and find the shortest round-trip route through several US cities.

recursive queries

With Recursive

Normal CTEs are great at helping to organize large queries. They are a simple way to make temporary tables you can access later in your query.

Recursive CTEs are more powerful – they reference themselves and allow you to explore hierarchical data. While that may sound complicated, the underlying concept is very similar to a for loop in other programming languages.

These CTEs have two parts — an anchor member and a recursive member. Theanchor member selects the starting rows for the recursive steps.

The recursive member generates more rows for the CTE by first joining against the anchor rows, and then joining against rows created in previous recursions. The recursive member comes after a union all in the CTE definition.

Here’s a simple recursive CTE that generates the numbers 1 to 10. The anchormember selects the value 1, and the recursive member adds to it up to the number 10:

with recursive incrementer(prev_val) as (
  select 1 -- anchor member
  union all
  select -- recursive member
    incrementer.prev_val + 1
  from incrementer
  where prev_val < 10 -- termination condition

select * from incrementer

The first time the recursive CTE runs it generates a single row 1 using the anchormember. In the second execution, the recursive member joins against the 1 and outputs a second row, 2. In the third execution the recursive step joins against both rows 1 and 2 and adds the rows 2 (a duplicate) and 3.

Recursive CTEs also only return distinct rows. Even though our CTE above creates many rows with the same value, only a distinct set of rows will be returned.

Notice how the CTE specifies its output as the named value prev_val. This lets us refer to the output of the previous recursive step.

And at the very end there is a termination condition to halt the recursion once the sum gets to 10. Without this condition, the CTE would enter an infinite loop!

Under the hood, the database is building up a table named after this recursive CTE using unions:

recursive queries

Recursive CTEs can also have many parameters. Here’s one that takes the sum, double, and square of starting values of 1, 2 and 3:

with recursive cruncher(inc, double, square) as (
  select 1, 2.0, 3.0 -- anchor member
  union all
  select -- recursive member + 1,
    cruncher.double * 2,
    cruncher.square ^ 2
  from cruncher
  where inc < 10

select * from cruncher

With recursive CTEs we can solve the Traveling Salesman Problem.

Finding the Shortest Path

There are many algorithms for finding the shortest round-trip path through several cities. We’ll use the simplest: brute force. Our recursive CTE will enumerate all possible routes and their total distances. We’ll then sort to find the shortest.

First, a list of cities with Periscope customers, along with their latitudes and longitudes:

create table places as (
    'Seattle' as name, 47.6097 as lat, 122.3331 as lon
    union all select 'San Francisco', 37.7833, 122.4167
    union all select 'Austin', 30.2500, 97.7500
    union all select 'New York', 40.7127, 74.0059
    union all select 'Boston', 42.3601, 71.0589
    union all select 'Chicago', 41.8369, 87.6847
    union all select 'Los Angeles', 34.0500, 118.2500
    union all select 'Denver', 39.7392, 104.9903

And we’ll need a distance function to compute how far two lat/lons are from each other (thanks to strkol on

create or replace function lat_lon_distance(
  lat1 float, lon1 float, lat2 float, lon2 float
) returns float as $$
  x float = 69.1 * (lat2 - lat1);
  y float = 69.1 * (lon2 - lon1) * cos(lat1 / 57.3);
  return sqrt(x * x + y * y);
$$ language plpgsql

Our CTE will use San Francisco as its anchor city, and then recurse from there to every other city:

with recursive travel(places_chain, last_lat, last_lon,
    total_distance, num_places) as (
  select -- anchor member
    name, lat, lon, 0::float, 1
    from places
    where name = 'San Francisco'
  union all
  select -- recursive member
    -- add to the current places_chain
    travel.places_chain || ' -> ' ||,,
    -- add to the current total_distance
    travel.total_distance +
      lat_lon_distance(last_lat, last_lon,, places.lon),
    travel.num_places + 1
    places, travel
    position( in travel.places_chain) = 0

The parameters in the CTE are:

  • places_chain: The list of places visited so far, which will be different for each instance of the recursion
  • last_lat and last_lon: The latitude and longitude of the last place in theplaces_chain
  • total_distance: The distance traveled going from one place to the next in the places_chain
  • num_places: The number of places in places_chain — we’ll use this to tell which routes are complete because they visited all cities

In the recursive member, the where clause ensures that we never repeat a place. If we’ve already visited Denver, position(...) will return a number greater than 0, invalidating this instance of the recursion.

We can see all possible routes by selecting all 8-city chains:

select * from travel where num_places = 8

We need to add in the distance from the last city back to San Francisco to complete the round-trip. We could hard code San Francisco’s lat/lon, but a join is more elegant. Once that’s done we sort by distance and show the smallest:

  travel.places_chain || ' -> ' ||,
  total_distance + lat_lon_distance(
      travel.last_lat, travel.last_lon,, places.lon) as final_dist
from travel, places
  travel.num_places = 8
  and = 'San Francisco'
order by 2 -- ascending!
limit 1

Even though this query is significantly more complicated than the incrementerquery earlier, the database is doing the same things behind the scenes. The top branch is the creating the CTE’s rows, the bottom branch is the final join and sort:

recursive queries

Run this query and you’ll see the shortest route takes 6671 miles and visits the cities in this order:

San Francisco -> Seattle -> Denver ->
Chicago -> Boston -> New York -> Austin ->
Los Angeles -> San Francisco

Thanks to recursive CTEs, we can solve the Traveling Salesman Problem in SQL!


Yes, I’m just as dissappointed with HP as anyone could be.. I just hope that Windows 8 fills the void!

Yes, it was short-sighted for HP to pour $10 billion into WebOS.. and I think that they should have kept on spending until there was nothing left..

but calling them Jewlett-Packard?
Just because they’re too cheap to innovate?

I just don’t know how stuff like this gets past editors!!!

Optic Fusion | Cage And Bandwidth Specials









Optic Fusion | Cage And Bandwidth Specials

Optic Fusion Fall Specials

Single Server and 5 Mbps for only $99 per month.
Half Cabinets starting at $300 per month.
Full Cabinets starting at $650 per month.
Full cabinet with 100 Mbps for only $1450 per month.
Bulk bandwidth available for $15 per meg.
New customers only, limited quantities available.
We offer a wide variety of packages and discounts, email us at today and let us build a bundled package for you!

GOOGLE+. Why discussions about "real names" are out-of-date – Forbes

The internet is not a virtual world anymore; the internet makes our life a connected life. So should we abandon anonymity? Let’s take the problem from a different point of view: in the “unconnected” life, you can publish and talk anonymously if you want to. It’s not easy, or usual, but it’s sometimes necessary, because there can’t be democracy without protected places. When we vote, we need to be anonymous, when you criticize a regime (even a democratic one), an institution, or a company you sometimes need to be anonymous to protect yourself (especially when you work there). Transparency sometimes needs a little “non-transparence,” which means anonymous talk, to help information emerge. In real life, some tools are available to protect our identities and generate those private areas that are essential to a balanced life and society. We need to find these new private areas, these new “secret squares,” for the connected life. But, as in real life, anonymity has to be the exception, not the rule.

So, yes, the question of “real names” is outdated. The real question should be: now that Facebook, Twitter and Google+ have become public spaces where people can meet to share or to protest, is there a danger in housing theses public places in the exclusive hands of private companies? If “Internet” is a new country, then who will protect freedom in its public places ?

via GOOGLE+. Why discussions about “real names” are out-of-date – Forbes.