Last week I wrote about some of the newer innovations in the SQL space. This got me thinking. What other language features would be useful in a SQL environment? This post explores some ideas I’ve had around this. It’s purely speculative, but perhaps it will spark some ideas for you. 1. Optional GROUP BY clause In most GROUP BY queries, every non-aggregate must be listed in the GROUP BY clause. Sure, we have options, like GROUP BY 1, 2, 3 or GROUP BY ALL, but while these are convenient, they’re also busywork for most queries. So my first suggestion is…
On 30 April 2025, DataBricks announced their SQL pipe operator. Without giving you a full rundown of what it is and what it does, here’s a quick example which should give you a good idea. This query: Is now equivalent to this: There’s more to it of course, but that’s the gist of it. It follows precedent set by other languages. Here’s how it might be written in R using tidyverse syntax. Here’s how you might express it in Python using a method chain. Ok so you get the picture. It’s not an unusual step for DataBricks to have taken.…
When working with databases, it’s crucial to know the best methods to query data effectively. In SQL, two powerful operations that allow you to compare sets of data are the INTERSECT operator and the INNER JOIN clause. Both commands serve the purpose of identifying commonalities between datasets, but they do so in subtly different ways that can affect the output and performance of your queries. What is the INTERSECT Operator? The INTERSECT operator is used to return a distinct intersection of two sets, which means it will only return the rows that are present in both query result sets. Here…
The full script for this post can be found here. This post uses the AdventureWorksDW database in SQL Server 2019 Express. Depending on your preference, you can either watch this video or read below. The goal Suppose we have a request to produce a report: Create a report showing internet sales amount with these columns: Order year Country Product category Sales amount Include a sub-total row showing total sales amount for each year. The value in the Country column on the sub-total row should be “All countries in YYYY”. The value in the Product category column on the sub-total row…
The full script for this post can be downloaded here. The following post was written primarily in Postgres 14 and tested in both Postgres and MySQL. While the data generation and use of EXCEPT are not supported natively in MySQL, the principles of row comparisons are equivalent. Setup To demonstrate how a row constructor comparison can be confusing, let’s create some dummy data with two columns of integers: Comparisons PostgreSQL allows us to compare two sets of values using what is known as a row constructor comparison. A row can be constructed with parentheses, like this: We can create…
What is Advanced SQL, anyway? I think about this sometimes. Is it window functions? CTEs? I don’t know. Some of it is trickier than others. I think it’s relative to how much practice you’ve had and depends a great deal on the data in front of you. Anyway, I saw this post on LinkedIn by Ankit Bansal. It struck me as an interesting problem. I wanted to come up with a solution before watching the author’s video. It’s only 8 minutes, so probably worth watching. I was happy to find that my solution was close but not identical. In case…