Thursday, June 26, 2014

SQL GROUP BY Example

The GROUP BY clause causes data in a table (or any data source) to be divided into groups based on the expressions (or conditions) given in the GROUP BY clause. Seems confusing? Probably.. Let's see what it means,

Say you have Column1, Column 2..etc on your table.

Now you say GROUP BY Column1 , what this does is it divides your table's data based on Column1. So all data which has  Column1='Apple' will be in one group, then all data which has  Column1='Orange' will be in another group.

Again if you had GROUP BY Column1, Column2 then your data will be divided into groups of Column1 + Column2 combinations . If Column2 had values of Green and Red in different rows then one group would be based on Apple+Red another group Apple+Green. So you get the point.

So in a nutshell,

GROUP BY creates groups for output rows, according to unique combination of values specified in the GROUP BY clause

But why would you need to divide them into groups? So we can operate on those groups.

I'll use AdventureWorks database as an example.

Download it here,
https://msftdbprodsamples.codeplex.com/releases/view/93587


SELECT SalesPersonID, YEAR(OrderDate) AS OrderYear,
COUNT(CustomerID) AS All_Customers
FROM Sales.SalesOrderHeader
GROUP BY SalesPersonID, YEAR(OrderDate);

The above query will return all unique combinations of SalesPersonID  + YEAR(OrderDate) and display all customers each of those combinations.

In other words this means you are seeing all customers for a particular Sales Person in a given year.

Point to note:

SELECT DISTINCT SalesPersonID ,YEAR(OrderDate) AS OrderYear
FROM Sales.SalesOrderHeader

and

SELECT  SalesPersonID ,YEAR(OrderDate) AS OrderYear
FROM Sales.SalesOrderHeader
GROUP BY SalesPersonID, YEAR(OrderDate);

Will give you the same results.