The SQL adventure of “Order By” and “Group By”

ORDER BY and GROUP BY DML clauses:

Most DBMS support DDL, DML and DCL commands. The DDL or Data Definition Language is used to define the structure of the database, the DML or Data Manipulation Language is used to manage the data and the DCL or Data Control Language is used to control the accessibility and privileges on the data. While each one of these languages are highly important in creating and maintaining the integrity, security and reliability of the data, DML nonetheless appears to be the most utilized language in the world of DBMS. This, is mainly due to its data retrieval statements “SELECT”. [1]

In standard ANSI SQL, the select statement is design to retrieve all the unique tuples of the selected dimension from the tables or joined tables based on the criteria defined with no specific order or grouping. The “ORDER BY” and “GROUP BY” clause address that issue by enhancing the readability of the data retrieved. [2]

Considering that SQL by default returns unique records only, the value of the “GROUP BY” is mainly revealed when coupled with an aggregator in the select clause as shown on query 1.

Query 1: Group by Query

This will display unique column_name(s) and aggregated values based on column_name(s) with similar value. Let’s consider the following Order table on Table 1:

Table 1: Order Table

The query 2 below would return the results on Table 2

Query 2: Number of Orders Query

Table 2: Number of Orders Table

This shows that John Smith has 1 order in the system, Alan Porter has 3 and German Wright has 2. With some of the SQL dialects such as T-SQL or PL/SQL, where duplicate tuples are not systematically removed, The “GROUP BY” Clause can be used instead of “DISTINCT” without Aggregators in the select clause to remove the duplicates. [3]

The “ORDER BY” forces a specific order to the data based on preselected columns as shown on query 3.

Query 3: Generic Order

This ensures that the order presented sequentially follow the specified order established for the columns listed in the “ORDER BY” clause.

Query 4 would return the data on table 3

Query 4: Sequential order query

Table 3: Sequential order table

Considering that the “ASC” syntax is the default direction for most SQL dialects, it is for the most part not required. “The ORDER BY” helps the viewer determine a specific order of the data that can be utilized for ranking. [4]

Unlike most programming languages, SQL does not have a set execution order. However, most SQL dialects execute their data retrieval statements on the following Logical Query Processing Phase:

1.      FROM clause,

2.      WHERE clause,

3.      GROUP BY clause,

4.      HAVING clause,

5.      SELECT clause,

6.      ORDER BY clause.

View that the “ORDER BY” Close gets executed last, even though both “GROUP BY” and “ORDER BY” can be simultaneously executed most DBMS vendors will require that the “GROUP BY” be declared before the “ORDER BY”.

Therefore, the statement on query 5 will return the records on table 4 where the records are initially grouped based on the CustomerName and the table ordered in an Ascendant fashion based on the Grouped CustomerName. [5].

Query 5: Orders Group query

Table 4: Order Group Table

Data Manipulation (DML) in addition to be the most utilized language of the DBMS, is mainly used to retrieve and present the data to non-technical viewers. Therefore the ability to format the data using clauses such as “GROUP BY” and “ORDER BY” is key.  

Leave a Reply