Understanding RANK and DENSE_RANK Functions in SQL

The RANK and DENSE_RANK functions are two powerful analytical functions used in SQL to assign ranks to rows within a result set. These functions are commonly used for data analysis, reporting, and business intelligence purposes.

Introduction to RANK Function

The RANK function assigns a unique rank to each row within a result set based on the values in a specified column or expression. If two or more rows have the same value, they will be assigned the same rank, and the next rank will be skipped. For example, if two rows are ranked first, the next row will be ranked third.

Introduction to DENSE_RANK Function

The DENSE_RANK function is similar to the RANK function, but it does not skip ranks when there are ties. Instead, it assigns consecutive ranks to each unique value in the specified column or expression. If two rows have the same value, they will be assigned the same rank, and the next row will be assigned the next rank without skipping any ranks.

Key Differences Between RANK and DENSE_RANK

The key differences between RANK and DENSE_RANK functions are:

The RANK function skips ranks when there are ties, while the DENSE_RANK function does not.
The RANK function assigns a unique rank to each row based on the values in a specified column or expression, while the DENSE_RANK function assigns consecutive ranks to each unique value.

Example Usage of RANK and DENSE_RANK

To illustrate the differences between RANK and DENSE_RANK, let’s consider an example table called employees with columns for employee name and salary:

CREATE TABLE employees (
    emp_name VARCHAR(255),
    salary DECIMAL(10, 2)
);

INSERT INTO employees (emp_name, salary) VALUES
('John Doe', 50000.00),
('Jane Doe', 60000.00),
('Bob Smith', 70000.00),
('Alice Johnson', 80000.00),
('Mike Brown', 90000.00);

To rank the employees based on their salaries using RANK and DENSE_RANK, we can use the following queries:

SELECT 
    emp_name,
    salary,
    RANK() OVER (ORDER BY salary DESC) AS rank,
    DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank
FROM employees;

The result will be:

| emp_name | salary | rank | dense_rank |
|—————|———|——|————|
| Mike Brown | 90000.00 | 1 | 1 |
| Alice Johnson | 80000.00 | 2 | 2 |
| Bob Smith | 70000.00 | 3 | 3 |
| Jane Doe | 60000.00 | 4 | 4 |
| John Doe | 50000.00 | 5 | 5 |

If we add another employee with the same salary as one of the existing employees, the result will be:

INSERT INTO employees (emp_name, salary) VALUES ('Emily Davis', 60000.00);

SELECT 
    emp_name,
    salary,
    RANK() OVER (ORDER BY salary DESC) AS rank,
    DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank
FROM employees;

The result will be:

| emp_name | salary | rank | dense_rank |
|—————|———|——|————|
| Mike Brown | 90000.00 | 1 | 1 |
| Alice Johnson | 80000.00 | 2 | 2 |
| Bob Smith | 70000.00 | 3 | 3 |
| Jane Doe | 60000.00 | 4 | 4 |
| Emily Davis | 60000.00 | 4 | 4 |
| John Doe | 50000.00 | 6 | 5 |

As you can see, the RANK function skips ranks when there are ties (Jane Doe and Emily Davis have the same rank of 4), while the DENSE_RANK function assigns consecutive ranks without skipping any ranks.

Handling NULL Values

When working with RANK and DENSE_RANK, it’s essential to consider how NULL values will be handled. By default, NULL values are sorted first or last depending on the database system being used. To specify how NULL values should be handled, you can use the NULLS FIRST or NULLS LAST clause in the ORDER BY statement.

For example:

SELECT 
    emp_name,
    salary,
    RANK() OVER (ORDER BY salary DESC NULLS LAST) AS rank,
    DENSE_RANK() OVER (ORDER BY salary DESC NULLS LAST) AS dense_rank
FROM employees;

This will sort the rows with NULL salaries last.

Conclusion

In conclusion, RANK and DENSE_RANK are two powerful analytical functions in SQL that can be used to assign ranks to rows within a result set. The key differences between these functions lie in how they handle ties and whether they skip ranks or not. Understanding the behavior of these functions is crucial for data analysis, reporting, and business intelligence purposes.