Counting distinct values is a common task when working with databases, and SQL provides several ways to achieve this. In this tutorial, we will explore how to count distinct values using the COUNT(DISTINCT)
function.
Introduction to COUNT(DISTINCT)
The COUNT(DISTINCT)
function in SQL is used to count the number of unique, non-null values in a column or expression. It takes an argument, which can be a column name, an expression, or a subquery. The function ignores null values and only counts each distinct value once.
Basic Syntax
The basic syntax for using COUNT(DISTINCT)
is as follows:
SELECT COUNT(DISTINCT column_name) AS count
FROM table_name;
This will return the number of unique, non-null values in the specified column.
Example Use Case
Suppose we have a table called employees
with columns name
, department
, and salary
. We want to count the number of distinct departments.
CREATE TABLE employees (
name VARCHAR(255),
department VARCHAR(255),
salary DECIMAL(10, 2)
);
INSERT INTO employees (name, department, salary) VALUES
('John Doe', 'Sales', 50000.00),
('Jane Smith', 'Marketing', 60000.00),
('Bob Johnson', 'Sales', 70000.00),
('Alice Brown', 'IT', 80000.00);
SELECT COUNT(DISTINCT department) AS distinct_departments
FROM employees;
This will return the number of unique departments, which is 3.
Counting Distinct Values with Group By
When working with multiple columns, we often need to count distinct values for each group. The GROUP BY
clause can be used in conjunction with COUNT(DISTINCT)
to achieve this.
SELECT department, COUNT(DISTINCT name) AS distinct_employees
FROM employees
GROUP BY department;
This will return the number of unique employees for each department.
Using Derived Tables
In some cases, we may need to count distinct values based on a derived table. A derived table is a temporary result set that can be used as a source for further queries.
SELECT COUNT(DISTINCT name) AS distinct_employees
FROM (
SELECT DISTINCT name, department
FROM employees
) AS derived_table;
This will return the number of unique employees in the derived table.
Best Practices
When using COUNT(DISTINCT)
, keep the following best practices in mind:
- Always specify the column or expression to count distinct values for.
- Use meaningful aliases for columns and tables to improve readability.
- Avoid using
SELECT \*
when counting distinct values, as this can lead to performance issues.
Conclusion
Counting distinct values is an essential task in database querying. The COUNT(DISTINCT)
function provides a powerful way to achieve this. By following the examples and best practices outlined in this tutorial, you can effectively count distinct values in your SQL queries.