Counting Distinct Values with SQL

Counting distinct values is a common task when working with databases, and SQL provides several ways to achieve this. In this tutorial, we will explore how to count distinct values using the COUNT(DISTINCT) function.

Introduction to COUNT(DISTINCT)

The COUNT(DISTINCT) function in SQL is used to count the number of unique, non-null values in a column or expression. It takes an argument, which can be a column name, an expression, or a subquery. The function ignores null values and only counts each distinct value once.

Basic Syntax

The basic syntax for using COUNT(DISTINCT) is as follows:

SELECT COUNT(DISTINCT column_name) AS count
FROM table_name;

This will return the number of unique, non-null values in the specified column.

Example Use Case

Suppose we have a table called employees with columns name, department, and salary. We want to count the number of distinct departments.

CREATE TABLE employees (
    name VARCHAR(255),
    department VARCHAR(255),
    salary DECIMAL(10, 2)
);

INSERT INTO employees (name, department, salary) VALUES
('John Doe', 'Sales', 50000.00),
('Jane Smith', 'Marketing', 60000.00),
('Bob Johnson', 'Sales', 70000.00),
('Alice Brown', 'IT', 80000.00);

SELECT COUNT(DISTINCT department) AS distinct_departments
FROM employees;

This will return the number of unique departments, which is 3.

Counting Distinct Values with Group By

When working with multiple columns, we often need to count distinct values for each group. The GROUP BY clause can be used in conjunction with COUNT(DISTINCT) to achieve this.

SELECT department, COUNT(DISTINCT name) AS distinct_employees
FROM employees
GROUP BY department;

This will return the number of unique employees for each department.

Using Derived Tables

In some cases, we may need to count distinct values based on a derived table. A derived table is a temporary result set that can be used as a source for further queries.

SELECT COUNT(DISTINCT name) AS distinct_employees
FROM (
    SELECT DISTINCT name, department
    FROM employees
) AS derived_table;

This will return the number of unique employees in the derived table.

Best Practices

When using COUNT(DISTINCT), keep the following best practices in mind:

  • Always specify the column or expression to count distinct values for.
  • Use meaningful aliases for columns and tables to improve readability.
  • Avoid using SELECT \* when counting distinct values, as this can lead to performance issues.

Conclusion

Counting distinct values is an essential task in database querying. The COUNT(DISTINCT) function provides a powerful way to achieve this. By following the examples and best practices outlined in this tutorial, you can effectively count distinct values in your SQL queries.

Leave a Reply

Your email address will not be published. Required fields are marked *