Insert or Update: Mastering UPSERT in PostgreSQL

In many applications, you need to insert new data into a table while updating existing records if they already exist. This is known as an "upsert" operation, short for "update or insert." In this tutorial, we will explore the various ways to perform upsert operations in PostgreSQL.

Understanding UPSERT

Before diving into the solutions, let’s understand what upsert means and why it’s essential in database management. An upsert is a combination of update and insert operations. If a record with the specified key already exists in the table, an update operation is performed; otherwise, a new record is inserted.

PostgreSQL 9.5 and Later: ON CONFLICT Clause

Starting from PostgreSQL 9.5, you can use the ON CONFLICT clause to perform upsert operations. The basic syntax is as follows:

INSERT INTO table_name (column1, column2)
VALUES (value1, value2)
ON CONFLICT (constraint_name) DO UPDATE
SET column1 = EXCLUDED.column1,
    column2 = EXCLUDED.column2;

In this example:

table_name is the name of your table.
column1 and column2 are the columns you want to insert or update.
value1 and value2 are the values you want to insert.
constraint_name is the name of the unique constraint that determines whether a conflict occurs.
EXCLUDED.column1 and EXCLUDED.column2 refer to the values that would have been inserted if there were no conflict.

Example Usage

Let’s create a sample table called employees with columns id, name, and salary.

CREATE TABLE employees (
    id INTEGER PRIMARY KEY,
    name VARCHAR(255),
    salary DECIMAL(10, 2)
);

Now, let’s insert or update an employee using the upsert method:

INSERT INTO employees (id, name, salary)
VALUES (1, 'John Doe', 50000.00)
ON CONFLICT (id) DO UPDATE
SET name = EXCLUDED.name,
    salary = EXCLUDED.salary;

This statement will insert a new employee if the id does not exist; otherwise, it will update the existing record.

Older Versions of PostgreSQL: Using Functions or CTEs

For versions prior to 9.5, you can achieve similar results using functions or Common Table Expressions (CTEs).

Using Functions

You can create a function that attempts to insert a record and updates it if a unique constraint violation occurs:

CREATE OR REPLACE FUNCTION upsert_employees(
    p_id INTEGER,
    p_name VARCHAR(255),
    p_salary DECIMAL(10, 2)
)
RETURNS VOID AS $$
BEGIN
    LOOP
        UPDATE employees SET name = p_name, salary = p_salary WHERE id = p_id;
        IF FOUND THEN
            RETURN;
        END IF;
        BEGIN
            INSERT INTO employees (id, name, salary) VALUES (p_id, p_name, p_salary);
            RETURN;
        EXCEPTION WHEN unique_violation THEN
            -- Do nothing and loop to try the UPDATE again.
        END;
    END LOOP;
END;
$$ LANGUAGE plpgsql;

Using CTEs

CTEs provide another way to perform upsert operations. Here’s an example:

WITH new_values (id, name, salary) AS (
    VALUES (1, 'John Doe', 50000.00)
),
upsert AS (
    UPDATE employees e
        SET name = nv.name,
            salary = nv.salary
    FROM new_values nv
    WHERE e.id = nv.id
    RETURNING e.*
)
INSERT INTO employees (id, name, salary)
SELECT id, name, salary
FROM new_values
WHERE NOT EXISTS (SELECT 1 FROM upsert up WHERE up.id = new_values.id);

Conclusion

In conclusion, performing upsert operations in PostgreSQL can be achieved through various methods, depending on the version of your database. For PostgreSQL 9.5 and later, the ON CONFLICT clause provides a straightforward way to insert or update records based on unique constraints. For older versions, using functions or CTEs can help you achieve similar results.

Best Practices

Always specify the column names in your INSERT statements.
Use transactions when performing multiple operations that should be treated as a single unit of work.
Be mindful of concurrency and potential race conditions when implementing upsert logic.

By following these guidelines and examples, you can effectively manage data insertion and updates in PostgreSQL, ensuring data consistency and reliability in your applications.