In many applications, you need to insert new data into a table while updating existing records if they already exist. This is known as an "upsert" operation, short for "update or insert." In this tutorial, we will explore the various ways to perform upsert operations in PostgreSQL.
Understanding UPSERT
Before diving into the solutions, let’s understand what upsert means and why it’s essential in database management. An upsert is a combination of update and insert operations. If a record with the specified key already exists in the table, an update operation is performed; otherwise, a new record is inserted.
PostgreSQL 9.5 and Later: ON CONFLICT Clause
Starting from PostgreSQL 9.5, you can use the ON CONFLICT
clause to perform upsert operations. The basic syntax is as follows:
INSERT INTO table_name (column1, column2)
VALUES (value1, value2)
ON CONFLICT (constraint_name) DO UPDATE
SET column1 = EXCLUDED.column1,
column2 = EXCLUDED.column2;
In this example:
table_name
is the name of your table.column1
andcolumn2
are the columns you want to insert or update.value1
andvalue2
are the values you want to insert.constraint_name
is the name of the unique constraint that determines whether a conflict occurs.EXCLUDED.column1
andEXCLUDED.column2
refer to the values that would have been inserted if there were no conflict.
Example Usage
Let’s create a sample table called employees
with columns id
, name
, and salary
.
CREATE TABLE employees (
id INTEGER PRIMARY KEY,
name VARCHAR(255),
salary DECIMAL(10, 2)
);
Now, let’s insert or update an employee using the upsert method:
INSERT INTO employees (id, name, salary)
VALUES (1, 'John Doe', 50000.00)
ON CONFLICT (id) DO UPDATE
SET name = EXCLUDED.name,
salary = EXCLUDED.salary;
This statement will insert a new employee if the id
does not exist; otherwise, it will update the existing record.
Older Versions of PostgreSQL: Using Functions or CTEs
For versions prior to 9.5, you can achieve similar results using functions or Common Table Expressions (CTEs).
Using Functions
You can create a function that attempts to insert a record and updates it if a unique constraint violation occurs:
CREATE OR REPLACE FUNCTION upsert_employees(
p_id INTEGER,
p_name VARCHAR(255),
p_salary DECIMAL(10, 2)
)
RETURNS VOID AS $$
BEGIN
LOOP
UPDATE employees SET name = p_name, salary = p_salary WHERE id = p_id;
IF FOUND THEN
RETURN;
END IF;
BEGIN
INSERT INTO employees (id, name, salary) VALUES (p_id, p_name, p_salary);
RETURN;
EXCEPTION WHEN unique_violation THEN
-- Do nothing and loop to try the UPDATE again.
END;
END LOOP;
END;
$$ LANGUAGE plpgsql;
Using CTEs
CTEs provide another way to perform upsert operations. Here’s an example:
WITH new_values (id, name, salary) AS (
VALUES (1, 'John Doe', 50000.00)
),
upsert AS (
UPDATE employees e
SET name = nv.name,
salary = nv.salary
FROM new_values nv
WHERE e.id = nv.id
RETURNING e.*
)
INSERT INTO employees (id, name, salary)
SELECT id, name, salary
FROM new_values
WHERE NOT EXISTS (SELECT 1 FROM upsert up WHERE up.id = new_values.id);
Conclusion
In conclusion, performing upsert operations in PostgreSQL can be achieved through various methods, depending on the version of your database. For PostgreSQL 9.5 and later, the ON CONFLICT
clause provides a straightforward way to insert or update records based on unique constraints. For older versions, using functions or CTEs can help you achieve similar results.
Best Practices
- Always specify the column names in your
INSERT
statements. - Use transactions when performing multiple operations that should be treated as a single unit of work.
- Be mindful of concurrency and potential race conditions when implementing upsert logic.
By following these guidelines and examples, you can effectively manage data insertion and updates in PostgreSQL, ensuring data consistency and reliability in your applications.