Copying Dictionaries in Python: Shallow vs. Deep Copies

Understanding Object References in Python

In Python, variables don’t directly store values like integers or strings. Instead, they store references to objects in memory. This is crucial when dealing with mutable data structures like dictionaries. When you assign one dictionary to another (e.g., dict2 = dict1), you’re not creating a new dictionary; you’re creating a new reference that points to the same dictionary in memory. Therefore, modifying dict2 will also modify dict1 because they both refer to the same object.

Creating Independent Copies

To avoid this behavior and work with a truly independent copy of a dictionary, you need to explicitly create a new dictionary object. Python provides several ways to achieve this:

1. Shallow Copy

A shallow copy creates a new dictionary object, but it doesn’t create copies of the objects within the dictionary. Instead, it copies references to those inner objects. This works well if your dictionary contains only immutable objects (like numbers, strings, and tuples).

You can create a shallow copy using either of these methods:

  • Using the copy() method:

    dict1 = {"key1": "value1", "key2": "value2"}
    dict2 = dict1.copy()
    
    dict2["key2"] = "WHY?!"
    print(dict1)  # Output: {'key1': 'value1', 'key2': 'WHY?!'}
    print(dict2)  # Output: {'key1': 'value1', 'key2': 'WHY?!'}
    
  • Using the dict() constructor:

    dict1 = {"key1": "value1", "key2": "value2"}
    dict2 = dict(dict1)
    
    dict2["key2"] = "WHY?!"
    print(dict1)
    print(dict2)
    

In these examples, dict1 and dict2 initially point to different dictionary objects. However, if the dictionary contains mutable objects (like lists or other dictionaries) as values, modifying those inner mutable objects through either dict1 or dict2 will affect both dictionaries.

2. Deep Copy

A deep copy creates a new dictionary object and recursively copies all of the objects found within it. This ensures that you have a completely independent copy of the dictionary and its contents, even if it contains nested mutable objects.

To create a deep copy, you need to use the deepcopy() function from the copy module:

import copy

dict1 = {"key1": "value1", "key2": {"nested_key": "nested_value"}}
dict2 = copy.deepcopy(dict1)

dict2["key2"]["nested_key"] = "WHY?!"

print(dict1) # Output: {'key1': 'value1', 'key2': {'nested_key': 'nested_value'}}
print(dict2) # Output: {'key1': 'value1', 'key2': {'nested_key': 'WHY?!'}}

As you can see, modifying the nested dictionary within dict2 does not affect dict1. This is because deepcopy() created entirely new objects for all nested structures.

Choosing the Right Copy Method

  • Shallow copy: Use a shallow copy when your dictionary contains only immutable objects, or when you specifically want to share references to inner mutable objects. It’s faster and more memory-efficient than a deep copy.

  • Deep copy: Use a deep copy when your dictionary contains nested mutable objects and you need to ensure complete independence between the original and the copy. This prevents unexpected side effects when modifying the copied dictionary. However, deep copying is more computationally expensive and requires more memory.

Leave a Reply

Your email address will not be published. Required fields are marked *