In Python, it’s often necessary to divide a list into smaller, equal-sized chunks. This can be useful when working with large datasets, processing data in parallel, or creating batches of items for further processing.
Introduction to List Chunking
List chunking is the process of dividing a list into smaller sublists of a specified size. There are several ways to achieve this in Python, and we’ll explore some of the most common methods below.
Using Generators
One efficient way to split a list into chunks is by using generators. A generator is a special type of function that can be used to generate a sequence of values on-the-fly, rather than computing them all at once and storing them in memory.
Here’s an example of a generator-based chunking function:
def chunks(lst, n):
"""Yield successive n-sized chunks from lst."""
for i in range(0, len(lst), n):
yield lst[i:i + n]
This function takes two arguments: lst
, the list to be chunked, and n
, the size of each chunk. It uses a loop to iterate over the list in steps of n
, yielding each chunk as it’s generated.
To use this function, you can simply call it with a list and a chunk size:
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9]
chunked_numbers = list(chunks(numbers, 3))
print(chunked_numbers) # Output: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
Note that the list()
function is used to convert the generator output to a list of chunks.
Using List Comprehensions
Another way to split a list into chunks is by using list comprehensions. A list comprehension is a concise way to create a new list from an existing iterable.
Here’s an example of a list comprehension-based chunking expression:
chunked_numbers = [numbers[i:i + 3] for i in range(0, len(numbers), 3)]
print(chunked_numbers) # Output: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
This expression uses a list comprehension to create a new list of chunks. The outer loop iterates over the indices of the original list in steps of n
, and the inner slice extracts each chunk.
Using NumPy
If you’re working with numerical data, you can use the NumPy library to split your lists into chunks. NumPy provides an array_split
function that can be used to divide an array into equal-sized chunks.
Here’s an example of using array_split
:
import numpy as np
numbers = np.arange(1, 10)
chunked_numbers = np.array_split(numbers, 3)
print(chunked_numbers) # Output: [array([1, 2, 3]), array([4, 5, 6]), array([7, 8, 9])]
Note that array_split
returns a list of NumPy arrays, rather than a list of lists.
Using itertools
Finally, you can use the itertools
library to split your lists into chunks. The batched
function (available in Python 3.12 and later) provides a convenient way to divide an iterable into batches.
Here’s an example of using batched
:
import itertools
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9]
chunked_numbers = list(itertools.batched(numbers, 3))
print(chunked_numbers) # Output: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
This code uses the batched
function to divide the input list into batches of size 3.
Conclusion
In this tutorial, we’ve explored several ways to split a list into equal-sized chunks in Python. Whether you’re working with generators, list comprehensions, NumPy arrays, or itertools, there’s a method that suits your needs. By choosing the right approach for your use case, you can write more efficient and effective code.