Lists in Python are arrays - with all the consequences.
Why is this important?Permalink
It is vital to understand how Python implements lists and how they work, so we can decide if they are the proper collection for our values.
Different collections have different performance characteristics for reading and writing.
Choosing a proper collection ensures we will get the performance we need for a given task.
What is an array?Permalink
Traditionally (in C and similar languages), array stores values of the same type and stores them sequentially in physical memory, one after the other.
This memory layout makes arrays fast to read as we just read one chunk of memory sequentially, and we don't need to jump back and forth.
Arrays are also very fast when accessing items by index - as we can get the memory address of an item in the array by doing simple arithmetic.
Traditionally we can't change the length of the array.
If we want to add or remove items from an array, we must create a new array and copy values to it.
Because the extra memory we need is right after our array's memory and is probably occupied. So if we want to add an item to the array, we need to allocate a new chunk of memory that will be big enough for all items and copy all items from the old array and write our new item there.
So now, when we understand what an array is and how it works, we can discuss how a list in Python behaves.
How list works in PythonPermalink
How do lists work in Python?
Here is a link to the Python documentation that discusses list implementation briefly.
The documentation says lists are variable-length arrays.
A variable-length array means that its length can change, either by adding/inserting items or removing them. This flexibility is good, and it looks simple from the outside.
Internally though, Python still has to reallocate memory and move items for us, so we don't have to do it. But it still needs to happen.
That's why it is slow to insert into an array. Or append to it.
Python applies some cleverness to make appending faster, as mentioned in the documentation.
When we append to a list, Python allocates more memory for a list than needed. This overallocation makes appending faster because when we append next time, some extra memory is already allocated, and allocation and copying don't need to happen again.
And the bigger the list is, the more memory Python over allocates (as it assumes more appends are coming).
import sys numbers =  sys.getsizeof(numbers) # outputs: 56 # List allocates 56 bytes of memory numbers.append(1) sys.getsizeof(numbers) # outputs: 88 # List now allocates 88 bytes, list had to be reallocated numbers.append(2) sys.getsizeof(numbers) # outputs: 88 # List still allocates 88 bytes, no reallocation or copying happened numbers.append(3) sys.getsizeof(numbers) # outputs: 88 # List still allocates 88 bytes, no reallocation or copying happened # Let's add 10 more numbers for n in range (10): numbers.append(n) sys.getsizeof(numbers) # outputs: 184 numbers.append(22) sys.getsizeof(numbers) # outputs:184 # List still allocates 184 bytes, no reallocation or copying happened
List stores pointersPermalink
Before, we mentioned that arrays store values of the same type.
But a list doesn't have to contain items of the same type. How's that?
The reason is that Python's list doesn't store actual values but rather pointers (references) to them.
Pointer is the memory address. So pointer tells us at which address the actual data are. Each pointer has the same size, so a list in Python is an array of pointers to values, not an array of actual values.
So Python's lists are true arrays with some extra bits to make them more convenient to use.
- Python lists are arrays with variable lengths.
- That makes them very fast for reading and accessing items by index.
- But slow for inserting/removing.
- Python has some optimisations, so append is not that slow on average.
You might also like
Join the newsletter
Subscribe to get new articles about Python, code and programming into your inbox!