Tuple comprehensions in Python

Tuple comprehensions in Python

In Python, we have comprehensions for lists, sets and dicts.

But there is no comprehension for tuples. Or is there?

Why there is not a tuple comprehension?Permalink

Firstly, the tuple is usually not used to store a series of homogenous data. We use it more often to store heterogeneous data as a quick way to group related data together.

Secondly, we create a tuple with round brackets: t1 = (1, 2), so tuple comprehension would probably need to look like this:

my_tuple = (item for item in some_iterable)

But this syntax is already used for generator expressions. So the above code returns a generator object. We will discuss generator expressions some other time. (If you don't know what they are yet, don't worry, you can go far without them.)

Why might we need it?Permalink

Sometimes we want to store homogenous data in a tuple.

One reason might be that there is no immutable (frozen) list in Python (not in the standard library). Some teams want to have an immutable collection type in place of a list to make sure that once a list is created, it is not changed.

Tuple fits this requirement. It behaves like a list and has all the characteristics of a list when reading a tuple. (And that is enough because since it is immutable, we don't want to write to it).

So what can we do if we want to use a tuple instead of a list but still enjoy the beauty of a comprehension?

We can have tuple comprehension (almost)Permalink

Ok, so how can we have tuple comprehension?

Usually, a tuple is constructed only with round brackets or even without:

# with brackets
t1 = (1, 2, 3)
# outputs: (1, 2, 3)

# without brackets
t2 = 8,9
# outputs: (8, 9)

But a tuple is a proper class (object), so we can also create a tuple as any other object - instantiate it using its class name:

t3 = tuple()
t3
# outputs: ()

Ok, this is not useful. Let's have a look at the documentation to see what the tuple class looks like: tuple documentation

So tuple can be constructed using tuple class and passing it some iterable, e.g. list:

t4 = tuple([22, 33])
t4
# outputs: (22, 33)

If it takes a list, we can use list comprehension too:

numbers = [1, 2, 3]
t5 = tuple([n * 2 for n in numbers])
t5
# outputs: (2, 4, 6)

This code is a bit lengthy; there are a lot of brackets.

It also allocates memory for all the items we want to have in a tuple twice. We first create a list (allocates memory) which is then used to create a tuple with the same items(allocates memory again). Yes, memory for the list is released right after the tuple is created, but the memory still needs to be available.

We can do better.

As seen in the documentation, a tuple accepts iterable as an argument.

As the documentation says:

iterable may be either a sequence, a container that supports iteration, or an iterator object.

tuple documentation

An iterator is an object that enables us to iterate (traverse) over an iterable, but it is also iterable itself.

We don't want to unnecessarily allocate memory for all items; thus, we will use a generator. A generator generates one value at a time (instead of allocating memory for all items).

And the easiest way to create a generator is to use generator expression:

(n * 2 for n in numbers)
# outputs <generator object <genexpr> at 0x10c643370>

We can use this directly when creating our tuple:

t6 = tuple((n * 2 for n in numbers))
t6
# outputs: (2, 4, 6)

That's better (in terms of memory), but there are still a lot of brackets.

Luckily generator expression doesn't have to be parenthesised if it is sole argument into the function so we can write it like this:

t7 = tuple(n * 2 for n in numbers)
t7
# outputs: (2, 4, 6)

And that's it. That is as close to tuple comprehension as we can get. It does require using tuple in front of our "comprehension", but otherwise, it's the same. It is also memory efficient. It doesn't create a list and then a tuple, so it doesn't allocate memory unnecessarily.

PerformancePermalink

I've got some bad news here. It is slower than list comprehension.

Here is the simple measurement for list comprehension that creates a list from 100 numbers and doubles each of them:

% python3 -m timeit -s 'numbers=list(range(100))' 'l = [i*2 for i in numbers]'
100000 loops, best of 5: 3.4 usec per loop

3.4 microseconds.

And this is the measurement when we create a tuple instead, using our generator:

 % python3 -m timeit -s 'numbers=list(range(100))' 't = tuple(i*2 for i in numbers)'
50000 loops, best of 5: 4.79 usec per loop

4.79 microseconds. So it's about 40% slower.

Let's try a version where we create a tuple by passing it a list comprehension:

% python3 -m timeit -s 'numbers=list(range(100))' 't = tuple([i*2 for i in numbers])'
100000 loops, best of 5: 3.6 usec per loop

3.6 microseconds. So this is still faster to execute than the version with a generator, and we still end up with a tuple. As mentioned before, it allocates memory twice, so be careful when working with large data.

Performance (or memory usage) is the price we need to pay if we want to use a tuple as an immutable list.

SummaryPermalink

  • There is no real tuple comprehension in Python.
  • We can get very close to it in terms of syntax if we use the tuple class and generator tuple(i for i in iterable).
  • But it is slower than list comprehensions.

You might also like

Join the newsletter

Subscribe to get new articles about Python, code and programming into your inbox!