Sequences and comprehensions
Download exercises zip
We can write elegant and compact code with sequences. First we will see how to scan sequences with iterators, and then how to build them with comprehensions of lists.
What to do
Unzip exercises zip in a folder, you should obtain something like this:
sequences
sequences1.ipynb
sequences1-sol.ipynb
sequences2-chal.ipynb
jupman.py
WARNING: to correctly visualize the notebook, it MUST be in an unzipped folder !
open Jupyter Notebook from that folder. Two things should open, first a console and then a browser. The browser should show a file list: navigate the list and open the notebook
sequences.ipynb
Go on reading the exercises file, sometimes you will find paragraphs marked Exercises which will ask to write Python commands in the following cells.
Shortcut keys:
to execute Python code inside a Jupyter cell, press
Control + Enter
to execute Python code inside a Jupyter cell AND select next cell, press
Shift + Enter
to execute Python code inside a Jupyter cell AND a create a new cell aftwerwards, press
Alt + Enter
If the notebooks look stuck, try to select
Kernel -> Restart
Iterables - lists
When dealing with loops with often talked about iterating sequences, but what does it exactly mean for a sequence to be iterable ? Concretely, it means we can call the functioniter
on that sequence.
Let’s try for example with familiar lists:
[2]:
iter(['a','b','c','d'])
[2]:
<list_iterator at 0x7f8e886ef8d0>
We notice Python just created an object of type list_iterator
.
NOTE: the list was not shown!
You can imagine an iterator as a sort of still machine, that each time is activated it produces an element from the sequence, one at a time
Typically, an iterator only knows its position inside the sequence, and can provide us with the sequence elements one by one if we keep asking with calls to the function next
:
[3]:
iterator = iter(['a','b','c','d'])
[4]:
next(iterator)
[4]:
'a'
[5]:
next(iterator)
[5]:
'b'
[6]:
next(iterator)
[6]:
'c'
[7]:
next(iterator)
[7]:
'd'
Note how the iterator has a state to keep track of where it is in the sequence (in other words, it’s stateful). The state is changed at each call of function next
.
If we try asking more elements of the available ones, Python raises the exception StopIteration
:
next(iterator)
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-65-4518bd5da67f> in <module>()
----> 1 next(iterator)
StopIteration:
V COMMANDMENT You shall never ever redefine next
and iter
system functions.
DO NOT use them as variables !!
iterables - range
We iterated a list, which is a completely materialized in memory sequence we scanned with the iterator object. There are also other peculiar sequences which are not materialized in memory, like for example range
.
Previously we used range
in for loops to obtain a sequence of numbers, but exactly, what is range
doing? Let’s try calling it on its own:
[8]:
range(4)
[8]:
range(0, 4)
Maybe we expected a sequence of numbers, instead, Python is showing us an object of type range
(with the lower range limit).
NOTE: No number sequence is currently present in memory
We only have a ‘still’ iterable object, which if we want can provide us with numbers
How can we ask for numbers?
We’ve seen we can use a for
loop:
[9]:
for x in range(4):
print(x)
0
1
2
3
As an alternative, we can pass range
to the function iter
which produces an iterator.
WARNING: range
is iterable but it is NOT an iterator !!
To obtain the iterator we must call the iter
function on the range
object
[10]:
iterator = iter(range(4))
iter
also produces a ‘still’ object, which hasn’t materialized numbers in memory yet:
[11]:
iterator
[11]:
<range_iterator at 0x7f8e88783030>
In order to ask we must use the function next
:
[12]:
next(iterator)
[12]:
0
[13]:
next(iterator)
[13]:
1
[14]:
next(iterator)
[14]:
2
[15]:
next(iterator)
[15]:
3
Note the iterator has a state, which is changed at each next
call to keep track of where it is in the sequence.
If we try asking for more elements than actually available, Python raises a StopIteration
exception:
next(iterator)
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-65-4518bd5da67f> in <module>()
----> 1 next(iterator)
StopIteration:
Materializing a sequence
We said a range
object does not physically materialize in memory all the numbers at the same time. We can get them one by one by only using the iterator. What if we wanted a list with all the numbers? In the tutorial on lists we’ve seen that by passing a sequence to function list
, a new list is created with all the sequence elements. We talked generically about a sequence, but the more correct term would
have been iterable.
If we pass any iterable object to list
, then a new list will be built - we’ve seen range
is iterable so let’s try:
[16]:
list(range(4))
[16]:
[0, 1, 2, 3]
Voilà ! Now the sequence is all physically present in memory.
WARNING: list
consumes the iterator!
If you try calling twice list
on the same iterator, you will get an empty list:
[17]:
sequence = range(4)
iterator = iter(sequence)
[18]:
new1 = list(iterator)
[19]:
new1
[19]:
[0, 1, 2, 3]
[20]:
new2 = list(iterator)
[21]:
new2
[21]:
[]
What if we wanted to directly access a specific position in the sequence generated by the iterator? Let’s try extracting the character at index 2:
[22]:
sequence = range(4)
iterator = iter(sequence)
iterator[2]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-129-3c080cc9e700> in <module>()
1 sequence = range(4)
2 iterator = iter(sequence)
----> 3 iterator[3]
TypeError: 'range_iterator' object is not subscriptable
… sadly we get an error!
We are left with only two alternatives. Either:
First we convert to list and then use the squared brackets
We call
next
4 times (remember indexes start from zero)
Option a) very often looks handy, but careful: converting an iterator into a list creates a NEW list in memory. If the list is very big and/or this operation is repeated many times, you risk occupying memory for nothing.
Let’s see the example in Python Tutor again:
[23]:
# WARNING: FOR PYTHON TUTOR TO WORK, REMEMBER TO EXECUTE THIS CELL with Shift+Enter
# (it's sufficient to execute it only once)
import jupman
[24]:
sequence = range(4)
iterator = iter(sequence)
new1 = list(iterator)
new2 = list(iterator)
jupman.pytut()
[24]:
QUESTION: Which object occupies more memory? a
or b
?
a = range(10)
b = range(10000000)
QUESTION: Which object occupies more memory? a
or b
?
a = list(range(10))
b = list(range(10000000))
Questions - range
Look at the following expressions, and for each try guessing the result (or if it gives an error):
range(3)
range()
list(range(-3))
range(3,6)
list(range(5,4))
list(range(3,3))
range(3) + range(6)
list(range(3)) + list(range(6))
list(range(0,6,2))
list(range(9,6,-1))
reversed
reversed
is a function which takes a sequence as parameter and PRODUCES a NEW iterator which allows to run through the sequence in reverse order.
WARNING: by calling reversed
we directly obtain an iterator !
So you do not need to make further calls to iter
as done with range
!
Let’s have a better look with an example:
[25]:
la = ['s','c','a','n']
[26]:
reversed(la)
[26]:
<list_reverseiterator at 0x7f8e886ad9d0>
We see reversed
has produced an iterator as result (not a reversed list)
INFO: iterators occupy a small amount of memory
Creating an iterator from a sequence only creates a sort of pointer, it does not create new memory regions.
Furthermore , we see the original list associated to la
was not changed:
[27]:
print(la)
['s', 'c', 'a', 'n']
WARNING: the function reversed
is different from reverse method
Note the final d! If we tried to call it as a method we would get an error:
>>> la.reversed()
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-182-c8d1eec57fdd> in <module>
----> 1 la.reversed()
AttributeError: 'list' object has no attribute 'reversed'
Iterating with next
How can we obtain a reversed list in memory? In other words, how can we actionate the iterator machine?
We can ask the iterator for one element at a time with the function next
:
[28]:
la = ['a','b','c']
[29]:
iterator = reversed(la)
[30]:
next(iterator)
[30]:
'c'
[31]:
next(iterator)
[31]:
'b'
[32]:
next(iterator)
[32]:
'a'
Once the iterator is exhausted, by calling next
again we will get an error:
next(iterator)
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-248-4518bd5da67f> in <module>
----> 1 next(iterator)
StopIteration:
Let’s try manually creating a destination list lb
and adding elements we obtain one by one:
[33]:
la = ['a','b','c']
iterator = reversed(la)
lb = []
lb.append(next(iterator))
lb.append(next(iterator))
lb.append(next(iterator))
print(lb)
jupman.pytut()
['c', 'b', 'a']
[33]:
Exercise - sconcerto
Write some code which given a list of characters la
, puts in a list lb
all the characters at odd position taken from reversed list la
.
use
reversed
andnext
DO NOT modify
la
DO NOT use negative indexes
DO NOT use
list
Example - given:
# 8 7 6 5 4 3 2 1 0
la = ['s', 'c', 'o', 'n', 'c', 'e', 'r', 't', 'o']
lb = []
After your code it must show:
>>> print(lb)
['t', 'e', 'n', 'c']
>>> print(la)
['s', 'c', 'o', 'n', 'c', 'e', 'r', 't', 'o']
We invite you to solve the problem in several ways:
WAY 1 - without cycle: Suppose the list length is fixed, and repeatedly call next
without using a loop
WAY 2 - while: Suppose having a list of arbitrary length, and try generalizing previous code by using a while
cycle, and calling next
inside
HINT 1: keep track of the position in which you are with a counter
i
HINT 2: you cannot call
len
on an iterator, so in thewhile
conditions you will have to use the original list length
WAY 3 - for: this is the most elegant way. Suppose having a list of arbitrary length and use a loop like for x in reversed(la)
HINT: you will still need to keep track of the position in which you are with an
i
counter
[34]:
# WAY 1: MANUAL
# 8 7 6 5 4 3 2 1 0
la = ['s', 'c', 'o', 'n', 'c', 'e', 'r', 't', 'o']
lb = []
# write here
[35]:
# WAY 2: WHILE
# 8 7 6 5 4 3 2 1 0
la = ['s', 'c', 'o', 'n', 'c', 'e', 'r', 't', 'o']
lb = []
# write here
[36]:
# WAY 3: for
# 8 7 6 5 4 3 2 1 0
la = ['s', 'c', 'o', 'n', 'c', 'e', 'r', 't', 'o']
lb = []
# write here
Materializing an iterator
Luckily enough, we can obtain a list from an iterator with a less laborious method.
We’ve seen that when we want to create a new list from a sequence, we can use list
as if it were a function. We can also do it in this case, interpreting the iterator as if it were a sequence:
[37]:
la = ['s', 'c', 'a', 'n']
list( reversed(la) )
[37]:
['n', 'a', 'c', 's']
Notice we generated a NEW list, the original one associated to la
is always the same:
[38]:
la
[38]:
['s', 'c', 'a', 'n']
Let’s see what happens using Python Tutor (we created some extra variables to evidence relevant passages):
[39]:
la = ['s', 'c', 'a', 'n']
iterator = reversed(la)
new = list(iterator)
print("la is",la)
print("new is",new)
jupman.pytut()
la is ['s', 'c', 'a', 'n']
new is ['n', 'a', 'c', 's']
[39]:
QUESTION Which effect is the following code producing?
la = ['b','r','i','d','g','e']
lb = list(reversed(reversed(la)))
sorted
The function sorted
takes as parameter a sequence and returns a NEW sorted list.
WARNING: sorted
returns a LIST, not an iterator!
[40]:
sorted(['g','a','e','d','b'])
[40]:
['a', 'b', 'd', 'e', 'g']
WARNING: sorted
is a function different from sort method !
Note the final ed! If we tried to call it with a different method we would get an error:
>>> la.sorted()
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-182-c8d1eec57fdd> in <module>
----> 1 la.reversed()
AttributeError: 'list' object has no attribute 'sorted'
Exercise - reversort
✪ Given a list of names, write some code to produce a list sorted in reverse
There are at least a couple of ways to do it in a single line of code, find them both
INPUT:
['Maria','Paolo','Giovanni','Alessia','Greta']
OUTPUT:
['Paolo', 'Maria', 'Greta', 'Giovanni', 'Alessia']
[41]:
# write here
zip
Suppose we have two lists paintings
and years
, with rispectively names of famous paintings and the dates in which they were painted:
[42]:
paintings = ["The Mona Lisa", "The Birth of Venus", "Sunflowers"]
years = [1503, 1482, 1888]
We want to produce a new list which contains some tuples which associate each painting with the year it was made:
[('The Mona Lisa', 1503),
('The Birth of Venus', 1482),
('Sunflowers', 1888)]
There are various ways to do it but certainly the most elegant is by using the function zip
which produces an iterator:
[43]:
zip(paintings, years)
[43]:
<zip at 0x7f8e88550c80>
Even if you don’t see written ‘iterator’ in the object name, we can still use it as such with next
:
[44]:
iterator = zip(paintings, years)
next(iterator)
[44]:
('The Mona Lisa', 1503)
[45]:
next(iterator)
[45]:
('The Birth of Venus', 1482)
[46]:
next(iterator)
[46]:
('Sunflowers', 1888)
As done previously, we can convert everything to a list with list
:
[47]:
paintings = ["The Mona Lisa", "The Birth of Venus", "Sunflowers"]
years = [1503, 1482, 1888]
list(zip(paintings,years))
[47]:
[('The Mona Lisa', 1503), ('The Birth of Venus', 1482), ('Sunflowers', 1888)]
If the lists have different length, the sequence produced by zip
will be as long as the shortest input sequence:
[48]:
list(zip([1,2,3], ['a','b','c','d','e']))
[48]:
[(1, 'a'), (2, 'b'), (3, 'c')]
If we will, we can pass an arbitrary number of sequences - for example, by passing three of them we will obtain triplets of values:
[49]:
songs = ['Imagine', 'Hey Jude', 'Satisfaction', 'Yesterday' ]
authors = ['John Lennon','The Beatles', 'The Rolling Stones', 'The Beatles']
years = [1971, 1968, 1965, 1965]
list(zip(songs, authors, years))
[49]:
[('Imagine', 'John Lennon', 1971),
('Hey Jude', 'The Beatles', 1968),
('Satisfaction', 'The Rolling Stones', 1965),
('Yesterday', 'The Beatles', 1965)]
Exercise - ladder
Given a number n
, create a list of tuples that for each integer number \(x\) such that \(0 \leq x \leq n\) associates the number \(n - x\)
INPUT:
n=5
OUTPUT:
[(0, 4), (1, 3), (2, 2), (3, 1), (4, 0)]
[50]:
n = 5
# write here
List comprehensions
List comprehensions are handy when you need to generate a NEW list by executing the same operation on all the elements of a sequence. Comprehensions start and end with square brackets [
]
so theit syntax reminds lists, but inside they contain a special for
to loop inside a sequence:
[51]:
numbers = [2,5,3,4]
doubled = [x*2 for x in numbers]
doubled
[51]:
[4, 10, 6, 8]
Note the variable numbers
is still associated to the original list:
[52]:
numbers
[52]:
[2, 5, 3, 4]
What happened ? We wrote the name of a variable x
we just invented, and we told Python to go through the list numbers
: at each iteration, the variable x
is associated to a different value of the list numbers
. This value can be reused in the expression we wrote on left of the for
, which in this case is x*2
As name for the variable we used x
, but we could have used any other name, for example this code is equivalent to the previous one:
[53]:
numbers = [2,5,3,4]
doubled = [number * 2 for number in numbers]
doubled
[53]:
[4, 10, 6, 8]
On the left of the for
we can write any expression which produces a value, for example here we write x + 1
to increment all the numbers of the original list:
[54]:
numbers = [2,5,3,4]
augmented = [x + 1 for x in numbers]
augmented
[54]:
[3, 6, 4, 5]
QUESTION: What is this code going to produce? If we visualize it in Python Tutor, will la
and lb
point to different objects?
la = [7,5,6,9]
lb = [x for x in la]
[55]:
la = [7,5,6,9]
lb = [x for x in la]
jupman.pytut()
[55]:
List comprehensions on strings
QUESTION: What is this code going to produce?
[x for x in 'question']
Let’s now suppose to have a list of animals
and we want to produce another one with the same names as uppercase. We can do it in a compact way with a list comprehension like this:
[56]:
animals = ['dogs', 'cats', 'squirrels', 'elks']
new_list = [animal.upper() for animal in animals]
[57]:
new_list
[57]:
['DOGS', 'CATS', 'SQUIRRELS', 'ELKS']
In the left part reserved to the expression we used the method .upper()
on the string variable animal
. We know strings are immutable, so we’re sure the method call produces a NEW string. Let’s see what happened with Python Tutor:
[58]:
animals = ['dogs', 'cats', 'squirrels', 'elks']
new_list = [animal.upper() for animal in animals]
jupman.pytut()
[58]:
✪ EXERCISE: Try writing here a list comprehension to put all characters as lowercase (.lower()
method)
[59]:
animals = ['doGS', 'caTS', 'SQUIrreLs', 'ELks']
# write here
Questions - List comprehensions
Look at the following code fragments, and for each try guessing the result it produces (or if it gives an error):
[x for [4,2,5]]
x for x in range(3)
[x for y in 'cartoccio']
[for x in 'zappa']
[for [3,4,5]]
[k + 1 for k in 'bozza']
[k + 1 for k in range(5)]
[k > 3 for k in range(7)]
[s + s for s in ['lam','pa','da']]
la = ['x','z','z'] [x for x in la] + [y for y in la]
[x.split('-') for x in ['a-b', 'c-d', 'e-f']]
['@'.join(x) for x in [['a','b.com'],['c','d.org'],['e','f.net'] ]]
['z' for y in 'borgo'].count('z') == len('borgo')
m = [['a','b'],['c','d'],['e','f'] ] la = [x.pop() for x in m] # not advisable - why ? print(' m:', m) print('la:',la)
Exercises - list comprehension
Exercise - Bubble bubble
✪ Given a list of strings, produce a sequence with all the strings replicated 4 times
INPUT:
['chewing','gum','bubble']
OUTPUT:
['chewingchewingchewingchewing', 'gumgumgumgum', 'bubblebubblebubblebubble']
[60]:
import math
bubble_bubble = ['chewing','gum','bubble']
# write here
Exercise - root
✪ Given a list of numbers, produce a list with the square root of the input numbers
INPUT:
[16,25,81]
OUTPUT:
[4.0, 5.0, 9.0]
[61]:
import math
# write here
Exercise - When The Telephone Rings
✪ Given a list of strings, produce a list with the first characters of each string
INPUT:
['When','The','Telephone','Rings']
OUTPUT:
['W', 'T', 'T', 'R']
[62]:
# write here
Exercise - don’t worry
✪ Given a list of strings, produce a list with the lengths of all the lists
INPUT:
["don't", 'worry','and', 'be','happy']
OUTPUT:
[5, 5, 3, 2, 5]
[63]:
# write here
Exercise - greater than 3
✪ Given a list of numbers, produce a list with True
if the corresponding element is greater than 3
, False
otherwise
INPUT:
[4,1,0,5,0,9,1]
OUTPUT:
[True, False, False, True, False, True, False]
[64]:
# write here
Exercise - even
✪ Given a list of numbers, produce a list with True
if the corresponding element is even
INPUT:
[3,2,4,1,5,3,2,9]
OUTPUT:
[False, True, True, False, False, False, True, False]
[65]:
# write here
Exercise - both ends
✪ Given a list of strings having at least two characters each, produce a list of strings with the first and last characters of each
INPUT:
['departing', 'for', 'the', 'battlefront']
OUTPUT:
['dg', 'fr', 'te', 'bt']
[66]:
# write here
Exercise - dashes
✪ Given a list of lists of characters, produce a list of strings with characters separated by dashes
INPUT:
[['a','b'],['c','d','e'], ['f','g']]
OUTPUT:
['a-b', 'c-d-e', 'f-g']
[67]:
# write here
Exercise - lollosa
✪ Given a string s
, produce a list of tuples having for each character the number of occurrences of that character in the string
INPUT:
s = 'lollosa'
OUTPUT:
[('l', 3), ('o', 2), ('l', 3), ('l', 3), ('o', 2), ('s', 1), ('a', 1)]
[68]:
s = 'lollosa'
# write here
Exercise - dog cat
✪ Given a list of strings of at least two characters each, produce a list with the strings without intial and final characters
INPUT:
['donkey','eagle','ox', 'dog' ]
OUTPUT:
['onke', 'agl', '', 'o']
[69]:
# write here
Exercise - smurfs
✪ Given some names produce a list with the names sorted alphabetically and all in uppercase
INPUT:
['Brainy', 'Hefty', 'Smurfette', 'Clumsy']
OUTPUT:
['BRAINY', 'CLUMSY', 'HEFTY', 'SMURFETTE']
[70]:
# write here
Exercise - precious metals
✪ Given two lists values
and metals
produce a list containing all the couples value-metal as tuples
INPUT:
values = [10,25,50]
metals = ['silver','gold','platinum']
OUTPUT: [(10, 'silver'), (25, 'gold'), (50, 'platinum')]
[71]:
values = [10,25,50]
metals = ['silver','gold','platinum']
# write here
Filtered list comprehensions
During the construction of a list comprehension we can filter the elements taken from the sequence by using an if
. For example, the following expression takes from the sequence only numbers greater than 5
:
[72]:
[x for x in [7,4,8,2,9] if x > 5]
[72]:
[7, 8, 9]
After the if
we can put any expression which reuses the variable on which we are iterating, for example if we are iterating a string we can keep only the uppercase characters:
[73]:
[x for x in 'The World Goes Round' if x.isupper()]
[73]:
['T', 'W', 'G', 'R']
WARNING: else
is not supported
For example, writing this generates an error:
[x for x in [7,4,8,2,9] if x > 5 else x + 1] # WRONG!
File "<ipython-input-74-9ba5c135c58c>", line 1
[x for x in [7,4,8,2,9] if x > 5 else x + 1]
^
SyntaxError: invalid syntax
Questions - filtered list comprehensions
Look at the following code fragments, and for each try guessing the result it produces (or if it gives an error):
[x for x in range(100) if False]
[x for x in range(3) if True]
[x for x in range(6) if x > 3 else 55]
[x for x in range(6) if x % 2 == 0]
[x for x in {'a','b','c'}] # careful about ordering
[x for x in [[5], [2,3], [4,2,3], [4]] if len(x) > 2]
[(x,x) for x in 'xyxyxxy' if x != 'x' ]
[x for x in ['abCdEFg'] if x.upper() == x]
la = [1,2,3,4,5] [x for x in la if x > la[len(la)//2]]
Exercises - filtered list comprehensions
Exercise - savannah
Given a list of strings, produce a list with only the strings of length greater than 6:
INPUT:
['zebra', 'leopard', 'giraffe', 'gnu', 'rhinoceros', 'lion']
OUTPUT:
['leopard', 'giraffe', 'rhinoceros']
[74]:
# write here
Exercise - puZZled
Given a list of strings, produce a list with only the strings which contain at least a 'z'
. The selected strings must be transformed so to place the Z
in uppercase.
INPUT:
['puzzled', 'park','Aztec', 'run', 'mask', 'zodiac']
OUTPUT:
['puZZled', 'AZtec', 'Zodiac']
[75]:
[x.replace('z','Z') for x in ['puzzled', 'park','Aztec', 'run', 'mask', 'zodiac'] if 'z' in x]
Exercise - Data science
Produce a string with the words of the input string alternated uppercase / lowercase
INPUT:
[76]:
phrase = """Data science is an interdisciplinary field
that uses scientific methods, processes, algorithms and systems
to extract knowledge and insights from noisy, structured
and unstructured data, and apply knowledge and actionable insights
from data across a broad range of application domains."""
OUTPUT (only one line):
DATA science IS an INTERDISCIPLINARY field THAT uses SCIENTIFIC methods, PROCESSES, algorithms AND systems TO extract KNOWLEDGE and INSIGHTS from NOISY, structured AND unstructured DATA, and APPLY knowledge AND actionable INSIGHTS from DATA across A broad RANGE of APPLICATION domains.
✪✪✪ WRITE ONLY ONE code line
✪✪✪✪ USE ONLY ONE list comprehension
Show solution[77]:
phrase = """Data science is an interdisciplinary field
that uses scientific methods, processes, algorithms and systems
to extract knowledge and insights from noisy, structured
and unstructured data, and apply knowledge and actionable insights
from data across a broad range of application domains."""
# write here
Continue
Go on with the challenges
[ ]: