Lists 4 - Search methods
Download exercises zip
Lists offer several different methods to perform searches and transformations inside them, but beware: the power is nothing without control! Sometimes you might feel the need to use them, but very often they hide traps you will later regret. So whenever you write code with one of these methods, always ask yourself the questions we will stress.
Method |
Returns |
Description |
---|---|---|
|
Produces a list with all the words in str1 separated from str2 |
|
|
Counts the occurrences of an element |
|
|
Searches for the first occurence of an element and returns its position |
|
|
Removes the first occurrence of an element |
What to do
Unzip exercises zip in a folder, you should obtain something like this:
lists
lists1.ipynb
lists1-sol.ipynb
lists2.ipynb
lists2-sol.ipynb
lists3.ipynb
lists3-sol.ipynb
lists4.ipynb
lists4-sol.ipynb
lists5-chal.ipynb
jupman.py
WARNING: to correctly visualize the notebook, it MUST be in an unzipped folder !
open Jupyter Notebook from that folder. Two things should open, first a console and then a browser. The browser should show a file list: navigate the list and open the notebook
lists4.ipynb
Go on reading the exercises file, sometimes you will find paragraphs marked Exercises which will ask to write Python commands in the following cells.
Shortcut keys:
to execute Python code inside a Jupyter cell, press
Control + Enter
to execute Python code inside a Jupyter cell AND select next cell, press
Shift + Enter
to execute Python code inside a Jupyter cell AND a create a new cell aftwerwards, press
Alt + Enter
If the notebooks look stuck, try to select
Kernel -> Restart
split
method - from strings to lists
The split
method of strings must be called on a string and a separator must be passed as parameter, which can be a single character or a substring. The result is a list of strings without the separator.
[2]:
"Finally the pirates shared the treasure".split("the")
[2]:
['Finally ', ' pirates shared ', ' treasure']
In practice this method is the opposite of lists method join we’ve already seen, with the important difference this method must be called on strings and not lists.
By calling split
without arguments generic blanks are used as separators (space, \n
, tab \t
, etc)
[3]:
s = "Finally the\npirates\tshared the treasure"
print(s)
Finally the
pirates shared the treasure
[4]:
s.split()
[4]:
['Finally', 'the', 'pirates', 'shared', 'the', 'treasure']
It’s also possible to limit the number of elements to split by specifying the parameter maxsplit
:
[5]:
s.split(maxsplit=2)
[5]:
['Finally', 'the', 'pirates\tshared the treasure']
WARNING: What happens if the string does not contain the separator? Remember to also consider this case!
[6]:
"I talk and overtalk and I never ever take a break".split(',')
[6]:
['I talk and overtalk and I never ever take a break']
QUESTION: Look at thie cose. Will it print something? Or will it produce an error?
"revolving\tdoor".split()
"take great\t\ncare".split()
"do not\tforget\nabout\tme".split('\t')
"non ti scordar\ndi\tme".split(' ')
"The Guardian of the Abyss stared at us".split('abyss')[1]
"".split('abyss')[0]
"abyss_OOOO_abyss".split('abyss')[0]
Exercise - trash dance
You’ve been hired to dance in the last video of the notorious band Melodic Trash. You can’t miss this golden opportunity. Excited, you start reading the score, but you find a lot of errors - of course the band doesn’t need to know about writing scores to get tv time. There are strange symbols, and the last bar is too long (after the sixth bar) and needs to be put one row at a time. Write some code which fixes the score in a list dance
.
DO NOT write string constants from the input in your code (so no
"Ra Ta Pam"
…)
Example - given:
music = "Zam Dam\tZa Bum Bum\tZam\tBam To Tum\tRa Ta Pam\tBar Ra\tRammaGumma Unza\n\t\nTACAUACA \n BOOMBOOM!"
after your code it must result:
>>> print(dance)
['Zam Dam',
'Za Bum Bum',
'Zam',
'Bam To Tum',
'Ra Ta Pam',
'Bar Ra',
'RammaGumma',
'Unza',
'TACAUACA',
'BOOMBOOM!']
[7]:
music = "Zam Dam\tZa Bum Bum\tZam\tBam To Tum\tRa Ta Pam\tBar Ra\tRammaGumma Unza\n\t\nTACAUACA \n BOOMBOOM!"
# write here
Exercise - Trash in tour
The Melodic Trash band strikes again! In a new tour they present the summer hits. The records company only provides the sales numbers in angosaxon format, so before communicating them to Italian media we need a conversion.
Write some code which given the hits
and a position
in the hit parade, (from 1
to 4
), prints the sales number.
NOTE: commas must be substituted with dots
Example - given:
hits = """6,230,650 - I love you like the moldy tomatoes in the fridge
2,000,123 - The pain of living filthy rich
100,000 - Groupies are never enough
837 - Do you remember the trashcans in the summer..."""
position = 1 # the tomatoes
#position = 4 # the trashcans
Prints:
Number 1 in hit parade "I love you like the moldy tomatoes in the fridge" sold 6.230.650 copies
[8]:
hits = """6,230,650 - I love you like the moldy tomatoes in the fridge
2,000,123 - The pain of living filthy rich
100,000 - Groupies are never enough
837 - Do you remember the trashcans in the summer..."""
position = 1 # the tomatoes
#position = 4 # the trashcans
# write here
Exercise - manylines
Given the following string of text:
"""This is a string
of text on
several lines which tells nothing."""
print it
prints how many lines, words and characters it contains
sort the words in alphabetical order and print the first and last ones in lexicographical order
You should obtain:
This is a string
of text on
several lines which tells nothing.
Lines: 3 words: 12 chars: 62
['T', 'h', 'i', 's', ' ', 'i', 's', ' ', 'a', ' ', 's', 't', 'r', 'i', 'n', 'g', '\n', 'o', 'f', ' ', 't', 'e', 'x', 't', ' ', 'o', 'n', '\n', 's', 'e', 'v', 'e', 'r', 'a', 'l', ' ', 'l', 'i', 'n', 'e', 's', ' ', 'w', 'h', 'i', 'c', 'h', ' ', 't', 'e', 'l', 'l', 's', ' ', 'n', 'o', 't', 'h', 'i', 'n', 'g', '.']
62
First word: This
Last word : which
['This', 'a', 'is', 'lines', 'nothing.', 'of', 'on', 'several', 'string', 'tells', 'text', 'which']
[9]:
s = """This is a string
of text on
several lines which tells nothing."""
# write here
Exercise - takechars
✪ Given a phrase
which contains exactly 3 words and has always as a central word a number \(n\), write some code which PRINTS the first \(n\) characters of the third word.
Example - given:
phrase = "Take 4 letters"
your code must print:
lett
[10]:
phrase = "Take 4 letters" # lett
#phrase= "Getting 5 caratters" # carat
#phrase= "Take 10 characters" # characters
# write here
count
method
We can find the number of occurrences of a certain element in a list by using the method count
[11]:
la = ['a', 'n', 'a', 'c', 'o', 'n', 'd', 'a']
[12]:
la.count('n')
[12]:
2
[13]:
la.count('a')
[13]:
3
[14]:
la.count('d')
[14]:
1
Do not abuse count
WARNING: count
is often used in a wrong / inefficient ways
Always ask yourself:
Could the list contain duplicates? Remember they will get counted!
Could the list contain no duplicate? Remember to also handle this case!
count
performs a search on all the list, which could be inefficient: is it really needed, or do we already know the interval where to search?
QUESTION: Look at the following code fragments, and for each of them try guessing the result (or if it produces an error)
['A','aa','a','aaAah',"a", "aaaa"[1], " a "].count("a")
["the", "punishment", "of", "the","fools"].count('Fools') == 1
lst = ['oasis','date','oasis','coconut','date','coconut'] print(lst.count('date') == 1)
lst = ['oasis','date','oasis','coconut','date','coconut'] print(lst[4] == 'date')
['2',2,"2",2,float("2"),2.0, 4/2,"1+1",int('3')-float('1')].count(2)
[].count([])
[[],[],[]].count([])
Exercise - country life
Given a list country
, write some code which prints True
if the first half contains a number of elements el1
equal to the number of elements el2
in the second half.
[15]:
el1,el2 = 'shovels', 'hoes' # True
#el1,el2 = 'shovels', 'shovels' # False
#el1,el2 = 'wheelbarrows', 'plows' # True
#el1,el2 = 'shovels', 'wheelbarrows' # False
country = ['plows','wheelbarrows', 'shovels', 'wheelbarrows', 'shovels','hoes', 'wheelbarrows',
'hoes', 'plows', 'wheelbarrows', 'plows', 'shovels','plows','hoes']
# write here
index
method
The index
method allows us to find the index of the FIRST occurrence of an element.
[16]:
# 0 1 2 3 4 5
la = ['p','a','e','s','e']
[17]:
la.index('p')
[17]:
0
[18]:
la.index('a')
[18]:
1
[19]:
la.index('e') # we find the FIRST occurrence
[19]:
2
If the element we’re looking for is not present, we will get an error:
>>> la.index('z')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-303-32d9c064ebe0> in <module>
----> 1 la.index('z')
ValueError: 'z' is not in list
Optionally, you can specify an index to start from (included):
[20]:
# 0 1 2 3 4 5 6 7 8 9 10
['a','c','c','a','p','a','r','r','a','r','e'].index('a',6)
[20]:
8
And also where to end (excluded):
# 0 1 2 3 4 5 6 7 8 9 10
['a','c','c','a','p','a','r','r','a','r','e'].index('a',6,8)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-17-7f344c26b62e> in <module>
1 # 0 1 2 3 4 5 6 7 8 9 10
----> 2 ['a','c','c','a','p','a','r','r','a','r','e'].index('a',6,8)
ValueError: 'a' is not in list
Do not abuse index
WARNING: index
is often used in a wrong / inefficient ways
Always ask yourself:
Could the list contain duplicates? Remember only the first will be found!
Could the list not contain the searched element? Remember to also handle this case!
index
performs a search on all the list, which could be inefficient: is it really needed, or do we already know the interval where to search?If we want to know if an
element
is in a position we already know,index
is useless, it’s enough to writemy_list[3] == element
. If you usedindex
, it could discover duplicate characters which are before or after the one we are interested in!
QUESTION: Look at the following code fragments, and for each one try guessing the result it produces (or if it gives error).
['arc','boat','hollow','dune'].index('hollow') == ['arc','boat','hollow','dune'].index('hollow',1)
['azure','blue','sky blue','smurfs'][-1:].index('sky blue')
road = ['asphalt','bitumen','cement','gravel'] print('mortar' in road or road.index('mortar'))
road = ['asphalt','bitumen','cement','gravel'] print('mortar' in road and road.index('mortar'))
road = ['asphalt','bitumen','mortar','gravel'] print('mortar' in road and road.index('mortar'))
la = [0,5,10] la.reverse() print(la.index(5) > la.index(10))
Exercise - Spatoč
In the past you met the Slavic painter Spatoč when he was still dirt poor. He gifted you with 2 or 3 paintings (you don’t remember) of dubious artistic value that you hid in the attic, but now watching TV you just noticed that Spatoč has gained international fame. You run to the attic to retrieve the paintings, which are lost among junk. Every painting is contained in a [ ]
box, but you don’t know in which rack it is. Write some code which prints where they are.
racks are numbered from 1. If the third painting was not found, print
0
.DO NOT use loops nor
if
HINT: printing first two is easy - to print the last one have a look at Booleans - evaluation order
Example 1 - given:
[21]:
# 1 2 3 4 5
attic = [3, '\\', ['painting'], '---', ['painting'],
# 6 7 8 9 10
5.23, ['shovel'], ['ski'], ["painting"], ['lamp']]
prints:
rack of first painting : 3
rack of second painting: 5
rack of third painting : 9
Example 2 - given:
[22]:
# 1 2 3 4 5 6 7
attic = [['painting'],'--',['ski'],['painting'],['statue'],['shovel'],['boots']]
prints
rack of first painting : 1
rack of second painting: 4
rack of third painting : 0
[23]:
# 1 2 3 4 5 6 7 8 9 10
attic = [3,'\\',['painting'],'---',['painting'],5.23,['shovel'],['ski'],['painting'], ['lamp']]
# 3,5,9
# 1 2 3 4 5 6 7
#attic = [['painting'],'--',['ski'],['painting'],['statue'],['shovel'],['boots']]
# 1,4,0
# write here
remove
method
remove
takes an object as parameter, searches for the FIRST cell containing that object and eliminates it:
[24]:
# 0 1 2 3 4 5
la = [6,7,9,5,9,8] # the 9 is in the first cell with index 2 and 4
[25]:
la.remove(9) # searches first cell containing 9
[26]:
la
[26]:
[6, 7, 5, 9, 8]
As you can see, the cell which was at index 2 and that contained the FIRST occurrence of 9
has been eliminated. The cell containing the SECOND occurrence of 9
is still there.
If you try removing an object which is not present, you will receive an error:
la.remove(666)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-121-5d04a71f9d33> in <module>
----> 1 la.remove(666)
ValueError: list.remove(x): x not in list
Do not abuse remove
WARNING: remove
is often used in a wrong / inefficient ways
Always ask yourself:
Could the list contain duplicates? Remember only the first will be removed!
Could the list not contain the searched element? Remember to also handle this case!
remove
performs a search on all the list, which could be inefficient: is it really needed, or do we already know the positioni
where the element to be removed is? In such case it’s much better using.pop(i)
QUESTION: Look at the following code fragments, and for each try guessing the result (or if it produces an error).
la = ['a','b','c','b'] la.remove('b') print(la)
la = ['a','b','c','b'] x = la.remove('b') print(x) print(la)
la = ['a','d','c','d'] la.remove('b') print(la)
la = ['a','bb','c','bbb'] la.remove('b') print(la)
la = ['a','b','c','b'] la.remove('B') print(la)
la = ['a',9,'99',9,'c',str(9),'999'] la.remove("9") print(la)
la = ["don't", "trick","me"] la.remove("don't").remove("trick").remove("me") print(la)
la = ["don't", "trick","me"] la.remove("don't") la.remove("trick") la.remove("me") print(la)
la = [4,5,7,10] 11 in la or la.remove(11) print(la)
la = [4,5,7,10] 11 in la and la.remove(11) print(la)
la = [4,5,7,10] 5 in la and la.remove(5) print(la)
la = [9, [9], [[9]], [[[9]]] ] la.remove([9]) print(la)
la = [9, [9], [[9]], [[[9]]] ] la.remove([[9]]) print(la)
Exercise - nob
Write some code which removes from list la
all the numbers contained in the 3 elements list lb
.
your code must work with any list
la
andlb
of three elementsyou can assume that list
la
contains exactly TWO occurrences of all the elements oflb
(plus also other numbers)
Example - given:
lb = [8,7,4]
la = [7,8,11,8,7,4,5,4]
after your code it must result:
>>> print(la)
[11, 5]
[27]:
lb = [8,7,4]
la = [7,8,11,8,7,4,5,4]
# write here
Continue
Go on with first challenges
[ ]: