Lists 4 - Search methods

Download exercises zip 

Lists offer several different methods to perform searches and transformations inside them, but beware: the power is nothing without control! Sometimes you might feel the need to use them, but very often they hide traps you will later regret. So whenever you write code with one of these methods, always ask yourself the questions we will stress.

Method	Returns	Description
str1.split(str2)	`list`	Produces a list with all the words in str1 separated from str2
list.count(obj)	`int`	Counts the occurrences of an element
list.index(obj)	`int`	Searches for the first occurence of an element and returns its position
list.remove(obj)	`None`	Removes the first occurrence of an element

What to do

Unzip exercises zip in a folder, you should obtain something like this:

lists
    lists1.ipynb
    lists1-sol.ipynb
    lists2.ipynb
    lists2-sol.ipynb
    lists3.ipynb
    lists3-sol.ipynb
    lists4.ipynb
    lists4-sol.ipynb
    lists5-chal.ipynb
    jupman.py

WARNING: to correctly visualize the notebook, it MUST be in an unzipped folder !

open Jupyter Notebook from that folder. Two things should open, first a console and then a browser. The browser should show a file list: navigate the list and open the notebook lists4.ipynb
Go on reading the exercises file, sometimes you will find paragraphs marked Exercises which will ask to write Python commands in the following cells.

Shortcut keys:

to execute Python code inside a Jupyter cell, press Control + Enter
to execute Python code inside a Jupyter cell AND select next cell, press Shift + Enter
to execute Python code inside a Jupyter cell AND a create a new cell aftwerwards, press Alt + Enter
If the notebooks look stuck, try to select Kernel -> Restart

`split` method - from strings to lists

The split method of strings must be called on a string and a separator must be passed as parameter, which can be a single character or a substring. The result is a list of strings without the separator.

[2]:

"Finally the pirates shared the treasure".split("the")

[2]:

['Finally ', ' pirates shared ', ' treasure']

In practice this method is the opposite of lists method join we’ve already seen, with the important difference this method must be called on strings and not lists.

By calling split without arguments generic blanks are used as separators (space, \n, tab \t, etc)

[3]:

s = "Finally the\npirates\tshared     the treasure"
print(s)

Finally the
pirates shared     the treasure

[4]:

s.split()

[4]:

['Finally', 'the', 'pirates', 'shared', 'the', 'treasure']

It’s also possible to limit the number of elements to split by specifying the parameter maxsplit:

[5]:

s.split(maxsplit=2)

[5]:

['Finally', 'the', 'pirates\tshared     the treasure']

WARNING: What happens if the string does not contain the separator? Remember to also consider this case!

[6]:

"I talk and overtalk and I never ever take a break".split(',')

[6]:

['I talk and overtalk and I never ever take a break']

QUESTION: Look at thie cose. Will it print something? Or will it produce an error?

```
"revolving\tdoor".split()
```
```
"take great\t\ncare".split()
```

"do not\tforget\nabout\tme".split('\t')

```
"non ti scordar\ndi\tme".split(' ')
```

"The Guardian of the Abyss stared at us".split('abyss')[1]

```
"".split('abyss')[0]
```
```
"abyss_OOOO_abyss".split('abyss')[0]
```

Exercise - trash dance

You’ve been hired to dance in the last video of the notorious band Melodic Trash. You can’t miss this golden opportunity. Excited, you start reading the score, but you find a lot of errors - of course the band doesn’t need to know about writing scores to get tv time. There are strange symbols, and the last bar is too long (after the sixth bar) and needs to be put one row at a time. Write some code which fixes the score in a list dance.

DO NOT write string constants from the input in your code (so no "Ra Ta Pam" …)

Example - given:

music = "Zam Dam\tZa Bum Bum\tZam\tBam To Tum\tRa Ta Pam\tBar Ra\tRammaGumma  Unza\n\t\nTACAUACA \n BOOMBOOM!"

after your code it must result:

>>> print(dance)
['Zam Dam',
 'Za Bum Bum',
 'Zam',
 'Bam To Tum',
 'Ra Ta Pam',
 'Bar Ra',
 'RammaGumma',
 'Unza',
 'TACAUACA',
 'BOOMBOOM!']

Show solution

[7]:

music = "Zam Dam\tZa Bum Bum\tZam\tBam To Tum\tRa Ta Pam\tBar Ra\tRammaGumma  Unza\n\t\nTACAUACA \n BOOMBOOM!"

# write here

Exercise - Trash in tour

The Melodic Trash band strikes again! In a new tour they present the summer hits. The records company only provides the sales numbers in angosaxon format, so before communicating them to Italian media we need a conversion.

Write some code which given the hits and a position in the hit parade, (from 1 to 4), prints the sales number.

NOTE: commas must be substituted with dots

Example - given:

hits = """6,230,650 - I love you like the moldy tomatoes in the fridge
2,000,123 - The pain of living filthy rich
100,000 - Groupies are never enough
837 - Do you remember the trashcans in the summer..."""

position = 1   # the tomatoes
#position = 4  # the trashcans

Prints:

Number 1 in hit parade "I love you like the moldy tomatoes in the fridge" sold 6.230.650 copies

Show solution

[8]:

hits = """6,230,650 - I love you like the moldy tomatoes in the fridge
2,000,123 - The pain of living filthy rich
100,000 - Groupies are never enough
837 - Do you remember the trashcans in the summer..."""

position = 1   # the tomatoes
#position = 4  # the trashcans

# write here

Exercise - manylines

Given the following string of text:

"""This is a string
of text on
several lines which tells nothing."""

print it
prints how many lines, words and characters it contains
sort the words in alphabetical order and print the first and last ones in lexicographical order

You should obtain:

This is a string
of text on
several lines which tells nothing.

Lines: 3   words: 12   chars: 62

['T', 'h', 'i', 's', ' ', 'i', 's', ' ', 'a', ' ', 's', 't', 'r', 'i', 'n', 'g', '\n', 'o', 'f', ' ', 't', 'e', 'x', 't', ' ', 'o', 'n', '\n', 's', 'e', 'v', 'e', 'r', 'a', 'l', ' ', 'l', 'i', 'n', 'e', 's', ' ', 'w', 'h', 'i', 'c', 'h', ' ', 't', 'e', 'l', 'l', 's', ' ', 'n', 'o', 't', 'h', 'i', 'n', 'g', '.']
62

First word: This
Last word : which
['This', 'a', 'is', 'lines', 'nothing.', 'of', 'on', 'several', 'string', 'tells', 'text', 'which']

Show solution

[9]:

s = """This is a string
of text on
several lines which tells nothing."""

# write here

Exercise - takechars

✪ Given a phrase which contains exactly 3 words and has always as a central word a number \(n\), write some code which PRINTS the first \(n\) characters of the third word.

Example - given:

phrase = "Take 4 letters"

your code must print:

lett

Show solution

[10]:

phrase = "Take 4 letters"        # lett
#phrase= "Getting 5 caratters"   # carat
#phrase= "Take 10 characters"    # characters

# write here

`count` method

We can find the number of occurrences of a certain element in a list by using the method count

[11]:

la = ['a', 'n', 'a', 'c', 'o', 'n', 'd', 'a']

[12]:

la.count('n')

[12]:

[13]:

la.count('a')

[13]:

[14]:

la.count('d')

[14]:

Do not abuse count

WARNING: count is often used in a wrong / inefficient ways

Always ask yourself:

Could the list contain duplicates? Remember they will get counted!
Could the list contain no duplicate? Remember to also handle this case!
count performs a search on all the list, which could be inefficient: is it really needed, or do we already know the interval where to search?

QUESTION: Look at the following code fragments, and for each of them try guessing the result (or if it produces an error)

['A','aa','a','aaAah',"a", "aaaa"[1], " a "].count("a")

["the", "punishment", "of", "the","fools"].count('Fools') == 1

lst = ['oasis','date','oasis','coconut','date','coconut']
print(lst.count('date') == 1)

lst = ['oasis','date','oasis','coconut','date','coconut']
print(lst[4] == 'date')

['2',2,"2",2,float("2"),2.0, 4/2,"1+1",int('3')-float('1')].count(2)

```
[].count([])
```
```
[[],[],[]].count([])
```

Exercise - country life

Given a list country, write some code which prints True if the first half contains a number of elements el1 equal to the number of elements el2 in the second half.

Show solution

[15]:

el1,el2 = 'shovels', 'hoes'          # True
#el1,el2 = 'shovels', 'shovels'      # False
#el1,el2 = 'wheelbarrows', 'plows'   # True
#el1,el2 = 'shovels', 'wheelbarrows' # False

country = ['plows','wheelbarrows', 'shovels',      'wheelbarrows', 'shovels','hoes', 'wheelbarrows',
           'hoes', 'plows',        'wheelbarrows', 'plows',        'shovels','plows','hoes']

# write here

`index` method

The index method allows us to find the index of the FIRST occurrence of an element.

[16]:

#      0   1   2   3   4   5
la = ['p','a','e','s','e']

[17]:

la.index('p')

[17]:

[18]:

la.index('a')

[18]:

[19]:

la.index('e')  # we find the FIRST occurrence

[19]:

If the element we’re looking for is not present, we will get an error:

>>> la.index('z')

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-303-32d9c064ebe0> in <module>
----> 1 la.index('z')

ValueError: 'z' is not in list

Optionally, you can specify an index to start from (included):

[20]:

# 0   1   2   3   4   5   6   7   8   9   10
['a','c','c','a','p','a','r','r','a','r','e'].index('a',6)

[20]:

And also where to end (excluded):

# 0   1   2   3   4   5   6   7   8   9   10
['a','c','c','a','p','a','r','r','a','r','e'].index('a',6,8)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-17-7f344c26b62e> in <module>
      1 # 0   1   2   3   4   5   6   7   8   9   10
----> 2 ['a','c','c','a','p','a','r','r','a','r','e'].index('a',6,8)

ValueError: 'a' is not in list

Do not abuse index

WARNING: index is often used in a wrong / inefficient ways

Always ask yourself:

Could the list contain duplicates? Remember only the first will be found!
Could the list not contain the searched element? Remember to also handle this case!
index performs a search on all the list, which could be inefficient: is it really needed, or do we already know the interval where to search?
If we want to know if an element is in a position we already know, index is useless, it’s enough to write my_list[3] == element. If you used index, it could discover duplicate characters which are before or after the one we are interested in!

QUESTION: Look at the following code fragments, and for each one try guessing the result it produces (or if it gives error).

['arc','boat','hollow','dune'].index('hollow') == ['arc','boat','hollow','dune'].index('hollow',1)

['azure','blue','sky blue','smurfs'][-1:].index('sky blue')

road = ['asphalt','bitumen','cement','gravel']
print('mortar' in road or road.index('mortar'))

road = ['asphalt','bitumen','cement','gravel']
print('mortar' in road and road.index('mortar'))

road = ['asphalt','bitumen','mortar','gravel']
print('mortar' in road and road.index('mortar'))

la = [0,5,10]
la.reverse()
print(la.index(5) > la.index(10))

Exercise - Spatoč

In the past you met the Slavic painter Spatoč when he was still dirt poor. He gifted you with 2 or 3 paintings (you don’t remember) of dubious artistic value that you hid in the attic, but now watching TV you just noticed that Spatoč has gained international fame. You run to the attic to retrieve the paintings, which are lost among junk. Every painting is contained in a [ ] box, but you don’t know in which rack it is. Write some code which prints where they are.

racks are numbered from 1. If the third painting was not found, print 0.
DO NOT use loops nor if
HINT: printing first two is easy - to print the last one have a look at Booleans - evaluation order

Example 1 - given:

[21]:

      #  1      2           3             4             5
attic = [3,    '\\',       ['painting'], '---',        ['painting'],
      #  6      7           8             9             10
         5.23, ['shovel'], ['ski'],      ["painting"], ['lamp']]

prints:

rack of first painting : 3
rack of second painting: 5
rack of third painting : 9

Example 2 - given:

[22]:

        # 1           2     3       4            5          6          7
attic = [['painting'],'--',['ski'],['painting'],['statue'],['shovel'],['boots']]

prints

rack of first painting : 1
rack of second painting: 4
rack of third painting : 0

Show solution

[23]:

      #  1 2     3           4      5           6     7          8       9             10
attic = [3,'\\',['painting'],'---',['painting'],5.23,['shovel'],['ski'],['painting'], ['lamp']]
#  3,5,9
         # 1           2     3       4            5          6          7
#attic = [['painting'],'--',['ski'],['painting'],['statue'],['shovel'],['boots']]
#  1,4,0

# write here

`remove` method

remove takes an object as parameter, searches for the FIRST cell containing that object and eliminates it:

[24]:

#     0 1 2 3 4 5
la = [6,7,9,5,9,8]   # the 9 is in the first cell with index 2 and 4

[25]:

la.remove(9)   # searches first cell containing 9

[26]:

la

[26]:

[6, 7, 5, 9, 8]

As you can see, the cell which was at index 2 and that contained the FIRST occurrence of 9 has been eliminated. The cell containing the SECOND occurrence of 9 is still there.

If you try removing an object which is not present, you will receive an error:

la.remove(666)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-121-5d04a71f9d33> in <module>
----> 1 la.remove(666)

ValueError: list.remove(x): x not in list

Do not abuse remove

WARNING: remove is often used in a wrong / inefficient ways

Always ask yourself:

Could the list contain duplicates? Remember only the first will be removed!
Could the list not contain the searched element? Remember to also handle this case!
remove performs a search on all the list, which could be inefficient: is it really needed, or do we already know the position i where the element to be removed is? In such case it’s much better using .pop(i)

QUESTION: Look at the following code fragments, and for each try guessing the result (or if it produces an error).

la = ['a','b','c','b']
la.remove('b')
print(la)

la = ['a','b','c','b']
x = la.remove('b')
print(x)
print(la)

la = ['a','d','c','d']
la.remove('b')
print(la)

la = ['a','bb','c','bbb']
la.remove('b')
print(la)

la = ['a','b','c','b']
la.remove('B')
print(la)

la = ['a',9,'99',9,'c',str(9),'999']
la.remove("9")
print(la)

la = ["don't", "trick","me"]
la.remove("don't").remove("trick").remove("me")
print(la)

la = ["don't", "trick","me"]
la.remove("don't")
la.remove("trick")
la.remove("me")
print(la)

la = [4,5,7,10]
11 in la or la.remove(11)
print(la)

la = [4,5,7,10]
11 in la and la.remove(11)
print(la)

la = [4,5,7,10]
5 in la and la.remove(5)
print(la)

la = [9, [9], [[9]], [[[9]]] ]
la.remove([9])
print(la)

la = [9, [9], [[9]], [[[9]]] ]
la.remove([[9]])
print(la)

Exercise - nob

Write some code which removes from list la all the numbers contained in the 3 elements list lb.

your code must work with any list la and lb of three elements
you can assume that list la contains exactly TWO occurrences of all the elements of lb (plus also other numbers)

Example - given:

lb = [8,7,4]
la = [7,8,11,8,7,4,5,4]

after your code it must result:

>>> print(la)
[11, 5]

Show solution

[27]:

lb = [8,7,4]
la = [7,8,11,8,7,4,5,4]

# write here

Continue

Go on with first challenges

[ ]:

Lists 4 - Search methods

Download exercises zip

What to do

split method - from strings to lists

Exercise - trash dance

Exercise - Trash in tour

Exercise - manylines

Exercise - takechars

count method

Do not abuse count

Exercise - country life

index method

Do not abuse index

Exercise - Spatoč

remove method

Do not abuse remove

Exercise - nob

Continue

Download exercises zip 

`split` method - from strings to lists

`count` method

`index` method

`remove` method