Metamath

Download worked project 

Metamath is a language that can express theorems, accompanied by proofs that can be verified by a computer program. Its website lets you browse from complex theorems up to the most basic axioms they rely on to be proven.

The purpose of this project is to visualize the steps of the proof as a graph, and visualize statement frequencies.

DISCLAIMER: No panic !

You DO NOT need to understand any of the mathematics which follows. Here we are only interested in parsing the data and visualize it

REQUIREMENTS: Having read Relational data tutorial , which contains also instructions for installing required libraries.

What to do

Unzip exercises zip in a folder, you should obtain something like this:

metamath-prj
    metamath.ipynb
    metamath-sol.ipynb
    db.mm
    proof.txt
    soft.py
    jupman.py

WARNING: to correctly visualize the notebook, it MUST be in an unzipped folder !

open Jupyter Notebook from that folder. Two things should open, first a console and then a browser. The browser should show a file list: navigate the list and open the notebook metamath.ipynb
Go on reading the notebook, and write in the appropriate cells when asked

Shortcut keys:

to execute Python code inside a Jupyter cell, press Control + Enter
to execute Python code inside a Jupyter cell AND select next cell, press Shift + Enter
to execute Python code inside a Jupyter cell AND a create a new cell aftwerwards, press Alt + Enter
If the notebooks look stuck, try to select Kernel -> Restart

Metamath db

For this exercise, we have two files to consider:

db.mm contains the description of a simple algebra where you can only add zero to variables
proof.txt contains the awesome proof that… any variable is equal to itself

First you will load db.mm and parse text file into Python, here is the full content:

$( Declare the constant symbols we will use $)
    $c 0 + = -> ( ) term wff |- $.
$( Declare the metavariables we will use $)
    $v t r s P Q $.
$( Specify properties of the metavariables $)
    tt $f term t $.
    tr $f term r $.
    ts $f term s $.
    wp $f wff P $.
    wq $f wff Q $.
$( Define "term" and "wff" $)
    tze $a term 0 $.
    tpl $a term ( t + r ) $.
    weq $a wff t = r $.
    wim $a wff ( P -> Q ) $.
$( State the axioms $)
    a1 $a |- ( t = r -> ( t = s -> r = s ) ) $.
    a2 $a |- ( t + 0 ) = t $.
$( Define the modus ponens inference rule $)
    ${
       min $e |- P $.
       maj $e |- ( P -> Q ) $.
       mp  $a |- Q $.
    $}

Format description:

Each row is a statement
Words are separated by spaces. Each word that appears in a statement is called a token
Tokens starting with dollar $ are called keywords, you may have $(, $), $c, $v, $a,$f,${,$}, $.
Statements may be identified with a unique arbitrary label, which is placed at the beginning of the row. For example, tt, weq, maj are all labels (in the file there are more):
- tt $f term t $.
- weq $a wff t = r $.
- maj $e |- ( P -> Q ) $.
Some rows have no label, examples:
- $c 0 + = -> ( ) term wff |- $.
- $v t r s P Q $.
- $( State the axioms $)
- ${
- $}
in each row, after the first dollar keyword, you may have an arbitratry sequence of characters terminated by a dollar followed by a dot $.. You don’t need to care about the sequence meaning! Examples:
- tt $f term t $. has sequence term t
- weq $a wff t = r $. has sequence wff t = r
- $v t r s P Q $. has sequence t r s P Q

Now implement function parse_db which scans the file line by line (it is a text file, so you can use line files examples), parses ONLY rows with labels, and RETURN a dictionary mapping labels to remaining data in the row represented as a dictionary, formatted like this (showing here only first three labels):

{'a1': {'keyword': '$a',
        'sequence': '|- ( t = r -> ( t = s -> r = s ) )'},
 'a2': {'keyword': '$a',
        'sequence': '|- ( t + 0 ) = t'},
 'maj': {'keyword': '$e',
         'sequence': '|- ( P -> Q )'},
 'min': {'keyword': '$e', 'sequence': '|- P'},
 'mp':  {'keyword': '$a', 'sequence': '|- Q'},
 'tpl': {'keyword': '$a', 'sequence': 'term ( t + r )'},
 'tr':  {'keyword': '$f', 'sequence': 'term r'},
 'ts':  {'keyword': '$f', 'sequence': 'term s'},
 'tt':  {'keyword': '$f', 'sequence': 'term t'},
 'tze': {'keyword': '$a', 'sequence': 'term 0'},
 'weq': {'keyword': '$a', 'sequence': 'wff t = r'},
 'wim': {'keyword': '$a', 'sequence': 'wff ( P -> Q )'},
 'wp':  {'keyword': '$f', 'sequence': 'wff P'},
 'wq':  {'keyword': '$f', 'sequence': 'wff Q'}
}

1. Metamath db

Show solution

[3]:

def parse_db(filepath):

    ret = {}
    with open(filepath, encoding='utf-8') as f:
        line=f.readline().strip()
        while line != "":
            #print(line)

            if line.startswith('$('):
                label = ''
                keyword = '$('
                sequence = ''
            elif line.split()[0].startswith('${'):
                label = ''
                keyword = '${'
                sequence = ''
            elif line.split()[0].startswith('$}'):
                label = ''
                keyword = '$}'
                sequence = ''
            elif line.split()[0].startswith('$'):
                label = ''
                keyword = line.split()[0]
                sequence = line.split()[1][:-2].strip()
            else:
                label = line.split(' $')[0].strip()
                keyword = line.split()[1]
                if line.endswith('$.'):
                    sequence = line.split(keyword)[1][1:-2].strip()

            if label:
                ret[label] = {
                    'keyword' : keyword,
                    'sequence' : sequence
                }
                #print('   DEBUG: FOUND', label, ':', ret[label])
            #else:
                #print('   DEBUG: DISCARDED')
            line=f.readline().strip()
    return ret


db_mm = parse_db('db.mm')

assert db_mm['tt'] == {'keyword': '$f', 'sequence': 'term t'}
assert db_mm['maj'] == {'keyword': '$e', 'sequence': '|- ( P -> Q )'}
# careful 'mp' label shouldn't have spaces inside !
assert 'mp' in db_mm
assert db_mm['mp'] == {'keyword': '$a', 'sequence': '|- Q'}


from pprint import pprint
pprint(db_mm)

{'a1': {'keyword': '$a', 'sequence': '|- ( t = r -> ( t = s -> r = s ) )'},
 'a2': {'keyword': '$a', 'sequence': '|- ( t + 0 ) = t'},
 'maj': {'keyword': '$e', 'sequence': '|- ( P -> Q )'},
 'min': {'keyword': '$e', 'sequence': '|- P'},
 'mp': {'keyword': '$a', 'sequence': '|- Q'},
 'tpl': {'keyword': '$a', 'sequence': 'term ( t + r )'},
 'tr': {'keyword': '$f', 'sequence': 'term r'},
 'ts': {'keyword': '$f', 'sequence': 'term s'},
 'tt': {'keyword': '$f', 'sequence': 'term t'},
 'tze': {'keyword': '$a', 'sequence': 'term 0'},
 'weq': {'keyword': '$a', 'sequence': 'wff t = r'},
 'wim': {'keyword': '$a', 'sequence': 'wff ( P -> Q )'},
 'wp': {'keyword': '$f', 'sequence': 'wff P'},
 'wq': {'keyword': '$f', 'sequence': 'wff Q'}}

[3]:

def parse_db(filepath):
    raise Exception('TODO IMPLEMENT ME !')

db_mm = parse_db('db.mm')

assert db_mm['tt'] == {'keyword': '$f', 'sequence': 'term t'}
assert db_mm['maj'] == {'keyword': '$e', 'sequence': '|- ( P -> Q )'}
# careful 'mp' label shouldn't have spaces inside !
assert 'mp' in db_mm
assert db_mm['mp'] == {'keyword': '$a', 'sequence': '|- Q'}


from pprint import pprint
pprint(db_mm)

2.1 Metamath proof

A proof file is made of steps, one per row. Each statement, in order to be proven, needs other steps to be proven until very basic facts called axioms are reached, which need no further proof (typically proofs in Metamath are shown in much shorter format, but here we use a more explicit way)

So a proof can be nicely displayed as a tree of the steps it is made of, where the top node is the step to be proven and the axioms are the leaves of the tree.

Complete content of proof.txt:

 1 tt            $f term t
 2 tze           $a term 0
 3 1,2 tpl       $a term ( t + 0 )
 4 tt            $f term t
 5 3,4 weq       $a wff ( t + 0 ) = t
 6 tt            $f term t
 7 tt            $f term t
 8 6,7 weq       $a wff t = t
 9 tt            $f term t
10 9 a2          $a |- ( t + 0 ) = t
11 tt            $f term t
12 tze           $a term 0
13 11,12 tpl     $a term ( t + 0 )
14 tt            $f term t
15 13,14 weq     $a wff ( t + 0 ) = t
16 tt            $f term t
17 tze           $a term 0
18 16,17 tpl     $a term ( t + 0 )
19 tt            $f term t
20 18,19 weq     $a wff ( t + 0 ) = t
21 tt            $f term t
22 tt            $f term t
23 21,22 weq     $a wff t = t
24 20,23 wim     $a wff ( ( t + 0 ) = t -> t = t )
25 tt            $f term t
26 25 a2         $a |- ( t + 0 ) = t
27 tt            $f term t
28 tze           $a term 0
29 27,28 tpl     $a term ( t + 0 )
30 tt            $f term t
31 tt            $f term t
32 29,30,31 a1   $a |- ( ( t + 0 ) = t -> ( ( t + 0 ) = t -> t = t ) )
33 15,24,26,32 mp  $a |- ( ( t + 0 ) = t -> t = t )
34 5,8,10,33 mp  $a |- t = t

Each line represents a step of the proof. Last line is the final goal of the proof.

Each line contains, in order:

a step number at the beginning, starting from 1 (step_id)
possibly a list of other step_ids, separated by commas, like 29,30,31 - they are references to previous rows
label of the db_mm statement referenced by the step, like tt, tze, weq - that label must have been defined somewhere in db.mm file
statement type: a token starting with a dollar, like $a, $f
a sequence of characters, like (for you they are just characters, don’t care about the meaning !):
- term ( t + 0 )
- |- ( ( t + 0 ) = t -> ( ( t + 0 ) = t -> t = t ) )

Implement function parse_proof, which takes a filepath to the proof and RETURN a list of steps expressed as a dictionary, in this format (showing here only first 5 items):

NOTE: referenced step_ids are integer numbers and they are the original ones from the file, meaning they start from one.

[
    {'keyword': '$f',
     'label': 'tt',
     'sequence': 'term t',
     'step_ids': []},
    {'keyword': '$a',
     'label': 'tze',
     'sequence': 'term 0',
     'step_ids': []},
    {'keyword': '$a',
     'label': 'tpl',
     'sequence': 'term ( t + 0 )',
     'step_ids': [1,2]},
    {'keyword': '$f',
     'label': 'tt',
     'sequence': 'term t',
     'step_ids': []},
    {'keyword': '$a',
     'label': 'weq',
     'sequence': 'wff ( t + 0 ) = t',
     'step_ids': [3,4]},
     .
     .
     .
]

Show solution

[5]:

def parse_proof(filepath):

    ret = []

    with open(filepath, encoding='utf-8') as f:
        line=f.readline().strip()

        while line != "":

            step_id = int(line.split(' ')[0])
            label = line.split('$')[0].strip().split(' ')[-1]
            keyword = '$' + line.split('$')[1][:1]
            sequence = line.split('$')[1][2:]
            candidate_step_ids = line.split(' ')[1]

            if candidate_step_ids != label:
                step_ids = [int(x) for x in line.split(' ')[1].split(',')]
            else:
                step_ids = []
            #print('deps =', deps)

            ret.append( {
                'step_ids': step_ids,
                'sequence': sequence,
                'label': label,
                'keyword': keyword
            })

            line=f.readline().strip()
        return ret


proof = parse_proof('proof.txt')

assert proof[0] == {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []}
assert proof[1] == {'keyword': '$a', 'label': 'tze', 'sequence': 'term 0', 'step_ids': []}
assert proof[2] == {'keyword': '$a',
                    'label': 'tpl',
                    'sequence': 'term ( t + 0 )',
                    'step_ids': [1, 2]}
assert proof[4] == {'keyword': '$a',
                    'label': 'weq',
                    'sequence': 'wff ( t + 0 ) = t',
                    'step_ids': [3,4]}
assert proof[33] == { 'keyword': '$a',
                      'label': 'mp',
                      'sequence': '|- t = t',
                      'step_ids': [5, 8, 10, 33]}

pprint(proof)

[{'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a', 'label': 'tze', 'sequence': 'term 0', 'step_ids': []},
 {'keyword': '$a',
  'label': 'tpl',
  'sequence': 'term ( t + 0 )',
  'step_ids': [1, 2]},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a',
  'label': 'weq',
  'sequence': 'wff ( t + 0 ) = t',
  'step_ids': [3, 4]},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a', 'label': 'weq', 'sequence': 'wff t = t', 'step_ids': [6, 7]},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a',
  'label': 'a2',
  'sequence': '|- ( t + 0 ) = t',
  'step_ids': [9]},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a', 'label': 'tze', 'sequence': 'term 0', 'step_ids': []},
 {'keyword': '$a',
  'label': 'tpl',
  'sequence': 'term ( t + 0 )',
  'step_ids': [11, 12]},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a',
  'label': 'weq',
  'sequence': 'wff ( t + 0 ) = t',
  'step_ids': [13, 14]},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a', 'label': 'tze', 'sequence': 'term 0', 'step_ids': []},
 {'keyword': '$a',
  'label': 'tpl',
  'sequence': 'term ( t + 0 )',
  'step_ids': [16, 17]},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a',
  'label': 'weq',
  'sequence': 'wff ( t + 0 ) = t',
  'step_ids': [18, 19]},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a',
  'label': 'weq',
  'sequence': 'wff t = t',
  'step_ids': [21, 22]},
 {'keyword': '$a',
  'label': 'wim',
  'sequence': 'wff ( ( t + 0 ) = t -> t = t )',
  'step_ids': [20, 23]},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a',
  'label': 'a2',
  'sequence': '|- ( t + 0 ) = t',
  'step_ids': [25]},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a', 'label': 'tze', 'sequence': 'term 0', 'step_ids': []},
 {'keyword': '$a',
  'label': 'tpl',
  'sequence': 'term ( t + 0 )',
  'step_ids': [27, 28]},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a',
  'label': 'a1',
  'sequence': '|- ( ( t + 0 ) = t -> ( ( t + 0 ) = t -> t = t ) )',
  'step_ids': [29, 30, 31]},
 {'keyword': '$a',
  'label': 'mp',
  'sequence': '|- ( ( t + 0 ) = t -> t = t )',
  'step_ids': [15, 24, 26, 32]},
 {'keyword': '$a',
  'label': 'mp',
  'sequence': '|- t = t',
  'step_ids': [5, 8, 10, 33]}]

[5]:

def parse_proof(filepath):
    raise Exception('TODO IMPLEMENT ME !')

proof = parse_proof('proof.txt')

assert proof[0] == {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []}
assert proof[1] == {'keyword': '$a', 'label': 'tze', 'sequence': 'term 0', 'step_ids': []}
assert proof[2] == {'keyword': '$a',
                    'label': 'tpl',
                    'sequence': 'term ( t + 0 )',
                    'step_ids': [1, 2]}
assert proof[4] == {'keyword': '$a',
                    'label': 'weq',
                    'sequence': 'wff ( t + 0 ) = t',
                    'step_ids': [3,4]}
assert proof[33] == { 'keyword': '$a',
                      'label': 'mp',
                      'sequence': '|- t = t',
                      'step_ids': [5, 8, 10, 33]}

pprint(proof)

[{'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a', 'label': 'tze', 'sequence': 'term 0', 'step_ids': []},
 {'keyword': '$a',
  'label': 'tpl',
  'sequence': 'term ( t + 0 )',
  'step_ids': [1, 2]},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a',
  'label': 'weq',
  'sequence': 'wff ( t + 0 ) = t',
  'step_ids': [3, 4]},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a', 'label': 'weq', 'sequence': 'wff t = t', 'step_ids': [6, 7]},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a',
  'label': 'a2',
  'sequence': '|- ( t + 0 ) = t',
  'step_ids': [9]},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a', 'label': 'tze', 'sequence': 'term 0', 'step_ids': []},
 {'keyword': '$a',
  'label': 'tpl',
  'sequence': 'term ( t + 0 )',
  'step_ids': [11, 12]},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a',
  'label': 'weq',
  'sequence': 'wff ( t + 0 ) = t',
  'step_ids': [13, 14]},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a', 'label': 'tze', 'sequence': 'term 0', 'step_ids': []},
 {'keyword': '$a',
  'label': 'tpl',
  'sequence': 'term ( t + 0 )',
  'step_ids': [16, 17]},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a',
  'label': 'weq',
  'sequence': 'wff ( t + 0 ) = t',
  'step_ids': [18, 19]},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a',
  'label': 'weq',
  'sequence': 'wff t = t',
  'step_ids': [21, 22]},
 {'keyword': '$a',
  'label': 'wim',
  'sequence': 'wff ( ( t + 0 ) = t -> t = t )',
  'step_ids': [20, 23]},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a',
  'label': 'a2',
  'sequence': '|- ( t + 0 ) = t',
  'step_ids': [25]},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a', 'label': 'tze', 'sequence': 'term 0', 'step_ids': []},
 {'keyword': '$a',
  'label': 'tpl',
  'sequence': 'term ( t + 0 )',
  'step_ids': [27, 28]},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$f', 'label': 'tt', 'sequence': 'term t', 'step_ids': []},
 {'keyword': '$a',
  'label': 'a1',
  'sequence': '|- ( ( t + 0 ) = t -> ( ( t + 0 ) = t -> t = t ) )',
  'step_ids': [29, 30, 31]},
 {'keyword': '$a',
  'label': 'mp',
  'sequence': '|- ( ( t + 0 ) = t -> t = t )',
  'step_ids': [15, 24, 26, 32]},
 {'keyword': '$a',
  'label': 'mp',
  'sequence': '|- t = t',
  'step_ids': [5, 8, 10, 33]}]

2. Checking proof

If you’ve done everything properly, by executing following cells you should be be able to see nice graphs.

IMPORTANT: You do not need to implement anything!

Just look if results match expected graphs

Overview plot

Here we only show step numbers using function draw_proof defined in sciprog library

[6]:

from soft import draw_proof
# uncomment and check
#draw_proof(proof, db_mm, only_ids=True)  # all graph, only numbers

[7]:

print()
print('************************ EXPECTED COMPLETE GRAPH  **********************************')
draw_proof(proof, db_mm, only_ids=True)


************************ EXPECTED COMPLETE GRAPH  **********************************

../../_images/projects_metamath_metamath-sol_26_1.png

Detail plot

Here we show data from both the proof and the db_mm we calculated earlier. To avoid having a huge graph we only focus on subtree starting from step_id 24.

To understand what is shown, look at node 20: - first line contains statement wff ( t + 0 ) = t taken from line 20 of proof file - second line weq: wff t = r is taken from db_mm, and means rule labeled weq was used to derive the statement in the first line.

[8]:

# uncomment and check
#draw_proof(proof, db_mm, step_id=24)

[9]:

print()
print('************************* EXPECTED DETAIL GRAPH  *******************************')
draw_proof(proof, db_mm, step_id=24)


************************* EXPECTED DETAIL GRAPH  *******************************

../../_images/projects_metamath_metamath-sol_29_1.png

3. Top statements

We can measure the importance of theorems and definitions (in general, statements) by counting how many times they are referenced in proofs.

3.1 plot: Write some code to plot the histogram of statement labels referenced by steps in proof, from most to least frequently referenced.

A label gets a count each time a step references another step with that label.

For example, in the subgraph above:

tt is referenced 4 times, that is, there are 4 steps referencing other steps which contain the label tt
weq is referenced 2 times
tpl and tze are referenced 1 time each
wim is referenced 0 times (it is only present in the last node, which being the root node cannot be referenced by any step)

NOTE: the previous counts are just for the subgraph example.

In your exercise, you will need to consider all the steps

3.2 print list: Below the graph, print the list of labels from most to least frequent, associating them to corresponding statement sequence taken from db_mm

Expected output:

tt :     term t
weq :    wff t = r
tze :    term 0
tpl :    term ( t + r )
a2 :     |- ( t + 0 ) = t
wim :    wff ( P -> Q )
a1 :     |- ( t = r -> ( t = s -> r = s ) )
mp :     |- Q

Show solution

[10]:

import numpy as np
import matplotlib.pyplot as plt


# write here

[ ]:

Metamath

Download worked project

What to do

Metamath db

1. Metamath db

2.1 Metamath proof

2. Checking proof

Overview plot

Detail plot

3. Top statements

Download worked project 