Isar for Metamath?

Isabelle is a program that checks your math. You tell it, in a very technical computer language, how to prove that some line of mathematical reasoning is correct, and it verifies that you’ve told it right. It’s also a great help in searching for proofs to begin with, because it has powerful techniques to simplify and transform mathematical statements into simpler or just different versions that help you understand the statements.

Isar is a language built on top of Isabelle. Unlike most computer proof languages, It allows steps in a proof to be done without mentioning the theorem being used by name in certain circumstances, allows explicit naming and reference of statements, and allows backwards and forwards proof to be mixed together arbitrarily. (In these respects it’s similar to the difference between assembly language and higher-level programming languages.) You can state subgoals explicitly on your way to proving the main goal if you like (similar to subroutines). A key strength of Isar is its flexibility—you can state a given proof in many different ways, which means that you can usually find a way that’s fairly easy to read.

What’s more, the ideas behind Isar don’t have a whole lot to do with the underlying logical or even metalogical framework. In other words, Isar is to a large extent orthogonal to Isabelle. Isabelle/Isar uses unification and resolution in an intuitionistic metalogic as its principles, but they could be substantially different while keeping Isar useful.

Metamath is a system that’s quite similar to Isabelle at heart, but much, much, much simpler. It’s correspondingly less powerful (practically speaking, not formally) and harder to use. Its main advantage is that its proofs, while longer, are very explicit and thus possible to follow with a minimum of prior knowledge. (And, as far as I’m concerned, the simplicity is also quite nice in that I can easily use Metamath ideas to design my own math system and have a lot more confidence in the results than I would with a system of my own design or a system that imitated the design of a more complex system like Isabelle.)

After three long years, I’m finally getting back to this. As you might have guessed, I’ve been rethinking a lot of my approach. While I still sort of like the direction I was taking initially, I now know how to do a much better job of building the system to yield correct results elegantly: take some lessons from Isabelle and Metamath. I was aware of neither back then.

I think the direction I’m heading with my project is to try to replicate the high-level nature of Isar with the ability to drill down you get with Metamath. And I’m finding (although maybe I could learn…) that Metamath-style proofs are quite hard to read, even with the features I added to the proof display.

I’m really missing Isar-style abbreviation, unification-based or otherwise, and, of course, the ability to intermix informal descriptions of the proof with the formal version. I think slawekk was right when he told me that I need to get as close as possible to human-language proof presentation. I think this makes sense because human-language presentation has evolved to near-optimally communicate results (to other members of the math specialty the paper was published in, but the same language is used in math textbooks too, just with more detail).

The problem with these informal proofs is in their lack of detail. They skip over parts that the audience is expected to know already, thereby limiting the audience to those people who do, in fact, already know. Isar can even have the same problem, even though it’s completely explicit, because of the complexity of the language that can be invoked with a word (like “by (blast)+”). In fact, it’s inherent in the nature of any high level view of a proof that it skip over detail. It really is a dichotomy between presenting the forest as a whole or the individual trees.

But this problem can be addressed! Simply allow the reader to either hide or show the detail as desired. It’s really that simple. Isar already provides for this, and quite well, I think. On the other hand, Isar doesn’t let you get down to the smallest, most explicit level in the proofs. For one, you can leave rules and steps implicit (in well-defined ways) and there’s not currently a way to view Isar proofs that lets you fill in that detail. For another, if you’re already at Isar’s lowest level (unification and resolution) and still remain confused, you have no recourse. There’s no way to expand those basic steps into more detail, or a pretty graphical presentation, that really shows what is happening at the same level as, say, Metamath.

And then there is another way to deal with the problem. It’s to create a new Isar-like language for Metamath, one that applies the same principles that make Isar relatively readable to the formal language of Metamath.

The disadvantage of Isar is that the bottom level may not be as easy to comprehend as Metamath’s. The disadvantage of Metamath is that the proof language wouldn’t have as rich an environment to draw on to make it flexible. (Flexibility is a key feature of a good proof language.) Now, the language could incorporate things like unification, even higher-order, or resolution, but it would be at the cost of making the correspondence between the Isar-like proof and the Metamath proof less clear and direct. You could have the detail where it could be expanded out, but the reader still has to understand the principles being used to manipulate the proof. In other words, you would have to learn what resolution and unification are to understand the detailed level of a proof.

I suppose you could hope that expanding to that level of detail isn’t necessary because the rest of the proof is clear enough already. But that’s already the approach taken by Isar, and I don’t think it has worked with this reader. On the other hand, unification isn’t terribly hard to learn, and resolution is easy to display graphically and prettily, which might make it easy to get a handle on.

At this point, I’m thoroughly undecided on both questions: whether to work on Isar or Metamath, and if the latter, what to make the proof language look like.



5 Responses to “Isar for Metamath?”

Slawekk says:

An essential element of the system with an Isar-like high level language expanding into Metamath would be a prover for Metamath. That would be some software that would be able to take statement in the high-level language like

from A1 A2 have “a + b =c” using theorem1 theorem2

(where A1 and A2 are labels of some assertions) and generate a Metamath lemma with the proof out of that.
The lemma would have the hypotheses A1 A2, assertion “a + b = c” and use facts theorem1 and theorem2 (and others added as necessary by the prover) in the proof. I think that will be the most challenging part.
I have an idea on one way how to approach this and I am curious what you think about it.
Metamath proofs are just sequences of references to already proven facts. So what the prover is supposed to do is accept premises and assertions for very simple lemmas and be able to generate a sequence of references that constitutes a proof. One way to do it is to just apply facts that unify with the current set of goals (the first set would have one element: the thesis) in hope that we would be lucky and after a couple of iterations reduce to the premises of the lemma. Of course for any given goal there are many lemmas whose theses match (unify with) the goal and the question is in what order we should try to apply them. We can think about this order a strategy for proving. And here is the idea: suppose we have a whole population of such proving agents, each with its own ordered set of known facts to try. Initially the individuals in this population have some randomly chosen strategies (order of facts to apply). From the existing proofs in set.mm we can generate hundreds of thousands of small lemmas that are fragments of those proofs. Then we we use those lemma as problems for which proving agents are rewarded with points when they solve them. Those points can be viewed as fitness to breed this population in a genetic algorithm fashion so that hudreds of generations and hours of CPU later we evolve some set of strategies that perform best in practice. We can even have sexual reproduction with two kinds of provers - one attempting forward proof and one specializing in backward proofs. It would be fun. That should get us to the point that the prover is usable as the back end of the high-level Isar-like language. Then we continue evolving the population while writing formalized mathematics in the Isar-like language and the prover keeps improving.

pdf23ds says:

Hmm. I really don’t know about that approach. It seems to me like the intelligent selection of good lemmas that goes on when building a theory is a form of compression, and that neural networks rely on the absence of compression to be able to do their work. So this may not be an area where they’re really applicable. There’s also the downside of the behavior being harder to predict, and regressions being impossible to troubleshoot.

I think artificial intelligence is a fascinating area of research, but I’d rather have my proof tools be in service of AI rather than the other way around. Of course, if you’re right I hope someone proves me wrong. :-)

Correct me if I’m wrong, but the “prover” part of the Isar language is a simple unification search*, based on the intro, dest, and a few other attributes on theorems. Replacing this lightweight piece of the language with something heavyweight and opaque like NNs and you have a totally different beast on your hands. Isar’s success would be no indication of this tool’s success.

* That reminds me. I think that you can’t do a unification search in Metamath as-is. The mmj2 tool can do it because it parses out metamath statements and unifies with the AST. I think this really needs to be a fundamental part of the langauge, like it is in Isabelle.

Slawekk says:

the intelligent selection of good lemmas that goes on when building a theory is a form of compression, and that neural networks rely on the absence of compression to be able to do their work.

What I was talking about was not related to the building of theory or neural networks or AI. I was talking about the problem you will encounter when you try to implement high-level proof language on top of the Metamath system. High level proof language by definition takes larger steps than native (low-level) Metamath language. That means that there will be gaps in the reasoning and to make the whole thing worthwhile you will need a prover that would fill the gaps. A piece of software that would be able to generate Metamath proofs of very simple lemmas (the lemmas being specified by a human in the form of statements in the high -level language that need to be expanded into complete Metamath proofs).

> the “prover” part of the Isar language is a simple unification search*, based on the intro, dest, and a few other attributes on theorems.

Maybe the problem is simpler than I think. But the fact is that I can take relatively (in comparison to Metamath) large proof steps in Isabelle/Isar and this is because there are hundreds of theorems in the simplifier that are used when Isabelle searches for the proof (of a statement proven “by simp”). The simplifier is tuned quite well, yet it still sometimes goes into an infinite loop or can not prove some things that I feel are much simpler than others that it can prove. You will have to do something similar for Metamath. The question is: how do you tune the proof search so that it works well for the high level language? My suggestion is to express the algorithm for the proof search in a way that can be used for genetic algorithm optimization (no relation to neural networks) and then evolve (train) the population of different algorithms using problems generated from exiting proofs taken from set.mm.

> Replacing this lightweight piece of the language with something heavyweight and opaque

Note that we don’t care how big and opaque the prover is. Only the verifier has to be lean and simple. The prover can be a total mess, as long as it is able to find proofs effectively.

pdf23ds says:

You’re right, I misunderstood what algorithm you were proposing. I have in fact used genetic algorithms for things other than neural nets, so I’m not sure why I conflated the two. Perhaps because it wasn’t obvious to me how else to use it, but of course using it to order the theorems would be one obvious way.

A problem with using the genetic algorithm to order the theorems to search is that it would be logic-dependent, and thus a clean build could conceivably take days unless that data were stored somehow in the source.

All in all, I think this is probably a good idea that could work equally well with Isabelle as with a Metamath simplifier, and is orthogonal to the problem of designing a good declarative proof language. But it could potentially make producing a working simplifier quite a bit simpler.

Note that we don’t care how big and opaque the prover is. Only the verifier has to be lean and simple. The prover can be a total mess, as long as it is able to find proofs effectively.

Well, the more opaque it is the harder it is to break down what it did after the fact if the inference it made was (itself) too opaque. In other words, it doesn’t make for readable proofs. But this is just a potential problem that could be avoided with enough care.

Slawekk says:

Great, I see I managed to make myself more clear in the second commet. Let’s followup on that.

using the genetic algorithm to order the theorems to search

The idea is not to use the genetic algorithm to find the optimal proof strategy (order of theorems to apply). There is no one optimal proof search strategy that is best for everything. The result of the optimization would be a whole population of strategies, maybe a hundred or more.
For example in Isabelle/ZF there are Isar steps that Isabelle accepts when done “by simp” but not when done “by blast” and vice versa. I usually first try “simp” and if that fails I try “blast”. It would be best if Isabelle run it both ways in parallel and returned the one that is the first to find a proof. The idea with a population of provers pushes this to extreme.

But I think the gratest advantage of this approach is that you can continue to evolve the prover population while you create mathematics. As the material progresses and you go from simple stuff (like, say, topological groups) to more complicated subjects (like calculus) the optimal strategies are different. From the user point of view it looks like your system learns and improves on its own while being used.

Leave a Reply

To include an em dash, use three hypens with no surrounding spaces, or two with surrounding spaces.