I think this is really cool. It's at least shaped like something really cool. But I need to have my hand held a little bit more than this page is set up for. Is there like a... for dummies version?
It's a stripped down model of computation like the SKI calculus [0] or its cousin the lambda calculus, which are formal systems with precise rules for mechanically evaluating or reducing expressions.
It differs from the SKI calculus in that it can reflect on its own program structure, especially in ways the SKI calculus cannot - deciding if two programs are equal, for instance [1]. Further, unlike the lambda calculus, reducing a program with the reduction rules given [2] eventually converges upon a stable "normal form" of the program, which is expressed in irreducible terms, instead of leading to possibly infinite chains of reduction [3]. This allows for reflection without needing to "quote" or serialize the program into a stable data structure or other representation to sidestep the possibility of infinite reduction. This is similar to the notion of homoiconicity as in Lisp.
It would appear that programs in even SKI calculus and other combinatory logics can reduce to base combinators in finitely many steps, but when those combinators are applied to arguments, then the possibility of non-termination arises [0]:
Combinatory logic [9,12] is very close to λ-calculus, and widely considered to be equivalent to it [8], but there are
important differences. In particular, by modifying the usual account of recursion, i.e. by modifying the usual account of
fixpoint functions, it is possible to ensure that all programs are represented by stable combinators, i.e. combinators that are
irreducible or in normal form. Only when applied to arguments does non-termination become possible.
That is, we may define a program to be a combinator in normal form. As such it is both an executable, since applications
of it will trigger evaluation, and also a syntax tree, i.e. a binary tree of applications, with leaves labelled by the primitive
combinators, say S and K.
This thread is perhaps phrased confusingly. If I understand correctly, the claim is only that any program (including recursive ones) can be expressed in a finite normal form, which may of course diverge when "run" (applied to an argument). The same is true for basically every language though, e.g. consider the following Python:
omega = lambda x: x(x)
omega(omega)(42)
The expression `omega(omega)` diverges, but only because we've applied it to an argument. The function `omega` itself is in an irreducible normal form, namely `lambda x: x(x)`.
On the other hand, consider the function call with argument `42`: in that case, the function is `omega(omega)`, which is not in a normal form, and in fact diverges. In a system like SK, and presumably this tree calculus, we can form a similar expression, where the "function part" (i.e. left sub-tree) diverges with no normal form. That's unavoidable, due to the halting problem.
I think the claim is that we never need to write such expressions, since there's always an alternative which behaves the same but whose "function part" has a normal form.
As a (slightly) more practical example, consider some code like:
adjust = (lambda x: x + 1) if increase else (lambda x: x - 1)
The function `adjust` is not in a normal form, since it depends on the value of `increase` (when truthy, `adjust` will increment; when falsy it'll decrement). Yet most programmers would probably avoid that, in favour of this:
adjust = lambda x: x + 1 if increase else x - 1
Both implementations of `adjust` will behave the same, but the second is in a normal form.
I'm rusty but doesn't the increase depend differently, the second adjust lets you decide to assign a value for the variable increase much later, whereas the first adjust fixes the value of increase at the point of evaluating that very assignment statement. There's a name for this, dynamic scoping or something.
Yes, my argument relies on immutability of all values (including `increase`). Also, under lazy evaluation we can call `adjust2` immediately, since we know it's a `lambda`; yet attempting to call `adjust1` will force evaluation of `increase`; not good if `increase = provable(collatz_conjecture)`!
Reading the technical papers around Tree Calculus, I now see this claim a little more clearly. It's based around Tree Calculus rejecting eta-equivalence.
In Lambda Calculus, eta-equivalence says that `λf. λx. f x` is equivalent to `λf. f`, i.e. a function which passes its argument straight into `f` and returns the result unmodified, is indistinguishable from the function `f` itself. Those two functions could beta-reduce in a different order, depending on the evaluation strategy we use; but such distinctions are unobservable from within Lambda Calculus itself. In fact, we can make a rule, called eta-reduction, which transforms one into the other:
λx. F x --> F
This is sound and, if we apply it before beta-reduction, ensures that both forms will also reduce in the same way (and hence have the same normal forms, if they exist). Note that Python has a single evaluation strategy, which will not reduce an expression like `lambda f: lambda x: f(x)` to `lambda f: f`; hence it doesn't implement eta-reduction.
In SK combinatory logic, eta-reduction is also sound; e.g. if 'xyz' reduces to 'yz' then 'xy' is equivalent to 'y'. This is obvious with a combinator like I = SKK, since Iy reduces to y via the ordinary S and K rules, so Iyz reduces yz in the same way, and the above implication is trivial. However, there are other combinators which don't reduce reduce all the way until they get another argument, i.e. a combinator W where Wy does not reduce to y, but Wyz does reduce to yz (the Tree Calculus papers give examples, referred to as "wait" combinators). It's fine to use an eta-equivalence rule like "if xyz ~= yz then xy ~= y" for reasoning about SK programs, but to actually reduce real expressions we may need a whole bunch of rules, to cover the most common "wait" combinators. The reason this is sound is that the only ways an SK expression can "use" an argument it's applied to is either (a) discard it, (b) apply some other other expression to it (potentially copying it a few times), or (c) apply it to some other expression. Cases (a) and (b) do not depend on the particular value of the expression, and the behaviour of (c) respects eta-equivalence.
Tree Calculus can also use a given expression in another way: it can branch on the structure. That doesn't respect eta-equivalence, and hence Tree Calculus can use this to distinguish between two combinators P and Q even if Px == Qx for all x.
Thanks for the links, I'll be reading up on these.
When I signed up for a computer science degree, I was hoping I'd learn this stuff (if not this calculus, than enough context to grapple with it at least). What I actually got was software engineering. Snore.
The Theoretical Programming Languages class I took for my minor at Princeton was okay, basically they just taught from Benjamin Pierce's textbook. I'm sure there are other undergrad level intro class syllabi available if that's something you have spare time to self study.
It's a very funny way to structure the main landing page—it takes cues from trendy programming language and framework websites (single word headings that feel slightly buzzwordy, animated code samples, etc), but then the text body is in quite overwhelmingly dense and lengthy academic language, but then it doesn't actually provide enough details in the academic language to really understand what is going on.
I spent a while parsing the paragraphs in the hope of understanding what this is, only to discover that in spite of its verbosity the text on the landing page is no more informative than PL landing pages usually are—it tells you a lot about what the author thinks is great about the language while not actually explaining how any of it works. I guess I need to go to the specification for that?
> it tells you a lot about what the author thinks is great about the language
I did not get so far as to understand that it was a language. I thought maybe it was some kind of higher order function to be implemented in a language of your closing. Like a map/reduce sort of thing.
It's an example of Combinatory Logic. For another example check out SKI combinators. SKI is actually kind of close to this but using three combinators instead of one it's a bit simpler to understand (IMO).
Note that the I combinator is redundant since `I = SK`, so all we really need is `SK`. There are alternatives like Iota with a single combinator, but they're essentially an obfuscated mix of S and K; so it's usually clearer to stick with SK.
Note that the first two rules of this Tree Calculus are precisely those of K and S.
Oops, sorry; I got part way through writing I = S K K, but went off to another tab to double-check, and forgot to update that expression when I came back!
You know, it's entirely possible that it isn't a language and I just assumed that into existence. It's a very confusing landing page.
Edit: no, it's definitely a language of some sort:
> The syntax and semantics of Tree Calculus are minimal and self-contained, relying on no platform-specific concepts or abstractions. As a result, it is trivial to write interpreters on any platform or in any programming language (demo). ...
> The ability to bootstrap the full power of Tree Calculus anywhere makes it an excellent configuration as code language in a heterogeneous system. Developers could write Tree Calculus programs that generate, say, JSON, and use an interpreter ...
Thanks for the feedback! Tree Calculus is a calculus/logic, see Specification page or the book by Barry Jay (linked on that page) for way way better and detailed verbose explanations. It only defines what I chose to call "t" on the website, Barry uses "Δ" in his book and papers.
So without anything else, we'd have to talk about programs in terms of "(t t) t ..." or binary trees, which gets unwieldy quickly. The first natural step, for practical purposes, is to allow definitions "foo = ...", then some syntactic sugar for lists, functions, etc. Ooops and now we have a "language". If you open the "Playground" page there's a summary of what exactly is TC and what is syntactic sugar (and really nothing more!) on top of it.
I like to think that the line is so blurry precisely because TC needs nothing but a bit of syntactic sugar to feel like a usable PL haha.
Right, it's a programming language the way the lambda calculus or pi calculus or whatever are programming languages—I did understand that much!
I love the idea of having a website like this to introduce people to one of the less popular calculi, and the playground is a great touch. It might be helpful to have an introductory paragraph that explains exactly what the tree calculus is, starting from what a "calculus" is—your target audience seems to be people who aren't otherwise going to go out of their way to read Barry's papers, which means you can't assume as much background as you currently do. As a reference, I'm a casual PL nerd who actually has read academic papers related to some of the less well-known calculi with an eye towards implementing them, so I'm on the more informed end of the spectrum of your target audience.
Have you considered making this site open source? No pressure if not, but if so I'd be happy to contribute to polishing up the landing page. I'm very interested in learning more about this anyway, and I'm more than willing to help!
The specification says that the syntax of expressions in this thing is `E ::= t | E E`. This is a bit confusing, because it might lead you to believe that all expressions just look like `t t t t t t t`. In reality, you are supposed to keep the implied bracketing, so expressions really look like `(t t) (t ((t t) (t t)))` (note that at the top level and in each parenthesis, there always exactly two subexpressions). Essentially, the space character becomes a binary operator (similar to how we often write multiplication with no symbol).
The expressions are a bit heavy on parentheses, so we say that this binary operator is left associative. This means that an expression `a b c` should be interpreted as `(a b) c`, an expression `a b c d` should be interpreted as `((a b) c) d`, and so on. If you think about it, this means that you can always get rid of an opening parenthesis at the left edge of an expression, i.e. we can assume that an expression always start without one.
With this out of the way, we can now understand where the trees come from: As there is only one terminal symbol, `t`, after removing unnecessary parentheses, every expression will always start with `t`, which is followed by a number of other expressions. To draw this as a tree, draw a node representing the initial `t`, and a sub-tree for each of the follow-up expressions (by applying the same procedure to them).
In this view, the semantic rules at the top of the specification page now tell you how to "simplify" a tree whenever there is a node with three or more sub-trees, or alternatively, how to reduce an expression that is a `t` followed by three or more sub-expressions. (In the syntax view, you replace the `t` and the first three expressions following it by whats on the right of the arrow. In the tree view, you replace the node and its first three children by some other sub-tree, then you attach the remaining children to the root of the new sub-tree.)
A defining sentence at the top of the page, something like “Tree Calculus is a [noun phrase] for [summary of purpose],” would be helpful. Wikipedia articles typically begin with such sentences:
“Lambda calculus (also written as λ-calculus) is a formal system in mathematical logic for expressing computation based on function abstraction and application using variable binding and substitution.”
“In mathematics, matrix calculus is a specialized notation for doing multivariable calculus, especially over spaces of matrices.”
“The felicific calculus is an algorithm formulated by utilitarian philosopher Jeremy Bentham (1748–1832) for calculating the degree or amount of pleasure that a specific action is likely to induce.”
It's a programming language whose programs (and whose values) are unlabeled trees.
An unlabeled tree is a tree-shaped data structure whose nodes hold no data. The children of a node are ordered, though. The "Tree Calculus" defines a set of rules by which you can "evaluate" an unlabeled tree to get another unlabeled tree. If you apply these rules repeatedly, either you'll get into an infinite loop, or you'll get a tree that the rules say doesn't change anymore. The rules are designed so that the rules don't effect binary trees, so if you evaluate a binary tree you'll get the same tree back out and the computation is "done". These rules are written as a small-step semantics (a standard way to write down evaluation rules in PL) in the "specifications" page.
They claim that:
- The evaluation rules for trees are Turing Complete, meaning that you can express any computation, e.g. any JS program, using the Tree Calculus. More precisely, the claim is that there's a straightforward way to convert any (say) JS Program into a tree, and any tree into a JS value, and now you can use tree evaluation to run a JS program by doing (i) convert the JS program into a tree, (ii) evaluate the tree to get another tree, and finally (iii) convert the tree into a JS value, which is the correct output of the JS program. To prove this you wouldn't actually use JS as the language, you'd use something simpler that we already know is Turing complete like the lambda calculus, but it's the same idea. Though glancing at the page they might have actually done this for JS.
- The evaluation is asymptotically optimal, meaning that for any programming language P (like JS), there's a conversion f(prog) from programs in P to Tree Calculus trees and constants a and b such that running_time(f(prog)) <= a+b*running_time(prog). That is, you can run programs in any language using the Tree Calculus with ~constant overhead. This is true for all the programming languages you love, e.g. you could write a JS program that takes a Java program, compiles it to bytecode, and run that bytecode, and unless you did this reasonably badly the running time won't be worse than a factor of 1,000,000.
- A whole bunch of other stuff too. It's all plausible at first glance (i.e., I don't think they're making any of it up), but not obviously consequential.
What's it good for:
Some PL people might think it's cool. Conceivably useful to simplify some theory of computation proofs.
If you find this sort of thing interesting, though, I'd recommend learning the lambda calculus instead. The lambda calculus is simpler, more well known, and more useful (it's a model of functions, instead of some made up rules about trees).
> The lambda calculus is simpler, more well known, and more useful (it's a model of functions, instead of some made up rules about trees).
Lambda calculus is actually quite tricky; e.g. I've implemented it many times over the decades, both for fun and for serious applications, and have still never properly implemented capture-avoiding substitution (I've tried, but usually give up and reach for an existing implementation).
Also, notice that the tree calculus reduction rules are confluent, disjoint and branch on their left subtrees (similar to SK calculus). Hence they are also a "model of functions" like lambda calculus; or, alternatively, lambda calculus is also "some made up rules about [expression] trees".
> If you find this sort of thing interesting, though, I'd recommend learning the lambda calculus instead. The lambda calculus is simpler, more well known, and more useful (it's a model of functions, instead of some made up rules about trees).
I was with you all the way up until here.
The lambda calculus is only more useful because it's become the basis for a lot of existing programming languages. Its made up rules are no more fundamental than the made-up rules of any other calculus, tree calculus included. They just seem more fundamental because they form the basis of most functional programming today.
I'm also unconvinced that the lambda calculus itself is meaningfully simpler. It typically has three syntactic components and two reduction operations, compared to one syntactic component and five reduction operations—that gives the lambda calculus a very small edge in moving parts, but it's very small.
The only way in which I could agree with you is that learning the lambda calculus first is going to be easier because there's so much more material on it. But that's not because the fundamental properties of the lambda calculus, it's because Church came first.
> The lambda calculus is only more useful because it's become the basis for a lot of existing programming languages.
But there's a reason for that. Functions make a good basis for programming languages. It's not the only good basis! Concatenative combinators make a good basis too: see Factor and Joy.
If you take the lambda calculus and add numbers and a few operations on them, it's easy to program in it. Likewise for the concatenative calculus. But not so for NAND gates or SK combinators. You certainly can do anything you want with them, but doing so feels more like solving a logic puzzle than programming. I am likewise skeptical about the tree calculus.
I'm skeptical that we'd feel that way if Church didn't come first. His work defined the field and shaped the way we all think about programming in profound ways that make it very difficult to distinguish how natural one is over the other.
Turing published the same year as Church, couple of months later. Church's advantage was very small.
Turing machine includes write over some place, thus it models a calculating man with infinite paper, infinite pencil and infinite eraser. Lambda calculus models calculating man with infinite paper and infinite pencil. Lambda calculus has very substantial advantage here.
So if it's just an unlabeled tree (nodes hold no data) then the only information is the child order / count, correct? So part of it is somehow mapping some high level information to combination of nodes and children, and back (after some manipulation), correct? Or am I misunderstanding everything?
Correct. I think of it this way: The reduction rules prescribe an encoding for functions, but don't describe it for other (traditional) data. But there are very canonical choices of course, which the demos on the website follow:
* false := t and true := t t
* Pairs are t first_elem second_elem
* Lists could be encoded as (t first_elem (t second_elem (t third_elem ...))) with empty list being t
* Natural numbers as lists of booleans (LSB first)
* Strings as lists of natural numbers (unicode code points)
These choices will affect what the functions that operate on data look like, concretely.
people are always so impressed by this kind of stuff but i'll never get why. doesn't your intuition already lead you to recognize what's going on? or do you recognize and are still impressed?
look if it walks like an expression tree, talks like an expression tree, and quacks like... something else... then it's basically an expression tree with extra sugar and spice. that doesn't mean the sugar and spice isn't fun, it means the basic idea is exactly what you think it is - expression trees (plus in this case enough semantics to define combinators).
I'm very much impressed with this. I have never seen a programming language that allowed me to de/serialize functions. Let alone calculate the inverse of a function. If you're saying one or more languages with these features already exist then I'm very interested in names, links or references.
> have never seen a programming language that allowed me to de/serialize functions.
You can pickle functions in python? You can trivially serialize any lisp function (I'm not a lisp fan). Plenty of programming languages with both macros and first class function objects (those that can be passed around and thus have data representations).
> Let alone calculate the inverse of a function
Note it says "try to compute the inverse" because actually computing inverses is equivalent to the halting problem.
"If it seems to good to be true it probably is" could be adapted here to "If it seems too magical to be true, it's probably just cherry-picked".
> You can pickle functions in python? You can trivially serialize any lisp function (I'm not a lisp fan).
The point of the tree calculus appears to be that it doesn't require the intermediate step of "pickling" or, as the author calls it, "quoting" the program to produce a data structure or other representation of the program [0]:
Previous accounts of self-interpretation had to work with programs that were not
normal forms, that were unstable. Stability was imposed by first quoting the program to
produce a data structure, by putting on some make-up. In tree calculus, the programs are
already data structures, so that no pre-processing is required; both of the self-evaluators
above act on the program and its input directly. In short, tree calculus supports honest
reflection without make-up.
It sounds similar to the notion of homoiconicity as in Lisp, but probably more precisely or even strongly stated.
> Plenty of programming languages with both macros and first class function objects (those that can be passed around and thus have data representations).
A language may have first class function objects, but its actual structure may be opaque and not open to reflection or manipulation (beyond of course just munging the source code as plaintext). You can maybe create a function literal, pass the function around and to higher-order functions, but you can't inspect or modify its internal structure, or decide program equality (based on either exact structure, or one program reducing to another according to the reduction rules of the calculus).
Lastly the tree calculus would also appear to differ from the lambda calculus in that programs are stable and won't reduce infinitely, instead converging on some normal form of irreducible terms. [1]
More precisely, the distinction would seem to be that programs in the tree calculus can analyze themselves with reference only to the reduction rules of the calculus, not needing to reach for some meta-language or theory outside the calculus that works on source code or some AST representation of the program [0]:
Reflective programs are programs that can act on themselves to query their own struc-
ture. The querying is important: that the identity function can be applied to itself does
not make it reflective. A simple example is the size function defined in Chapter 5.
When applied to itself, the result is (the tree of) the number 508 [...] Self-evaluation
in a calculus provides good evidence for the ability to perform program analysis and
optimisation within the calculus itself. Traditionally, self-interpreters were allowed to
act on the syntax tree of a program, i.e. its quotation. [...] When quotation lies
outside of the formal calculus then interpretation is separated from computation proper,
so that some sort of staging is required.
A demo of a size function is given here [1], implemented directly in the tree calculus:
By the measuring stick of "anything that gratifies one's intellectual curiosity," I'd say this submission has been a roaring success for a lot of people. The nice thing about HN is that if something on the front page doesn't gratify your personal curiosity there are 29 more things that might.
Well, "hoax" may indeed not be the best word, because it implies intent. I've known of another highly intelligent software developer that invented similar symbology, but was suffering from a mental disorder which made him believe his work was useful even though it's only uses were 'self-referential'. Highly intelligent people, including software developers, are not immune from such disorders.
Y Combinator is the simple syntax for representing the concept of passing a function to itself, so it's useful, for symbolic manipulation.
However the "Tree Calculus", just from those 5 little rules, is supposedly able to do all kinds of magic. I'm not buying it. It's nothing but musings and examples of binary trees, and how you can convert various data structures to binary trees.