All elementary functions from a single binary operator
All Elementary Functions from a Single Binary Operator
ELI5/TLDR
Your scientific calculator has dozens of buttons — sin, cos, log, square root, and so on. This paper shows you only need one. A single operation called EML, which just does “e to the x minus log of y,” can rebuild every single function on that calculator when paired with the number 1. It is the continuous-math equivalent of the NAND gate, which alone can build any digital circuit. The author also shows you can use this to build trainable formula-discovery circuits that recover exact mathematical expressions from raw data.
The Full Story
The NAND gate of continuous math
In digital electronics, there is a famous fact: one gate, NAND, can build any Boolean circuit. Every AND, OR, NOT, XOR — all of them are just NAND gates wired together in the right pattern. It is a beautiful reduction. The entire universe of digital logic collapses into one repeating brick.
Continuous mathematics — the world of sin, cos, exp, log, square roots, and all the functions on your scientific calculator — never had anything like this. You could reduce the list somewhat. Euler’s formula connects trig functions to the exponential. Logarithms turn multiplication into addition. But you always seemed to need at least a few distinct operations.
Andrzej Odrzywołek, a theoretical physicist at Jagiellonian University in Krakow, found that this is wrong. One operation is enough.
The operator
The operator is called EML, for Exp-Minus-Log:
eml(x, y) = exp(x) - ln(y)
That is it. Take the exponential of the first input, subtract the natural log of the second. Together with the constant 1, this single operation can reconstruct every function on a standard scientific calculator.
Think of it like a Swiss Army knife where every tool is actually the same blade, just folded differently. The exponential function? Easy: eml(x, 1) = exp(x) - ln(1) = exp(x), because ln(1) is zero. The natural log is a bit more work — you nest three EML calls — but it works. And from exp and ln, you can bootstrap everything else: addition, subtraction, multiplication, division, exponentiation, trig functions, hyperbolic functions, their inverses, and constants like e, pi, and the imaginary unit i.
How he found it
The method was essentially brute force with clever bookkeeping. Odrzywołek started with the full 36-button calculator — all the constants, functions, and operations you would expect — and systematically asked: can I remove one button and still express everything else?
He built progressively smaller “calculators”:
- Calc 3 (6 primitives): exp, ln, negation, reciprocal, addition, and the ability to generate constants internally
- Calc 2 (4 primitives): exp, ln, and subtraction — subtraction being crucial because it is non-commutative, giving you both growth and inversion
- Calc 1 (4 primitives): binary exponentiation, binary logarithm, and the constant e or pi
- Calc 0 (3 primitives): exp, binary logarithm — absorbing the constant e into the exponential itself
At Calc 0, the pattern was clear: a single binary operator might exist. But it was not going to be any named function. He began enumerating candidate binary operations built from elementary pieces and testing each one. After numerous failures and a few false positives, EML emerged.
The testing itself is clever. Instead of trying to symbolically prove every identity (computationally intractable at the nesting depths involved), he substituted specific transcendental constants — like the Euler-Mascheroni constant — for variables, computed numerical values, and used an “inverse symbolic calculator” to check whether the outputs matched known expressions. Under standard mathematical conjectures, accidental numerical collisions are vanishingly unlikely. Full symbolic verification was then done separately.
What the expressions look like
EML expressions are binary trees where every internal node is the same operation. The grammar is absurdly simple:
S -> 1 | eml(S, S)
That is a context-free grammar with one terminal symbol and one production rule. Every elementary function is a sentence in this language.
The simplest expression is the constant e itself: eml(1, 1) = exp(1) - ln(1) = e. The exponential is depth 1. The natural log is depth 3 (seven nodes). Negation is depth 4. Multiplication requires depth 8 in the unoptimized compiler output, though direct exhaustive search finds shorter versions (depth about 17 nodes in Reverse Polish Notation).
The expressions get large. But “large” here means trees of maybe a few hundred nodes — tiny by the standards of modern neural networks with trillions of parameters.
Symbolic regression: discovering formulas from data
Here is where the paper gets genuinely practical. Because every elementary function is an EML tree of identical nodes, you can build a “master formula” — a complete binary tree of fixed depth where each node’s inputs are learnable linear combinations of the constant 1, the variable x, and the output of the node below.
Think of it like a neural network, but instead of layers of different neurons with different activation functions, every node does exactly the same thing: EML. The learnable parameters are just switches that route different inputs to different nodes.
You then train this tree on data using standard gradient descent (Adam optimizer), and if the data was generated by an elementary function, the trained weights snap to exact 0-or-1 values that recover the precise formula. Not an approximation. The exact symbolic expression.
When successful, the snapped weights yield mean squared errors at the level of machine epsilon squared (~10^-32), consistent with exact symbolic recovery.
The success rates from over 1000 experimental runs: 100% recovery at tree depth 2, about 25% at depths 3-4, and below 1% at depth 5. When the correct weights are perturbed by noise and re-optimized, recovery is 100% even at depths 5 and 6 — the basins of attraction exist, they are just hard to find from random initialization.
The cousins and the open questions
EML is not unique. At least two close relatives exist:
- EDL: exp(x) / ln(y), paired with the constant e
- -EML: ln(x) - exp(y), which is just EML with swapped arguments
The author speculates an entire continuous family of such operators may exist. A ternary variant T(x, y, z) = e^x / ln(x) * ln(z) / e^y has the nice property that T(x, x, x) = 1 — it generates its own constant from any input, which EML cannot do.
One limitation: EML requires complex arithmetic internally, even when computing real functions. Generating pi and trig functions requires going through ln(-1) = i*pi. This mirrors how quantum computing uses complex amplitudes to compute real probabilities.
Whether a purely real-valued continuous Sheffer operator exists remains open. Whether a unary Sheffer exists — one that could serve as both a neural activation function and a generator of all elementary functions — is another open question.
Claude’s Take
This is a genuinely surprising result. The existence of a continuous Sheffer operator was, as the author notes, “not anticipated.” The paper has the satisfying quality of a mathematical discovery that, once stated, feels almost inevitable — of course exp and ln should be able to bootstrap everything, we already knew they were the backbone — but the specific construction and the proof that a single binary combination suffices is new and non-obvious.
The methodology is sound but necessarily heuristic on the discovery side. The author is upfront about this: the exhaustive numerical search finds candidates, and separate symbolic verification confirms them. The verification is provided in supplementary materials. The symbolic regression application is genuinely interesting but clearly at proof-of-concept stage — the scaling problems (recovery rates dropping sharply with depth) are honestly reported.
A few things to flag: the paper is from a single author in theoretical physics, not a major ML lab, which makes the symbolic regression claims more preliminary but the core mathematical result no less valid. The practical utility is speculative at this stage — whether EML trees will actually outperform existing symbolic regression methods remains to be seen. The analog computing angle is intriguing but undeveloped.
The writing is clear and the paper does not oversell. The core result — that elementary functions form a simpler class than previously recognized, with a single generating operator — is the kind of structural insight that tends to find unexpected applications.
claude_score: 8 — This genuinely teaches something new and surprising. The core mathematical result is elegant and well-established. The ML application is promising but early. Docking a point for the symbolic regression being proof-of-concept rather than competitive with existing methods, and another for some of the more speculative claims about analog computing and activation functions that remain unsubstantiated. But the central discovery is the real contribution, and it delivers.
Further Reading
- Sheffer (1913), “A set of five independent postulates for Boolean algebras” — The original NAND gate paper. Where the idea of a single sufficient logical operator was born.
- AI Feynman / Udrescu & Tegmark (2020) — Physics-inspired symbolic regression. The leading existing method that EML trees would need to beat to matter practically.
- PySR / Cranmer (2023) — The current state-of-the-art open-source symbolic regression tool. Context for understanding where EML fits in the landscape.
- KAN: Kolmogorov-Arnold Networks / Liu et al. (2025) — A recent alternative architecture for interpretable function learning. Different approach to the same broad problem of making neural nets discover exact formulas.
- Smith et al. (2024), “An aperiodic monotile” — The “einstein” tiling discovery. Another recent example of finding a single primitive that generates unexpected richness — in geometry rather than analysis.