Part 7. Aristotle prover¶

Harmonic (the company behind Newclid) also has a Lean-based prover called Aristotle that is its closed-source flagship product. The company provides a web interface and API at https://aristotle.harmonic.fun that anyone can register and try (caveat: it can be slow and the job queue can be long). This note uses our small, friendly geometry problem to demonstrate the workflow, while the same system is aimed at harder, cutting-edge problems across mathematics (geometry, number theory, analysis, and more) [ARISTOTLE].

How Aristotle differs from Newclid (short version). Newclid is a geometry-first system: given a diagram-style construction and goals, its symbolic engine tries to derive a synthetic proof inside that geometry domain. Aristotle is a general Lean prover: it formalizes statements in Lean and then proves them with Lean tactics, so its output is a formal Lean proof rather than a geometry-specific trace. For geometry problems, Aristotle's geometric solver can still use geometric reasoning (which can look Newclid-like), but in this demo it chooses a coordinate model, which looks closer to the analytic approach in Part 2. This is why the artifacts here feel more Lean/coordinate-heavy than Newclid's geometric narrative.

Harmonic reports that Aristotle achieved gold medal-level performance on the 2025 International Mathematical Olympiad (IMO), solving 5 of 6 problems with formally verified solutions [HN-IMO][BW-IMO]. Problem 2 is about pure plane geometry [IMO2025P2]. It is unclear whether the published solution used Newclid-style geometric deduction; the Aristotle paper hints that a separate geometry solver is used, and a Newclid paper appeared shortly afterward. The proof also appears to be a geometric deduction rather than a Lean script.

These IMO announcements are a useful backdrop for this demo: we stay with our simple trapezoid ratio to show how Aristotle solves an elementary geometry problem, but the tooling is designed for much harder proofs in domains beyond geometry. Without full information on how Aristotle's geometry engine differs from the open-source Newclid, we focus on how the solutions differ in our small case. The white papers suggest that Aristotle trains an LLM with reinforcement learning to handle highly parallel Monte Carlo graph search over Lean proof states, using policy/value networks to choose nodes and estimate promise. This differs from Newclid's default BFS tree search, which is enhanced by auxiliary-point heuristics when the search gets stuck. That design discussion is beyond the scope of this note. For our simple problem, Aristotle indeed gives a distinct solution relying on Lean.

Aristotle has been used beyond contest problems. In late 2025, it produced a Lean proof of a simplified variant of Erdos Problem 124. Chinese coverage highlighted that the system completed a variant autonomously in about six hours, while other reports noted that the solved statement was easier than the original conjecture [E124-6H]. Nonetheless, this shows its early impact on the math community.

A few days later, Harmonic's tool was credited for helping resolve Erdos Problem 1026. A December 14, 2025 write-up described how the AI tool proved a key step $c(k^2)=1/k$ in Lean as part of the eventual solution [E1026-48H]. Harmonic's own news page claims that since Aristotle's public launch in late 2025, mathematicians have used it to formalize proofs across levels and that the company sees this as a new era of discovery [HN-MSI].

For broader context on Erdos problems, the community-maintained erdosproblems GitHub database tracks problem status, prizes, formalizations, and links to related resources. It is essentially a living index of the current state of the problem list and related metadata, with an auto-generated table based on the repository's data files [ERDOS-SOTA]. Top mathematician Prof. Terence Tao has been pioneering this effort and promoting the use of AI for math research. The accompanying wiki page on AI contributions is more careful and nuanced: it catalogs where AI tools have been used, but also stresses that this is not a benchmark or leaderboard, that literature review can be incomplete, and that assessments are provisional and context-dependent. The page is best read as a reference for tracing claims and understanding the role AI has actually played [ERDOS-AI].

Aristotle prompt interface

Figure 7.1: Aristotle's prompt interface. It accepts formal Lean input as well as informal text or file uploads.

As Figure 7.1 shows, the command-line application of Aristotle can offer an interactive proof interface. Appendix 7A uses the Lean proof given by Aristotle for the input in Figure 7.1 as a beginners' tutorial to Lean and Aristotle. They also offer a Python API, which is used in the code below to solve our polygon problem, including both questions. Note that the problem text is written in informal plain English, which is accepted by their API as well as the prompt, aside from the alternative Lean input format. Under the hood the model clearly utilizes a strong LLM for understanding such descriptions. In fact, the team has described its solver architecture as a pretrained LLM further trained with reinforcement learning for reasoning [ARISTOTLE].
In [ ]:
import os
import logging
from pathlib import Path
import os
import re

def load_aristotle_api_key():
    key_file = Path.home() / ".aristotle_api_key"
    text = key_file.read_text(encoding="utf-8")

    m = re.search(r'ARISTOTLE_API_KEY="([^"]+)"', text)
    if not m:
        raise RuntimeError("ARISTOTLE_API_KEY not found in ~/.aristotle_api_key")
    return m.group(1)

os.environ["ARISTOTLE_API_KEY"] = load_aristotle_api_key()
print("ARISTOTLE_API_KEY loaded into this kernel")

import aristotlelib
from aristotlelib.project import Project, ProjectInputType

async def prove_polygon_ratio():
    api_key = os.getenv("ARISTOTLE_API_KEY")
    if api_key:
        aristotlelib.set_api_key(api_key)
    else:
        raise RuntimeError("ARISTOTLE_API_KEY environment variable not set")

    logging.basicConfig(level=logging.INFO, format="%(levelname)s - %(message)s")

    problem_text = """
Polygon ratio problem (informal statement).
In trapezoid ABCD, AD ∥ BC, ∠C = 90°, and BC = CD.
Point E lies on CD with DE = AD. Through E draw EF ⟂ AB with F the foot.
Draw BD; it meets AE and EF at G and H, respectively. Prove:
1) BF * HE = GE * BH.
2) BE * DH = BD * EH.
""".strip() + "\n"

    await Project.prove_from_file(
        input_content=problem_text,
        output_file_path="aristotle/polygon_ratio_problem_aristotle.lean",
        validate_lean_project=False,
        project_input_type=ProjectInputType.INFORMAL,
        auto_add_imports=False,
    )

await prove_polygon_ratio()
Aristotle web interface screenshot

Figure 7.2 The Aristotle web UI at harmonic.fun. The demo job below was queued for a while and then ran for ~20 minutes, suggesting a resource constrained run for the trial account possibly not even on the GPU.

Lean quick intro¶

Lean is an interactive theorem prover: humans (or automated tactics) provide the proof steps, and Lean checks every step for correctness. In typical use, the proof is supplied by the user via tactics or proof terms, though automation can discharge many routine goals.

Before Lean became popular, prominent proof assistants and formal verification systems included Coq, Isabelle/HOL, HOL Light, ACL2, and Mizar. Lean was created at Microsoft Research by Leonardo de Moura and collaborators. It began as a general-purpose theorem prover for formal verification and mathematics, and later grew a large community library (mathlib). Rough timeline (high-level): early 2010s initial Lean releases; mid-2010s Lean 3 matured and became the main community platform; in the early 2020s Lean 4 modernized the system and tooling.

In recent years, researchers and tool builders have started using Lean not just for verification but also for assisting proof discovery: automated tactics, proof search, and AI-guided systems propose steps that Lean then verifies. This notebook uses Aristotle to generate a Lean proof automatically, while Lean itself remains the verifier. [LEAN]

Proof 7.1 Aristotle (Lean files + reading guide)¶

Human-readable summary (abstracted based on Lean)¶

Aristotle proves the ratio identities by coordinatizing the trapezoid in the complex plane. It sets $C=(0,0)$, $B=(a,0)$, $D=(0,a)$, $A=(x,a)$, and $E=(0,a-x)$, so the geometric constraints are encoded directly. The auxiliary points $F,G,H$ are then defined algebraically (projection and intersections). The two target identities are reduced to equalities of squared distances, denominators are cleared, and the resulting expressions are simplified by algebraic tactics. In short: the geometry is converted to algebra, and Lean verifies the algebraic identities. This highlights the step of converting the informal input description into a formal format, saving some work for the human.

Human-readable analytical proof (abstracted based on Lean)¶

  1. Coordinate model. Set $C=(0,0)$, $B=(a,0)$, $D=(0,a)$, $A=(x,a)$, $E=(0,a-x)$. This makes $AD\parallel BC$, $\angle C=90^\circ$, $BC=CD$, and $DE=AD$ automatic.

  2. How $H$ is found. The diagonal $BD$ has direction $(-a,\,a)$ and can be parametrized as $(a\,(1-s),\,a\,s)$. The line through $E$ perpendicular to $AB$ has direction $(a,\,-(x-a))$ and can be parametrized as $(0,a-x)+u\,(a,\,-(x-a))$. Solving $u a=a\,(1-s)$ and $a-x-u(x-a)=a\,s$ gives $u=\tfrac{x}{2a-x}$, hence $$ H=\Big(\frac{ax}{2a-x},\,\frac{2a(a-x)}{2a-x}\Big). $$

  3. How $F$ is found. $F$ is the orthogonal projection of $E$ onto the line $AB$. In coordinates this uses the standard projection formula $$ F=B+t(A-B),\quad t=\frac{\langle E-B,\,A-B\rangle}{\langle A-B,\,A-B\rangle}. $$ This yields explicit coordinates for $F$, which Lean uses algebraically rather than geometrically.

  4. Second identity. Compute squared lengths: $BE^2=a^2+(a-x)^2$, $BD^2=2a^2$, $DH^2=\tfrac{2a^2x^2}{(2a-x)^2}$, and $EH^2=\tfrac{x^2(a^2+(a-x)^2)}{(2a-x)^2}$. Thus $BE^2\,DH^2=BD^2\,EH^2$, implying $BE\cdot DH=BD\cdot EH$.

  5. First identity. The proof is analogous: substitute the explicit $F,G,H$ coordinates, expand squared distances, clear denominators, and simplify. The Lean proof follows the same algebraic pattern as the second identity, so we omit the details.

Input file (informal statement) - @polygon_ratio_problem.lean¶

This is the informal problem text sent to Aristotle (comments start with --):

-- Polygon ratio problem (informal statement for Aristotle)
-- In trapezoid ABCD, AD ∥ BC, ∠C = 90°, and BC = CD.
-- Point E lies on CD with DE = AD. Through E draw EF ⟂ AB with F the foot.
-- Draw BD; it meets AE and EF at G and H, respectively. Prove:
-- 1) BF * HE = GE * BH
-- 2) BE * DH = BD * EH

Aristotle output (formal Lean proof) - polygon_ratio_problem_aristotle.lean¶

Below are key parts of the generated Lean file with line numbers, each followed by extended commentary that explains the Lean keywords and how the proof progresses.

Lean output (lines 1-53):

 1: /-
 2: This file was generated by Aristotle.
 3: 
 4: Lean version: leanprover/lean4:v4.24.0
 5: Mathlib version: f897ebcf72cd16f89ab4577d0c826cd14afaafc7
 6: This project request had uuid: defb37a9-102b-470c-8442-c3396eccd211
 7: 
 8: To cite Aristotle, tag @Aristotle-Harmonic on GitHub PRs/issues, and add as co-author to commits:
 9: Co-authored-by: Aristotle (Harmonic) <aristotle-harmonic@harmonic.fun>
10: -/
11: 
12: /-
13: We define the points of the trapezoid in the complex plane and prove the two required geometric identities using algebraic simplification of the distances.
14: The points are defined as follows:
15: - C is the origin (0,0).
16: - B is at (a,0).
17: - D is at (0,a).
18: - A is at (x,a).
19: - E is at (0, a-x).
20: - F is the projection of E onto AB.
21: - G is the intersection of AE and BD.
22: - H is the intersection of EF and BD.
23: 
24: We prove:
25: 1. BF * HE = GE * BH (`polygon_ratio_1`)
26: 2. BE * DH = BD * EH (`polygon_ratio_2`)
27: -/
28: 
29: import Mathlib
30: 
31: set_option linter.mathlibStandardSet false
32: 
33: open scoped BigOperators
34: open scoped Real
35: open scoped Nat
36: open scoped Classical
37: open scoped Pointwise
38: 
39: set_option maxHeartbeats 0
40: set_option maxRecDepth 4000
41: set_option synthInstance.maxHeartbeats 20000
42: set_option synthInstance.maxSize 128
43: 
44: set_option relaxedAutoImplicit false
45: set_option autoImplicit false
46: 
47: noncomputable section
48: 
49: /-
50: Definitions of points A, B, C, D, E, F, G, H in the complex plane for the trapezoid problem.
51: -/
52: open Complex
53: 

Commentary (lines 1-53): This block is metadata and explanatory prose rather than executable code. The multi-line comment delimiters /- ... -/ mark documentation. The header records the Lean and mathlib versions, which matter because tactics and lemmas can change across releases. The project UUID identifies the Aristotle run. The next comment describes the coordinate model and names the two target identities, giving the reader a roadmap before formal definitions begin. The import Mathlib and set_option lines also live in this block: they load the library and tune the proof engine (heartbeats, recursion depth, instance search limits) so algebraic automation can complete. This section moves the proof from narrative intent to a precise plan.

Lean output (lines 54-76):

54: noncomputable def pC : ℂ := 0
55: noncomputable def pD (a : ℝ) : ℂ := I * a
56: noncomputable def pB (a : ℝ) : ℂ := a
57: noncomputable def pA (a x : ℝ) : ℂ := x + I * a
58: noncomputable def pE (a x : ℝ) : ℂ := I * (a - x)
59: 
60: noncomputable def pF (a x : ℝ) : ℂ :=
61:   let AB := pA a x - pB a
62:   let EB := pE a x - pB a
63:   let t := (EB * star AB).re / (AB * star AB).re
64:   pB a + t * AB
65: 
66: noncomputable def pG (a x : ℝ) : ℂ := (x / 2) + I * (a - x / 2)
67: 
68: noncomputable def pH (a x : ℝ) : ℂ :=
69:   let den := 2 * a - x
70:   (a * x / den) + I * (2 * a * (a - x) / den)
71: 
72: /-
73: Prove that BF * HE = GE * BH in the configured trapezoid.
74: -/
75: open Complex Metric
76: 

Commentary (lines 54-76): This block is the fully translated Lean version of the geometric setup. noncomputable and open Complex make complex-number geometry available. The def keyword introduces named definitions; here pA, pB, pC, pD, pE, pF, pG, pH are points encoded as complex numbers, with I the imaginary unit. pF encodes orthogonal projection using complex conjugation (star) and real parts (.re) as dot products. pG and pH are explicit intersection formulas. This is the precise coordinate model that the later algebraic proofs use.

Lean output (lines 77-92):

77: theorem polygon_ratio_1 (a x : ℝ) (ha : a ≠ 0) (hx : x ≠ 0) (hden : 2 * a ≠ x) :
78:   dist (pF a x) (pB a) * dist (pE a x) (pH a x) = dist (pG a x) (pE a x) * dist (pH a x) (pB a) := by
79:   unfold pF pE pG pH at *;
80:   unfold pA pB at *;
81:   norm_num [ Complex.dist_eq, Complex.normSq, Complex.norm_def ];
82:   norm_num [ Complex.normSq, Complex.div_re, Complex.div_im ] at *;
83:   field_simp;
84:   rw [ ← Real.sqrt_mul <| by positivity, ← Real.sqrt_mul <| by positivity ];
85:   rw [ ← Real.sqrt_mul <| by positivity, ← Real.sqrt_mul <| by positivity ] ; ring;
86:   grind
87: 
88: /-
89: Prove that BE * DH = BD * EH in the configured trapezoid.
90: -/
91: open Complex Metric
92: 

Commentary (lines 77-92): theorem introduces a proposition to prove. The objective statement is on line 77: it asserts the first ratio identity as an equality of distances. The hypotheses ha, hx, and hden assert non-degeneracy so denominators are safe (e.g., 2a-x ≠ 0). The proof enters tactic mode with by. unfold expands definitions, turning geometric objects into explicit algebra. norm_num expands norms and distances into algebraic expressions. field_simp clears denominators. rw is the rewrite tactic: it replaces the current goal using a known equality (here, square-root identities), so expressions are put into a form that ring and grind can finish. This block advances the proof by reducing geometry to a polynomial identity and closing it mechanically.

Lean output (lines 93-101):

 93: theorem polygon_ratio_2 (a x : ℝ) (ha : a ≠ 0) (hx : x ≠ 0) (hden : 2 * a ≠ x) :
 94:   dist (pE a x) (pB a) * dist (pD a) (pH a x) = dist (pD a) (pB a) * dist (pE a x) (pH a x) := by
 95:   unfold pB pD pE pH;
 96:   norm_num [ Complex.normSq, Complex.norm_def, Complex.dist_eq ];
 97:   norm_cast; norm_num [ Complex.normSq, Complex.div_re, Complex.div_im, ha, hx, hden ] ; ring;
 98:   rw [ ← Real.sqrt_mul, ← Real.sqrt_mul ];
 99:   · grind;
100:   · positivity;
101:   · nlinarith [ sq_nonneg ( a - x ) ]

Commentary (lines 93-101): polygon_ratio_2 mirrors the first proof. norm_cast resolves coercions between numeric types. positivity discharges non-negativity side conditions required for square roots. nlinarith solves remaining nonlinear arithmetic constraints. Conceptually, this block repeats the same algebraic pipeline: expand distances, clear denominators, and verify the polynomial equality. It completes the proof of the second identity under the same non-degeneracy assumptions.

Note on proof style. In this problem Aristotle proceeds by a coordinate model, so the proof is essentially algebraic and very similar in spirit to Part 2. That choice is not a rule of Aristotle; it is just the strategy selected for this run. In contrast, Newclid (Part 6) is designed to stay in the geometry domain and produce a synthetic-style trace from diagram constraints. So the key difference to keep in mind is not correctness but style of reasoning and output: Aristotle outputs a Lean proof (general formal system), while Newclid outputs a geometry-specific derivation. For other problems, Aristotle can also use geometric/angle-chasing reasoning, as seen in the IMO 2025 P2 formalization [IMO2025P2], but here it stayed with coordinates for simplicity.

All in all, tools like Lean + mathlib (Lean language and theorem database), Aristotle (an LLM trained with RL for Lean), and ChatGPT 5.2 Pro (and other frontier LLMs) are among the most advanced AI tools available for mathematicians. Together they point toward a frontier of machine-assisted mathematical discovery.

References¶

  • [ARISTOTLE] Harmonic. "Aristotle" (web interface). Summary: Web interface and API access for the Aristotle prover. URL: https://aristotle.harmonic.fun; arxiv link of the paper at https://arxiv.org/pdf/2510.01346
  • [E124-6H] 36Kr (2025-11-). "AI solved a 30-year math problem in 6 hours. Terence Tao: ChatGPT and others failed." Summary: Coverage of a simplified Erdős #124 variant solved with Aristotle. URL: https://eu.36kr.com/en/p/3576638922980231
  • [E1026-48H] 36Kr (2025-12-14). "Terence Tao is shocked: The “pit” dug by a mathematician in 1975 was filled by AI and netizens around the world in just 48 hours." Summary: Coverage of Erdős #1026 and an Aristotle-assisted Lean step. URL: https://eu.36kr.com/en/p/3596176018898947
  • [HN-MSI] Harmonic News (2025). "Harmonic announces mathematician sponsorships to accelerate mathematical superintelligence." Summary: Harmonic’s announcement and claims about Aristotle usage. URL: https://harmonic.fun/news
  • [HN-IMO] Harmonic News (2025-07-28). "Aristotle Achieves Gold Medal-Level Performance at the International Mathematical Olympiad". Summary: Harmonic announcement of IMO 2025 results and formal verification. URL: https://harmonic.fun/news
  • [BW-IMO] Business Wire (2025-07-28). "Harmonic Announces IMO Gold Medal-Level Performance & Launch of First Mathematical Superintelligence (MSI) AI App." Summary: Press release on IMO 2025 result and Aristotle launch. URL: https://www.businesswire.com/news/home/20250728394917/en/Harmonic-Announces-IMO-Gold-Medal-Level-Performance-Launch-of-First-Mathematical-Superintelligence-MSI-AI-App
  • [IMO2025P2] Harmonic AI (2025). "IMO 2025 P2 formalization (Lean)." Summary: Lean formalization file for IMO 2025 Problem 2. URL: https://github.com/harmonic-ai/IMO2025/blob/main/HarmonicLean/IMO2025P2.txt
  • [ERDOS-SOTA] Terence Tao et al. (2025). "Erdos Problems" (overview). Summary: Community-maintained status tracker for Erdos problems. URL: https://github.com/teorth/erdosproblemstab=readme-ov-file
  • [ERDOS-AI] Terence Tao et al. (2025). "AI contributions to Erdos problems" (wiki). Summary: Notes on AI involvement in Erdos problem progress. URL: https://github.com/teorth/erdosproblems/wiki/AI-contributions-to-Erd%C5%91s-problems
  • [LEAN] Lean Prover (2025). "Lean Theorem Prover" (official site). Summary: Official documentation and downloads. URL: https://leanprover.github.io/

Appendix 7A: Introduction of Lean and Aristotle with the Angle Bisector Theorem (understand the output angle_bisector_problem.lean)¶

We will use the bisector problem defined in Figure 7.1 as a pilot to give another intro to Lean language and Aristotle. The input statement shown in Figure 7.1 contains a small typo, but it still produced the angle_bisector_problem.lean downloade in the ./aristotle folder. While our focus is to better understand Lean language syntax and its application to this simple problem, Aristotle clearly pivoted quickly to find the typo of the statement. Aristotle's run either hit a wall and then suggested the corrected objective, or its LLM was capable to spot the error right away. In fact since Aristotle has been trained on top of LLM, such correction is fully within its capacity [ARISTOTLE]. However, this still indicates its human-like that's mostly missing from any of the methods mentioned in Part 5 or 6 of the notes. Other than that this appendix summarizes the Lean output and how to read it. Every line of the Lean file is quoted (with line numbers, unchanged content), broken into frequent chunks, with very beginner-friendly commentary—including the deeper explanations of the exact? “holes” (what they’re supposed to fill, and how to discover the lemma names) without modifying the file. While for a simple bisector theorem this Lean output seems to be an over kill, it nevertheless shows all the necessary logics to deduct the famous output using other known facts (in this case the Law of Sines).

Aristotle prompt interface

Figure 7A.1: The lean proof environment of vscode with Lean4 extension. 1. the lean file get all the color, warning, or error (red squiggly lines under the error) if any. 2. the lean_viwer shows the warnings, and translated the exact? to the right suggestions. 3. the command line way of using lake env lean filename.lean to show the status, alternatively

First: what’s going on here¶

This .lean file is a formal proof script. Think of it as:

  • Theorem statement = a precise claim (a “type” in Lean).
  • Proof = an object/program that convinces Lean’s kernel the claim is true.

So a Lean theorem is like a function:

  • input: hypotheses (assumptions)
  • output: the conclusion
  • body: the proof steps The original request was: AB/AC = AD/DC. The classical Angle Bisector Theorem is: AB/AC = BD/DC. Notice: the conclusion compares how the angle bisector cuts the opposite side BC (that’s BD and DC). Using AD instead is usually false: AD is a “cevian length” that varies in a more complicated way than the side-splitting ratio.

As show below in the Lean output, Aristotle explicitly repairs the statement by replacing AD with BD by the built in language model most likely at certain point of its effort to prove the statement. It is unforunately lack of logging from its API to reveal more details of this critical "thinking" step. Yet once the statement is repaired, Lean can proceed with a formal derivation in the rest of the file. The mean Lean library Aristotle uses is MathLib, which contains many theorems [MATHLIB], but the angle bisector theorem isn't one of them; so this suggestion wasn't directly from MathLib database, but more related to the LLM intelligence.

Lean / mathlib mini dictionary (for a reader who knows nothing about it)¶

  • /- ... -/ means coments that can span multiple lines.
  • ¬ P means “not P”.
  • ∧ means “and”.
  • := by ... means “here is the proof, written in tactic steps”.
  • by_cases h : X means “split into cases: (1) h is true, (2) h is false”.
  • simp / simp_all means “simplify using known rewrite rules (and in simp_all, also using all hypotheses)”.
  • rw [h] means “rewrite using equality h”.
  • ⊢ P means "the goal is P / it suffices to prove P"
  • exact t means “the proof is exactly t”.
  • exact? means “Lean, please search for an expression/lemma that fits here.”

Geometry notations:

  • dist A B = length AB.
  • angle B A D = ∠BAD (vertex at A).
  • {A, B, C} is a finite set of points.
  • Collinear ℝ {A, B, C} means “A,B,C are on one line.”
  • affineSegment ℝ B C = the closed segment from B to C (includes endpoints).
  • Sbtw ℝ B D C = “D is strictly between B and C” (on the segment, not equal to endpoints). Below let's read-along with the original file

Chunk 1 — Metadata + the repaired statement + the proof plan¶

001: /-
002: This file was generated by Aristotle.
003:
004: Lean version: leanprover/lean4:v4.24.0
005: Mathlib version: f897ebcf72cd16f89ab4577d0c826cd14afaafc7
006: This project request had uuid: 86e2b544-c6c8-49e3-bca1-e4f015c16ba1
007:
008: To cite Aristotle, tag @Aristotle-Harmonic on GitHub PRs/issues, and add as co-author to commits:
009: Co-authored-by: Aristotle (Harmonic) <aristotle-harmonic@harmonic.fun>
010: -/
011:
012: /-
013: We prove the Angle Bisector Theorem.
014: The user's request "AB/AC=AD/DC" contains a typo (AD should be BD).
015: We prove the standard theorem: if AD bisects angle A and D is on BC, then AB/AC = BD/DC.
016:
017: The proof proceeds by:
018: 1. Establishing that D lies strictly between B and C (`sbtw_of_angle_bisector_intersect`).
019: 2. Showing that sin(∠ADB) = sin(∠ADC) because the angles are supplementary (`sin_angle_eq_sin_angle_of_sbtw`).
020: 3. Applying the Law of Sines to triangles ABD and ACD.
021:    - In ABD: BD / sin(A/2) = AB / sin(∠ADB)
022:    - In ACD: DC / sin(A/2) = AC / sin(∠ADC)
023: 4. Dividing these equations and using the equality of sines gives the result.
024: -/

Commentary (beginner mode):

  • Lines 4–6 pin down versions. Proof libraries change; this tells us exactly what Aristotle used.
  • Lines 14–15 are the crucial “typo fix.” This is not cosmetic. If we try to prove the wrong statement in Lean, it’s never succeed because it’s simply not derivable from the hypotheses. A smarter prover might use numerical solver to find counter examples. The human provers might offer constructive correction to the theorem; and so did Aristotle in this case.
  • Lines 17–23 give a human proof outline. That’s already a hint at how these systems work: the agent often chooses a familiar skeleton (Law of Sines) and then Lean forces all the details to be made explicit. In the paper session 2.3.2, it was discussed that Law of sines can be added by its algebraic reasoning engine Yuclid which slows down 10x; so Sine Law is turned off in its recommended configurations [ARISTOTLE]. So it is interesting the rule is used in this proof in 2026 API.

Chunk 2 — Importing mathlib + turning off proof timeouts¶

025:
026: import Mathlib
027:
028: set_option linter.mathlibStandardSet false
029:
030: open scoped BigOperators
031: open scoped Real
032: open scoped Nat
033: open scoped Classical
034: open scoped Pointwise
035:
036: set_option maxHeartbeats 0
037: set_option maxRecDepth 4000
038: set_option synthInstance.maxHeartbeats 20000
039: set_option synthInstance.maxSize 128
040:
041: set_option relaxedAutoImplicit false
042: set_option autoImplicit false
043:
044: noncomputable section

Commentary:

  • import Mathlib loads the standard math library (geometry, trig, algebra, automation tactics). The Law of Sines is one of its theorems.
  • “Heartbeats” are Lean’s internal fuel to stop runaway automation. Setting them huge (or 0 = unlimited) is common in machine-generated proofs that rely on search.
  • autoImplicit false forces variables to be declared explicitly. This reduces “mystery implicit variables,” which helps both machines and humans debug.

Chunk 3 — Opening the geometry world + declaring the ambient space¶

045:
046: /-
047: In a non-degenerate triangle ABC, if D is the intersection of the angle bisector of A with the segment BC, then D is strictly between B and C.
048: -/
049: open EuclideanGeometry Metric
050:
051: variable {V : Type*} {P : Type*} [NormedAddCommGroup V] [InnerProductSpace ℝ V] [MetricSpace P] [NormedAddTorsor V P]
052:

Commentary:

  • open EuclideanGeometry Metric enables names like dist, angle, segment definitions, etc.
  • Line 51 is Lean being extremely general: it’s saying “we are in some Euclidean affine space.”
    • P = points
    • V = vectors (displacements between points)
    • the typeclass brackets [...] are Lean’s way of saying “assume V behaves like a real inner product space, and P is an affine space modeled on V.” Note that V and P define the affine space instead of treating all points as a vector space which would require the definition of origin that's completely avoided here. With this setup, two points A and B can have subtraction to define the vector AB as B -ᵥ A.

Chunk 4 — Lemma 1: from “D lies on segment BC + bisector” to “D is strictly between B and C”¶

053: lemma sbtw_of_angle_bisector_intersect (A B C D : P)
054:   (h_triangle : ¬ Collinear ℝ {A, B, C})
055:   (h_D_on_BC : D ∈ affineSegment ℝ B C)
056:   (h_bisector : angle B A D = angle C A D) :
057:   Sbtw ℝ B D C := by
058:   -- By contradiction, assume $D = B$.
059:   by_cases hDB : D = B;
060:   · simp_all +decide [ EuclideanGeometry.angle ];
061:     rw [ InnerProductGeometry.angle_self ] at h_bisector;
062:     · rw [ eq_comm, InnerProductGeometry.angle_eq_zero_iff ] at h_bisector;
063:       refine' h_triangle _;
064:       rw [ collinear_iff_exists_forall_eq_smul_vadd ];
065:       use A, C -ᵥ A;
066:       simp +zetaDelta at *;
067:       exact ⟨ ⟨ 0, by simp +decide ⟩, ⟨ h_bisector.2.choose, by rw [ ← h_bisector.2.choose_spec.2, vsub_vadd ] ⟩, ⟨ 1, by simp +decide ⟩ ⟩;
068:     · exact vsub_ne_zero.mpr ( by rintro rfl; exact h_triangle <| by simp +decide [ collinear_pair ] );
069:   · by_cases hDC : D = C <;> simp_all +decide [ Sbtw ];
070:     · simp_all +decide [ EuclideanGeometry.angle ];
071:       rw [ InnerProductGeometry.angle_self ] at h_bisector;
072:       · rw [ InnerProductGeometry.angle_eq_zero_iff ] at h_bisector;
073:         refine' h_triangle _;
074:         rw [ collinear_iff_exists_forall_eq_smul_vadd ];
075:         exact ⟨ A, B -ᵥ A, fun p hp => by rcases hp with ( rfl | rfl | rfl ) <;> [ exact ⟨ 0, by simp +decide ⟩ ; exact ⟨ 1, by simp +decide ⟩ ; exact ⟨ h_bisector.2.choose, by rw [ ← h_bisector.2.choose_spec.2, vsub_vadd ] ⟩ ] ⟩;
076:       · simp_all +decide [ sub_eq_zero ];
077:         rintro rfl; simp_all +decide [ collinear_pair ];
078:     · exact?
079:

What this lemma is really doing (human geometry)¶

  • h_D_on_BC says D is on the closed segment BC. That still allows D = B or D = C.
  • Sbtw ℝ B D C says D is strictly inside the segment (not endpoints).
  • So the plan is: rule out endpoint cases using the “non-degenerate triangle” hypothesis.

Why line 59 does a case split¶

by_cases hDB : D = B means:

  • Case 1: D is B (degenerate)
  • Case 2: D is not B (the “normal” case) Here Case 2 is further break into the following by_cases hDC : D = C means:
  • Case 1: D is C (degenerate)
  • Case 2: D is not C (the “normal” case) So this is a nested case. This is a classic Lean move: explicitly isolate degenerate cases because later trig and division facts can fail or behave differently there.

What the angle/collinear stuff is about (lines 60–67)¶

  • angle BAB = angle CAB and in mathlib, degenerate angles are defined in a particular way; but the net effect is that we can force a contradiction that A,B,C become collinear, which violates h_triangle. So lines 63–67 are Lean laboriously constructing a formal proof of collinearity in mathlib’s chosen definition (collinear_iff_exists_forall_eq_smul_vadd).

The key “automation hole” at line 78: what exact? is expected to find here?¶

At line 69 we’re already in the branch “D ≠ B”, and inside it we split on D = C. The last remaining branch is:

  • D ∈ segment BC
  • D ≠ B
  • D ≠ C → therefore D is strictly between B and C. That last implication is extremely standard, so Aristotle writes exact?. So,
  • Sbtw is “between + not endpoints”.
  • We already have “between-ish” from h_D_on_BC.
  • We already have “not endpoints” from the case splits.
  • So there should be a lemma of the form:

If D ∈ affineSegment ℝ B C and D ≠ B and D ≠ C then Sbtw ℝ B D C.

Let's discover the exact lemma name in practice (no file modifications needed):

  • Put the cursor on exact? in an editor with Lean (VS Code + Lean extension). Lean will often suggest a replacement proof term / lemma application. In our case it suggested ⊢ Wbtw ℝ B D C, which is just the hypothesis H_D_on_BC since this branch already has hDB : ¬D = B and hDC : ¬D = C
  • Alternatively, we can try simp or aesop in a scratch buffer (not altering the original) and see what it uses.

In my case I have to replace exact? to exact h_D_on_BC because exact? doesn't resolve the proof strangely.

Chunk 5 — Lemma 2: “straight line at D” ⇒ equal sines of supplementary angles¶

080: /-
081: If D is strictly between B and C, then sin(angle ADB) = sin(angle ADC).
082: -/
083: open EuclideanGeometry Metric
084:
085: variable {V : Type*} {P : Type*} [NormedAddCommGroup V] [InnerProductSpace ℝ V] [MetricSpace P] [NormedAddTorsor V P]
086:
087: lemma sin_angle_eq_sin_angle_of_sbtw (A B C D : P) (h : Sbtw ℝ B D C) :
088:   Real.sin (angle A D B) = Real.sin (angle A D C) := by
089:   -- By the properties of angles in a triangle, we know that angle BDC = π.
090:   have h_angle_BDC : angle B D C = Real.pi := by
091:     exact?;
092:   -- By the properties of angles in a triangle, we know that angle ADB + angle ADC = π.
093:   have h_angle_sum : angle A D B + angle A D C = Real.pi := by
094:     exact?;
095:   rw [ ← Real.sin_pi_sub, ← h_angle_sum, add_sub_cancel_left ]
096:

If B–D–C is a straight line, then ∠BDC = π, and ∠ADB + ∠ADC = π, so sin(∠ADB) = sin(∠ADC) by sin(π−x)=sin(x).

What line 90–91 (exact?) is doing: We have h : Sbtw ℝ B D C. From “strictly between”, mathlib usually provides a lemma saying:

  • the angle ∠BDC is a straight angle, i.e. equals π. So exact? at line 91 is expected to find “Sbtw implies angle = π” (with the correct ordering of points). Again, in Lean, the order of points matters a lot. angle B D C is ∠BDC (vertex D). A lemma that proves angle B C D = π would not help here.

What line 93–94 (exact?) is doing¶

Given ∠BDC = π, mathlib usually has a lemma saying:

  • if rays DB and DC form π, then for any A, the adjacent angles ∠ADB and ∠ADC sum to π. Again, exact? is the placeholder for “the library already knows supplementary angles sum to π in this configuration.”

Line 95: the trig identity move¶

rw [ ← Real.sin_pi_sub, ← h_angle_sum, add_sub_cancel_left ] is a compact chain:

  • rewrite using sin(π − x) = sin x,
  • replace π by (∠ADB + ∠ADC),
  • cancel algebraically to turn sin(∠ADC) into sin(∠ADB).

Chunk 6 — Restating the theorem goal section header¶

097: /-
098: The Angle Bisector Theorem: In a triangle ABC, if the bisector of angle A intersects the segment BC at D, then AB/AC = BD/DC.
099: -/
100: open EuclideanGeometry Metric
101:
102: variable {V : Type*} {P : Type*} [NormedAddCommGroup V] [InnerProductSpace ℝ V] [MetricSpace P] [NormedAddTorsor V P]
103:

Commentary: This is just a fresh section header. Machines often repeat context to keep things local and avoid implicit state. This is also the first time the program introduce Metric to the system, which is needed for the distance.

Chunk 7 — Main theorem (part 1): encoding Law of Sines twice¶

104: theorem angle_bisector_theorem_corrected (A B C D : P)
105:   (h_triangle : ¬ Collinear ℝ {A, B, C})
106:   (h_D_on_BC : D ∈ affineSegment ℝ B C)
107:   (h_bisector : angle B A D = angle C A D) :
108:   dist A B / dist A C = dist B D / dist D C := by
109:   -- By definition of angle bisector, we know that $A B / A C = B D / D C$.
110:   have h_angle_bisector : Real.sin (angle B A D) / dist B D = Real.sin (angle A D B) / dist A B ∧ Real.sin (angle C A D) / dist D C = Real.sin (angle A D C) / dist A C := by
111:     apply And.intro;
112:     · convert EuclideanGeometry.sin_angle_div_dist_eq_sin_angle_div_dist _ _ using 1;
113:       congr! 1;
114:       · exact dist_comm _ _;
115:       · rw [ EuclideanGeometry.sin_angle_div_dist_eq_sin_angle_div_dist ];
116:         · rw [ dist_comm ];
117:         · rintro rfl; simp_all +decide [ collinear_pair ];
118:         · rintro rfl;
119:           refine' h_triangle _;
120:           rw [ collinear_iff_exists_forall_eq_smul_vadd ];
121:           rcases h_D_on_BC with ⟨ r, hr, rfl ⟩;
122:           exact ⟨ B, C -ᵥ B, fun p hp => by rcases hp with ( rfl | rfl | rfl ) <;> [ exact ⟨ r, by simp +decide [ AffineMap.lineMap_apply ] ⟩ ; exact ⟨ 0, by simp +decide ⟩ ; exact ⟨ 1, by simp +decide ⟩ ] ⟩;
123:       · rintro rfl;
124:         refine' h_triangle _;
125:         rw [ collinear_iff_exists_forall_eq_smul_vadd ];
126:         obtain ⟨ r, hr ⟩ := h_D_on_BC;
127:         use B, C -ᵥ B;
128:         simp +decide [ ← hr.2, AffineMap.lineMap_apply ];
129:         exact ⟨ ⟨ r, rfl ⟩, ⟨ 0, by simp +decide ⟩, ⟨ 1, by simp +decide ⟩ ⟩;
130:       · rintro rfl;
131:         simp_all +decide [ EuclideanGeometry.angle ];
132:         rw [ InnerProductGeometry.angle_self ] at h_bisector;
133:         · rw [ eq_comm, InnerProductGeometry.angle_eq_zero_iff ] at h_bisector;
134:           refine' h_triangle _;
135:           rw [ collinear_iff_exists_forall_eq_smul_vadd ];
136:           use A, C -ᵥ A;
137:           simp +zetaDelta at *;
138:           exact ⟨ ⟨ 0, by simp +decide ⟩, ⟨ h_bisector.2.choose, by rw [ ← h_bisector.2.choose_spec.2, vsub_vadd ] ⟩, ⟨ 1, by simp +decide ⟩ ⟩;
139:         · intro h; simp_all +decide [ sub_eq_zero ];
140:           exact h_triangle ( collinear_pair ℝ A C );
141:     · convert EuclideanGeometry.sin_angle_div_dist_eq_sin_angle_div_dist _ _ using 1;
142:       · rw [ EuclideanGeometry.sin_angle_div_dist_eq_sin_angle_div_dist ];
143:         · rw [ dist_comm C A ];
144:         · rintro rfl; simp_all +decide [ collinear_pair ];
145:         · rintro rfl;
146:           refine' h_triangle _;
147:           rw [ collinear_iff_exists_forall_eq_smul_vadd ];
148:           rcases h_D_on_BC with ⟨ r, hr, rfl ⟩;
149:           exact ⟨ B, C -ᵥ B, fun p hp => by rcases hp with ( rfl | rfl | rfl ) <;> [ exact ⟨ r, by simp +decide [ AffineMap.lineMap_apply ] ⟩ ; exact ⟨ 0, by simp +decide ⟩ ; exact ⟨ 1, by simp +decide ⟩ ] ⟩;
150:       · rintro rfl;
151:         refine' h_triangle _;
152:         rw [ collinear_iff_exists_forall_eq_smul_vadd ];
153:         rcases h_D_on_BC with ⟨ r, hr, rfl ⟩;
154:         exact ⟨ B, C -ᵥ B, fun p hp => by rcases hp with ( rfl | rfl | rfl ) <;> [ exact ⟨ r, by simp +decide [ AffineMap.lineMap_apply ] ⟩ ; exact ⟨ 0, by simp +decide ⟩ ; exact ⟨ 1, by simp +decide ⟩ ] ⟩;
155:       · rintro rfl;
156:         simp_all +decide [ EuclideanGeometry.angle ];
157:         rw [ InnerProductGeometry.angle_self ] at h_bisector;
158:         · rw [ InnerProductGeometry.angle_eq_zero_iff ] at h_bisector;
159:           refine' h_triangle _;
160:           rw [ collinear_iff_exists_forall_eq_smul_vadd ];
161:           exact ⟨ A, B -ᵥ A, fun p hp => by rcases hp with ( rfl | rfl | rfl ) <;> [ exact ⟨ 0, by simp +decide ⟩ ; exact ⟨ 1, by simp +decide ⟩ ; exact ⟨ h_bisector.2.choose, by rw [ ← h_bisector.2.choose_spec.2, vsub_vadd ] ⟩ ] ⟩;
162:         · exact vsub_ne_zero.mpr ( by rintro rfl; simp_all +decide [ collinear_pair ] );

What’s the conceptual move here?¶

Line 110 defines h_angle_bisector as two Law-of-Sines-style equalities (joined by ∧ = “and”):

  • One equality will correspond to triangle A D B.
  • The other corresponds to triangle A D C. The lemma used is:
  • EuclideanGeometry.sin_angle_div_dist_eq_sin_angle_div_dist That’s the library’s packaged “Law of Sines symmetry” in the form: $$\frac{\sin(\angle)}{\text{opposite side}} = \frac{\sin(\angle)}{\text{opposite side}}$$ but written in mathlib’s preferred ordering.

Why so many lines about degeneracy?¶

Because these identities involve dividing by distances. Lean will not let us divide by zero silently. So we see many sub-branches like:

  • rintro rfl; ... which is Lean-speak for:
    • “Assume two points are equal (so the distance becomes 0). Show this contradicts h_triangle.” That is “formal hygiene”: humans skip it; Lean forces it.

Why the convert ... using 1 trick? Because the library lemma’s exact statement may not match the goal’s exact arrangement of points/distances. convert says:

“This lemma gives something equivalent; let me adjust the expression slightly to make it match my goal.” Then congr! helps prove the adjusted pieces are equal (e.g., swapping dist A B to dist B A using dist_comm).

Chunk 8 — Main theorem (part 2): use the sine equality, clear denominators, finish¶

163:   -- Since $\sin(\angle A D B) = \sin(\angle A D C)$, we can equate the two expressions from the Angle Bisector Theorem.
164:   have h_sin_eq' : Real.sin (angle A D B) = Real.sin (angle A D C) := by
165:     apply sin_angle_eq_sin_angle_of_sbtw;
166:     exact?;
167:   by_cases hAB : dist A B = 0 <;> by_cases hAC : dist A C = 0 <;> simp_all +decide [ div_eq_mul_inv ];
168:   · exact False.elim ( h_triangle <| collinear_pair ℝ _ _ );
169:   · exact Or.inr ( h_angle_bisector.2.resolve_left ( by norm_num [ mul_div ] ) );
170:   · by_cases hBD : dist B D = 0 <;> by_cases hDC : dist D C = 0 <;> simp_all +decide [ mul_assoc, mul_comm, mul_left_comm ];
171:     have h_sin_ne_zero : Real.sin (angle C A D) ≠ 0 := by
172:       intro h_sin_zero_triangle; simp_all +decide [ EuclideanGeometry.angle ] ;
173:       rw [ Real.sin_eq_zero_iff_of_lt_of_lt ] at h_sin_zero_triangle <;> try linarith [ Real.pi_pos, InnerProductGeometry.angle_nonneg ( C -ᵥ A ) ( D -ᵥ A ), InnerProductGeometry.angle_le_pi ( C -ᵥ A ) ( D -ᵥ A ) ] ;
174:       · rw [ InnerProductGeometry.angle_eq_zero_iff ] at h_sin_zero_triangle;
175:         obtain ⟨ r, hr, hr' ⟩ := h_sin_zero_triangle.2; simp_all +decide [ sub_eq_iff_eq_add ] ;
176:         rw [ InnerProductGeometry.angle_eq_zero_iff ] at h_bisector ; simp_all +decide [ sub_eq_iff_eq_add ] ;
177:         obtain ⟨ s, hs, hs' ⟩ := h_bisector.2; simp_all +decide [ collinear_pair ] ;
178:         refine' h_triangle _;
179:         rw [ collinear_iff_exists_forall_eq_smul_vadd ] ; use A ; simp +decide [ hs', hr' ];
180:         exact ⟨ B -ᵥ A, ⟨ 0, by simp +decide ⟩, ⟨ 1, by simp +decide ⟩, ⟨ s, by rw [ ← hs', vsub_vadd ] ⟩ ⟩;
181:       · refine' lt_of_le_of_ne ( InnerProductGeometry.angle_le_pi _ _ ) _;
182:         intro h; simp_all +decide [ InnerProductGeometry.angle_eq_pi_iff ] ;
183:         obtain ⟨ r, hr, hr' ⟩ := h.2;
184:         obtain ⟨ s, hs, hs' ⟩ := h_bisector.2;
185:         -- Since $r < 0$ and $s < 0$, we have $r • (C -ᵥ A) = s • (B -ᵥ A)$ implies $C -ᵥ A = \frac{s}{r} • (B -ᵥ A)$.
186:         have h_eq : C -ᵥ A = (s / r) • (B -ᵥ A) := by
187:           simp_all +decide [ div_eq_inv_mul, MulAction.mul_smul ]; --I had to remove MulAction to finish the proof
188:           rw [ inv_smul_smul₀ ( ne_of_lt hr ) ];
189:         refine' h_triangle _;
190:         rw [ collinear_iff_exists_forall_eq_smul_vadd ];
191:         use A, B -ᵥ A;
192:         simp +decide [ h_eq ];
193:         exact ⟨ ⟨ 0, by simp +decide ⟩, ⟨ 1, by simp +decide ⟩, ⟨ s / r, by rw [ ← h_eq, vsub_vadd ] ⟩ ⟩;
194:     simp_all +decide [ ← div_eq_inv_mul ];
195:     rw [ div_eq_div_iff ] at h_angle_bisector <;> simp_all +decide [ dist_comm ];
196:     grind
197:

The key “exact?” at line 166 (what it must supply) We’re proving h_sin_eq' by applying Lemma 2:

  • apply sin_angle_eq_sin_angle_of_sbtw; That lemma needs an input:
  • Sbtw ℝ B D C But here, in the main theorem, we only have:
  • h_triangle, h_D_on_BC, h_bisector So line 166 exact? is expected to discover:
  • “from those three, we can prove Sbtw ℝ B D C” which is precisely Lemma 1 (sbtw_of_angle_bisector_intersect). So the chain is:
  • Lemma 1: bisector + D on segment + non-collinear triangle ⇒ Sbtw
  • Lemma 2: Sbtw ⇒ sin(∠ADB)=sin(∠ADC)

Lines 167–170: case splits on distances:

  • dist A B / dist A C
  • dist B D / dist D C If any denominator is zero, division is problematic. So Lean forces us to split:
  • Is dist A C = 0? (meaning A=C)
  • Is dist D C = 0? (meaning D=C) etc. Humans would say “in a nondegenerate triangle, these aren’t zero.” Lean makes us prove it.

Lines 171–193: proving a sine is nonzero. This is another “formal hygiene” piece:

  • if sin(angle C A D) = 0, then the angle is 0 or π (for angles in $[0,\pi]$).
  • either one implies collinearity, contradicting h_triangle. So it is a formal version of: “In a real triangle, angle CAD can’t be 0 or π.” Interestingly, line 187 had to be modified in my case, since MulAction namespace isn't found. The fix is to remove MulAction.

Line 195: div_eq_div_iff = “cross-multiply”. This is the standard algebra step: $\frac{x}{y} = \frac{u}{v} \Rightarrow x v = u y$ (with side conditions about nonzero denominators).

Line 196 grind: this is the “big automation broom” that sweeps up routine algebraic rearrangements and simplifications. grind is not a magical mathematical insight. It’s mostly:

  • rewriting,
  • cancelling,
  • commutativity/associativity bookkeeping,
  • and pushing simp-like rules repeatedly.

Chunk 9 — Final wrapper theorem: “Lean, find the proof we already built”¶

198: /-
199: The Angle Bisector Theorem: In a triangle ABC, if the bisector of angle A intersects the segment BC at D, then AB/AC = BD/DC.
200: -/
201: open EuclideanGeometry Metric
202:
203: variable {V : Type*} {P : Type*} [NormedAddCommGroup V] [InnerProductSpace ℝ V] [MetricSpace P] [NormedAddTorsor V P]
204:
205: theorem angle_bisector_theorem_final (A B C D : P)
206:   (h_triangle : ¬ Collinear ℝ {A, B, C})
207:   (h_D_on_BC : D ∈ affineSegment ℝ B C)
208:   (h_bisector : angle B A D = angle C A D) :
209:   dist A B / dist A C = dist B D / dist D C := by
210:   exact?

What exact? at line 210 is intended to do? At this point, the file already has a theorem with the same statement:

  • angle_bisector_theorem_corrected (line 104) So line 210 exact? is basically: “Please locate a theorem in the environment whose type matches my goal, and apply it.” In a “human-clean” proof, we’d typically write: exact angle_bisector_theorem_corrected A B C D h_triangle h_D_on_BC h_bisector But Aristotle leaves it as exact? to let Lean’s search find it automatically. If exact? fails here, it usually means:
  • the earlier theorem is not in scope (not the case here),
  • or the goal is syntactically different (e.g., distances swapped), and we need a simp [dist_comm] first.

In conclusion, this Appendix shows we can read this file as a collaboration:

  • The agent chooses a known proof strategy (Law of Sines) and scaffolds the structure.
  • Lean + mathlib force every missing detail to be filled: nondegeneracy, rewriting, collinearity contradictions, denominator nonzero, angle bounds.
  • exact? is where the agent trusts the library already has the right lemma and asks Lean to retrieve it.
  • grind is where the agent asks automation to finish routine algebra. The beautiful part: once Lean accepts it, there’s no “handwaving” left—every step is reducible to kernel-checked primitives.
  • unlike the lean file generated by Aristotle for the main polygon problem that worked straight out of the box, we had to modify like 078 and 187 slightly to get this Lean proof work in my system.

Appendix 7B: Why polygon_ratio_problem_aristotle.lean looks simpler than angle_bisector_problem.lean¶

Even though the polygon ratio statement is mathematically more elaborate, its Lean development is shorter because it is proved analytically (in coordinates) rather than synthetically (in abstract Euclidean geometry). In Lean, the choice of proof style often dominates file length more than the inherent difficulty of the theorem.

Key reasons (from a Lean perspective):

  1. Coordinates reduce geometry to algebra. In the polygon proof, one fixes a coordinate model (here, the complex plane) and represents points explicitly. Once the configuration is expressed in coordinates, many geometric claims become equalities or inequalities of algebraic expressions (often involving norms). At that stage, Lean’s algebraic automation—ring, nlinarith, field_simp, norm_num, and related lemmas—can close goals with comparatively little guidance.

  2. Synthetic Euclidean geometry carries substantial overhead. The angle-bisector file works in a coordinate-free setting. As a result, the proof must explicitly manage: nondegeneracy hypotheses (e.g., noncollinearity), collinearity and betweenness predicates, angle definitions and their domain conditions, and the “glue” lemmas that connect these notions. In Lean, these are not merely background facts: they are additional objects and obligations that must be introduced, transported, and rewritten at each step. The resulting script length reflects the infrastructure of the framework as much as the theorem itself.

  3. The polygon proof operates in a normalized model. The analytic approach typically proves the claim for a specific normal form—e.g., a parameterized configuration in a fixed coordinate system, with a condition such as 2*a ≠ x to exclude degenerate cases. Mathematically, this is often justified “up to similarity” (or another equivalence), but the final statement is less general in form than a fully coordinate-free theorem phrased intrinsically in Euclidean geometry.

A useful meta-observation (for exposition): Analytic proofs often read as longer on paper even when they are shorter in Lean. Human exposition must spell out coordinate choices, compute intermediate quantities, and keep track of case splits that feel mechanical. By contrast, synthetic arguments can cite high-level results (similarity, angle chasing, Menelaus/Ceva, etc.), which compresses the narrative. Formalization reverses this economy: Lean is exceptionally good at compressing algebra once the model is fixed, while synthetic geometry tends to incur higher bookkeeping costs.

In short, the length difference is largely an artifact of formalization strategy: Lean rewards algebraic normalization, whereas synthetic developments pay more overhead for the generality and abstraction they retain.