Kimina-Prover:在大型形式化推理模型上应用测试时强化学习搜索
Numina & Kimi 团队

我们很高兴地宣布发布 Kimina-Prover-72B,这是我们基于 Qwen2.5-72B [2],通过 Kimi k1.5[1] RL 管道训练的最新定理证明模型。同时,我们还发布了两种蒸馏变体:Kimina-Prover-Distill-8B 和 1.7B(分别基于 Qwen3-8B 和 Qwen3-1.7B[3])。
我们的主要创新包括:
测试时强化学习搜索(Test-Time Reinforcement Learning Search):一个可训练的智能体证明框架,它使模型能够递归地发现、组合和应用多个引理来构建复杂的证明,并基于一种新颖的引理启用模式。
错误修正能力:Kimina-Prover 可以读取和解释 Lean 的错误消息,并提出有针对性的修复,与从头开始重新生成证明相比,显示出显著更高的样本效率。
这些进步使 Kimina-Prover 能够解决具有挑战性的数学问题并超越现有方法。如图 1 所示,在广泛使用的 miniF2F 基准测试中,Kimina-Prover 达到了 92.2% 的最先进通过率。
引言
我们专注于 Lean 4 语言中的自动化定理证明 (ATP),旨在自动化形式化数学证明的构建。神经定理证明的最新进展显著提高了 AI 系统辅助或自动化这一过程的能力。值得注意的进展包括 Google DeepMind 的 AlphaProof[4],它在国际数学奥林匹克竞赛级别的问题上表现出色。结合强化学习的开源系统,如 DeepSeek-Prover-V2[5],也取得了最先进的成果。此外,像 DSP+[6] 这样的神经符号智能体方法表明,通过在模块化框架中利用现成模型,无需大规模训练也能获得具有竞争力的性能。
我们早期的 Kimina-Prover Preview[7] 工作引入了一种用于 Lean 形式化定理证明的大型语言模型,在 miniF2F 基准测试上建立了新的性能基线。该模型采用大规模强化学习管道进行训练,采用推理驱动的探索范式,并证明了更大的模型可以作为更强大的形式化推理器。其结构化推理模式实现了高效的证明搜索,并模仿了人类解决问题的策略。
在初次成功之后,我们通过进一步的强化学习迭代继续改进模型。然而,单步推理不足以解决需要长而多阶段证明的复杂问题。为了解决这一限制,我们引入了测试时强化学习(TTRL)搜索框架,该框架使模型能够自主发现、组合和重用多个中间引理。该框架通过将难题分解为可重用的子组件来支持更深、更长期的推理。
TTRL 搜索的一个关键要素是引理启用模式,它允许模型在证明构建过程中识别和应用中间引理。这种对中间结果的结构化重用显著扩展了模型的解决问题能力,超越了单步生成。
为了进一步增强鲁棒性,我们还集成了错误修正机制,该机制可以解释 Lean 的错误消息并提出有针对性的修正。这使得模型能够通过迭代反馈改进其输出,从而提高证明的可靠性和整体样本效率。
新的里程碑
模型 | pass@1 | pass@32 | pass@1024 |
---|---|---|---|
Kimina-Prover-1.7B | 46.7 | 73.4 | — |
DSP+ | 52.5 | 71.3 | 80.7 |
DeepSeek-Prover-V2-7B | 58.6 | 75.6 | 79.9 |
Kimina-Prover-8B | 61.1 | 78.3 | — |
DeepSeek-Prover-V2-671B | 61.9 | 82.4 | 86.6 |
Kimina-Prover-72B | 63.9 | 84.0 | 87.7 |
所提出的技术结合起来显著提高了形式化定理证明的性能。在 miniF2F 基准测试中,Kimina-Prover 在 pass@32 下达到了 84.0% 的通过率,在增加一轮错误校正后达到了 86.4%。在 pass@1024 下,通过率达到了 87.7%。应用完整的测试时强化学习(TTRL)搜索框架,最终通过率达到 92.2%,估计通过上限约为 42,000。然而,这个通过预算在未来版本中可以大幅优化,因为当前大部分采样都花在证明无用或冗余的引理上。
值得注意的是,这些结果表明了证明器扩展行为的转变。虽然早期版本在对数尺度上随采样预算的增加表现出近似线性的改进,但当前系统在 pass@1024 之后显示出收益递减。这表明进一步的收益较少依赖于增加采样,而更多地需要更复杂的搜索策略,例如 TTRL 引入的策略。
方法论
引理启用模式
引理启用模式旨在使模型具备识别和利用输入中提供的有用引理的能力。为了支持此功能,在强化学习(RL)训练期间,会将一个由一到三个形式化引理组成的随机子集预置到问题上下文中,使证明器能够接触到可能有助于构建最终证明的潜在有用中间结果。这些引理通过两步管道准备:(1) 通用 LLM 生成自然语言的候选引理;(2) 然后使用我们的 Kimina-Autoformalizer-7b
将其翻译成形式化语句。
初步观察表明,模型结合所提供引理的倾向较低。为了解决这个问题,我们在强化学习框架中引入了一种基于偏好的奖励塑造策略。当一个定理可以通过多个轨迹被证明时,成功利用所提供引理的解决方案会获得更高的奖励,而未利用引理的解决方案则会受到惩罚。这种方法被证明是有效的,训练后引理利用率稳定地提高到 30-40%。重要的是,这种方法不仅鼓励了引理的使用,还促进了选择性:模型学会了在有用时策略性地应用引理,同时忽略不相关的引理,展示出更高效和类似人类的推理行为。
TTRL 搜索
引理启用模式允许模型将预生成的引理作为构建证明的中间步骤。然而,我们观察到随机采样和插入引理不足以解决需要高度结构化和深度嵌套推理的复杂问题。为了解决这一限制,我们开发了测试时强化学习搜索框架——一种可训练的智能体方法,系统地组织、过滤和组合大量候选引理以构建完整证明。该框架将过程从随机探索转变为更具战略性的目标导向搜索。
如图 2 所示,我们将每个问题的搜索范围定义为问题本身及其相关的候选引理。TTRL 搜索在每个搜索范围内跟踪引理利用分数,以衡量每个引理对最终证明贡献的频率和有效性。在每个 RL 训练迭代开始时,对于每个问题(即搜索范围),我们通过预置不同的引理组合来构建 K = 10 个输入变体。为了平衡探索和利用,60% 的输入是使用排名靠前的引理(即利用分数最高的引理)构建的,将模型集中在最有希望的证明路径上。其余 40% 则通过包含这些排名靠前的引理以及一到四个随机选择的额外引理来形成,鼓励探索新颖且可能 有用的引理组合。
为确保质量,过滤机制会剪枝那些持续未能做出有意义贡献的引理:任何在 50 次插入尝试后未能达到至少 τ=0.10 利用分数的引理都将从搜索池中移除。
TTRL 的一个关键特性是其递归搜索机制。搜索范围不限于原始定理,也为每个引理维护,从而允许框架将问题递归分解为更小的子问题。一个并行的子引理生成过程贯穿始终,并且当定理或引理在 N = 128 次尝试后仍未能找到证明时,会生成新的候选子引理。这种递归策略使得推理深度可以在测试时扩展,显著扩展了模型的有效问题解决能力。
为了保持逻辑正确性,我们解决了错误形式化引理可能导致平凡或无效证明的故障模式。在这种情况下,模型可能会利用不一致性来构建看似有效但不正确的解决方案。为了防止这种情况,我们引入了一个否定证明过程:对于每个新生成的引理,我们尝试证明其逻辑否定。如果否定语句是可证明的,则表明原始引理在逻辑上不一致,并立即丢弃。此步骤确保了整个证明构建过程的可靠性和正确性。
在 Kimina-Prover 中启用错误修正
最近先进的定理证明模型的一个关键限制是它们缺乏根据证明助手的反馈来纠正证明的能力——而人类用户则经常利用这些能力。为了弥补这一差距,我们开发了一个专用框架,将错误修正能力集成到 Kimina-Prover 中。
用于错误修正的 SFT 数据生成。通用大型语言模型在解释 Lean 的错误消息和提出有效修正时成功率较低。为了克服这一点,我们构建了一个专门的监督微调(SFT)数据集,用于错误修正。该数据集包含形式为(错误证明,Lean 反馈,正确证明)的三元组。为了丰富监督信号,我们提示 Claude 3.7 sonnet[8] 合成逐步推理链,解释如何使用提供的反馈将错误证明转换为正确证明。结果是一个高质量的数据集,不仅包括初始和已修正的证明,还包括中间推理,从而促进更有效的学习。
批量失败重放策略。将错误校正直接集成到强化学习(RL)循环中最初被证明是无效的,因为 SFT 模型在修复错误方面的成功率较低(约 1%),导致奖励稀疏和训练不稳定。为了解决这个问题,我们设计了批量失败重放策略。在 RL 迭代期间,我们不立即尝试纠正错误,而是收集迭代 N 中的所有失败证明尝试。在随后的迭代 N+1 中,训练批次由这些先前的失败(例如 500 个样本)和提示集中的标准问题(例如另外 500 个样本)的固定大小并集组成。这确保了在每个训练步骤中持续大量地接触错误校正任务,使模型能够以稳定且数据高效的方式逐渐学习有效的错误处理行为。
16+16 尝试并修正 | 32×1 暴力 | 32+32 尝试并修正 | |
---|---|---|---|
kimina-prover | 35.6 | 28.8 | 44.1 |
这种训练方法显著提高了模型从失败中恢复的能力。模型从纠正基本语法错误,发展到解决复杂的逻辑错误,最终在初始尝试失败时发现替代证明策略。至关重要的是,这种能力提高了样本效率。如表 2 所示,我们比较了在固定计算预算下的不同策略。16+16 尝试并修复策略——包括 16 次初始证明尝试,每次尝试后进行一次错误修复尝试——达到了 35.6% 的成功率,优于 32×1 暴力基线,后者在 32 次独立尝试中达到了 28.8%。进一步将样本预算增加到 32+32 并进行错误修正,成功率达到 44.1%。这些结果表明,让模型能够修正自己的错误比重复试错更有效地利用计算资源。
其他改进
除了我们的核心 TTRL 搜索和错误修正功能外,我们还开发了其他几种新颖的技术来增强模型的学习过程和解决问题的能力。
提示集整理与迭代。有效的数据整理对强化学习(RL)效率至关重要。我们最初的 30 万个问题提示集被精简为大约 9 万个以竞赛为重点的子集,其中侧重于来自 NuminaMath 1.5[9] 的 Olympiads-ref 子集等来源的问题。在 RL 期间,我们应用了动态过滤:成功率一直很高的问题被移除以保持挑战性,而持续困难的问题则使用通用 LLM 分解为更简单的子问题或变体。最终的提示集结合了用于广泛覆盖的自动形式化语句、用于质量控制的人工标注问题以及用于有针对性技能开发的增强变体。整理后的提示集将开源。
Lean 数据上的持续预训练。为了解决大多数基础模型在 Lean 熟练度方面的局限性,我们应用了持续预训练(CPT)。我们构建了一个 60 亿词元的 CPT 数据集,数据来源于多个地方:来自 GitHub 等在线平台的 2.6 亿词元,来自我们 RL 管道的 55 亿词元的编译器验证推出数据,以及以 state–tactic–state
和 state–tactic–error
格式表示的额外结构化数据,以增强多样性。这些结构化数据为基础模型补充了额外的领域知识。
随机证明裁剪数据增强。为了更好地利用缺乏中间推理步骤的高质量人工标注证明,我们开发了随机证明裁剪技术。该方法使用两种变体增强此类数据:
- 证明截断:移除证明的最后一部分,模型必须完成它。
- 证明填充:将随机选择的内部块替换为占位符标记
sorry
,模型必须重建缺失的步骤。
该策略允许模型学习如何扩展现有推理线并填补逻辑空白,从而有效地将人工生成的内容集成到训练过程中。
非证明问题解决。许多问题以要求最终答案的形式呈现,这无法自然地构架为传统的证明任务。受 CombiBench[10] 评估方法的启发,我们引入了非证明问题解决能力,将答案生成与证明构建统一起来。在这种设置下,模型会获得一个最终答案字段留空的问题。它在一次推理中执行两阶段过程:首先推导出正确答案,然后生成完整的形式化证明来证明该答案。这种方法确保了模型的推理明确地以达到正确解决方案为条件。
证明案例示例
这里我们提供了一些由我们最新的 Kimina-prover 模型发现的证明案例。已解决的 miniF2F 问题的完整列表可在 GitHub 上找到
TTRL 搜索证明示例
系统使用 TTRL 搜索方法递归地将问题分解为四层引理,最终生成了一个 520 行的形式化证明。在这个递归证明过程中,最初,只有一个引理 sum_cexp_div_pow_ne_zero
仍待解决。虽然这个结论在非形式上证明起来似乎很简单,但在 Lean 4 中严格证明它仍然需要很多步骤。我们的系统递归地生成了另外两层引理,并为 sum_cexp_div_pow_ne_zero
获得了 260 行的形式化证明。此证明在 Mathlib v4.15 下有效
imo 1969 p2 的证明示例
证明结构

形式化代码
import Mathlib
open Finset in
theorem norm_le_sum_norm_of_sum_eq_zero {N : ℕ} (hN : N > 0) (z : ℕ → ℂ)
(hsum : ∑ i ∈ range N, z i = 0) :
‖z 0‖ ≤ ∑ i ∈ Icc 1 (N - 1), ‖z i‖ := by
by_cases h : N > 1
·
have h6 : ∑ i in range N, z i = z 0 + ∑ i in Icc 1 (N - 1), z i := by
have h10 : range N = {0} ∪ Icc 1 (N - 1) := by
ext x
simp
omega
rw [h10]
rw [sum_union]
·
simp
·
apply disjoint_left.mpr
intro x hx1 hx2
simp at hx1
simp [hx1] at hx2
have h3 : z 0 = - (∑ i ∈ Icc 1 (N - 1), z i) := by
have h4 : ∑ i in range N, z i = 0 := hsum
rw [h6] at h4
calc
z 0 = (z 0 + ∑ i in Icc 1 (N - 1), z i) - ∑ i in Icc 1 (N - 1), z i := by ring
_ = (0 : ℂ) - ∑ i in Icc 1 (N - 1), z i := by rw [h4]
_ = - (∑ i in Icc 1 (N - 1), z i) := by ring
have h11 : ‖z 0‖ = ‖(∑ i ∈ Icc 1 (N - 1), z i : ℂ)‖ := by
rw [h3]
simp [norm_neg]
have h12 : ‖(∑ i ∈ Icc 1 (N - 1), z i : ℂ)‖ ≤ ∑ i ∈ Icc 1 (N - 1), ‖z i‖ := by
apply norm_sum_le
linarith [h11, h12]
·
have h5 : N = 1 := by
omega
rw [h5]
have h6 : ∑ i in range 1, z i = 0 := by
have h7 : N = 1 := h5
have h8 : ∑ i in range N, z i = 0 := hsum
rw [show N = 1 by omega] at h8
simpa using h8
have h7 : z 0 = 0 := by
simp at h6
all_goals
try tauto
rw [show z 0 = (0 : ℂ) by exact h7]
simp
open Finset in
theorem sum_pow_one_half_from_two_to_N {N : ℕ} (hN : 0 < N) :
∑ j ∈ Icc 1 (N - 1), (1 / 2 : ℝ) ^ (j + 1) = 1 / 2 - 1 / 2 ^ N := by
have h2 : ∀ (k : ℕ), ∑ j in Icc 1 (k), (1 / 2 : ℝ) ^ (j + 1) = 1 / 2 - 1 / 2 ^ (k + 1 : ℕ) := by
intro k
induction k with
| zero =>
simp
| succ k ihk =>
rw [sum_Icc_succ_top (by linarith)]
rw [ihk]
field_simp at *
ring_nf
have h4 : ∑ j in Icc 1 (N - 1), (1 / 2 : ℝ) ^ (j + 1) = 1 / 2 - 1 / 2 ^ (N : ℕ) := by
have h5 : ∑ j in Icc 1 (N - 1), (1 / 2 : ℝ) ^ (j + 1) = 1 / 2 - 1 / 2 ^ ((N - 1 : ℕ) + 1 : ℕ) := by
apply h2 (N - 1)
rw [h5]
have h6 : (N - 1 : ℕ) + 1 = N := by
omega
rw [h6]
exact h4
open Complex Finset in
theorem sum_eq_zero_of_sq_add_sq_eq_geom_seq_false {N : ℕ} (hN : 0 < N) (x y : Fin N → ℝ)
(hxsum : ∑ i : Fin N, x i = 0) (hysum : ∑ i : Fin N, y i = 0)
(hxy : ∀ j : Fin N, (x j)^2 + (y j)^2 = (1/4)^(j.1+1)) :
False := by
let z : ℕ → ℂ := fun i => if h : i < N then (x ⟨i, h⟩ : ℂ) + (y ⟨i, h⟩ : ℂ) * I else 0
have hsumz : ∑ i in range N, z i = 0 := by
have h4 : ∑ i in range N, z i = (∑ i : Fin N, ((x i : ℂ) + (y i : ℂ) * I)) := by
simp [z, Finset.sum_fin_eq_sum_range, Finset.sum_congr]
rw [h4]
simp [Complex.ext_iff, Finset.sum_congr, hxsum, hysum]
have hz0 : ‖z 0‖ ≤ ∑ i ∈ Icc 1 (N - 1), ‖z i‖ := norm_le_sum_norm_of_sum_eq_zero hN z hsumz
have h2 : ‖z 0‖ = (1 / 2 : ℝ) := by
have h6 : z 0 = (x ⟨0, by omega⟩ : ℂ) + (y ⟨0, by omega⟩ : ℂ) * I := by
have h7 : (0 : ℕ) < N := by omega
simp [z, h7]
rw [h6]
have h7 : ((x ⟨0, by omega⟩ : ℝ) ^ 2 + (y ⟨0, by omega⟩ : ℝ) ^ 2) = (1 / 4 : ℝ) := by
have h8 := hxy (⟨0, by omega⟩)
norm_num at h8 ⊢
nlinarith
have h8 : ‖((x ⟨0, by omega⟩ : ℂ) + (y ⟨0, by omega⟩ : ℂ) * I)‖ = Real.sqrt ((x ⟨0, by omega⟩ : ℝ) ^ 2 + (y ⟨0, by omega⟩ : ℝ) ^ 2) := by
simp [Complex.norm_eq_abs, Complex.abs, Complex.normSq]
all_goals
ring_nf
rw [h8, h7]
have h9 : Real.sqrt ((1 / 4 : ℝ)) = (1 / 2 : ℝ) := by
rw [Real.sqrt_eq_iff_mul_self_eq] <;> norm_num
rw [h9]
have h3 : ∑ i ∈ Icc 1 (N - 1), ‖z i‖ = ∑ i ∈ Icc 1 (N - 1), (1 / 2 : ℝ) ^ (i + 1 : ℕ) := by
apply Finset.sum_congr
.
rfl
.
intro i hi
have h10 : i ≥ 1 := by
have h11 : i ∈ Icc (1 : ℕ) (N - 1) := by
simpa using hi
simp at h11 ⊢
omega
have h11 : i ≤ N - 1 := by
have h12 : i ∈ Icc (1 : ℕ) (N - 1) := by
simpa using hi
simp at h12 ⊢
omega
have h12 : i < N := by
omega
have h13 : z i = (x ⟨i, by omega⟩ : ℂ) + (y ⟨i, by omega⟩ : ℂ) * I := by
have h14 : i < N := by omega
simp [z, h14]
have h14 : ‖z i‖ = (1 / 2 : ℝ) ^ (i + 1 : ℕ) := by
rw [h13]
have h14 : ((x ⟨i, by omega⟩ : ℝ) ^ 2 + (y ⟨i, by omega⟩ : ℝ) ^ 2) = (1 / 4 : ℝ) ^ (i + 1 : ℕ) := by
have h15 := hxy (⟨i, by omega⟩)
norm_num at h15 ⊢
nlinarith
have h15 : ‖((x ⟨i, by omega⟩ : ℂ) + (y ⟨i, by omega⟩ : ℂ) * I)‖ = Real.sqrt ((x ⟨i, by omega⟩ : ℝ) ^ 2 + (y ⟨i, by omega⟩ : ℝ) ^ 2) := by
simp [Complex.norm_eq_abs, Complex.abs, Complex.normSq]
all_goals
ring_nf
rw [h15]
have h16 : Real.sqrt ((x ⟨i, by omega⟩ : ℝ) ^ 2 + (y ⟨i, by omega⟩ : ℝ) ^ 2) = (1 / 2 : ℝ) ^ (i + 1 : ℕ) := by
have h17 : ((x ⟨i, by omega⟩ : ℝ) ^ 2 + (y ⟨i, by omega⟩ : ℝ) ^ 2) = (1 / 4 : ℝ) ^ (i + 1 : ℕ) := h14
rw [h17]
have h18 : Real.sqrt ((1 / 4 : ℝ) ^ (i + 1 : ℕ)) = (1 / 2 : ℝ) ^ (i + 1 : ℕ) := by
have h19 : ((1 / 4 : ℝ) : ℝ) = (1 / 2 : ℝ) ^ (2 : ℕ) := by norm_num
rw [h19]
have h20 : Real.sqrt (((1 / 2 : ℝ) ^ (2 : ℕ)) ^ (i + 1 : ℕ)) = (1 / 2 : ℝ) ^ (i + 1 : ℕ) := by
have h21 : ((1 / 2 : ℝ) ^ (2 : ℕ)) ^ (i + 1 : ℕ) = ((1 / 2 : ℝ) ^ (i + 1 : ℕ)) ^ 2 := by
have h22 : ((1 / 2 : ℝ) ^ (2 : ℕ)) ^ (i + 1 : ℕ) = (1 / 2 : ℝ) ^ (2 * (i + 1 : ℕ) : ℕ) := by
rw [← pow_mul]
all_goals ring_nf
have h23 : ((1 / 2 : ℝ) ^ (i + 1 : ℕ)) ^ 2 = (1 / 2 : ℝ) ^ (2 * (i + 1 : ℕ) : ℕ) := by
ring_nf
rw [h22, h23]
rw [h21]
have h22 : Real.sqrt (((1 / 2 : ℝ) ^ (i + 1 : ℕ)) ^ 2) = (1 / 2 : ℝ) ^ (i + 1 : ℕ) := by
rw [Real.sqrt_sq (show (1 / 2 : ℝ) ^ (i + 1 : ℕ) ≥ 0 by positivity)]
rw [h22]
exact h20
exact h18
exact h16
linarith
rw [h3] at hz0
rw [h2] at hz0
have h6 : ∑ i ∈ Icc 1 (N - 1), (1 / 2 : ℝ) ^ (i + 1 : ℕ) = (1 / 2 - 1 / 2 ^ N : ℝ) := by
have h7 : ∑ i ∈ Icc 1 (N - 1), (1 / 2 : ℝ) ^ (i + 1 : ℕ) = (1 / 2 - 1 / 2 ^ N : ℝ) := sum_pow_one_half_from_two_to_N hN
linarith
rw [h6] at hz0
have h7 : (1 / 2 ^ N : ℝ) > 0 := by
have h10 : (1 / 2 ^ N : ℝ) > 0 := by
have h11 : (1 / 2 ^ N : ℝ) = (1 / 2 : ℝ) ^ N := by
ring_nf
rw [h11]
positivity
linarith
have h11 : (1 / 2 - 1 / 2 ^ N : ℝ) < (1 / 2 : ℝ) := by
have h12 : (1 / 2 ^ N : ℝ) > 0 := h7
linarith
linarith
open BigOperators Real Nat Topology Rat in
theorem sum_cexp_div_pow_ne_zero {N : ℕ} (hN : N > 0) (a : Fin N → ℝ) :
((∑ j : Fin N, cos (a j) / 2 ^ (j : ℕ))^2 +
(∑ j : Fin N, sin (a j) / 2 ^ (j : ℕ))^2) > 0 := by
by_contra h
push_neg at h
have h1 : (∑ j : Fin N, cos (a j) / 2 ^ (j : ℕ)) ^ 2 ≥ 0 := by
apply sq_nonneg
have h2 : (∑ j : Fin N, sin (a j) / 2 ^ (j : ℕ)) ^ 2 ≥ 0 := by
apply sq_nonneg
have h3 : (∑ j : Fin N, cos (a j) / 2 ^ (j : ℕ)) ^ 2 = 0 := by
linarith
have h4 : (∑ j : Fin N, sin (a j) / 2 ^ (j : ℕ)) ^ 2 = 0 := by
linarith
have h5 : (∑ j : Fin N, cos (a j) / 2 ^ (j : ℕ)) = 0 := by
nlinarith
have h6 : (∑ j : Fin N, sin (a j) / 2 ^ (j : ℕ)) = 0 := by
nlinarith
let x : Fin N → ℝ := fun j => cos (a j) / 2 ^ (j.1 + 1 : ℕ)
let y : Fin N → ℝ := fun j => sin (a j) / 2 ^ (j.1 + 1 : ℕ)
have hxsum : ∑ i : Fin N, x i = 0 := by
have eq2 : ∑ i : Fin N, x i = (∑ i : Fin N, (cos (a i) / 2 ^ (i.1 : ℕ) : ℝ)) / 2 := by
have eq3 : ∑ i : Fin N, x i = ∑ i : Fin N, (cos (a i) / 2 ^ (i.1 + 1 : ℕ) : ℝ) := by
congr
rw [eq3]
have eq4 : ∑ i : Fin N, (cos (a i) / 2 ^ (i.1 + 1 : ℕ) : ℝ) = (∑ i : Fin N, (cos (a i) / 2 ^ (i.1 : ℕ) : ℝ)) / 2 := by
calc
∑ i : Fin N, (cos (a i) / 2 ^ (i.1 + 1 : ℕ) : ℝ)
= ∑ i : Fin N, ((cos (a i) / 2 ^ (i.1 : ℕ) : ℝ) / 2) := by
apply Finset.sum_congr
.
rfl
intro i hi
have h12 : (cos (a i) / 2 ^ (i.1 + 1 : ℕ) : ℝ) = (cos (a i) / 2 ^ (i.1 : ℕ) : ℝ) / 2 := by
have h11 : (2 : ℝ) ^ (i.1 + 1 : ℕ) = (2 : ℝ) ^ (i.1 : ℕ) * 2 := by
ring_nf
rw [h11]
field_simp
all_goals ring
exact h12
_ = (∑ i : Fin N, (cos (a i) / 2 ^ (i.1 : ℕ) : ℝ)) / 2 := by
simp [Finset.sum_div]
exact eq4
rw [eq2]
rw [show ∑ i : Fin N, (cos (a i) / 2 ^ (i.1 : ℕ) : ℝ) = (0 : ℝ) by simpa using h5]
all_goals norm_num
have hysum : ∑ i : Fin N, y i = 0 := by
have eq2 : ∑ i : Fin N, y i = (∑ i : Fin N, (sin (a i) / 2 ^ (i.1 : ℕ) : ℝ)) / 2 := by
have eq3 : ∑ i : Fin N, y i = ∑ i : Fin N, (sin (a i) / 2 ^ (i.1 + 1 : ℕ) : ℝ) := by
congr
rw [eq3]
have eq4 : ∑ i : Fin N, (sin (a i) / 2 ^ (i.1 + 1 : ℕ) : ℝ) = (∑ i : Fin N, (sin (a i) / 2 ^ (i.1 : ℕ) : ℝ)) / 2 := by
calc
∑ i : Fin N, (sin (a i) / 2 ^ (i.1 + 1 : ℕ) : ℝ)
= ∑ i : Fin N, ((sin (a i) / 2 ^ (i.1 : ℕ) : ℝ) / 2) := by
apply Finset.sum_congr
.
rfl
intro i hi
have h12 : (sin (a i) / 2 ^ (i.1 + 1 : ℕ) : ℝ) = (sin (a i) / 2 ^ (i.1 : ℕ) : ℝ) / 2 := by
have h11 : (2 : ℝ) ^ (i.1 + 1 : ℕ) = (2 : ℝ) ^ (i.1 : ℕ) * 2 := by
ring_nf
rw [h11]
field_simp
all_goals ring
exact h12
_ = (∑ i : Fin N, (sin (a i) / 2 ^ (i.1 : ℕ) : ℝ)) / 2 := by
simp [Finset.sum_div]
exact eq4
rw [eq2]
rw [show ∑ i : Fin N, (sin (a i) / 2 ^ (i.1 : ℕ) : ℝ) = (0 : ℝ) by simpa using h6]
all_goals norm_num
have hxy : ∀ j : Fin N, (x j) ^ 2 + (y j) ^ 2 = (1 / 4 : ℝ) ^ (j.1 + 1 : ℕ) := by
intro j
have h11 : cos (a j) ^ 2 + sin (a j) ^ 2 = 1 := by
exact Real.cos_sq_add_sin_sq (a j)
simp [x, y]
have h12 : (4 : ℝ) ^ (j.1 + 1 : ℕ) = (2 : ℝ) ^ (2 * (j.1 + 1 : ℕ)) := by
have h13 : (4 : ℝ) = (2 : ℝ) ^ (2 : ℕ) := by norm_num
rw [h13]
rw [← pow_mul]
have h13 : (2 : ℝ) ^ (2 * (j.1 + 1 : ℕ)) = ((2 : ℝ) ^ (j.1 + 1 : ℕ)) ^ 2 := by
ring_nf
field_simp [h11, h12, h13]
all_goals
ring_nf
have h12 : False := sum_eq_zero_of_sq_add_sq_eq_geom_seq_false (show 0 < N by omega) x y hxsum hysum hxy
tauto
open BigOperators Real Nat Topology Rat in
theorem mul_cos_sub_mul_sin_eq_mul_cos_add (C S : ℝ) (h : C ^ 2 + S ^ 2 ≠ 0) :
∃ R α : ℝ, R > 0 ∧ ∀ x : ℝ, C * cos x - S * sin x = R * cos (x + α) := by
have h1 : C ^ 2 + S ^ 2 > 0 := by
by_contra h2
push_neg at h2
have h3 : C ^ 2 ≥ 0 := sq_nonneg C
have h4 : S ^ 2 ≥ 0 := sq_nonneg S
have h5 : C ^ 2 + S ^ 2 = 0 := by nlinarith
tauto
have h2 : ∃ R : ℝ, R > 0 ∧ R ^ 2 = C ^ 2 + S ^ 2 := by
use Real.sqrt (C ^ 2 + S ^ 2)
constructor
· -- Show that sqrt (C^2 + S^2) > 0
apply Real.sqrt_pos.mpr
linarith
· -- Show that (sqrt (C^2 + S^2)) ^ 2 = C^2 + S^2
rw [Real.sq_sqrt]
positivity
rcases h2 with ⟨R, hR_pos, hR_sq⟩
have h12 : (C / R) ^ 2 + (S / R) ^ 2 = 1 := by
have h14 : R ≠ 0 := by linarith
have h15 : R ^ 2 = C ^ 2 + S ^ 2 := hR_sq
field_simp at *
nlinarith
have h13 : ∃ α : ℝ, Real.cos α = C / R ∧ Real.sin α = S / R := by
have h9 : (C / R) ^ 2 + (S / R) ^ 2 = 1 := h12
by_cases hC : C ≥ 0
· -- C ≥ 0
use Real.arcsin (S / R)
constructor
· -- Show cos (arcsin (S / R)) = C / R
have h14 : Real.cos (Real.arcsin (S / R)) = Real.sqrt (1 - (S / R) ^ 2) := by
rw [Real.cos_arcsin]
have h15 : Real.sqrt (1 - (S / R) ^ 2) = C / R := by
have h151 : (C / R) ^ 2 = 1 - (S / R) ^ 2 := by
linarith [h9]
have h161 : Real.sqrt (1 - (S / R) ^ 2) = Real.sqrt ((C / R) ^ 2) := by
rw [show 1 - (S / R) ^ 2 = (C / R) ^ 2 by linarith [h9]]
rw [h161]
have h171 : Real.sqrt ((C / R) ^ 2) = C / R := by
apply Real.sqrt_sq (show 0 ≤ C / R by
have h211 : R > 0 := hR_pos
apply div_nonneg
linarith
linarith
)
rw [h171]
rw [h14, h15]
· -- Show sin (arcsin (S / R)) = S / R
have h20 : -1 ≤ S / R := by
have h6 : (S / R) ^ 2 ≤ 1 := by nlinarith [h9]
nlinarith [sq_nonneg (S / R - 1), sq_nonneg (S / R + 1)]
have h21 : S / R ≤ 1 := by
have h6 : (S / R) ^ 2 ≤ 1 := by nlinarith [h9]
nlinarith [sq_nonneg (S / R - 1), sq_nonneg (S / R + 1)]
have h18 : Real.sin (Real.arcsin (S / R)) = S / R := by
apply Real.sin_arcsin
all_goals linarith
rw [h18]
· -- C < 0
have hC2 : C < (0 : ℝ) := by linarith
have hC3 : C / R < 0 := by
have hR_pos1 : R > 0 := hR_pos
apply div_neg_of_neg_of_pos hC2 (by linarith)
use Real.pi - Real.arcsin (S / R)
constructor
· -- Show cos (π - arcsin (S / R)) = C / R
have h28 : Real.cos (Real.pi - Real.arcsin (S / R)) = - Real.sqrt (1 - (S / R) ^ 2) := by
have h1 : Real.cos (Real.pi - Real.arcsin (S / R)) = - Real.cos (Real.arcsin (S / R)) := by
rw [Real.cos_pi_sub]
have h2 : Real.cos (Real.arcsin (S / R)) = Real.sqrt (1 - (S / R) ^ 2) := by
rw [Real.cos_arcsin]
rw [h1, h2]
have h29 : Real.sqrt (1 - (S / R) ^ 2) = - (C / R) := by
have h301 : (C / R) ^ 2 = 1 - (S / R) ^ 2 := by
linarith [h9]
have h311 : Real.sqrt (1 - (S / R) ^ 2) = Real.sqrt ((C / R) ^ 2) := by
rw [show 1 - (S / R) ^ 2 = (C / R) ^ 2 by linarith [h9]]
rw [h311]
have h321 : Real.sqrt ((C / R) ^ 2) = - (C / R) := by
have h331 : C / R < 0 := hC3
have : Real.sqrt ((C / R) ^ 2) = - (C / R) := by
rw [Real.sqrt_sq_eq_abs]
rw [abs_of_neg h331]
linarith
linarith
rw [h28, h29]
all_goals nlinarith
· -- Show sin (π - arcsin (S / R)) = S / R
have h30 : Real.sin (Real.pi - Real.arcsin (S / R)) = Real.sin (Real.arcsin (S / R)) := by
rw [Real.sin_pi_sub]
have h31 : Real.sin (Real.arcsin (S / R)) = S / R := by
have h20 : -1 ≤ S / R := by
have h6 : (S / R) ^ 2 ≤ 1 := by nlinarith [h9]
nlinarith [sq_nonneg (S / R - 1), sq_nonneg (S / R + 1)]
have h21 : S / R ≤ 1 := by
have h6 : (S / R) ^ 2 ≤ 1 := by nlinarith [h9]
nlinarith [sq_nonneg (S / R - 1), sq_nonneg (S / R + 1)]
apply Real.sin_arcsin
all_goals linarith
rw [h30, h31]
rcases h13 with ⟨α, h14, h15⟩
use R, α
constructor
· -- Show R > 0
linarith
· -- Show ∀ x : ℝ, C * cos x - S * sin x = R * cos (x + α)
intro x
have h21 : Real.cos (x + α) = Real.cos x * Real.cos α - Real.sin x * Real.sin α := by
rw [Real.cos_add]
have h22 : Real.cos α = C / R := by
linarith [h14]
have h23 : Real.sin α = S / R := by
linarith [h15]
calc
C * Real.cos x - S * Real.sin x
= R * (Real.cos x * (C / R) - Real.sin x * (S / R)) := by
field_simp [show R ≠ 0 by linarith]
all_goals ring
_ = R * (Real.cos x * Real.cos α - Real.sin x * Real.sin α) := by
rw [show Real.cos α = C / R by linarith [h14], show Real.sin α = S / R by linarith [h15]]
_ = R * Real.cos (x + α) := by
rw [h21]
theorem sum_cos_div_two_pow_eq_mul_cos (N : ℕ) (a : ℕ → ℝ) (hN : N > 0) :
∃ R0 α : ℝ, R0 > 0 ∧ ∀ x : ℝ, ∑ j : Fin N, Real.cos (a j + x) / 2 ^ j.1 =
(R0 : ℝ) * Real.cos (x + α) := by
have h2 : ((∑ j : Fin N, Real.cos (a j) / 2 ^ (j : ℕ)) ^ 2 + (∑ j : Fin N, Real.sin (a j) / 2 ^ (j : ℕ)) ^ 2) > 0 := by
apply sum_cexp_div_pow_ne_zero (by omega) (fun j : Fin N => a j)
have h4 : (∑ j : Fin N, Real.cos (a j) / 2 ^ (j : ℕ)) ^ 2 + (∑ j : Fin N, Real.sin (a j) / 2 ^ (j : ℕ)) ^ 2 ≠ 0 := by
linarith
have h5 : ∃ R α : ℝ, R > 0 ∧ ∀ x : ℝ, (∑ j : Fin N, Real.cos (a j) / 2 ^ (j : ℕ)) * Real.cos x - (∑ j : Fin N, Real.sin (a j) / 2 ^ (j : ℕ)) * Real.sin x = R * Real.cos (x + α) := by
apply mul_cos_sub_mul_sin_eq_mul_cos_add (∑ j : Fin N, Real.cos (a j) / 2 ^ (j : ℕ)) (∑ j : Fin N, Real.sin (a j) / 2 ^ (j : ℕ)) (by
exact h4
)
rcases h5 with ⟨R0, α, hR0_pos, h_eq⟩
use R0, α
constructor
.
exact hR0_pos
.
intro x
have h_eq2 : ∑ j : Fin N, Real.cos (a j + x) / 2 ^ j.1 = (∑ j : Fin N, Real.cos (a j) / 2 ^ (j : ℕ)) * Real.cos x - (∑ j : Fin N, Real.sin (a j) / 2 ^ (j : ℕ)) * Real.sin x := by
have h8 : ∀ j : Fin N, Real.cos (a j + x) / (2 : ℝ) ^ (j : ℕ) = (Real.cos (a j) * Real.cos x - Real.sin (a j) * Real.sin x) / (2 : ℝ) ^ (j : ℕ) := by
intro j
have h9 : Real.cos (a j + x) = Real.cos (a j) * Real.cos x - Real.sin (a j) * Real.sin x := by
rw [Real.cos_add]
rw [h9]
have h10 : ∑ j : Fin N, Real.cos (a j + x) / 2 ^ j.1 = ∑ j : Fin N, ((Real.cos (a j) * Real.cos x - Real.sin (a j) * Real.sin x) / (2 : ℝ) ^ (j : ℕ)) := by
apply Finset.sum_congr
.
rfl
.
intro j hj
have h9 : Real.cos (a j + x) / (2 : ℝ) ^ (j : ℕ) = (Real.cos (a j) * Real.cos x - Real.sin (a j) * Real.sin x) / (2 : ℝ) ^ (j : ℕ) := h8 j
simpa using h9
rw [h10]
have h11 : ∑ j : Fin N, ((Real.cos (a j) * Real.cos x - Real.sin (a j) * Real.sin x) / (2 : ℝ) ^ (j : ℕ)) =
(∑ j : Fin N, Real.cos (a j) / (2 : ℝ) ^ (j : ℕ)) * Real.cos x - (∑ j : Fin N, Real.sin (a j) / (2 : ℝ) ^ (j : ℕ)) * Real.sin x := by
have h12 : ∀ j : Fin N, ((Real.cos (a j) * Real.cos x - Real.sin (a j) * Real.sin x) / (2 : ℝ) ^ (j : ℕ)) =
(Real.cos (a j) / (2 : ℝ) ^ (j : ℕ)) * Real.cos x - (Real.sin (a j) / (2 : ℝ) ^ (j : ℕ)) * Real.sin x := by
intro j
ring_nf
calc
∑ j : Fin N, ((Real.cos (a j) * Real.cos x - Real.sin (a j) * Real.sin x) / (2 : ℝ) ^ (j : ℕ))
= ∑ j : Fin N, ((Real.cos (a j) / (2 : ℝ) ^ (j : ℕ)) * Real.cos x - (Real.sin (a j) / (2 : ℝ) ^ (j : ℕ)) * Real.sin x) := by
apply Finset.sum_congr
.
rfl
.
intro j hj
specialize h12 j
linarith
_ = (∑ j : Fin N, (Real.cos (a j) / (2 : ℝ) ^ (j : ℕ)) * Real.cos x) - (∑ j : Fin N, (Real.sin (a j) / (2 : ℝ) ^ (j : ℕ)) * Real.sin x) := by
rw [Finset.sum_sub_distrib]
_ = (∑ j : Fin N, Real.cos (a j) / (2 : ℝ) ^ (j : ℕ)) * Real.cos x - (∑ j : Fin N, Real.sin (a j) / (2 : ℝ) ^ (j : ℕ)) * Real.sin x := by
rw [Finset.sum_mul, Finset.sum_mul]
exact h11
rw [h_eq2]
specialize h_eq x
linarith
open Real Set in
theorem sub_eq_int_mul_pi_of_cos_eq_zero :
∀ (θ₁ θ₂ : ℝ), cos θ₁ = 0 → cos θ₂ = 0 → ∃ m : ℤ, θ₂ - θ₁ = m * π := by
intro θ₁ θ₂ h₁ h₂
have h1 : ∃ k : ℤ, θ₁ = Real.pi / 2 + k * Real.pi := by
rw [Real.cos_eq_zero_iff] at h₁
rcases h₁ with ⟨k, hk⟩
use k
linarith
rcases h1 with ⟨k, hk1⟩
have h2 : ∃ l : ℤ, θ₂ = Real.pi / 2 + l * Real.pi := by
rw [Real.cos_eq_zero_iff] at h₂
rcases h₂ with ⟨l, hl⟩
use l
linarith
rcases h2 with ⟨l, hl2⟩
use l - k
rw [hl2, hk1]
field_simp at *
<;> ring_nf <;> norm_num
open BigOperators Real Nat Topology Rat in
theorem imo_1969_p2 (m n : ℝ) (k : ℕ) (a : ℕ → ℝ) (y : ℝ → ℝ) (h₀ : 0 < k)
(h₁ : ∀ x, y x = ∑ i in Finset.range k, Real.cos (a i + x) / 2 ^ i) (h₂ : y m = 0)
(h₃ : y n = 0) : ∃ t : ℤ, m - n = t * Real.pi := by
have h4 : y m = ∑ i in Finset.range k, Real.cos (a i + m) / 2 ^ i := by
specialize h₁ m
simpa using h₁
have h5 : y n = ∑ i in Finset.range k, Real.cos (a i + n) / 2 ^ i := by
specialize h₁ n
simpa using h₁
rw [h4] at h₂
rw [h5] at h₃
have h9 : ∃ R0 α : ℝ, R0 > 0 ∧ ∀ x : ℝ, ∑ j : Fin k, Real.cos (a j + x) / 2 ^ j.1 = R0 * Real.cos (x + α) := by
apply sum_cos_div_two_pow_eq_mul_cos k (fun (j : ℕ) => a j) (by omega)
rcases h9 with ⟨R0, α, hR0, h_eq3⟩
have h10 : ∑ i in Finset.range k, Real.cos (a i + m) / 2 ^ i = R0 * Real.cos (m + α) := by
have h11 : ∑ i in Finset.range k, Real.cos (a i + m) / 2 ^ i = ∑ j : Fin k, Real.cos (a j + m) / 2 ^ j.1 := by
simp [Finset.sum_range]
rw [h11]
specialize h_eq3 m
simpa using h_eq3
have h11 : ∑ i in Finset.range k, Real.cos (a i + n) / 2 ^ i = R0 * Real.cos (n + α) := by
have h12 : ∑ i in Finset.range k, Real.cos (a i + n) / 2 ^ i = ∑ j : Fin k, Real.cos (a j + n) / 2 ^ j.1 := by
simp [Finset.sum_range]
rw [h12]
specialize h_eq3 n
simpa using h_eq3
have h10' : R0 * Real.cos (m + α) = 0 := by
have h_eq10 : ∑ i in Finset.range k, Real.cos (a i + m) / 2 ^ i = 0 := by
linarith [h₂, h10]
linarith [h10, h_eq10]
have h11' : R0 * Real.cos (n + α) = 0 := by
have h_eq11 : ∑ i in Finset.range k, Real.cos (a i + n) / 2 ^ i = 0 := by
linarith [h₃, h11]
linarith [h11, h_eq11]
have h12 : Real.cos (m + α) = 0 := by
apply (mul_left_inj' (ne_of_gt hR0)).mp
linarith
have h13 : Real.cos (n + α) = 0 := by
apply (mul_left_inj' (ne_of_gt hR0)).mp
linarith
have h14 : ∃ m1 : ℤ, (n + α) - (m + α) = m1 * Real.pi := by
apply sub_eq_int_mul_pi_of_cos_eq_zero (m + α) (n + α) h12 h13
rcases h14 with ⟨m1, hm1⟩
use -m1
have hm4 : (m : ℝ) - (n : ℝ) = - (m1 * Real.pi) := by
have hm3 : (n + α : ℝ) - (m + α : ℝ) = (m1 : ℝ) * Real.pi := by
linarith [hm1]
nlinarith
have hm5 : (-m1 : ℤ) * Real.pi = - (m1 * Real.pi) := by
simp
rw [hm5]
linarith [hm4]
错误修正示例
在下面的示例中,初始证明由于直接将 simp
应用于 ((Real.sqrt 3 / 2 : ℂ) + (1 / 2) * Complex.I) ^ 12
导致搜索复杂性过高而超时。通过逐步计算中间幂(如 $z^3$、然后是 $z^6$、最后是 $z^{12}$)来分解证明,显著降低了编译器的计算负担。
错误修正的证明示例
错误修正前的证明
import Mathlib
theorem algebra_299086 {z : ℂ} (hz : z = (Real.sqrt 3 / 2) + (1 / 2) * .I) : z ^ 2016 = 1 := by
have h1 : z = (Real.sqrt 3 / 2) + (1 / 2) * Complex.I := hz
-- Prove z^12 = 1
have h2 : z ^ 12 = 1 := by
rw [h1]
have h3 : ((Real.sqrt 3 / 2 : ℂ) + (1 / 2) * Complex.I) ^ 12 = 1 := by
simp [pow_succ, pow_zero, Complex.ext_iff, Complex.I_mul_I, mul_add, add_mul, mul_comm, mul_assoc, pow_two, pow_three, pow_succ, Real.sqrt_sq]
field_simp [Complex.ext_iff, mul_add, add_mul, pow_two, pow_three]
<;> ring_nf <;> norm_num [Real.sqrt_sq]
exact h3
calc
z ^ 2016 = (z ^ 12) ^ 168 := by ring
_ = 1 ^ 168 := by rw [h2]
_ = 1 := by norm_num
错误信息
Error message:
Lean process timed out
错误修正后的证明
import Mathlib
theorem algebra_299086 {z : ℂ} (hz : z = (Real.sqrt 3 / 2) + (1 / 2) * .I) : z ^ 2016 = 1 := by
have h1 : z = (Real.sqrt 3 / 2) + (1 / 2) * Complex.I := by
exact_mod_cast hz
have h2 : z ^ 3 = Complex.I := by
rw [h1]
simp [pow_three, Complex.ext_iff, Complex.add_re, Complex.add_im, Complex.mul_re, Complex.mul_im, mul_add, add_mul, pow_two, mul_comm, mul_assoc]
field_simp
<;> ring_nf <;> norm_num
have h3 : z ^ 6 = -1 := by
calc
z ^ 6 = (z ^ 3) ^ 2 := by ring
_ = Complex.I ^ 2 := by rw [h2]
_ = -1 := by simp [Complex.I_sq]
have h4 : z ^ 12 = 1 := by
calc
z ^ 12 = (z ^ 6) ^ 2 := by ring
_ = (-1) ^ 2 := by rw [h3]
_ = 1 := by norm_num
calc
z ^ 2016 = (z ^ 12) ^ 168 := by
ring_nf
_ = (1) ^ 168 := by
rw [h4]
_ = 1 := by
norm_num
引用
如果您想引用 Kimina-Prover,目前请随意引用我们的预览论文。完整论文正在准备中,即将发布。
@article{kimina_prover_2025,
title = {Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning},
author = {Wang, Haiming and Unsal, Mert and Lin, Xiaohan and Baksys, Mantas and Liu, Junqi and Santos, Marco Dos and Sung, Flood and Vinyes, Marina and Ying, Zhenzhe and Zhu, Zekai and Lu, Jianqiao and Saxcé, Hugues de and Bailey, Bolton and Song, Chendong and Xiao, Chenjun and Zhang, Dehao and Zhang, Ebony and Pu, Frederick and Zhu, Han and Liu, Jiawei and Bayer, Jonas and Michel, Julien and Yu, Longhui and Dreyfus-Schmidt, Léo and Tunstall, Lewis and Pagani, Luigi and Machado, Moreira and Bourigault, Pauline and Wang, Ran and Polu, Stanislas and Barroyer, Thibaut and Li, Wen-Ding and Niu, Yazhe and Fleureau, Yann and Hu, Yangyang and Yu, Zhouliang and Wang, Zihan and Yang, Zhilin and Liu, Zhengying and Li, Jia},
year = {2025},
url = {http://arxiv.org/abs/2504.11354},
}
参考文献
[1]Kimi 团队等。"Kimi k1.5:用 LLMs 扩展强化学习。" arXiv 预印本 arXiv:2501.12599 (2025)。
[2]Qwen 等。"Qwen2.5 技术报告" arXiv 预印本 arXiv:2412.15115 (2024)。
[3]Yang, An 等。"Qwen3 技术报告。" arXiv 预印本 arXiv:2505.09388 (2025)。
[4]https://deepmind.google/discover/blog/ai-solves-imo-problems-at-silver-medal-level/
[5]Ren, Z. Z., et al. "Deepseek-prover-v2: 通过强化学习进行子目标分解,推进形式化数学推理。" arXiv 预印本 arXiv:2504.21801 (2025)。
[6]Cao, Chenrui 等。"在推理模型时代复兴 DSP 以进行高级定理证明。" arXiv 预印本 arXiv:2506.11487 (2025)。
[7]Wang, Haiming 等。"Kimina-prover 预览:通过强化学习迈向大型形式化推理模型。" arXiv 预印本 arXiv:2504.11354 (2025)。
[8]https://assets.anthropic.com/m/785e231869ea8b3b/original/claude-3-7-sonnet-system-card.pdf
[9]Li, Jia 等。"Numinamath:AI4Maths 中最大的公共数据集,包含 86 万对竞赛数学问题和解决方案。" Hugging Face 存储库 13 (2024): 9。
[10]Liu, Junqi 等。"CombiBench:评估 LLM 在组合数学方面的能力。" arXiv 预印本 arXiv:2505.03171 (2025)。