Formalizing Discrete Probability in Lean 4: The One-Time Pad#
FM Crypto Meeting#
Overview#
- Goal: Formalize basic discrete probability in Lean 4
- Case Study: One-Time Pad (OTP) and Perfect Secrecy
- Key Concepts:
- Probability Mass Functions (PMFs)
- Independence and joint distributions
- Conditional probability
- Bijections preserving uniform distributions
The One-Time Pad#
Informal Definition#
- Message space: \(M = \{0,1\}^n\)
- Key space: \(K = \{0,1\}^n\)
- Ciphertext space: \(C = \{0,1\}^n\)
- Encryption: \(\text{Enc}(m, k) = m \oplus k\)
- Decryption: \(\text{Dec}(c, k) = c \oplus k\)
Key Property#
\[\text{Dec}(\text{Enc}(m, k), k) = (m \oplus k) \oplus k = m\]
Lean 4 Formalization: Types#
import Mathlib.Data.Vector.Basic
def Plaintext (n : Nat) := List.Vector Bool n
def Key (n : Nat) := List.Vector Bool n
def Ciphertext (n : Nat) := List.Vector Bool n
-- Element-wise XOR
def vec_xor {n : Nat} (v₁ v₂ : List.Vector Bool n) :=
map₂ xor v₁ v₂
def encrypt {n : Nat} (m : Plaintext n) (k : Key n) : Ciphertext n :=
vec_xor m k
Correctness of Encryption/Decryption#
lemma encrypt_decrypt {n : Nat} (m : Plaintext n) (k : Key n) :
decrypt (encrypt m k) k = m := by
unfold encrypt decrypt vec_xor
apply ext -- vector extensionality
intro i -- prove element-wise
simp only [get_map₂]
-- Goal: xor (xor (get m i) (get k i)) (get k i) = get m i
simp -- Uses xor properties automatically
Key insight: Reduce vector equality to element-wise boolean equality
Probability Mass Functions in Lean 4#
Mathlib's PMF type#
import Mathlib.Probability.ProbabilityMassFunction.Constructions
-- PMF α represents discrete probability distributions over α
-- (μ : PMF α) assigns probability μ a to element a : α
Uniform distribution over keys#
noncomputable def μK {n : ℕ} : PMF (Key n) :=
uniformOfFintype (Key n)
-- For any key k: μK k = 1 / 2^n
Independence and Joint Distributions#
Independent product of PMFs#
noncomputable def μMK {n : ℕ} (μM : PMF (Plaintext n)) :
PMF (Plaintext n × Key n) :=
PMF.bind μM (fun m => PMF.map (fun k => (m, k)) μK)
-- P(M = m, K = k) = P(M = m) · P(K = k)
Ciphertext distribution#
noncomputable def μC {n : Nat} (μM : PMF (Plaintext n)) :
PMF (Ciphertext n) :=
PMF.bind (μMK μM) (fun mk =>
PMF.pure (encrypt mk.1 mk.2))
Key Lemma 1: XOR is a Bijection#
def xorEquiv {n : ℕ} (m : Plaintext n) : Key n ≃ Ciphertext n where
toFun := encrypt m -- k ↦ m ⊕ k
invFun := vec_xor m -- c ↦ m ⊕ c
left_inv := by -- m ⊕ (m ⊕ k) = k
intro k
apply ext
simp [encrypt, vec_xor, get_map₂, xor_aab_eq_b]
right_inv := by -- (m ⊕ c) ⊕ m = c
intro c
apply ext
simp [encrypt, vec_xor, get_map₂, xor_aab_eq_b]
For fixed \(m\), the map \(k \mapsto m \oplus k\) is a bijection!
Key Lemma 2: Bijections Preserve Uniform Distributions#
lemma map_uniformOfFintype_equiv
{α β : Type*} [Fintype α] [Fintype β] [DecidableEq β]
[Nonempty α] [Nonempty β] (e : α ≃ β) :
PMF.map e (uniformOfFintype α) = uniformOfFintype β
Intuition#
- If we have a uniform distribution on \(\alpha\)
- And apply a bijection \(e : \alpha \to \beta\)
- We get a uniform distribution on \(\beta\)
- Crucial: bijections preserve cardinality!
Perfect Secrecy: Statement#
Informal Version#
For all messages \(m\) and ciphertexts \(c\):
\[P(M = m \; | \; C = c) = P(M = m)\]
Lean 4 Version#
theorem perfect_secrecy {n : Nat} (μM : PMF (Plaintext n))
(m₀ : Plaintext n) (c₀ : Ciphertext n) :
(μC_M m₀) c₀ * μM m₀ / (μC μM) c₀ = μM m₀
Where:
μC_M m₀
is the conditional distribution \(P(C \; | \; M = m_0)\)μC μM
is the marginal distribution \(P(C)\)
Proof Strategy#
-
Show conditional distribution is uniform:
\[P(C = c \; | \; M = m) = 2^{-n}\] -
Show marginal distribution is uniform:
\[P(C = c) = 2^{-n}\] -
Apply Bayes' theorem:
\[\begin{align*} P(M = m | C = c) =& \frac{P(C = c \; | \; M = m) · P(M = m)}{P(C = c)}\\[8pt] &= \frac{2^{-n} · P(M = m)}{2^{-n}}\\[8pt] &= P(M = m) \end{align*}\]
Step 1: Conditional Distribution is Uniform#
lemma C_given_M_eq_inv_card_key {n : ℕ}
(m : Plaintext n) (c : Ciphertext n) :
(μC_M m) c = 1 / card (Key n) := by
-- μC_M m = map (encrypt m) μK
-- encrypt m is a bijection (xorEquiv m)
-- So map (encrypt m) μK = uniformOfFintype (Ciphertext n)
have hμ : μC_M m = uniformOfFintype (Ciphertext n) := by
apply map_uniformOfFintype_equiv (xorEquiv m)
simpa [hμ, uniformOfFintype_apply]
using card_congr (xorEquiv m)
Step 2: Marginal Distribution is Uniform#
lemma prob_C_uniform_ennreal {n : Nat} (μM : PMF (Plaintext n))
(c : Ciphertext n) :
(μC μM) c = (card (Key n) : ENNReal)⁻¹
Key insight:#
- For each \(m\), there's exactly one \(k\) such that \(m \oplus k = c\)
- Namely, \(k = m \oplus c\)
- So we can rewrite the sum over \((m,k)\) pairs as a sum over \(m\) alone
Working with ENNReal#
Challenge: Division in probability#
- PMFs use
NNReal
(non-negative reals) - Division requires
ENNReal
(extended non-negative reals) - Need to handle \(0\) and \(\infty\) carefully
Key properties used:#
-- For x ≠ 0 and x ≠ ∞:
x * y / x = y
-- Uniform distribution has probability 1/|S|
(uniformOfFintype S) s = (card S)⁻¹
Lessons Learned#
1. Type Classes Matter#
Fintype
for finite typesNonempty
to avoid division by zeroDecidableEq
for computability
2. Bijections are Powerful#
- XOR with fixed value is a bijection
- Bijections preserve uniform distributions
- Can transform complex sums using bijections
3. PMF Library is Well-Designed#
bind
for dependent distributionsmap
for transforming distributions- Uniform distributions built-in
Challenges in Formalization#
1. Vector Equality#
- Must use extensionality (
ext
) - Reduce to element-wise proofs
simp
is very helpful withget_map₂
2. Infinite Sums#
tsum
requires careful manipulation- Product types need
tsum_prod'
- Conditional sums use
tsum_ite_eq
3. Real Number Arithmetic#
- Coercion between
Nat
,NNReal
,ENNReal
- Division requires non-zero denominators
- Must track when values are finite
Demo: Interactive Proof Development#
Let's see how Lean 4's tactics work in practice:
example {n : Nat} (m : Plaintext n) (k : Key n) :
encrypt m k = encrypt m k := by
-- Lean's proof state shows current goal
rfl -- reflexivity
example {n : Nat} (m : Plaintext n) (k₁ k₂ : Key n)
(h : encrypt m k₁ = encrypt m k₂) : k₁ = k₂ := by
-- Use key uniqueness
have h_unique := key_uniqueness m k₁ (encrypt m k₂)
rw [h_unique.mp h]
-- Alternative: use xorEquiv directly
Future Directions#
Extensions#
-
Other Cryptographic Constructions
- Stream ciphers
- Block ciphers (with appropriate modes)
- Public key encryption
-
Advanced Probability
- Computational indistinguishability
- Negligible functions
- Probabilistic polynomial time
-
Security Proofs
- Semantic security
- CPA/CCA security
- Reduction proofs
Takeaways#
- Lean 4 + Mathlib provides excellent support for probability
- Type-driven development helps catch errors early
- Bijections are a key tool in cryptographic proofs
- Perfect secrecy is elegantly expressible and provable
Resources#
- Slides https://formalmethods.io/crypto
- Mathlib docs https://leanprover-community.github.io/mathlib4_docs/
- Mathlib source https://github.com/leanprover-community/mathlib4
Questions?#
Quick Reference#
PMF α
- Probability mass function over typeα
uniformOfFintype
- Uniform distributionList.Vector
- Fixed-length vectorsENNReal
- Extended non-negative reals≃
- Type equivalence (bijection)
Thank you!