Improving the lower bound for the unit distance problem

May 21, 2026 · 15 min · 3000 words · nor

Introduction

Recently, an OpenAI internal model disproved the Erdős unit distance conjecture. The original model-generated writeup is Planar Point Sets with Many Unit Distances, and the human-verified version is at Remarks on the disproof of the unit distance conjecture and arXiv:2605.20695.

The conjecture was as follows:

Does every set of nn distinct points in 2\mathbb R^2 contain at most n1+O(1/loglogn)n^{1+O(1/\log\log n)} many pairs which are distance 11 apart?

For background, see Erdős Problem #90. As that page notes, the original source is Erdős' 1946 paper On sets of distances of nn points.

The following was proved by the OpenAI model:

There exists ϵ>0\epsilon > 0 such that the following holds. There exists a sequence of point sets 𝒫i\mathcal P_i in 2\mathbb R^2 such that |𝒫i||\mathcal P_i| \to \infty and the number of unit distances in 𝒫i\mathcal P_i is at least |𝒫i|1+ϵ|\mathcal P_i|^{1+\epsilon} for all ii.

Sawin (2026), in An explicit lower bound for the unit distance problem, refined the argument and arrived at a much more practical lower bound on the constant. His theorem gives, for nn arbitrarily large, a set of points U2U \subseteq \mathbb R^2 with |U|=n|U|=n and at least n1.014114/Cn^{1.014114}/C unit-distance pairs, for an absolute constant CC.

The best known upper bound so far is O(n4/3)O(n^{4/3}), due to Spencer, Szemerédi, and Trotter. Thus, the best possible value of ϵ\epsilon in a lower bound of the form n1+ϵn^{1+\epsilon} is at most 1/31/3. Sawin's bound ϵ=0.014114\epsilon=0.014114\ldots is therefore about a factor of 2424 away from the upper bound.

Sawin notes that his parameters are likely far from optimized, and explicitly suggests searching for better values of the parameters T,S,k,RT,S_{\mathbb Q},k,R.

We use GPT 5.5 Pro to improve the lower bound. The claim here is that we can improve the exponent to 1.031721.03172.

For nn arbitrarily large, there exists a set of points U2U \subset \mathbb R^2 such that |U|=n|U| = n and |{(v1,v2)U2:|v1v2|=1}|n1.03172C,|\{ (v_1,v_2) \in U^2 : |v_1-v_2|=1\}| \geq \frac{n^{1.03172}}{C}, where C=32q421,q odd primeq. C=32\prod_{q\leq 421,\ q \textrm{ odd prime}} q.

With ϵ=0.03172\epsilon=0.03172, the gap between this exponent gain and the upper-bound ceiling 1/31/3 is about a factor of 10.510.5.

Important caveats

This claim is tentative and subject to further verification. I've mostly checked the mechanical parts of the argument, but it's possible that there are subtleties in Sawin's paper that might break the following argument, especially since I do not have formal training in algebraic number theory.

UPD: The proof seems to be sound, upon independent verification.

Proof

The proof of the claim has little novelty, if any, and is largely the same as Sawin (2026). I will only discuss what changes. Before that, some discussion of what the original proof does is in order.

Setup

They prove a general criterion which turns suitable number fields into lower bounds for the unit distance problem. In the notation of his paper, if one has Galois CM fields K/FK/F of arbitrarily large degree with relative root discriminant λ\lambda, together with a finite set SS_{\mathbb Q} of rational primes satisfying the required splitting and inertia conditions, then one obtains an exponent gain

δ=log(11/R)+12log(2π/e)+pS12e(p)f(p)log(k(p)+1)14logλ12loglogλlog(2RpSpk(p)/(2e(p))+1). \delta = \frac{ \log(1-1/R)+\frac12\log(2\pi/e) +\sum_{p\in S_{\mathbb Q}} \frac{1}{2e(p)f(p)}\log(k(p)+1) -\frac14\log \lambda -\frac12\log\log\lambda }{ \log\left(2R\prod_{p\in S_{\mathbb Q}}p^{k(p)/(2e(p))}+1\right) }.

Here e(p)e(p) is the ramification index of pp in FF, f(p)f(p) is an upper bound for the inertia degree of pp in FF, k(p)k(p) is a positive integer weight, and R>1R>1 is a real parameter. The resulting unit-distance lower bound is |{(v1,v2)U2:|v1v2|=1}|(|U|)1+δ8λ2|\{(v_1,v_2)\in U^2: |v_1-v_2|=1\}| \geq \frac{(|U|)^{1+\delta}}{8\lambda^2} for arbitrarily large |U||U|.

In Sawin's construction, the fields come from a Golod--Shafarevich tower associated to a quadratic field

Q=(P),P=qTq, Q=\mathbb Q(\sqrt P), \qquad P=\prod_{q\in T}q,

where TT is a finite set of odd primes with an odd number of elements congruent to 33 modulo 44.

The local-condition budget in Sawin's field-existence lemma is |T|+|S|+|{pS:p split in Q}|+1(|T|1)24.|T|+|S_{\mathbb Q}| +|\{p\in S_{\mathbb Q}:p\textrm{ split in }Q\}| +1 \leq \frac{(|T|-1)^2}{4}. The extra split-prime term is important: if pp splits in QQ, there are two primes of QQ above pp, and both cost a relation in the Golod--Shafarevich count.

The new parameters

Let TT be the set of the first 8181 odd primes:

T={3,5,7,,421}. T=\{3,5,7,\ldots,421\}.

We pick

P=qTq,Q=(P). P=\prod_{q\in T}q, \qquad Q=\mathbb Q(\sqrt P).

There are 4141 primes qTq\in T with q3(mod4)q\equiv 3\pmod 4, hence P3(mod4)P\equiv 3\pmod 4, and the discriminant of QQ is 4P4P.

Define

S={p4643:p prime}{4643<p16063:p prime and inert in Q}. S_{\mathbb Q} = \{p\leq 4643:p\textrm{ prime}\} \cup \{4643<p\leq 16063:p\textrm{ prime and inert in }Q\}.

Equivalently, for p2Pp\nmid 2P, the second condition is (Pp)=1\left(\frac{P}{p}\right)=-1.

For pSp\in S_{\mathbb Q}, define k(p)k(p) as

{22,p=2,13,p=3,9,p=5,7,p=7,6,p=11,5,13p17,4,19p31,3,37p89,2,97p593,1,599p16063.\begin{cases} 22,&p=2,\\\\ 13,&p=3,\\\\ 9,&p=5,\\\\ 7,&p=7,\\\\ 6,&p=11,\\\\ 5,&13\leq p\leq 17,\\\\ 4,&19\leq p\leq 31,\\\\ 3,&37\leq p\leq 89,\\\\ 2,&97\leq p\leq 593,\\\\ 1,&599\leq p\leq 16063. \end{cases}

Finally take R=32.5237. R=32.5237.

For these fields, the ramification index used in the specialized formula is $e(p) = $

{2,p=2 or pT,1,otherwise,\begin{cases} 2,&p=2\textrm{ or }p\in T,\\\\ 1,&\textrm{otherwise}, \end{cases}

and the construction gives inertia degree at most 22, so we take f(p)=2f(p)=2 for all pSp\in S_{\mathbb Q}.

The finite computation

The finite computation gives

|S|=1264.|S_{\mathbb Q}|=1264.

Among these primes, the splitting behavior in Q/Q/\mathbb Q is 8282 ramified, 928928 intert and 254254 split.

The 8282 ramified primes are 22 and the 8181 primes in TT.

Thus the Golod--Shafarevich budget is exactly saturated:

|T|+|S|+|{pS:p split in Q}|+1=81+1264+254+1=1600,|T|+|S_{\mathbb Q}| +|\{p\in S_{\mathbb Q}:p\textrm{ split in }Q\}| +1 =81+1264+254+1=1600,

while

(|T|1)24=8024=1600. \frac{(|T|-1)^2}{4}=\frac{80^2}{4}=1600.

The local splitting condition also has to be checked. Every pSp\in S_{\mathbb Q} must either satisfy

p1(mod4), p\equiv 1\pmod 4,

or be inert in (q)\mathbb Q(\sqrt q) for at least one qTq\in T. Equivalently, if we set Δq\Delta_q to

{q,q1(mod4),4q,q3(mod4),\begin{cases} q,&q\equiv 1\pmod 4,\\\\ 4q,&q\equiv 3\pmod 4, \end{cases}

then for every pSp\in S_{\mathbb Q} not congruent to 1(mod4)1\pmod 4, one must have

(Δqp)=1 \left(\frac{\Delta_q}{p}\right)=-1

for some qTq\in T. The verification script we provide below checks this directly.

The exponent

With the above parameters, Sawin's expression specializes to

δ=ND, \delta=\frac{N}{D},

where the numerator and denominator are, respectively,

N=log(11/R)+12log(2π/e)+_pS_14e(p)log(k(p)+1)18log(4P)12loglog4P\begin{aligned} N={}&\log(1-1/R)+\frac12\log(2\pi/e) +\sum\_{p\in S\_{\mathbb Q}}\frac{1}{4e(p)}\log(k(p)+1)\\\\ &-\frac18\log(4P)-\frac12\log\log\sqrt{4P} \end{aligned}

and

D=log(2RpSpk(p)/(2e(p))+1). D= \log\left(2R\prod_{p\in S_{\mathbb Q}}p^{k(p)/(2e(p))}+1\right).

The computation gives

N=168.748528745514809679312047915 N=168.748528745514809679312047915\ldots

and

D=5319.58546756942831932697442321. D=5319.58546756942831932697442321\ldots.

Therefore

δ=ND=0.0317221200362850258500342889102>0.03172. \delta=\frac ND =0.0317221200362850258500342889102\ldots>0.03172.

Finally, in this construction the relative root discriminant is

λ=4P. \lambda=\sqrt{4P}.

Hence

8λ2=8(4P)=32P=32q421,q odd primeq. 8\lambda^2=8(4P)=32P =32\prod_{q\leq 421,\ q\textrm{ odd prime}}q.

This gives the constant claimed above.

Note that this is not the best constant possible, but in the asymptotic limit, this does not matter.

Verification script

Script
from __future__ import annotations

import math
from decimal import Decimal, getcontext

import mpmath as mp
import sympy as sp

try:
    from sympy.functions.combinatorial.numbers import kronecker_symbol
except:
    def jacobi_symbol_int(a: int, n: int) -> int:
        """Jacobi symbol (a/n), for n positive odd."""
        if n <= 0 or n % 2 == 0:
            raise ValueError("n must be a positive odd integer")

        a %= n
        result = 1

        while a:
            while a % 2 == 0:
                a //= 2
                r = n % 8
                if r in (3, 5):
                    result = -result

            a, n = n, a

            if a % 4 == 3 and n % 4 == 3:
                result = -result

            a %= n

        return result if n == 1 else 0


    def kronecker_symbol(a: int, n: int) -> int:
        """Kronecker symbol (a/n), for arbitrary integers a,n."""
        a = int(a)
        n = int(n)

        if n == 0:
            return 1 if abs(a) == 1 else 0

        result = 1

        # Unit factor -1 in the denominator.
        if n < 0:
            n = -n
            if a < 0:
                result = -result

        # Power-of-2 part of the denominator.
        while n % 2 == 0:
            n //= 2

            if a % 2 == 0:
                return 0

            r = a % 8
            result *= 1 if r in (1, 7) else -1

        # Remaining denominator is positive odd.
        if n == 1:
            return result

        return result * jacobi_symbol_int(a, n)

mp.mp.dps = 100
getcontext().prec = 80


def first_n_odd_primes(n: int) -> list[int]:
    out: list[int] = []
    p = 3
    while len(out) < n:
        if sp.isprime(p):
            out.append(p)
        p += 2
    return out


def fundamental_discriminant_prime_q(q: int) -> int:
    # For q an odd prime, the quadratic field Q(sqrt(q)) has discriminant
    # q if q == 1 mod 4, and 4q if q == 3 mod 4.
    return q if q % 4 == 1 else 4 * q


def k_value(p: int) -> int:
    if p == 2:
        return 22
    if p == 3:
        return 13
    if p == 5:
        return 9
    if p == 7:
        return 7
    if p == 11:
        return 6
    if 13 <= p <= 17:
        return 5
    if 19 <= p <= 31:
        return 4
    if 37 <= p <= 89:
        return 3
    if 97 <= p <= 593:
        return 2
    if p >= 599:
        return 1
    raise ValueError(f"Unhandled prime {p}")


T = first_n_odd_primes(81)
assert T[-1] == 421
assert sum(1 for q in T if q % 4 == 3) == 41
P = math.prod(T)
C = 32 * P

S_small = list(sp.primerange(2, 4644))
S_large_inert = [p for p in sp.primerange(4644, 16064) if kronecker_symbol(P, p) == -1]
S = S_small + S_large_inert
assert len(S_small) == 627
assert len(S_large_inert) == 637
assert len(S) == 1264
assert len(set(S)) == len(S)
assert S == sorted(S)

split_count = sum(1 for p in S if p not in T and p != 2 and kronecker_symbol(P, p) == 1)
assert split_count == 254
assert 81 + len(S) + split_count + 1 == (81 - 1) ** 2 // 4 == 1600

# Local splitting hypothesis of Lemma field-existence.
bad_local: list[int] = []
for p in S:
    if p % 4 == 1:
        continue
    if any(kronecker_symbol(fundamental_discriminant_prime_q(q), p) == -1 for q in T):
        continue
    bad_local.append(p)
assert not bad_local, bad_local[:10]

e = {p: (2 if p == 2 or p in T else 1) for p in S}
k = {p: k_value(p) for p in S}
R = mp.mpf(325237) / mp.mpf(10000)

num = (
    mp.log(1 - 1 / R)
    + mp.log(2 * mp.pi / mp.e) / 2
    + mp.fsum(mp.log(k[p] + 1) / (4 * e[p]) for p in S)
    - mp.log(4 * P) / 8
    - mp.log(mp.log(mp.sqrt(4 * P))) / 2
)
# Use log(A+1) = log(A) + log(1+1/A), with log(A) computed as a sum.
logA = mp.log(2 * R) + mp.fsum(k[p] * mp.log(p) / (2 * e[p]) for p in S)
den = logA + mp.log(1 + mp.e ** (-logA))
delta = num / den

assert num > mp.mpf("168.748")
assert den < mp.mpf("5319.586")
assert delta > mp.mpf("0.03172")
assert Decimal("168.748") / Decimal("5319.586") > Decimal("0.03172")

# Check the upper-bound numerical claim.
c = mp.mpf("4.116")
upper_sum = mp.mpf("0")
positive = []
for p in list(sp.primerange(2, 1000)):
    vals = [c * mp.log(kk + 1) - kk * mp.log(p) for kk in range(50)]
    m = max(vals)
    kk = vals.index(m)
    if m > 0:
        upper_sum += m
        positive.append((p, kk, m))
upper_bound_expr = (
    -c * mp.log(2) / 2
    - mp.log(2)
    + c * mp.log(1 - 1 / (c + 1))
    - mp.log(c + 1)
    + upper_sum / 2
)
assert upper_bound_expr < 0

print("T_size", len(T))
print("T_last", T[-1])
print("T_3mod4", sum(1 for q in T if q % 4 == 3))
print("S_size", len(S))
print("S_split_count", split_count)
print("GS_left", 81 + len(S) + split_count + 1)
print("C", C)
print("numerator", mp.nstr(num, 30))
print("denominator", mp.nstr(den, 30))
print("delta", mp.nstr(delta, 30))
print("upper_bound_expr", mp.nstr(upper_bound_expr, 30))
print("positive upper-bound maxima", [(p, kk) for p, kk, _ in positive])

The verification script checks the parameter arithmetic, the local splitting condition, the Golod--Shafarevich budget, and the numerical value of δ\delta. Its output is:

Output
T_size 81
T_last 421
T_3mod4 41
S_size 1264
S_split_count 254
GS_left 1600
C 47082029846439902373300552676857151217413642304918530623715869353452183912844092721301455161128887627577037327100579567727757921375947966068356869517751730533224975774230240
numerator 168.748528745514809679312047915
denominator 5319.58546756942831932697442321
delta 0.0317221200362850258500342889102
upper_bound_expr -0.00105208369181224469571344885647
positive upper-bound maxima [(2, 5), (3, 3), (5, 2), (7, 1), (11, 1), (13, 1), (17, 1)]

The most important checks are:

  • |T|=81|T|=81 and maxT=421\max T=421,
  • |S|=1264|S_{\mathbb Q}|=1264,
  • exactly 254254 primes in SS_{\mathbb Q} split in QQ,
  • the Golod--Shafarevich budget is exactly 16001600 on both sides,
  • every pSp\in S_{\mathbb Q} satisfies the required local splitting condition,
  • N>168.748N>168.748,
  • D<5319.586D<5319.586,
  • N/D>0.03172N/D>0.03172.

References

  • Noga Alon, Thomas F. Bloom, W. T. Gowers, Daniel Litt, Will Sawin, Arul Shankar, Jacob Tsimerman, Victor Wang, and Melanie Matchett Wood. Remarks on the disproof of the unit distance conjecture. 2026. PDF, arXiv:2605.20695.

  • Paul Erdős. On sets of distances of nn points. American Mathematical Monthly 53 (1946), 248--250. PDF.

  • Paul Erdős. Some of my favourite problems which recently have been solved. Proceedings of the International Mathematical Conference, Singapore 1981, North-Holland Math. Stud. 74, North-Holland, 1982, pp. 59--79.

  • Paul Erdős. Some problems in number theory, combinatorics and combinatorial geometry. Mathematica Pannonica 5(2), 1994, 261--269.

  • OpenAI. Planar Point Sets with Many Unit Distances. 2026. PDF.

  • Will Sawin. An explicit lower bound for the unit distance problem. 2026. arXiv:2605.20579.

  • Joel Spencer, Endre Szemerédi, and William T. Trotter, Jr. Unit distances in the Euclidean plane. In Graph Theory and Combinatorics (Cambridge, 1983), Academic Press, London, 1984, pp. 293--303. PDF.

  • Thomas F. Bloom. Erdős Problem #90. erdosproblems.com/90. Accessed 2026-05-21.

Acknowledgements

The original proof by Will Sawin (and before that, by the internal OpenAI model) did most of the heavy lifting.

GPT 5.5 Pro came up with the construction and proof (with mild steering from me) as well as most of the technical parts of this writeup.

Thanks to AcerFur for going through the above post prior to publication!

After the above post was finalized, I was also informed about claims of slightly better bounds: e.g., here.

Cite this post
@online{improving-unit-distance-lower-bounds,
  author    = {nor},
  title     = {Improving the lower bound for the unit distance problem},
  year      = {2026},
  month     = {05},
  day       = {21},
  url       = {https://nor-blog.pages.dev/posts/2026-05-21-improving-unit-distance-lower-bounds/},
}