The Donkey on the Edge THE PAPER · APPENDIX C May MMXXVI
THE PAPER · APPENDIX C

Appendix C – The Entropy-Replacement Proof (Haar)

C.1–C.5 off-diagonal F_off; C.6 diagonal-to-bulk F_diag by the centered-operator reduction; C.7 assembly.

Companion Pages

This appendix proves Theorem 1 (§3.5) for the Haar bulk class: the von Neumann entropy of the actual observer-reduced state may be replaced by the Shannon entropy of its diagonal, with an error subleading to the two-observer signal by a full power of dd. All constants below are dimension-independent; we write D=d2dMD = d^2 d_M for the effective dimension, ρ(0,1)\rho \in (0,1) for the non-isometry ratio, and X{A,B}X \in \{A,B\} for the two observers.

Throughout, ρRX\rho_{R_X} is the observer-XX reduced state of the HUZ-included, VV-mapped state (§3.1). Two diagonals enter, and must be kept distinct:

DX:=diag(ρRX) (actual, V-dependent),PX:=diag(ρXbulk),  (PA)a=bcψabc2,  (PB)b=acψabc2,D_X := \mathrm{diag}(\rho_{R_X})\ \text{(actual, $V$-dependent)}, \qquad P_X := \mathrm{diag}(\rho_X^{\rm bulk}),\ \ (P_A)_a = \sum_{bc}|\psi_{abc}|^2,\ \ (P_B)_b=\sum_{ac}|\psi_{abc}|^2,

with PXP_X depending only on ψ|\psi\rangle. The full entropy-replacement error is taken against the bulk-marginal diagonal PXP_X (the object the scaling laws of §§4–5 use), and splits into an off-diagonal and a diagonal-to-bulk part:

FAB:=[S(ρRA)S(ρRB)][H(PA)H(PB)]  =  Foff+Fdiag,F_{AB} := \bigl[S(\rho_{R_A}) - S(\rho_{R_B})\bigr] - \bigl[H(P_A) - H(P_B)\bigr] \;=\; F_{\rm off} + F_{\rm diag}, Foff:=[SASB][H(DA)H(DB)],Fdiag:=[H(DA)H(DB)][H(PA)H(PB)].F_{\rm off} := \bigl[S_A - S_B\bigr] - \bigl[H(D_A) - H(D_B)\bigr], \qquad F_{\rm diag} := \bigl[H(D_A) - H(D_B)\bigr] - \bigl[H(P_A) - H(P_B)\bigr].

The off-diagonal perturbation is EX:=ρRXDXE_X := \rho_{R_X} - D_X. Sections C.1–C.5 bound FoffF_{\rm off} (the resolvent machinery); §C.6 adds the short diagonal-to-bulk bound (Lemma C.6) and assembles the full statement. We prove:

Theorem C.1 (entropy replacement, Haar class). In the joint Haar measure on bulk and VV, with dA=dB=dd_A = d_B = d and dMd_M, ρ\rho fixed, the replacement against the bulk-marginal diagonal obeys

E[FAB2]=O ⁣(d4dM2),henceE[SASB][H(PA)H(PB)]=O ⁣(d2dM1)=o ⁣(d3/2dM1),\mathbb{E}\bigl[F_{AB}^2\bigr] = O\!\bigl(d^{-4} d_M^{-2}\bigr), \qquad\text{hence}\qquad \mathbb{E}\,\bigl|[S_A-S_B]-[H(P_A)-H(P_B)]\bigr| = O\!\bigl(d^{-2} d_M^{-1}\bigr) = o\!\bigl(d^{-3/2} d_M^{-1}\bigr),

with both FoffF_{\rm off} and FdiagF_{\rm diag} of order O(d4dM2)O(d^{-4}d_M^{-2}) in L2L^2.

Since the Haar two-observer signal has variance Θ(d3dM2)\Theta(d^{-3}d_M^{-2}) (Theorem 2), the replacement-error variance is suppressed by one power of dd, and Theorem 2 is unconditional.

The proof has four ingredients: an exact resolvent representation of FABF_{AB} (§C.1); a linear bound reducing to a bulk-marginal moment (§C.2–C.3); a nonlinear bound controlled by a fourth moment of EXE_X (§C.4); the fourth-moment estimate itself (§C.5); and a diagonal-to-bulk bound for FdiagF_{\rm diag} (Lemma C.6). The argument is assembled in §C.7.


C.1 An exact antisymmetric resolvent representation

Write the von Neumann entropy through the integral representation S(σ)=Tr(σlogσ)S(\sigma) = -\mathrm{Tr}(\sigma\log\sigma) and the resolvent identity logσ=0 ⁣[(1+t)1(σ+t)1]dt\log\sigma = \int_0^\infty\!\bigl[(1+t)^{-1} - (\sigma+t)^{-1}\bigr]dt. Applied to the antisymmetric combination S(ρRA)S(ρRB)S(\rho_{R_A}) - S(\rho_{R_B}) and to the diagonal pair H(DA)H(DB)H(D_A) - H(D_B), and subtracting, the constant and single-resolvent terms cancel between the two observers, leaving an exact representation of the difference FoffF_{\rm off}. After the substitution t=s/dt = s/d (Jacobian ds/dds/d),

Lemma C.1 (resolvent representation).

Foff=1d0Ysds,Ys=tRtABt=s/d,F_{\rm off} = \frac1d\int_0^\infty Y_s\,ds, \qquad Y_s = t\,R_t^{AB}\big|_{t=s/d}, RtAB=Tr ⁣[(DA+t)1EA(ρRA+t)1](A ⁣ ⁣B).R_t^{AB} = \mathrm{Tr}\!\bigl[(D_A+t)^{-1} E_A (\rho_{R_A}+t)^{-1}\bigr] - (A\!\to\!B).

This is exact – all orders in the perturbation EXE_X – and is the antisymmetric analogue of the single-observer resolvent identity of Engelhardt–Gesteau–Harlow. The integrand is concentrated near the eigenvalue scale t1/dt \sim 1/d (equivalently sρ1s \sim \rho^{-1}); integrability at both ends follows from EXF2\|E_X\|_F \le 2 and (+t)1op1/t\|(\,\cdot\,+t)^{-1}\|_{\rm op} \le 1/t.

Numerical check. The representation reproduces the direct entropy difference with correlation 1.00001.0000 and unit slope across d{4,5,6}d \in \{4,5,6\}, with the integrand peaking at s=td0.5s = td \approx 0.5.

We split the integrand by one application of the resolvent identity (ρRX+t)1=(DX+t)1(DX+t)1EX(ρRX+t)1(\rho_{R_X}+t)^{-1} = (D_X+t)^{-1} - (D_X+t)^{-1}E_X(\rho_{R_X}+t)^{-1}, giving Ys=Yslin+YsnlY_s = Y_s^{\rm lin} + Y_s^{\rm nl} with

Yslin=t(Tr[(DA+t)2EA](A ⁣ ⁣B)),Ysnl=t(Tr[GAEAGAEAG~A](A ⁣ ⁣B)),Y_s^{\rm lin} = t\Bigl(\mathrm{Tr}[(D_A+t)^{-2}E_A] - (A\!\to\!B)\Bigr), \qquad Y_s^{\rm nl} = -t\Bigl(\mathrm{Tr}[G_A E_A G_A E_A \widetilde G_A] - (A\!\to\!B)\Bigr),

where GX:=(DX+t)1G_X := (D_X+t)^{-1} and G~X:=(ρRX+t)1\widetilde G_X := (\rho_{R_X}+t)^{-1}. The two pieces are bounded in §C.2–C.3 and §C.4 respectively.


C.2 The linear bound

The linear integrand is a weighted version of the bulk-marginal object of §5. Writing the diagonal entries pXa=1/d+δaXp_X^a = 1/d + \delta_a^X and centering, define the weighted operator HX(t)=a(pXa+t)2Q~X(a)H_X(t) = \sum_a (p_X^a+t)^{-2}\,\widetilde Q_X^{(a)}, where Q~X(a)\widetilde Q_X^{(a)} is the centered single-block operator of §C.3; then Tr[(DA+t)2EA](A ⁣ ⁣B)=ρ1Tr[(PVρI)(HA(t)HB(t))]\mathrm{Tr}[(D_A+t)^{-2}E_A]-(A\!\to\!B) = \rho^{-1}\mathrm{Tr}[(P_V-\rho I)(H_A(t)-H_B(t))].

Lemma C.2 (linear bound). EψTr ⁣[(HA(t)HB(t))2]Cd2(1+s)6dM\displaystyle \mathbb{E}_\psi\,\mathrm{Tr}\!\bigl[(H_A(t)-H_B(t))^2\bigr] \le \frac{C\,d^2}{(1+s)^6 d_M}, CC absolute, hence E[Foff2]lin=O(d4dM2)\mathbb{E}[F_{\rm off}^2]_{\rm lin} = O(d^{-4}d_M^{-2}).

Proof. The weight wXa=(pXa+t)2w_X^a = (p_X^a+t)^{-2} satisfies, by the mean value theorem on the event {pXac0/d}\{p_X^a \ge c_0/d\} (whose complement is exponentially rare), wXau2K(s)δaX|w_X^a - u^{-2}| \le K(s)\,|\delta_a^X| with u=(1+s)/du = (1+s)/d and K(s)=2c03d3(1+s)3K(s) = \tfrac{2}{c_0^3}d^3(1+s)^{-3}. Thus HX(t)H_X(t) has the form of the §C.3 object GX=aδaXQ~X(a)G_X = \sum_a \delta_a^X \widetilde Q_X^{(a)} with weights bounded by K(s)δaXK(s)|\delta_a^X|. The base lemma’s proof (§C.3) uses only the pointwise bound HAHBF24maxXTr(HX2)\|H_A-H_B\|_F^2 \le 4\max_X\mathrm{Tr}(H_X^2) and the exact single-block trace formula, both valid for any weight; pulling out K(s)2K(s)^2 and using the single-observer base bound EψTr(GX2)C1/(d5dM)\mathbb{E}_\psi\mathrm{Tr}(G_X^2)\le C_1/(d^5 d_M) gives EψTr[(HAHB)2]4K(s)2C1/(d5dM)=16C1c06d(1+s)6dM1Cd2(1+s)6dM1\mathbb{E}_\psi\mathrm{Tr}[(H_A-H_B)^2]\le 4K(s)^2 C_1/(d^5 d_M) = \tfrac{16 C_1}{c_0^6}\,d(1+s)^{-6}d_M^{-1}\le C d^2(1+s)^{-6}d_M^{-1}. Then, with Yslin=tρ1Tr[(PVρI)(HAHB)]Y_s^{\rm lin} = t\rho^{-1}\mathrm{Tr}[(P_V-\rho I)(H_A-H_B)] and EVTr[(PVρI)M]2=κ(MF2D1TrM2)\mathbb{E}_V|\mathrm{Tr}[(P_V-\rho I)M]|^2 = \kappa(\|M\|_F^2 - D^{-1}|\mathrm{Tr}M|^2) (Lemma C.3 below), Minkowski in L2(ψ)L^2(\psi) and 0s(1+s)3ds=12\int_0^\infty s(1+s)^{-3}ds = \tfrac12 give E[Foff2]lin=O(1/(d4dM2))\mathbb{E}[F_{\rm off}^2]_{\rm lin} = O(1/(d^4 d_M^2)). \square

Numerical check. The constant C=EψTr[(HAHB)2](1+s)6dM/d2C = \mathbb{E}_\psi\mathrm{Tr}[(H_A-H_B)^2]\cdot (1+s)^6 d_M/d^2 measures 1.44,1.33,1.21,1.09,1.011.44, 1.33, 1.21, 1.09, 1.01 at d=4,,8d = 4,\dots,8 (dM=2d_M = 2), decreasing toward the leading value; the dMd_M-scaling is exactly 1/dM1/d_M.


C.3 The base moment lemma

The single-block object is GX=aδaXQ~X(a)G_X = \sum_a \delta_a^X \widetilde Q_X^{(a)}, where Q~X(a)=QX(a)pXaQ\widetilde Q_X^{(a)} = Q_X^{(a)} - p_X^a Q centers the rank-one block operator QX(a)=bϕabϕabQ_X^{(a)} = \sum_b |\phi_{ab}\rangle\langle\phi_{ab}| against the total Q=abϕabϕabQ = \sum_{ab}|\phi_{ab}\rangle\langle\phi_{ab}|. With pab=ϕab2p_{ab} = \||\phi_{ab}\rangle\|^2 the block masses (a flat Dirichlet vector on d2d^2 categories of concentration dMd_M), the relevant block weights are WX(a)=bpab2W_X^{(a)} = \sum_b p_{ab}^2 and Wblock=abpab2W_{\rm block} = \sum_{ab}p_{ab}^2.

Lemma (base moment bound). EψTr ⁣[(GAGB)2]C0d4dM\displaystyle \mathbb{E}_\psi\,\mathrm{Tr}\!\bigl[(G_A-G_B)^2\bigr] \le \frac{C_0}{d^4 d_M}, with C0C_0 a dimension-independent constant.

Proof. Pointwise GAGBF22GAF2+2GBF24maxXTr(GX2)\|G_A-G_B\|_F^2 \le 2\|G_A\|_F^2 + 2\|G_B\|_F^2 \le 4\max_X\mathrm{Tr}(G_X^2) (the cross term cancellation is not needed). The single-observer trace has the exact closed form

Tr(GX2)=a(δaX)2WX(a)2SWXδX2+WblockδX4,\mathrm{Tr}(G_X^2) = \sum_a (\delta_a^X)^2 W_X^{(a)} - 2 S_W^X\,\|\delta_X\|^2 + W_{\rm block}\,\|\delta_X\|^4,

with SWX=aδaXWX(a)S_W^X = \sum_a \delta_a^X W_X^{(a)}. The dominant term is the first; by Cauchy–Schwarz, Ea(δaX)2WX(a)=dE[(δ1X)2WX(1)]dE[(δ1X)4]E[(WX(1))2]\mathbb{E}\sum_a(\delta_a^X)^2 W_X^{(a)} = d\,\mathbb{E}[(\delta_1^X)^2 W_X^{(1)}] \le d\sqrt{\mathbb{E}[(\delta_1^X)^4]\,\mathbb{E}[(W_X^{(1)})^2]}. The fourth central moment of the marginal δ1X\delta_1^X (a centered Beta(ddM,d(d1)dM)\mathrm{Beta}(dd_M, d(d-1)d_M) variable) is, in closed form, E[(δ1X)4]=(3+γ2)μ22\mathbb{E}[(\delta_1^X)^4] = (3+\gamma_2)\mu_2^2 with

γ2=6[(d2)2(D+1)(d1)(D+2)](d1)(D+2)(D+3)12ddM,μ2=Var(δ1X),\gamma_2 = \frac{6[(d-2)^2(D+1)-(d-1)(D+2)]}{(d-1)(D+2)(D+3)} \le \frac{12}{d\,d_M}, \qquad \mu_2 = \mathrm{Var}(\delta_1^X),

so E[(δ1X)4]7μ22\mathbb{E}[(\delta_1^X)^4]\le 7\mu_2^2 for d3d\ge3; and E[(WX(1))2]\mathbb{E}[(W_X^{(1)})^2] is the exact Dirichlet moment (d(dM)4+d(d1)[(dM)2]2)/(D)4\bigl(d(d_M)_4 + d(d-1)[(d_M)_2]^2\bigr)/(D)_4. Substituting the scalings μ2=Θ(d3dM1)\mu_2 = \Theta(d^{-3}d_M^{-1}) and E[(WX(1))2]=Θ(d6dM2)\mathbb{E}[(W_X^{(1)})^2] = \Theta(d^{-6}d_M^{-2}) gives the dominant term Θ(d5dM1)\Theta(d^{-5}d_M^{-1}) and the stated bound after multiplying by 44. The remaining terms are smaller by 1/d1/d. \square

Numerical check. EψTr[(GAGB)2]d4dM=0.40,0.38,0.35,0.30,0.25\mathbb{E}_\psi\mathrm{Tr}[(G_A-G_B)^2]\cdot d^4 d_M = 0.40, 0.38, 0.35, 0.30, 0.25 at d=4,5,6,8,10d = 4,5,6,8,10; the closed-form γ2\gamma_2 matches Monte Carlo to three digits.


C.4 The nonlinear bound

Lemma (nonlinear bound). E[Foff2]nonlin=O ⁣(d4dM2)\displaystyle \mathbb{E}[F_{\rm off}^2]_{\rm nonlin} = O\!\bigl(d^{-4}d_M^{-2}\bigr), subleading to the signal by 1/d1/d.

Proof. By Schatten–Hölder applied to YsnlY_s^{\rm nl},

Tr[GXEXGXEXG~X]GXop2G~XopEXF2,\bigl|\mathrm{Tr}[G_X E_X G_X E_X \widetilde G_X]\bigr| \le \|G_X\|_{\rm op}^2\,\|\widetilde G_X\|_{\rm op}\,\|E_X\|_F^2,

with G~Xop1/t\|\widetilde G_X\|_{\rm op}\le 1/t deterministically (ρRX0\rho_{R_X}\succeq 0) and GXopmin(Cd,1/t)\|G_X\|_{\rm op}\le \min(Cd, 1/t) on the good event. Hence, with t=s/dt = s/d, Ysnlsdmin(Cd,d/s)2ds(EAF2+EBF2)|Y_s^{\rm nl}| \le \tfrac{s}{d}\min(Cd, d/s)^2\tfrac{d}{s}(\|E_A\|_F^2 + \|E_B\|_F^2). Minkowski in L2L^2 and the kernel integral 0sdmin(Cd,d/s)2dsds=2Cd2\int_0^\infty \tfrac{s}{d}\min(Cd,d/s)^2\tfrac{d}{s}\,ds = 2Cd^2 give E[Foff2]nonlin1d2Cd22EEXF4\sqrt{\mathbb{E}[F_{\rm off}^2]_{\rm nonlin}} \le \tfrac1d\cdot 2Cd^2\cdot 2\sqrt{\mathbb{E}\|E_X\|_F^4}. The fourth moment EEXF4=O(d6dM2)\mathbb{E}\|E_X\|_F^4 = O(d^{-6}d_M^{-2}) is supplied by §C.5, whence E[Foff2]nonlin=O(d4dM2)\mathbb{E}[F_{\rm off}^2]_{\rm nonlin} = O(d^{-4}d_M^{-2}). The bad event contributes poly(d)ecddM\mathrm{poly}(d)\cdot e^{-cdd_M}, negligible. \square

Numerical check. The actual nonlinear contribution is E[F2]nonlin/signal=0.011,0.005,0.003\mathbb{E}[F^2]_{\rm nonlin}/\text{signal} = 0.011, 0.005, 0.003 at d=4,5,6d=4,5,6, well inside the certified bound.


C.5 The fourth-moment estimate

The nonlinear bound rests on a fourth moment of the off-diagonal perturbation EXE_X. This follows from a single projector estimate plus a grouped-Dirichlet moment bound.

Lemma C.3 (fourth-moment projector estimate). Let PVP_V be a Haar-random rank-rr projector on CD\mathbb{C}^D, r=ρDr=\rho D, Π=PVρI\Pi = P_V - \rho I. For any trace-zero AA on CD\mathbb{C}^D (Hermitian or not),

EVTr(ΠA)4C4AF4D2,\mathbb{E}_V\bigl|\mathrm{Tr}(\Pi A)\bigr|^4 \le \frac{C_4\,\|A\|_F^4}{D^2},

with a dimension-independent absolute constant C4C_4.

Proof. Since Tr(A)=0\mathrm{Tr}(A)=0, Tr(ΠA)=Tr(PVA)\mathrm{Tr}(\Pi A) = \mathrm{Tr}(P_V A), which has mean zero. Split A=A1+iA2A = A_1 + iA_2 into Hermitian trace-zero parts, so that Tr(ΠA)48[(TrΠA1)4+(TrΠA2)4]|\mathrm{Tr}(\Pi A)|^4 \le 8[(\mathrm{Tr}\Pi A_1)^4 + (\mathrm{Tr}\Pi A_2)^4]; it suffices to bound each real piece. Writing PV=UΠrUP_V = U\Pi_r U^\dagger with UU Haar on U(D)U(D) and gk(U)=Tr(UΠrUAk)g_k(U) = \mathrm{Tr}(U\Pi_r U^\dagger A_k), the map UUΠrUU\mapsto U\Pi_r U^\dagger is 22-Lipschitz in the Frobenius metric, so gkg_k is mean-zero and 2AkF2\|A_k\|_F-Lipschitz. By the concentration inequality for Lipschitz functions on U(D)U(D) (Meckes, The Random Matrix Theory of the Classical Compact Groups, 2019), gkg_k is sub-Gaussian with variance proxy σk2=c0(2AkF)2/D\sigma_k^2 = c_0\,(2\|A_k\|_F)^2/D, whence Egk43σk448c02AkF4/D2\mathbb{E}|g_k|^4 \le 3\sigma_k^4 \le 48 c_0^2 \|A_k\|_F^4/D^2. Summing gives the claim with C4=384c02C_4 = 384 c_0^2. \square

Remark. The same estimate also follows from the fourth Weingarten moment of a Haar rank-rr projector; the concentration proof is used only for the O(D2)O(D^{-2}) scaling, not the sharp constant. Numerically and by the leading Gaussian pairing, C4D2/AF43κ2D23/16C_4 D^2/\|A\|_F^4 \to 3\kappa^2 D^2 \to 3/16 (the variable Tr(ΠA)\mathrm{Tr}(\Pi A) is asymptotically Gaussian with variance κAF2\kappa\|A\|_F^2, κ=r(Dr)D(D21)\kappa = \tfrac{r(D-r)}{D(D^2-1)}); measured 0.180.18, flat across D=16,32,64D = 16,32,64, with kurtosis 3\to 3.

The entries of EXE_X are exactly first-order in Π\Pi: (EX)aa=ρ1Tr(ΠMaa)/(1+η)(E_X)_{aa'} = \rho^{-1}\mathrm{Tr}(\Pi M_{a'a})/(1+\eta), where MaaM_{a'a} are trace-zero operators (the off-diagonal blocks NaaN_{a'a} and the centered diagonal Q~X(a)\widetilde Q_X^{(a)}) and η=ρ1Tr(ΠQ)\eta = \rho^{-1}\mathrm{Tr}(\Pi Q) is a normalization fluctuation.

Proposition C.4 (fourth moment of the perturbation). EEXF4Cd6dM2\displaystyle \mathbb{E}\|E_X\|_F^4 \le \frac{C}{d^6 d_M^2}, CC absolute.

Proof. On the good event G={1+η12}\mathcal{G} = \{|1+\eta|\ge\tfrac12\} (whose complement has probability ecddM\le e^{-cdd_M} and contributes EXop4Pr(Gc)16ecddM\le\|E_X\|_{\rm op}^4\Pr(\mathcal{G}^c)\le 16 e^{-cdd_M}, since EXE_X is a difference of density matrices so EXop2\|E_X\|_{\rm op}\le 2), EXF416ρ4(a,aTr(ΠMaa)2)2\|E_X\|_F^4 \le 16\rho^{-4}\bigl(\sum_{a,a'}|\mathrm{Tr}(\Pi M_{a'a})|^2\bigr)^2. Expanding and taking EV\mathbb{E}_V, Cauchy–Schwarz and Lemma C.3 give EV[TrΠMaa2TrΠMcc2]C4D2MaaF2MccF2\mathbb{E}_V[|\mathrm{Tr}\Pi M_{a'a}|^2|\mathrm{Tr}\Pi M_{c'c}|^2]\le C_4 D^{-2}\|M_{a'a}\|_F^2\|M_{c'c}\|_F^2, so EVEXF416ρ4C4D2(a,aMaaF2)2\mathbb{E}_V\|E_X\|_F^4 \le 16\rho^{-4}C_4 D^{-2}\bigl(\sum_{a,a'}\|M_{a'a}\|_F^2\bigr)^2. The bracket bound (Lemma C.5) gives Eψ()2=O(d2)\mathbb{E}_\psi(\cdot)^2 = O(d^{-2}), and with D2=d4dM2D^2 = d^4 d_M^2 the claim follows. \square

Lemma C.5 (bracket bound). Eψ[(a,aMaaF2)2]CBd2\displaystyle \mathbb{E}_\psi\Bigl[\bigl(\textstyle\sum_{a,a'}\|M_{a'a}\|_F^2\bigr)^2\Bigr] \le \frac{C_B}{d^2}, CBC_B absolute.

Proof. The bracket equals B=b[(pBb)2WB(b)]+aTr((Q~X(a))2)B = \sum_b[(p_B^b)^2 - W_B^{(b)}] + \sum_a\mathrm{Tr}((\widetilde Q_X^{(a)})^2). The off-diagonal part has Eψ=Θ(1/d)\mathbb{E}_\psi = \Theta(1/d) (exact Dirichlet second moment) and the diagonal part is O(1/d2)O(1/d^2), so Eψ[B]=Θ(1/d)\mathbb{E}_\psi[B] = \Theta(1/d). BB is a degree-2 polynomial in the Dirichlet masses; the grouped-Dirichlet fourth moments (as in §C.3) give Varψ(B)=O(1/d4)\mathrm{Var}_\psi(B) = O(1/d^4), so Eψ[B2]=(EψB)2+Varψ(B)=O(1/d2)\mathbb{E}_\psi[B^2] = (\mathbb{E}_\psi B)^2 + \mathrm{Var}_\psi(B) = O(1/d^2). \square

Numerical check. E[B]d=0.981.03\mathbb{E}[B]\,d = 0.98\to1.03 and E[B2]d2=0.961.06\mathbb{E}[B^2]\,d^2 = 0.96\to1.06 over d=4d=488 (both flat, confirming BB concentrates), and EEXF4d6dM2=18.0,11.9,9.65\mathbb{E}\|E_X\|_F^4\cdot d^6 d_M^2 = 18.0, 11.9, 9.65 at d=4,5,6d=4,5,6 (bounded, decreasing).


C.6 The diagonal-to-bulk bound

It remains to bound Fdiag=[H(DA)H(DB)][H(PA)H(PB)]F_{\rm diag} = [H(D_A)-H(D_B)] - [H(P_A)-H(P_B)], the error from replacing the actual reduced diagonal DX=diag(ρRX)D_X=\mathrm{diag}(\rho_{R_X}) by the bulk-marginal diagonal PXP_X. Write ηaX:=(ρRX)aapXa\eta_a^X := (\rho_{R_X})_{aa} - p_X^a; since DXD_X and PXP_X are both trace-one, aηaX=0\sum_a \eta_a^X = 0. The argument reduces FdiagF_{\rm diag} to the same base moment that controls FoffF_{\rm off}, through the centered block operators of §C.3.

Representation. With QX(a)=bϕabϕabQ_X^{(a)} = \sum_b |\phi_{ab}\rangle\langle\phi_{ab}| the block operator and P=VVP = V^\dagger V, the unnormalized diagonal is (ρRXunnorm)aa=Tr[PQX(a)](\rho_{R_X}^{\rm unnorm})_{aa} = \mathrm{Tr}[P\,Q_X^{(a)}], with Haar mean EV=ρTrQX(a)=ρpXa\mathbb{E}_V = \rho\,\mathrm{Tr}\,Q_X^{(a)} = \rho\,p_X^a. Dividing by Z=Ψ2Z = \|\Psi\|^2 (with EVZ=ρ\mathbb{E}_V Z = \rho) and using ρpXa=Tr[ρIQX(a)]\rho\,p_X^a = \mathrm{Tr}[\rho I\,Q_X^{(a)}],

ηaX  =  1ρTr ⁣[(PρI)QX(a)]  +  raX,raX=Zρρ(ρRX)aa,\eta_a^X \;=\; \frac{1}{\rho}\,\mathrm{Tr}\!\bigl[(P-\rho I)\,Q_X^{(a)}\bigr] \;+\; r_a^X, \qquad r_a^X = -\,\frac{Z-\rho}{\rho}\,(\rho_{R_X})_{aa},

the normalization remainder raXr_a^X being O(d2)(ρRX)aaO(d^{-2})\cdot(\rho_{R_X})_{aa} by Lemma 2. The leading-order entropy term of §C.6 (the Taylor expansion of HH about PXP_X, using H/pa=logpa1\partial H/\partial p_a = -\log p_a - 1 and aηaX=0\sum_a \eta_a^X = 0, which removes the constant and the 1-1) is the linear functional

LX  =  daδaXηaX,Fdiag  =  (LALB)  +  (RARB),L_X \;=\; -\,d\sum_a \delta_a^X\,\eta_a^X, \qquad F_{\rm diag} \;=\; (L_A - L_B) \;+\; (R_A - R_B),

with RX=O ⁣(a(ηaX)2/pXa)R_X = O\!\bigl(\sum_a (\eta_a^X)^2/p_X^a\bigr) the quadratic Taylor remainder.

Reduction to the base moment. Substituting the representation,

LALB  =  dρTr ⁣[(PρI)a(δaAQA(a)δaBQB(a))]  +  (norm remainder).L_A - L_B \;=\; -\frac{d}{\rho}\,\mathrm{Tr}\!\Bigl[(P-\rho I)\,\textstyle\sum_a\bigl(\delta_a^A Q_A^{(a)} - \delta_a^B Q_B^{(a)}\bigr)\Bigr] \;+\; (\text{norm remainder}).

Replacing each QX(a)Q_X^{(a)} by its centered form Q~X(a)=QX(a)pXaQ\widetilde Q_X^{(a)} = Q_X^{(a)} - p_X^a Q changes the bracket by (aδaXpXa)Q=δX2Q\bigl(\sum_a \delta_a^X p_X^a\bigr)Q = \|\delta_X\|^2\,Q (since aδaXpXa=a(pXa)21/d=δX2\sum_a \delta_a^X p_X^a = \sum_a (p_X^a)^2 - 1/d = \|\delta_X\|^2), and Tr[(PρI)Q]=Zρ\mathrm{Tr}[(P-\rho I)Q] = Z-\rho is the norm fluctuation; this difference is absorbed into the remainder. Hence, with GX=aδaXQ~X(a)G_X = \sum_a \delta_a^X \widetilde Q_X^{(a)} the operator of §C.3,

LALB  =  dρTr ⁣[(PρI)(GAGB)]  +  (remainder).L_A - L_B \;=\; -\frac{d}{\rho}\,\mathrm{Tr}\!\bigl[(P-\rho I)(G_A - G_B)\bigr] \;+\; (\text{remainder}).

For a uniformly random rank-ρD\rho D projector and any fixed Hermitian MM, the projector variance gives VarV(Tr[PM])ρ(1ρ)Tr(M2)/(D1)\mathrm{Var}_V(\mathrm{Tr}[P M]) \le \rho(1-\rho)\, \mathrm{Tr}(M^2)/(D-1). Applied with M=GAGBM = G_A - G_B (fixed by ψ\psi),

EV ⁣[(LALB)2ψ]    d2ρ2VarV ⁣(Tr[P(GAGB)])    Cd2DTr ⁣[(GAGB)2],C=1ρρ.\mathbb{E}_V\!\bigl[(L_A - L_B)^2 \,\big|\, \psi\bigr] \;\le\; \frac{d^2}{\rho^2}\,\mathrm{Var}_V\!\bigl(\mathrm{Tr}[P(G_A-G_B)]\bigr) \;\le\; C\,\frac{d^2}{D}\,\mathrm{Tr}\!\bigl[(G_A - G_B)^2\bigr], \qquad C = \tfrac{1-\rho}{\rho}.

Taking Eψ\mathbb{E}_\psi and invoking the base moment bound of §C.3, EψTr[(GAGB)2]C0/(d4dM)\mathbb{E}_\psi\,\mathrm{Tr}[(G_A-G_B)^2] \le C_0/(d^4 d_M), and D=d2dMD = d^2 d_M,

E[(LALB)2]    Cd2DC0d4dM  =  CC0d2d2dM1d4dM  =  O ⁣(d4dM2).\mathbb{E}\bigl[(L_A - L_B)^2\bigr] \;\le\; C\,\frac{d^2}{D}\cdot\frac{C_0}{d^4 d_M} \;=\; C\,C_0\,\frac{d^2}{d^2 d_M}\cdot\frac{1}{d^4 d_M} \;=\; O\!\bigl(d^{-4} d_M^{-2}\bigr).

Remainders. The normalization remainder (Zρ\propto Z-\rho, of relative size O(d2)O(d^{-2}) by Lemma 2) and the quadratic Taylor remainder RARBR_A - R_B (a sum of (ηaX)2/pXa(\eta_a^X)^2/p_X^a terms, controlled by the same Frobenius/fourth-moment estimates of §§C.4–C.5 applied to the diagonal entries) each contribute E[(RARB)2]=O(d4dM2)\mathbb{E}[(R_A - R_B)^2] = O(d^{-4} d_M^{-2}) or smaller. Collecting:

Lemma C.6 (diagonal-to-bulk replacement). In the joint measure,

E[Fdiag2]  =  O ⁣(d4dM2),\mathbb{E}\bigl[F_{\rm diag}^2\bigr] \;=\; O\!\bigl(d^{-4} d_M^{-2}\bigr),

the same order as FoffF_{\rm off}, hence one power of dd below the signal variance Θ(d3dM2)\Theta(d^{-3}d_M^{-2}). \square

The reduction is exact: FdiagF_{\rm diag} and FoffF_{\rm off} are bounded by the same quantity EψTr[(GAGB)2]\mathbb{E}_\psi\,\mathrm{Tr}[(G_A-G_B)^2] through the same rank-projector variance estimate, so no separate constant is introduced. (The independent Monte-Carlo value E[Fdiag2]d4dM2=0.49,0.36,0.34\mathbb{E}[F_{\rm diag}^2]\cdot d^4 d_M^2 = 0.49, 0.36, 0.34 at d=4,5,6d=4,5,6 is consistent, but the bound above does not rely on it.)

C.7 Assembly

The off-diagonal error FoffF_{\rm off} is controlled by combining the linear bound (Lemma C.2: E[Foff2]lin=O(d4dM2)\mathbb{E}[F_{\rm off}^2]_{\rm lin} = O(d^{-4}d_M^{-2})) with the nonlinear bound (§C.4, via Proposition C.4: E[Foff2]nonlin=O(d4dM2)\mathbb{E}[F_{\rm off}^2]_{\rm nonlin} = O(d^{-4}d_M^{-2})) and Minkowski’s inequality, giving E[Foff2]=O(d4dM2)\mathbb{E}[F_{\rm off}^2] = O(d^{-4}d_M^{-2}). Adding the diagonal-to-bulk bound (Lemma C.6: E[Fdiag2]=O(d4dM2)\mathbb{E}[F_{\rm diag}^2] = O(d^{-4}d_M^{-2})) through FAB=Foff+FdiagF_{AB} = F_{\rm off} + F_{\rm diag} and Minkowski once more,

E[FAB2]=O ⁣(d4dM2)=o ⁣(d3dM2),\mathbb{E}\bigl[F_{AB}^2\bigr] = O\!\bigl(d^{-4}d_M^{-2}\bigr) = o\!\bigl(d^{-3}d_M^{-2}\bigr),

which is one power of dd below the signal variance Θ(d3dM2)\Theta(d^{-3}d_M^{-2}). By Cauchy–Schwarz, EFABE[FAB2]=O(d2dM1)=o(d3/2dM1)\mathbb{E}|F_{AB}| \le \sqrt{\mathbb{E}[F_{AB}^2]} = O(d^{-2}d_M^{-1}) = o(d^{-3/2}d_M^{-1}). This proves Theorem C.1, hence Theorem 1 for the Haar class, and makes Theorem 2 unconditional. \blacksquare

End-to-end numerical check. Directly, E[FAB2]/signal=0.146,0.094,0.067\mathbb{E}[F_{AB}^2]/\text{signal} = 0.146, 0.094, 0.067 and EFAB/(d3/2dM1)=0.280,0.233,0.196\mathbb{E}|F_{AB}|/(d^{-3/2}d_M^{-1}) = 0.280, 0.233, 0.196 at d=4,5,6d=4,5,6 – both decreasing, consistent with the proved O(1/d)O(1/d) and O(1/d)O(1/\sqrt d) suppression.

Remark (product class). The representation of §C.1 and the nonlinear bound of §C.4 hold verbatim for any bulk class. The Haar-specific inputs are the bulk-marginal moments of §C.3 and §C.5, which use the near-maximally-mixed Dirichlet structure of the Haar marginal. For the product class the marginal is rank-one, and the small-mass régime of its diagonal is not controlled by the present argument; establishing the analogue of Lemma C.5 there would make Theorem 3 unconditional as well. This is the sense in which Theorem 1 remains conjectural for the product class.