当前位置: 首页 > news >正文

文献分享: Muvera多向量到单向量的转化方法(Part3——引理证明的补充)

原文章

由于 CSDN \text{CSDN} CSDN字数严格的限制,很多详细过程腰斩了,完整版见标题的超链接

1. \textbf{1. } 1. 事实 A.2 \textbf{A.2} A.2的证明

1.0. \textbf{1.0. } 1.0. 定理的内容

👉前提:设随机矩阵 S = ( s i j ) ∈ R d proj × d \mathbf{S}{=}(s_{ij}){∈}\mathbb{R}^{d_\text{proj}{×}d} S=(sij)Rdproj×d

  1. 每元素 s i j s_{ij} sij独立且同分布于 D \mathcal{D} D,其中 D \mathcal{D} D关于竖轴对称且 ∀ s i j ∼ D ∀s_{ij}{∼}\mathcal{D} sijD E [ s i j 2 ] = 1 \mathbb{E}[s_{ij}^2]{=}1 E[sij2]=1
  2. 对任意固定向量 u ∈ R d × 1 u{∈}\mathbb{R}^{d{×}1} uRd×1(即每个分量 u j u_{j} uj不随机),定义 ψ ( u ) = 1 t ( S u ) ψ{(u)}{=}\cfrac{1}{\sqrt{t}}(\mathbf{S}u) ψ(u)=t 1(Su)

👉结论 1 1 1 ∥ ψ ( u ) ∥ 2 \left\|ψ(u)\right\|^2 ψ(u)2 ∥ u ∥ 2 \|u\|^2 u2逼近

  1. 负半边:令 E [ s i j 4 ] = B < ∞ \mathbb{E}[s_{ij}^4]{=}B{<}\infty E[sij4]=B<,对 ∀ ε > 0 ∀{ε}{>}0 ε>0 Pr ⁡ [ ∥ u ′ ∥ 2 ≤ ( 1 – ε ) ∥ u ∥ 2 ] ≤ e – ( ε 2 – ε 3 ) t 2 ( B + 1 ) \Pr\left[\left\|u^{\prime}\right\|^2{≤}(1–ε)\|u\|^2\right]{≤}e^{–\frac{\left(ε^2–ε^3\right)t}{2(B{+}1)}} Pr[u2(1–ε)u2]e2(B+1)(ε2ε3)t
  2. 正半边:若 ∃ L > 0 \text{∃}L{>}0 L>0 ∀ m > 0 ∀m{>}0 m>0满足矩 E [ s 2 m ] ≤ ( 2 m ) ! 2 m m ! L 2 m \mathbb{E}\left[s^{2m}\right]{≤}\cfrac{(2m)!}{2^mm!}L^{2m} E[s2m]2mm!(2m)!L2m,对 ∀ ε > 0 ∀{ε}{>}0 ε>0 Pr ⁡ [ ∥ u ′ ∥ 2 ≥ ( 1 + ε ) L 2 ∥ u ∥ 2 ] ≤ ( ( 1 + ε ) e – ε ) t 2 ≤ e – ( ε 2 – ε 3 ) t 4 \Pr\left[\left\|u^{\prime}\right\|^2{≥}(1{+}ε)L^2\|u\|^2\right]{≤}\left((1{+}ε)e^{–ε}\right)^{\frac{t}{2}}{≤}e^{–\frac{\left(ε^2–ε^3\right)t}{4}} Pr[u2(1+ε)L2u2]((1+ε)eε)2te4(ε2ε3)t

1.1. \textbf{1.1. } 1.1. 矩分析

➡️设 X i = ∑ j = 1 d s i j u j = S i ⋅ u \displaystyle{}X_i{=}\sum_{j{=}1}^ds_{ij}u_{j}{=}\mathbf{S}_{i\cdot}u Xi=j=1dsijuj=Siu,则 ψ i ( u ) = 1 d proj ( S i ⋅ u ) = 1 d proj X i ψ_i(u){=}\cfrac{1}{\sqrt{d_\text{proj}}}\left(\mathbf{S}_{i\cdot}u\right){=}\cfrac{1}{\sqrt{d_\text{proj}}}X_i ψi(u)=dproj 1(Siu)=dproj 1Xi

➡️分析 X i X_i Xi 1/2/4 \text{1/2/4} 1/2/4阶矩

  1. 一阶矩: E [ X i ] = E [ ∑ j = 1 d s i j u j ] ⇒ 由于每个 u j 固定 E [ X i ] = ∑ j = 1 d u j E [ s i j ] ⇒ s i j 的分布关于竖轴对称 E [ X i ] = 0 \displaystyle\mathbb{E}\left[X_i\right]{=}\mathbb{E}\left[\sum_{j{=}1}^ds_{ij}u_{j}\right]\xRightarrow{由于每个u_{j}固定}\mathbb{E}\left[X_i\right]{=}\sum_{j{=}1}^du_{j}\mathbb{E}\left[s_{ij}\right]\xRightarrow{s_{ij}的分布关于竖轴对称}\mathbb{E}\left[X_i\right]{=}0 E[Xi]=E[j=1dsijuj]由于每个uj固定 E[Xi]=j=1dujE[sij]sij的分布关于竖轴对称 E[Xi]=0
  2. 二阶矩: E [ X i 2 ] = E [ ( ∑ j = 1 d s i j u j ) 2 ] = E [ ∑ j = 1 d ( s i j u j ) 2 ] + E [ 2 ∑ j 1 < j 2 ( s i j 1 u j 1 ) ( s i j 2 u j 2 ) ] = 1 \displaystyle\mathbb{E}\left[X_i^2\right]{=}\mathbb{E}\left[\left(\sum_{j{=}1}^ds_{ij}u_j\right)^2\right]{=}\mathbb{E}\left[\sum_{j{=}1}^d\left(s_{ij}u_j\right)^2\right]{+}\mathbb{E}\left[2\sum_{j_1{<}j_2}\left(s_{ij_1}u_{j_1}\right)\left(s_{ij_2}u_{j_2}\right)\right]{=}1 E[Xi2]=E (j=1dsijuj)2 =E[j=1d(sijuj)2]+E[2j1<j2(sij1uj1)(sij2uj2)]=1
  3. 四阶矩: E [ X i 4 ] = ∑ j 1 , 2 , 3 , 4 E [ s i j 1 s i j 2 s i j 3 s i j 4 ] \displaystyle\mathbb{E}\left[X_i^4\right]{=}\sum_{j_{1,2,3,4}}\mathbb{E}\left[s_{ij_1}s_{ij_2}s_{ij_3}s_{ij_4}\right] E[Xi4]=j1,2,3,4E[sij1sij2sij3sij4]
    • s i j s_{ij} sij相互独立且关于竖轴对称,故 E [ s i j 2 n +1 ] = E [ s i j 2 n ] E [ s i j ] = 0 \mathbb{E}\left[s_{ij}^{2n\text{+1}}\right]{=}\mathbb{E}\left[s_{ij}^{2n}\right]\mathbb{E}\left[s_{ij}\right]{=}0 E[sij2n+1]=E[sij2n]E[sij]=0 E [ s i j 2 n ] ≠ 0 \mathbb{E}\left[s_{ij}^{2n}\right]≠0 E[sij2n]=0
    • E [ s i j 1 s i j 2 s i j 3 s i j 4 ] \mathbb{E}\left[s_{ij_1}s_{ij_2}s_{ij_3}s_{ij_4}\right] E[sij1sij2sij3sij4]四种情况
      情形次数 E [ s i j 1 s i j 2 s i j 3 s i j 4 ] \boldsymbol{\mathbb{E}\left[s_{ij_1}s_{ij_2}s_{ij_3}s_{ij_4}\right]} E[sij1sij2sij3sij4]
      j 1 = j 2 = j 3 = j 4 j_1=j_2=j_3=j_4 j1=j2=j3=j4 1 1 1 B B B
      j 1 = j 2 ≠ j 3 = j 4 j_1=j_2≠j_3=j_4 j1=j2=j3=j4 C 4 2 / 2 =3 {C_4^2}/{2}\text{=3} C42/2=3 1 1 1
      j 1 = j 2 ≠ j 3 ≠ j 4 j_1=j_2≠j_3≠j_4 j1=j2=j3=j4 N/A \text{N/A} N/A 0 0 0
      j 1 ≠ j 2 ≠ j 3 ≠ j 4 j_1≠j_2≠j_3≠j_4 j1=j2=j3=j4 N/A \text{N/A} N/A 0 0 0
    • E [ X i 4 ] \displaystyle\mathbb{E}\left[X_i^4\right] E[Xi4]的情况
      索引情形次数 E [ X i 4 ] \boldsymbol{\mathbb{E}\left[X_i^4\right]} E[Xi4]
      j 1 = j 2 = j 3 = j 4 j_1=j_2=j_3=j_4 j1=j2=j3=j4 1 1 1 ∑ j = 1 d u j 4 E [ s i j 4 ] = B ∑ j = 1 d u j 4 \displaystyle\sum_{j{=}1}^du_j^4\mathbb{E}\left[s_{ij}^4\right]{=}B\sum_{j{=}1}^du_j^4 j=1duj4E[sij4]=Bj=1duj4
      j 1 = j 2 ≠ j 3 = j 4 j_1=j_2≠j_3=j_4 j1=j2=j3=j4 3 3 3 ∑ j α , j β u j α 2 u j β 2 – ∑ j α = j β u j α 2 u j β 2 = 1 − ∑ j = 1 d u j 4 \displaystyle\sum_{j_{α}\text{,}j_{\beta}}u_{j_{α}}^2u_{j_{\beta}}^2–\sum_{j_{α}{=}j_{\beta}}u_{j_{α}}^2u_{j_{\beta}}^2{=}1{-}\sum_{j{=}1}^du_j^4 jα,jβujα2ujβ2jα=jβujα2ujβ2=1j=1duj4
    • 合并得 E [ X i 4 ] = ( B – 3 ) ∑ j = 1 d u j 4 + 3 \mathbb{E}\left[X_i^4\right]\displaystyle{=}(B–3)\sum_{j=1}^du_j^4{+}3 E[Xi4]=(B–3)j=1duj4+3,而 ∑ j = 1 d u j 4 ≤ ∑ j = 1 d u j 2 =1 \displaystyle\sum_{j{=}1}^du_j^4{≤}\sum_{j{=}1}^du_j^2\text{=1} j=1duj4j=1duj2=1 E [ X i 4 ] ≤ B \displaystyle\mathbb{E}\left[X_i^4\right]{≤}B E[Xi4]B

➡️分析 X i X_i Xi类二阶矩母函数 E [ e α X i 2 ] ≤ 1 1 – 2 α L 2 \mathbb{E}\left[e^{αX_i^2}\right]{≤}\cfrac{1}{\sqrt{1–2αL^{2}}} E[eαXi2]1–2αL2 1

  1. 构造变量 s i j ′ ∼ N ( 0 , L ) s_{ij}^{\prime}{∼}N(0,L) sijN(0,L),定义其线性变换 Z i = ∑ j = 1 d s i j ′ u j \displaystyle{Z}_i{=}\sum_{j{=}1}^ds_{ij}^{\prime}u_{j} Zi=j=1dsijuj
  2. 正态分布性质中有 E [ s i j ′ 2 m ] = ( 2 m – 1 ) ! ! σ 2 m = ( 2 m ) ! 2 m m ! σ 2 m \mathbb{E}\left[s_{ij}^{\prime\text{ }2m}\right]{=}(2m–1)!!σ^{2m}{=}\cfrac{(2m)!}{2^mm!}σ^{2m} E[sij 2m]=(2m–1)!!σ2m=2mm!(2m)!σ2m,结合条件 E [ s i j 2 m ] ≤ ( 2 m ) ! 2 m m ! L 2 m \mathbb{E}\left[s^{2m}_{ij}\right]{≤}\cfrac{(2m)!}{2^mm!}L^{2m} E[sij2m]2mm!(2m)!L2m E [ s i j 2 m ] ≤ E [ s i j ′ 2 m ] \mathbb{E}\left[s^{2m}_{ij}\right]{≤}\mathbb{E}\left[s_{ij}^{\prime\text{ }2m}\right] E[sij2m]E[sij 2m]
  3. E [ s i j 2 m ] ≤ E [ s i j ′ 2 m ] \mathbb{E}\left[s_{ij}^{2m}\right]{≤}\mathbb{E}\left[s_{ij}^{\prime\text{ }2m}\right] E[sij2m]E[sij 2m]可得 E [ X i 2 m ] ≤ E [ Z i 2 m ] \mathbb{E}\left[X_i^{2m}\right]{≤}\mathbb{E}\left[Z_i^{2m}\right] E[Xi2m]E[Zi2m]
    • D \mathcal{D} D N ( 0 , L ) N(0,L) N(0,L)皆关于竖轴对称,则 E [ s i j 2 m + 1 ] = E [ s i j ′ 2 m + 1 ] = 0 \mathbb{E}\left[s^{2m{+}1}_{ij}\right]{=}\mathbb{E}\left[s^{\prime{\text{ }}2m{+}1}_{ij}\right]{=}0 E[sij2m+1]=E[sij 2m+1]=0,故在 E [ X i 2 m ] \mathbb{E}\left[X_i^{2m}\right] E[Xi2m] E [ Z i 2 m ] \mathbb{E}\left[Z_i^{2m}\right] E[Zi2m]展开直接忽略奇数项(如下)
      • E [ X i 2 m ] = ∑ ∑ α = 2 m ∀ α = 2 k ( 2 m ) ! ∏ i =1 d α i ! ( ∏ j = 1 d u j α j ) ∏ j = 1 d E [ s i j α j ] \displaystyle\mathbb{E}\left[X_i^{2m}\right]{=}\sum_{\substack{\sum\alpha=2m\\∀\alpha{=}2k}}\frac{(2m)!}{\prod_{i\text{=1}}^{d}\alpha_i\text{!}}\left(\prod_{j=1}^du_j^{\alpha_j}\right)\prod_{j=1}^d\mathbb{E}\left[s_{ij}^{\alpha_j}\right] E[Xi2m]=α=2mα=2ki=1dαi!(2m)!(j=1dujαj)j=1dE[sijαj]
      • E [ Z i 2 m ] \mathbb{E}\left[Z_i^{2m}\right] E[Zi2m]将上式 s i j s_{ij} sij换成 s i j ′ s_{ij}^\prime sij即可
    • 两式 E [ s i j α j ] \mathbb{E}\left[s_{ij}^{\alpha_j}\right] E[sijαj] E [ s i j ′ α j ] \mathbb{E}\left[s_{ij}^{\prime{\text{ }}\alpha_j}\right] E[sij αj] α j \alpha_j αj为偶数,故 E [ s i j α j ] ≤ E [ s i j ′ α j ] \mathbb{E}\left[s_{ij}^{\alpha_j}\right]{≤}\mathbb{E}\left[s_{ij}^{\prime{\text{ }}\alpha_j}\right] E[sijαj]E[sij αj],故 E [ X i 2 m ] ≤ E [ Z i 2 m ] \mathbb{E}\left[X_i^{2m}\right]{≤}\mathbb{E}\left[Z_i^{2m}\right] E[Xi2m]E[Zi2m]
  4. E [ X i 2 m ] ≤ E [ Z i 2 m ] \mathbb{E}\left[X_i^{2m}\right]{≤}\mathbb{E}\left[Z_i^{2m}\right] E[Xi2m]E[Zi2m] E [ e α X i 2 ] ≤ E [ e α Z i 2 ] \mathbb{E}\left[e^{αX_i^2}\right]{≤}\mathbb{E}\left[e^{αZ_i^2}\right] E[eαXi2]E[eαZi2],有正态分布性质知当 α < 1 2 L 2 α{<}\cfrac{1}{2L^2} α<2L21 E [ e α X i 2 ] ≤ E [ e α Z i 2 ] = 1 1 – 2 α L 2 \mathbb{E}\left[e^{αX_i^2}\right]{≤}\mathbb{E}\left[e^{αZ_i^2}\right]{=}\cfrac{1}{\sqrt{1–2αL^{2}}} E[eαXi2]E[eαZi2]=1–2αL2 1

1.2. \textbf{1.2. } 1.2. 负半边的证明

➡️设 Y = ∑ i = 1 d proj X i 2 \displaystyle{Y}{=}\sum_{i{=}1}^{d_\text{proj}}X_i^2 Y=i=1dprojXi2,默认 ∥ u ∥ = 1 \|u\|{=}1 u=1

  1. E [ Y ] = E [ ∑ i = 1 d proj X i 2 ] = ∑ i = 1 d proj E [ X i 2 ] = d proj \displaystyle\mathbb{E}[Y]{=}\mathbb{E}\left[\sum_{i{=}1}^{d_\text{proj}}X_i^2\right]{=}\sum_{i{=}1}^{d_\text{proj}}\mathbb{E}\left[X_i^2\right]{=}{d_\text{proj}} E[Y]=E i=1dprojXi2 =i=1dprojE[Xi2]=dproj
  2. Y = ∑ i = 1 d proj ( ψ i ( u ) d proj ) 2 = d proj ∥ ψ ( u ) ∥ 2 \displaystyle{Y}{=}\sum_{i=1}^{d_\text{proj}}\left(ψ_i(u)\sqrt{d_\text{proj}}\right)^2{=}d_\text{proj}\left\|ψ{(u)}\right\|^2 Y=i=1dproj(ψi(u)dproj )2=dprojψ(u)2,故 Pr ⁡ [ ∥ ψ ( u ) ∥ 2 ≤ ( 1 – ε ) ∥ u ∥ 2 ] = Pr ⁡ [ Y ≤ ( 1 – ε ) d proj ] \Pr\left[\left\|ψ{(u)}\right\|^{2}{≤}(1–ε)\|u\|^{2}\right]{=}\Pr[Y{≤}(1–ε){d_\text{proj}}] Pr[ψ(u)2(1–ε)u2]=Pr[Y(1–ε)dproj]

➡️马可夫不等式 Pr ⁡ [ Y ≤ ( 1 – ε ) d proj ] = Pr ⁡ [ e – α Y ≥ e – α ( 1 – ε ) d proj ] ≤ E [ e – α Y ] e – α ( 1 – ε ) d proj \Pr[Y{≤}(1–ε){d_\text{proj}}]{=}\Pr\left[e^{–αY}{≥}e^{–α(1–ε){d_\text{proj}}}\right]{≤}\cfrac{\mathbb{E}\left[e^{–αY}\right]}{e^{–α(1–ε){d_\text{proj}}}} Pr[Y(1–ε)dproj]=Pr[eαYeα(1–ε)dproj]eα(1–ε)dprojE[eαY]

  1. 分子 E [ e – α Y ] = E [ e – α ( X 1 2 + ⋅ ⋅ ⋅ + X d proj 2 ) ] = ∏ i = 1 d proj E [ e – α X i 2 ] = E d proj [ e – α X i 2 ] \mathbb{E}\left[e^{–αY}\right]{=}\displaystyle\mathbb{E}\left[e^{–α\left(X^2_1{+}···{+}X^2_{d_\text{proj}}\right)}\right]{=}\prod_{i=1}^{d_\text{proj}}\mathbb{E}\left[e^{–αX_i^2}\right]{=}\mathbb{E}^{d_\text{proj}}\left[e^{–αX_i^2}\right] E[eαY]=E[eα(X12+⋅⋅⋅+Xdproj2)]=i=1dprojE[eαXi2]=Edproj[eαXi2]
  2. 已证 E [ X i 4 ] ≤ B + 1 \mathbb{E}\left[X_i^4\right]{≤}B{+}1 E[Xi4]B+1,故泰勒展开得 E [ e – α X i 2 ] ≤ 1 – α + α 2 2 E [ X i 4 ] ≤ 1 – α + α 2 2 ( B + 1 ) \mathbb{E}\left[e^{–αX_i^2}\right]{≤}1–α{+}\cfrac{α^2}{2}\mathbb{E}\left[X_i^4\right]{≤}1–α{+}\cfrac{α^2}{2}(B{+}1) E[eαXi2]1–α+2α2E[Xi4]1–α+2α2(B+1)
  3. 代回得 Pr ⁡ [ ∥ ψ ( u ) ∥ 2 < ( 1 – ε ) ∥ u ∥ 2 ] ≤ ( ( 1 – α + α 2 2 ( B + 1 ) ) e α ( 1 – ε ) ) d proj \Pr\left[\left\|ψ{(u)}\right\|^{2}{<}(1–ε)\|u\|^{2}\right]{≤}\left(\left(1–α{+}\cfrac{α^2}{2}(B{+}1)\right)e^{α(1–ε)}\right)^{d_\text{proj}} Pr[ψ(u)2<(1–ε)u2]((1–α+2α2(B+1))eα(1–ε))dproj
  4. 又有 1 + ( – α + α 2 2 ( B + 1 ) ) ≤ e ( – α + α 2 2 ( B + 1 ) ) 1{+}\left(–α{+}\cfrac{α^2}{2}(B{+}1)\right){≤}e^{\left(–α{+}\frac{α^2}{2}(B{+}1)\right)} 1+(α+2α2(B+1))e(α+2α2(B+1)),故令 α = ε 1 + B α{=}\cfrac{ε}{1{+}B} α=1+Bε Pr ⁡ [ ∥ ψ ( u ) ∥ 2 < ( 1 – ε ) ∥ u ∥ 2 ] ≤ e ( – ε 2 2 ( B + 1 ) ) d proj \Pr\left[\left\|ψ{(u)}\right\|^{2}{<}(1–ε)\|u\|^{2}\right]{≤}e^{\left(–\frac{ε^2}{2(B{+}1)}\right){d_\text{proj}}} Pr[ψ(u)2<(1–ε)u2]e(2(B+1)ε2)dproj
  5. 1≤ e – – ε 3 2 ( B + 1 ) \text{1≤}e^{–\frac{–ε^3}{2(B{+}1)}} 1≤e2(B+1)ε3( ε > 0 ε{>}0 ε>0),所以 Pr ⁡ [ ∥ ψ ( u ) ∥ 2 < ( 1 – ε ) ∥ u ∥ 2 ] ≤ e ( – ε 2 2 ( B + 1 ) ) t ≤ e ( – ε 2 – ε 3 2 ( B + 1 ) ) t \Pr\left[\left\|ψ{(u)}\right\|^{2}{<}(1–ε)\|u\|^{2}\right]{≤}e^{\left(–\frac{ε^2}{2(B{+}1)}\right)t}{≤}e^{\left(–\frac{ε^2–ε^3}{2(B{+}1)}\right)t} Pr[ψ(u)2<(1–ε)u2]e(2(B+1)ε2)te(2(B+1)ε2ε3)t,证毕

1.3. \textbf{1.3. } 1.3. 正半边的证明

➡️设 Y = ∑ i = 1 d proj X i 2 \displaystyle{Y}{=}\sum_{i{=}1}^{d_\text{proj}}X_i^2 Y=i=1dprojXi2,默认 ∥ u ∥ = 1 \|u\|{=}1 u=1

  1. E [ Y ] = E [ ∑ i = 1 d proj X i 2 ] = ∑ i = 1 d proj E [ X i 2 ] = d proj \displaystyle\mathbb{E}[Y]{=}\mathbb{E}\left[\sum_{i{=}1}^{d_\text{proj}}X_i^2\right]{=}\sum_{i{=}1}^{d_\text{proj}}\mathbb{E}\left[X_i^2\right]{=}{d_\text{proj}} E[Y]=E i=1dprojXi2 =i=1dprojE[Xi2]=dproj
  2. Y = ∑ i = 1 d proj ( ψ i ( u ) d proj ) 2 = d proj ∑ i = 1 d proj ψ i 2 ( u ) = d proj ∥ ψ ( u ) ∥ 2 \displaystyle{}Y{=}\sum_{i=1}^{d_\text{proj}}\left(ψ_i(u)\sqrt{d_\text{proj}}\right)^2{=}d_\text{proj}\sum_{i{=}1}^{d_\text{proj}}ψ^2_i(u){=}d_\text{proj}\left\|ψ(u)\right\|^2 Y=i=1dproj(ψi(u)dproj )2=dproji=1dprojψi2(u)=dprojψ(u)2,故 Pr ⁡ [ ∥ ψ ( u ) ∥ 2 ≥ ( 1 + ε ) L 2 ∥ u ∥ 2 ] = Pr ⁡ [ Y ≥ ( 1 + ε ) L 2 d proj ] \Pr\left[\left\|ψ{(u)}\right\|^{2}{≥}(1{+}ε)L^2\|u\|^{2}\right]{=}\Pr[Y{≥}(1{+}ε)L^2{d_\text{proj}}] Pr[ψ(u)2(1+ε)L2u2]=Pr[Y(1+ε)L2dproj]

➡️马可夫不等式 Pr ⁡ [ Y ≥ ( 1 + ε ) L 2 d proj ] = Pr ⁡ [ e α Y ≥ e α ( 1 + ε ) L 2 d proj ] ≤ E [ e α Y ] e α ( 1 + ε ) L 2 d proj \Pr[Y{≥}(1{+}ε)L^2{d_\text{proj}}]{=}\Pr\left[e^{αY}{≥}e^{α(1{+}ε)L^2{d_\text{proj}}}\right]{≤}\cfrac{\mathbb{E}\left[e^{αY}\right]}{e^{ α(1{+}ε)L^2{d_\text{proj}}}} Pr[Y(1+ε)L2dproj]=Pr[eαYeα(1+ε)L2dproj]eα(1+ε)L2dprojE[eαY]

  1. E [ e α Y ] = E [ e α ( X 1 2 + X 2 2 + ⋅ ⋅ ⋅ + X d proj 2 ) ] = E [ e α X 1 2 ⋅ ⋅ ⋅ e α X d proj 2 ] = ∏ i = 1 d proj E [ e α X i 2 ] = E d proj [ e α X i 2 ] \displaystyle\mathbb{E}\left[e^{αY}\right]{=}\mathbb{E}\left[e^{α\left(X^2_1{+}X^2_2{+}···{+}X^2_{d_\text{proj}}\right)}\right]{=}\mathbb{E}\left[e^{αX^2_1}···e^{αX^2_{d_\text{proj}}}\right]{=}\prod_{i=1}^{d_\text{proj}}\mathbb{E}\left[e^{αX_i^2}\right]{=}\mathbb{E}^{d_\text{proj}}\left[e^{αX_i^2}\right] E[eαY]=E[eα(X12+X22+⋅⋅⋅+Xdproj2)]=E[eαX12⋅⋅⋅eαXdproj2]=i=1dprojE[eαXi2]=Edproj[eαXi2]
  2. 其中 E [ e α Y ] = E d proj [ e α X i 2 ] ≤ ( 1 1 – 2 α L 2 ) d proj 2 \displaystyle\mathbb{E}\left[e^{αY}\right]{=}\mathbb{E}^{d_\text{proj}}\left[e^{αX_i^2}\right]{≤}\left(\frac{1}{1–2\alpha L^{2}}\right)^{\frac{{d_\text{proj}}}{2}} E[eαY]=Edproj[eαXi2](1–2αL21)2dproj,即 Pr ⁡ [ ∥ ψ ( u ) ∥ 2 ≥ ( 1 + ε ) L 2 ∥ u ∥ 2 ] ≤ ( e – 2 α L 2 ( 1 + ε ) 1 – 2 α L 2 ) d proj 2 \Pr\left[\left\|ψ{(u)}\right\|^{2}{≥}(1{+}ε)L^2\|u\|^{2}\right]{≤}\left(\cfrac{e^{–2\alpha L^{2}(1{+}ε)}}{1–2\alpha L^{2}}\right)^{\frac{{d_\text{proj}}}{2}} Pr[ψ(u)2(1+ε)L2u2](1–2αL2e–2αL2(1+ε))2dproj
  3. 分析 ϕ ( α , ε ) = ( e – 2 α L 2 ( 1 + ε ) 1 – 2 α L 2 ) d proj 2 ϕ(\alpha,ε){=}\left(\cfrac{e^{–2\alpha L^{2}(1{+}ε)}}{1–2\alpha L^{2}}\right)^{\frac{{d_\text{proj}}}{2}} ϕ(α,ε)=(1–2αL2e–2αL2(1+ε))2dproj
    • α = ε 2 L 2 ( 1 + ε ) α{=}\cfrac{ε}{2L^2(1{+}ε)} α=2L2(1+ε)ε,得 ϕ ( α , ε ) = ϕ ( ε ) = ( ( 1 + ε ) e – ε ) d proj 2 ϕ(\alpha,ε){=}ϕ(ε){=}\left((1{+}ε)e^{–ε}\right)^{\frac{{d_\text{proj}}}{2}} ϕ(α,ε)=ϕ(ε)=((1+ε)eε)2dproj
    • 泰勒展开 ϕ ( ε ) = ( ( 1 + ε ) e – ε ) d proj 2 = e d proj 2 ( ln ⁡ ( 1 + ε ) – ε ) ≤ e d proj 2 ( – e 2 2 + s 3 2 ) ϕ(ε){=}\left((1{+}ε) e^{–ε}\right)^{\frac{{d_\text{proj}}}{2}}=e^{\frac{{d_\text{proj}}}{2}(\ln (1+ε)–ε)} \leq e^{\frac{{d_\text{proj}}}{2}\left(–\frac{e^2}{2}+\frac{s^3}{2}\right)} ϕ(ε)=((1+ε)eε)2dproj=e2dproj(ln(1+ε)ε)e2dproj(2e2+2s3)
    • Pr ⁡ [ ∥ ψ ( u ) ∥ 2 ≥ ( 1 + ε ) L 2 ∥ u ∥ 2 ] ≤ ( ( 1 + ε ) e – ε ) d proj 2 ≤ e – ( ε 2 – ε 3 ) t 4 \Pr\left[\left\|ψ{(u)}\right\|^2{≥}(1{+}ε)L^2\|u\|^2\right]{≤}\left((1{+}ε)e^{–ε}\right)^{\frac{{d_\text{proj}}}{2}}{≤}e^{–\frac{\left(ε^2–ε^3\right)t}{4}} Pr[ψ(u)2(1+ε)L2u2]((1+ε)eε)2dproje4(ε2ε3)t,证毕

1.4. \textbf{1.4. } 1.4. 事实 A.2 \textbf{A.2} A.2的推论

➡️推论一:让 s i j ∼ U { – 1 , 1 } s_{ij}{∼}U\{–1,1\} sijU{–1,1},则 Pr ⁡ [ ∥ ψ ( u ) ∥ 2 ≤ ( 1 – ε ) ∥ u ∥ 2 ] ≤ e – ( ε 2 – ε 3 ) t 4 \Pr\left[\left\|ψ{(u)}\right\|^2{≤}(1–ε)\|u\|^2\right]{≤}e^{–\frac{\left(ε^2–ε^3\right)t}{4}} Pr[ψ(u)2(1–ε)u2]e4(ε2ε3)t Pr ⁡ [ ∥ ψ ( u ) ∥ 2 ≥ ( 1 + ε ) ∥ u ∥ 2 ] ≤ e – ( ε 2 – ε 3 ) t 4 \Pr\left[\left\|ψ{(u)}\right\|^2{≥}(1{+}ε)\|u\|^2\right]{≤}e^{–\frac{\left(ε^2–ε^3\right)t}{4}} Pr[ψ(u)2(1+ε)u2]e4(ε2ε3)t

  1. s i j ∼ U { – 1 , 1 } s_{ij}{∼}U\{–1,1\} sijU{–1,1},有 E [ s i j 2 m ] = 1 2 ( − 1 ) 2 m + 1 2 1 2 m = 1 \mathbb{E}\left[s_{ij}^{2m}\right]{=}\cfrac{1}{2}(-1)^{2m}{+}\cfrac{1}{2}1^{2m}{=}1 E[sij2m]=21(1)2m+2112m=1 E [ s i j ] = 0 \mathbb{E}[s_{ij}]{=}0 E[sij]=0(符合前提),且 E [ s i j 4 ] = B = 1 \mathbb{E}\left[s_{ij}^{4}\right]{=}B{=}1 E[sij4]=B=1
  2. 再取 L =1 L\text{=1} L=1代入套定理,推论得证

➡️推论二:对 ∀ ε δ > 0 / ∀ d ≥ 1 / ∀ x y ∈ R d ∀εδ{>}0/∀{d}{≥}1/∀xy{∈}\mathbb{R}^{d} εδ>0/∀d1/∀xyRd t = O ( 1 ε 2 log ⁡ 1 δ ) t{=}O\left(\cfrac{1}{ε^{2}}\log\cfrac{1}{δ}\right) t=O(ε21logδ1),让 s i j ∼ U { – 1 , 1 } s_{ij}{∼}U\{–1,1\} sijU{–1,1} Pr [ ∣ ⟨ ψ ( u ) , ψ ( v ) ⟩ – ⟨ u , v ⟩ ∣ ≤ ε ] ≥ 1 – δ \text{Pr}\left[|⟨ψ(u),ψ(v)⟩–⟨u,v⟩|{≤}ε\right]{≥}1–δ Pr[ψ(u),ψ(v)⟩u,vε]1–δ

  1. 极化恒等式
    • ⟨ ψ ( u ) , ψ ( v ) ⟩ = 1 d proj ⟨ S u , S v ⟩ = 1 4 ( ∥ S ( u + v ) ∥ 2 d proj – ∥ S ( u – v ) ∥ 2 d proj ) ⟨ψ(u),ψ(v)⟩{=}\cfrac{1}{d_{\text{proj}}}⟨\mathbf{S}u,\mathbf{S}v⟩{=}\cfrac{1}{4}\left(\cfrac{\|\mathbf{S}(u{+}v)\|^2}{d_{\text{proj}}}–\cfrac{\|\mathbf{S}(u–v)\|^2}{d_{\text{proj}}}\right) ψ(u),ψ(v)⟩=dproj1Su,Sv=41(dprojS(u+v)2dprojS(uv)2),及 ⟨ u , v ⟩ = 1 4 ( ∥ u + v ∥ 2 – ∥ u – v ∥ 2 ) ⟨u,v⟩{=}\cfrac{1}{4}\left(\|u{+}v\|^2–\|u–v\|^2\right) u,v=41(u+v2–∥uv2)
    • 二者偏差为 ∣ 1 d proj ⟨ S u , S v ⟩ – ⟨ u , v ⟩ ∣ = 1 4 ( ( 1 d proj ∥ S ( u + v ) ∥ 2 – ∥ u + v ∥ 2 ) – ( 1 d proj ∥ S ( u – v ) ∥ 2 – ∥ u – v ∥ 2 ) ) \left|\cfrac{1}{d_{\text{proj}}}⟨\mathbf{S}u,\mathbf{S}v⟩–⟨u,v⟩\right|{=}\cfrac{1}{4}\left(\left(\cfrac{1}{d_{\text{proj}}}\|\mathbf{S}(u{+}v)\|^2–\|u{+}v\|^2\right)–\left(\cfrac{1}{d_{\text{proj}}}\|\mathbf{S}(u–v)\|^2–\|u–v\|^2\right)\right) dproj1Su,Svu,v =41((dproj1S(u+v)2–∥u+v2)(dproj1S(uv)2–∥uv2))
  2. u + v u{+}v u+v u – v u–v uv应用推论一
    • u ± v u{±}v u±v Pr ⁡ [ 1 d proj ∥ S ( u ± v ) ∥ 2 ≤ ( 1 − ε ) ∥ u ± v ∥ 2 ] ≤ e – ( ε 2 – ε 3 ) t 4 \Pr\left[\cfrac{1}{d_{\text{proj}}}\|\mathbf{S}(u{±}v)\|^2{≤}(1{-}ε)\|u{±}v\|^2\right]{≤}e^{–\frac{\left(ε^2–ε^3\right)t}{4}} Pr[dproj1S(u±v)2(1ε)u±v2]e4(ε2ε3)t 1 − ε 1{-}ε 1ε换成 1 + ε 1{+}ε 1+ε也成立
    • 合并后 Pr ⁡ [ 1 d proj ∥ S ( u + v ) ∥ 2 ∈ ( 1 ± ε ) ∥ u + v ∥ 2 ] ≥ 1 – 2 e – ( ε 2 – ε 3 ) t 4 \Pr\left[\cfrac{1}{d_{\text{proj}}}\left\|\mathbf{S}(u{+}v)\right\|^2{∈}(1{±}ε)\|u{+}v\|^2\right]{≥}1–2e^{–\frac{\left(ε^2–ε^3\right)t}{4}} Pr[dproj1S(u+v)2(1±ε)u+v2]1–2e4(ε2ε3)t u + v u{+}v u+v换成 u − v u{-}v uv也成立
    • 进而 max ⁡ { 1 d proj ∥ S ( u + v ) ∥ 2 − ∥ u + v ∥ 2 } = ε ∥ u + v ∥ 2 \max\left\{\cfrac{1}{d_{\text{proj}}}\|\mathbf{S}(u{+}v)\|^2{-}\|u{+}v\|^2\right\}{=}ε\|u{+}v\|^2 max{dproj1S(u+v)2u+v2}=εu+v2 min ⁡ { 1 d proj ∥ S ( u − v ) ∥ 2 − ∥ u − v ∥ 2 } = – ε ∥ u – v ∥ 2 \min\left\{\cfrac{1}{d_{\text{proj}}}\|\mathbf{S}(u{-}v)\|^2{-}\|u{-}v\|^2\right\}{=}–ε\|u–v\|^2 min{dproj1S(uv)2uv2}=εuv2
    • 代回 ∣ 1 d proj ⟨ S u , S v ⟩ – ⟨ u , v ⟩ ∣ ≤ ε 4 ( ∥ u + v ∥ 2 + ∥ u – v ∥ 2 ) = ε \left|\cfrac{1}{d_{\text{proj}}}⟨\mathbf{S}u,\mathbf{S}v⟩–⟨u,v⟩\right|{≤}\cfrac{ε}{4}\left(\|u{+}v\|^2{+}\|u–v\|^2\right){=}ε dproj1Su,Svu,v 4ε(u+v2+uv2)=ε
  3. 概率修正:
    • t = O ( 1 ε 2 log ⁡ 1 δ ) t{=}O\left(\cfrac{1}{ε^{2}}\log\cfrac{1}{δ}\right) t=O(ε21logδ1)假设 t ≥ 4 ε 2 – ε 3 ln ⁡ 4 δ t{≥}\cfrac{4}{ε^2–ε^3}\ln\cfrac{4}{δ} tε2ε34lnδ4,即 2 e – ( ε 2 – ε 3 ) t 4 ≤ δ 2 2e^{–\frac{\left(ε^2–ε^3\right)t}{4}}{≤}\cfrac{δ}{2} 2e4(ε2ε3)t2δ
    • Pr ⁡ [ 1 d proj ∥ S ( u + v ) ∥ 2 ∈ ( 1 ± ε ) ∥ u + v ∥ 2 ] ≥ 1 – δ 2 \Pr\left[\cfrac{1}{d_{\text{proj}}}\left\|\mathbf{S}(u{+}v)\right\|^2{∈}(1{±}ε)\|u{+}v\|^2\right]{≥}1–\cfrac{δ}{2} Pr[dproj1S(u+v)2(1±ε)u+v2]1–2δ u + v u{+}v u+v换成 u − v u{-}v uv也成立
    • Pr [ ∣ ⟨ ψ ( u ) , ψ ( v ) ⟩ – ⟨ u , v ⟩ ∣ ≤ ε ] ≥ 1 – δ \text{Pr}\left[|⟨ψ(u),ψ(v)⟩–⟨u,v⟩|{≤}ε\right]{≥}1–δ Pr[ψ(u),ψ(v)⟩u,vε]1–δ

2. \textbf{2. } 2. 事实 A.4 \textbf{A.4} A.4的证明

2.1. Muvera \textbf{2.1. Muvera} 2.1. Muvera前两步

1️⃣文本嵌入:得到查询和段落文本各自的多向量嵌入

  1. 查询嵌入 Q Q Q { q 1 , . . . , q m } \{q_1,...,q_m\} {q1,...,qm},其中 q i ⊆ R d q_i{⊆}\mathbb{R}^{d} qiRd即为固定 d d d
  2. 段落嵌入 P P P { p 1 , . . . , p n } \{p_1,...,p_n\} {p1,...,pn},其中 p i ⊆ R d p_i{⊆}\mathbb{R}^{d} piRd即为固定 d d d

2️⃣向量分桶:用 SimHash \text{SimHash} SimHash将原有空间分为 2 k sim 2^{k_{\text{sim}}} 2ksim个桶,每个桶用长为 k sim k_{\text{sim}} ksim的二进制向量编码

  1. 法向抽取:从高斯分布中抽取 k sim ≥ 1 k_{\text{sim}}{≥}1 ksim1个向量 g 1 , … , g k sim ∈ R d g_{1},\ldots,g_{k_{\text{sim}}}{∈}\mathbb{R}^{d} g1,,gksimRd,作为 k sim k_{\text{sim}} ksim个超平面的法向量
  2. 空间划分: φ ( x ) = ( 1 ( ⟨ g 1 , x ⟩ > 0 ) , … , 1 ( ⟨ g k sim , x ⟩ > 0 ) ) φ(x){=}(\mathbf{1}(⟨g_{1},x⟩{>}0),\ldots,\mathbf{1}(⟨g_{k_{\text{sim}}},x⟩{>}0)) φ(x)=(1(⟨g1,x>0),,1(⟨gksim,x>0))
    • 1 ( ⟨ g i , x ⟩ > 0 ) \mathbf{1}(⟨g_i,x⟩{>}0) 1(⟨gi,x>0):当 ⟨ g i , x ⟩ > 0 ⟨g_i,x⟩{>}0 gi,x>0成立(即 x x x投影在超平面 g i g_i gi的正侧)时,将该位设为 1 1 1
  3. 向量分桶:让所有的 m + n m{+}n m+n个嵌入通过 φ ( ⋅ ) φ(\cdot) φ()得到长 k sim k_{\text{sim}} ksim的二进制编码,相同编码者放入同一桶

2.2. \textbf{2.2. } 2.2. 声明的内容

👉记号:记 ∀ x , y ∈ R b ∀x,y{∈}\mathbb{R}^b x,yRb夹角为 θ ( x , y ) ∈ [ 0 , π ] θ(x,y){∈}[0,\pi] θ(x,y)[0,π],二进制编码 x , y x,y x,y的海明距离为 ∥ x – y ∥ 0 \|x–y\|_{0} xy0

👉前提 1 1 1:对 ∀ q i ∈ Q ∀q_i{∈}Q qiQ以及 ∀ p j ∈ P ∀p_j{∈}P pjP,给定 ∀ ε ≤ 1 2 ∀ε\text{≤}\cfrac{1}{2} ε21(与定理 2.1 \text{2.1} 2.1统一)与 ∀ δ ≤ ε ∀δ{≤}ε δε

👉前提 2 2 2:令 k sim = O ( 1 ε log ⁡ m δ ) k_{\text{sim}}{=}O\left(\cfrac{1}{ε}\log{\cfrac{m}{δ}}\right) ksim=O(ε1logδm),其中 m = ∣ Q ∣ + ∣ P ∣ m{=}|Q|{+}|P| m=Q+P

👉结论: ∣ ∥ φ ( q i ) – φ ( p j ) ∥ 0 − k s i m θ ( q i , p j ) π ∣ ≤ ε k s i m \left|\|φ(q_i)–φ(p_j)\|_0{-}\cfrac{k_{\mathrm{sim}}θ(q_i,p_j)}{\pi}\right|\text{≤}\sqrt{ε}k_{\mathrm{sim}} φ(qi)φ(pj)0πksimθ(qi,pj) ε ksim Pr ⁡ ≥ 1 – ( ε δ m 2 ) \Pr{≥}1–\left(\cfrac{εδ}{m^2}\right) Pr1–(m2εδ)的概率成立

2.3. \textbf{2.3. } 2.3. 声明的证明

➡️由以下分析可得 Pr ⁡ [ 1 ( ⟨ g k , q i ⟩ > 0 ) ≠ 1 ( ⟨ g k , p j ⟩ > 0 ) ] = θ ( q i , p j ) π \Pr[\mathbf{1}(⟨g_k,q_i⟩{>}0)\neq\mathbf{1}(⟨g_k, p_j⟩{>}0)]{=}\cfrac{θ(q_i,p_j)}{\pi} Pr[1(⟨gk,qi>0)=1(⟨gk,pj>0)]=πθ(qi,pj)

  1. 通过 Gram-Schmidt \text{Gram-Schmidt} Gram-Schmidt过程,对 q i , p j q_i,p_j qi,pj进行 R \mathbf{R} R旋转
    • d d d维空间基为 B = { e ⃗ 1 , e ⃗ 2 , . . . , e ⃗ d } B{=}\{\vec{e}_1,\vec{e}_2,...,\vec{e}_d\} B={e 1,e 2,...,e d}满足 ∀ e ⃗ i , e ⃗ j ∈ B ∀{\vec{e}_i,\vec{e}_j}{∈}B e i,e jB e ⃗ i ⊥ e ⃗ j \vec{e}_i\text{⊥}\vec{e}_j e ie j,让旋转矩阵 R \mathbf{R} R q i q_i qi旋转到 e ⃗ 1 \vec{e}_1 e 1方向即 R q i = ( 1 , 0 , . . . , 0 ) d \mathbf{R}q_i{=}(1,0,...,0)^d Rqi=(1,0,...,0)d
    • 只考虑 e ⃗ 1 \vec{e}_1 e 1 e ⃗ 2 \vec{e}_2 e 2组成的二维平面,则 R p j = e ⃗ 1 cos ⁡ ( R q i , R p j ) + e ⃗ 2 sin ⁡ ( R q i , R p j ) \mathbf{R}p_j{=}\vec{e}_1\cos{(\mathbf{R}q_i,\mathbf{R}p_j)}{+}\vec{e}_2\sin{(\mathbf{R}q_i,\mathbf{R}p_j)} Rpj=e 1cos(Rqi,Rpj)+e 2sin(Rqi,Rpj)
    • θ ( R q i , R p j ) = θ ( q i , p j ) θ{(\mathbf{R}q_i,\mathbf{R}p_j)}{=}θ{(q_i,p_j)} θ(Rqi,Rpj)=θ(qi,pj),则 R p j = e 1 cos ⁡ θ ( q i , p j ) + e 2 sin ⁡ θ ( q i , p j ) = ( cos ⁡ θ ( q i , p j ) , sin ⁡ θ ( q i , p j ) , 0 , . . . , 0 ) d \mathbf{R}p_j{=}e_1\cos{θ(q_i,p_j)}{+}e_2\sinθ{(q_i,p_j)}{=}(\cos{θ(q_i,p_j)},\sin{θ(q_i,p_j)},0,...,0)^d Rpj=e1cosθ(qi,pj)+e2sinθ(qi,pj)=(cosθ(qi,pj),sinθ(qi,pj),0,...,0)d
  2. g k g_k gk进行 R \mathbf{R} R旋转
    • (旋转不变性)对高斯向量 g k ∈ R d × 1 ∼ N ( 0 , I d ) g_k{∈}\mathbb{R}^{d\text{×}1}\text{∼}\mathcal{N}(\textbf{0},\boldsymbol{\text{I}_d}) gkRd×1N(0,Id)及旋转矩阵 R ∈ R d × d \mathbf{R}{∈}\mathbb{R}^{d\text{×}d} RRd×d,有 R g k ∼ N ( 0 , I d ) \mathbf{R}g_k\text{∼}\mathcal{N}(\textbf{0},\boldsymbol{\text{I}_d}) RgkN(0,Id)
    • R g k = ( g k 1 , . . . , g k d ) ∼ N ( 0 , I d ) \mathbf{R}g_k{=}(g_{k_1},...,g_{k_d})\text{∼}\mathcal{N}(\textbf{0},\boldsymbol{I_d}) Rgk=(gk1,...,gkd)N(0,Id),其中 g k 1 ∼ N ( 0 , 1 ) g_{k_1}\text{∼}{\mathcal{N}(0,1)} gk1N(0,1) g k 2 ∼ N ( 0 , 1 ) g_{k_2}\text{∼}{\mathcal{N}(0,1)} gk2N(0,1)
    • R \mathbf{R} R在对 x , y x,y x,y旋转时仅前两位起作用,故可令 R g k = ( g k 1 , g k 2 , 0 , . . . , 0 ) d \mathbf{R}g_k{=}(g_{k_1},g_{k_2},0,...,0)^d Rgk=(gk1,gk2,0,...,0)d,或 R g = ( cos ⁡ ϕ , sin ⁡ ϕ , 0 , . . . , 0 ) d \mathbf{R}g{=}(\cosϕ,\sinϕ,0,...,0)^d Rg=(cosϕ,sinϕ,0,...,0)d
  3. 求相应内积
    • ⟨ g k , q i ⟩ = ⟨ R g k , R q i ⟩ = ( g k 1 , g k 2 , . . . , g k d ) ( 1 , 0 , . . . , 0 ) d T = cos ⁡ ϕ ⟨g_k,q_i⟩{=}⟨\mathbf{R}g_k,\mathbf{R}q_i⟩{=}(g_{k_1},g_{k_2},...,g_{k_d})(1,0,...,0)^{dT}{=}\cosϕ gk,qi=Rgk,Rqi=(gk1,gk2,...,gkd)(1,0,...,0)dT=cosϕ
    • ⟨ g k , p j ⟩ = ⟨ R g k , R p j ⟩ = ( cos ⁡ ϕ , sin ⁡ ϕ , 0 , . . . , 0 ) d ( cos ⁡ θ ( q i , p j ) , sin ⁡ θ ( q i , p j ) , 0 , . . . , 0 ) d T = cos ⁡ ( ϕ – θ ( q i , p j ) ) ⟨g_k,p_j⟩{=}⟨\mathbf{R}g_k,\mathbf{R}p_j⟩{=}(\cosϕ,\sinϕ,0,...,0)^d(\cos{θ(q_i,p_j)},\sin{θ(q_i,p_j)},0,...,0)^{dT}{=}\cos(ϕ–θ(q_i,p_j)) gk,pj=Rgk,Rpj=(cosϕ,sinϕ,0,...,0)d(cosθ(qi,pj),sinθ(qi,pj),0,...,0)dT=cos(ϕθ(qi,pj))
  4. 整理 Pr ⁡ [ 1 ( ⟨ g k , q i ⟩ > 0 ) ≠ 1 ( ⟨ g k , p j ⟩ > 0 ) ] \Pr[\mathbf{1}(⟨g_k,q_i⟩{>}0)\neq\mathbf{1}(⟨g_{k},p_j⟩{>}0)] Pr[1(⟨gk,qi>0)=1(⟨gk,pj>0)],变为 Pr ⁡ [ 1 ( cos ⁡ ϕ > 0 ) ≠ 1 ( cos ⁡ ( ϕ – θ ( q i , p j ) ) > 0 ) ] \Pr\left[\mathbf{1}\left(\cosϕ{>}0\right)\neq\mathbf{1}\left(\cos(ϕ–θ(q_i,p_j)){>}0\right)\right] Pr[1(cosϕ>0)=1(cos(ϕθ(qi,pj))>0)]
    • 情形 1 1 1 cos ⁡ ϕ > 0 \cosϕ{>}0 cosϕ>0 cos ⁡ ( ϕ – θ ( q i , p j ) ) ≤ 0 \cos(ϕ–θ(q_i,p_j))\text{≤}0 cos(ϕθ(qi,pj))0,则 ϕ ∈ ( – π 2 , π 2 ) ϕ{∈}\left(–\cfrac{\pi}{2},\cfrac{\pi}{2}\right) ϕ(2π,2π) ϕ ∈ ( θ ( q i , p j ) + π 2 , θ ( q i , p j ) + 3 π 2 ) ϕ{∈}\left(θ(q_i,p_j){+}\cfrac{\pi}{2},θ(q_i,p_j){+}\cfrac{3\pi}{2}\right) ϕ(θ(qi,pj)+2π,θ(qi,pj)+23π),交集长 ∣ ϕ ∣ = θ ( q i , p j ) |ϕ|{=}θ(q_i,p_j) ϕ=θ(qi,pj)
    • 情形 2 2 2 cos ⁡ ϕ < 0 \cosϕ\text{<}0 cosϕ<0 cos ⁡ ( ϕ – θ ( q i , p j ) ) ≥ 0 \cos(ϕ–θ(q_i,p_j)){≥}0 cos(ϕθ(qi,pj))0,则 ϕ ∈ ( π 2 , 3 π 2 ) ϕ{∈}\left(\cfrac{\pi}{2},\cfrac{3\pi}{2}\right) ϕ(2π,23π) ϕ ∈ ( θ ( q i , p j ) – π 2 , θ ( q i , p j ) + π 2 ) ϕ{∈}\left(θ(q_i,p_j)\text{–}\cfrac{\pi}{2},θ(q_i,p_j){+}\cfrac{\pi}{2}\right) ϕ(θ(qi,pj)2π,θ(qi,pj)+2π),交集长 ∣ ϕ ∣ = θ ( q i , p j ) |ϕ|{=}θ(q_i,p_j) ϕ=θ(qi,pj)
    • ϕ ϕ ϕ总范围为 ∣ ϕ ∣ max ⁡ = 2 π |ϕ|_{\max}{=}2\pi ϕmax=2π,故 Pr ⁡ [ 1 ( ⟨ g k , q i ⟩ > 0 ) ≠ 1 ( ⟨ g k , p j ⟩ > 0 ) ] = 2 θ ( q i , p j ) 2 π \Pr[\mathbf{1}(⟨g_k,q_i⟩{>}0)\neq\mathbf{1}(⟨g_{k},p_j⟩{>}0)]{=}\cfrac{2θ(q_i,p_j)}{2\pi} Pr[1(⟨gk,qi>0)=1(⟨gk,pj>0)]=2π2θ(qi,pj)

➡️构造变量 Z k Z_k Zk

  1. 对每个 k ∈ { 1 , 2 , . . . , k sim } k{∈}\{1,2,...,k_{\text{sim}}\} k{1,2,...,ksim}及对应的高斯向量 g k g_k gk,定义 Z k = 1 ( ⟨ g k , q i ⟩ > 0 ) ⊕ 1 ( ⟨ g k , p j ⟩ > 0 ) Z_k{=}\mathbf{1}(⟨ g_k, q_i⟩{>}0){⊕}\mathbf{1}(⟨ g_k, p_j⟩{>}0) Zk=1(⟨gk,qi>0)1(⟨gk,pj>0),即二者不相等时 Z k = 1 Z_k{=}1 Zk=1
    erzthyjgdkvghvvzvgret
  2. 由海明距离的定义(两二进制编码上下对齐后有多少对应位不同),则有 ∥ φ ( q i ) – φ ( p j ) ∥ 0 = ∑ k = 1 k sim Z k \|φ(q_i)–φ(p_j)\|_0{=}\displaystyle\sum_{k{=}1}^{k_{\text{sim}}}Z_k φ(qi)φ(pj)0=k=1ksimZk
  3. Pr ⁡ [ 1 ( ⟨ g k , q i ⟩ > 0 ) ≠ 1 ( ⟨ g k , p j ⟩ > 0 ) ] = θ ( q i , p j ) π \Pr[\mathbf{1}(⟨g_k,q_i⟩{>}0)\neq\mathbf{1}(⟨g_k, p_j⟩{>}0)]{=}\cfrac{θ(q_i,p_j)}{\pi} Pr[1(⟨gk,qi>0)=1(⟨gk,pj>0)]=πθ(qi,pj) E [ Z k ] = 1 ( θ ( q i , p j ) π ) + 0 ( 1 – θ ( q i , p j ) π ) = θ ( q i , p j ) π \mathbb{E}\left[Z_k\right]{=}1\left(\cfrac{θ(q_i,p_j)}{\pi}\right){+}0\left(1–\cfrac{θ(q_i,p_j)}{\pi}\right){=}\cfrac{θ(q_i,p_j)}{\pi} E[Zk]=1(πθ(qi,pj))+0(1–πθ(qi,pj))=πθ(qi,pj)
  4. ∣ ∥ φ ( q i ) – φ ( p j ) ∥ 0 − k s i m θ ( q i , p j ) π ∣ = ∣ ∑ k = 1 k sim Z k − E [ ∑ k = 1 k sim Z k ] ∣ \left|\|φ(q_i)–φ(p_j)\|_0{-}\cfrac{k_{\mathrm{sim}}θ(q_i,p_j)}{\pi}\right|{=}\left|\displaystyle\sum_{k{=}1}^{k_{\text{sim}}}Z_k{-}\mathbb{E}\left[\sum_{k{=}1}^{k_{\text{sim}}}Z_k\right]\right| φ(qi)φ(pj)0πksimθ(qi,pj) = k=1ksimZkE[k=1ksimZk] ,故需证 Pr ⁡ [ ∣ ∑ k = 1 k sim Z k − E [ ∑ k = 1 k sim Z k ] ∣ ≥ ε k sim ] ≤ ( ε δ m 2 ) \Pr\left[\left|\displaystyle\sum_{k{=}1}^{k_{\text{sim}}}Z_k{-}\mathbb{E}\left[\sum_{k{=}1}^{k_{\text{sim}}}Z_k\right]\right|{≥}\sqrt{ε}k_{\text{sim}}\right]\text{≤}\left(\cfrac{εδ}{m^2}\right) Pr[ k=1ksimZkE[k=1ksimZk] ε ksim](m2εδ)

➡️进一步转换

  1. Hoeffding \text{Hoeffding} Hoeffding不等式,则 Z k ∈ [ 0 , 1 ] Z_k{∈}[0,1] Zk[0,1] Pr ⁡ [ ∣ ∑ k = 1 n Z k − E [ ∑ k = 1 n Z k ] ∣ ≥ t ] ≤ 2 e – 2 t 2 n \Pr\left[\left|\displaystyle\sum_{k{=}1}^{n}Z_k{-}\mathbb{E}\left[\sum_{k{=}1}^{n}Z_k\right]\right|{≥}t\right]\text{≤}2e^{–\frac{2t^2}{n}} Pr[ k=1nZkE[k=1nZk] t]2en2t2
  2. t = ε k sim t{=}\sqrt{ε}k_{\text{sim}} t=ε ksim n = k sim n{=}k_{\text{sim}} n=ksim,则 Pr ⁡ [ ∣ ∑ k = 1 k sim Z k − E [ ∑ k = 1 k sim Z k ] ∣ ≥ ε k sim ] ≤ 2 e – 2 ε k sim \Pr\left[\left|\displaystyle\sum_{k{=}1}^{k_{\text{sim}}}Z_k{-}\mathbb{E}\left[\sum_{k{=}1}^{k_{\text{sim}}}Z_k\right]\right|{≥}\sqrt{ε}k_{\text{sim}}\right]\text{≤}2e^{–2εk_{\text{sim}}} Pr[ k=1ksimZkE[k=1ksimZk] ε ksim]2e–2εksim,需证 2 e – 2 ε k sim ≤ ε δ m 2 2e^{–2εk_{\text{sim}}}\text{≤}\cfrac{εδ}{m^2} 2e–2εksimm2εδ k s i m ≥ 1 2 ε ln ⁡ 2 m 2 ε δ k_{\mathrm{sim}}{≥}\cfrac{1}{2ε}\ln\cfrac{2m^2}{εδ} ksim2ε1lnεδ2m2
  3. 只需验证 k sim = O ( 1 ε log ⁡ m δ ) ≥ 1 2 ε ln ⁡ 2 m 2 ε δ k_{\text{sim}}{=}O\left(\cfrac{1}{ε}\log{\cfrac{m}{δ}}\right){≥}\cfrac{1}{2ε}\ln\cfrac{2m^2}{εδ} ksim=O(ε1logδm)2ε1lnεδ2m2
    • C ε ln ⁡ m δ ≥ k sim ≥ 1 2 ε ln ⁡ 2 m 2 ε δ \cfrac{C}{ε}\ln\cfrac{m}{δ}{≥}k_{\text{sim}}{≥}\cfrac{1}{2ε}\ln\cfrac{2m^2}{ε δ} εClnδmksim2ε1lnεδ2m2,需验证 C ε ln ⁡ m δ ≥ 1 2 ε ln ⁡ 2 m 2 ε δ \cfrac{C}{ε}\ln\cfrac{m}{δ}{≥}\cfrac{1}{2ε}\ln\cfrac{2m^2}{εδ} εClnδm2ε1lnεδ2m2,即上界足以覆盖下界
    • ( 2 C – 2 ) ln ⁡ m – ( 2 C – 1 ) ln ⁡ δ ≥ ln ⁡ 2 – ln ⁡ ε (2C–2)\ln{m}–(2C–1)\ln{δ}{≥}\ln2–\ln{ε} (2C–2)lnm(2C–1)lnδln2–lnε
    • { 2 C – 2 ≥ 1 2 C – 1 ≥ 1 \begin{cases}2C–2{≥}1\\2C–1{≥}1\end{cases} {2C–212C–11 { 2 ≤ m δ ≤ ε \begin{cases}2{≤}m\\δ{≤}ε\end{cases} {2mδε则上式成立,解得 C ≥ 3 2 C{≥}\cfrac{3}{2} C23 m ≥ 2 , δ ≤ ε m{≥}2,δ{≤}ε m2,δε,其中 m = ∣ Q ∣ + ∣ P ∣ ≥ 2 m{=}|Q|{+}|P|{≥}2 m=Q+P2隐性成立
    • 只需让 δ ≤ ε δ{≤}ε δε结论 O ( 1 ε log ⁡ m δ ) ≥ 1 2 ε ln ⁡ 2 m 2 ε δ O\left(\cfrac{1}{ε}\log{\cfrac{m}{δ}}\right){≥}\cfrac{1}{2ε}\ln\cfrac{2m^2}{εδ} O(ε1logδm)2ε1lnεδ2m2就成立,证毕

http://www.mrgr.cn/news/97540.html

相关文章:

  • 沧州铁狮子
  • ES:geoip_databases
  • 巧用数论与动态规划破解包子凑数问题
  • 计算机面试八股(自整)
  • 无人机装调与测试
  • vue3 脚手架初始化项目生成文件的介绍
  • 七种驱动器综合对比——《器件手册--驱动器》
  • 十四届蓝桥杯Java省赛 B组(持续更新..)
  • babel-runtime 如何缩小打包体积
  • docker的几种网络模式
  • Java反射实战-特殊嵌套格式JSON自定义解析装配
  • 初阶C++笔记第一篇:C++基础语法
  • I²S协议概述与信号线说明
  • 10-MySQL-性能优化思路
  • 网络相关题目
  • LeetCode 热题 100_完全平方数(84_279_中等_C++)(动态规划(完全背包))
  • DMA 概念与讲解
  • 深度学习驱动的车牌识别:技术演进与未来挑战
  • 【c++深入系列】:类和对象详解(下)
  • 【团体程序涉及天梯赛】L1~L2实战反思合集(C++)