Holevo's theorem

Upper bound on the knowable information of a quantum state

Holevo's theorem is an important limitative theorem in quantum computing, an interdisciplinary field of physics and computer science. It is sometimes called Holevo's bound, since it establishes an upper bound to the amount of information that can be known about a quantum state (accessible information). It was published by Alexander Holevo in 1973.

Statement of the theorem

Suppose Alice wants to send a classical message to Bob by encoding it into a quantum state, and suppose she can prepare a state from some fixed set { ρ 1 , . . . , ρ n } {\displaystyle \{\rho _{1},...,\rho _{n}\}} , with the i-th state prepared with probability p i {\displaystyle p_{i}} . Let X {\displaystyle X} be the classical register containing the choice of state made by Alice. Bob's objective is to recover the value of X {\displaystyle X} from measurement results on the state he received. Let Y {\displaystyle Y} be the classical register containing Bob's measurement outcome. Note that Y {\displaystyle Y} is therefore a random variable whose probability distribution depends on Bob's choice of measurement.

Holevo's theorem bounds the amount of correlation between the classical registers X {\displaystyle X} and Y {\displaystyle Y} , regardless of Bob's measurement choice, in terms of the Holevo information. This is useful in practice because the Holevo information does not depend on the measurement choice, and therefore its computation does not require performing an optimization over the possible measurements.

More precisely, define the accessible information between X {\displaystyle X} and Y {\displaystyle Y} as the (classical) mutual information between the two registers maximized over all possible choices of measurements on Bob's side:

I a c c ( X : Y ) = sup { Π i B } i I ( X : Y | { Π i B } i ) , {\displaystyle I_{\rm {acc}}(X:Y)=\sup _{\{\Pi _{i}^{B}\}_{i}}I(X:Y|\{\Pi _{i}^{B}\}_{i}),}
where I ( X : Y | { Π i B } i ) {\displaystyle I(X:Y|\{\Pi _{i}^{B}\}_{i})} is the (classical) mutual information of the joint probability distribution given by p i j = p i Tr ( Π j B ρ i ) {\displaystyle p_{ij}=p_{i}\operatorname {Tr} (\Pi _{j}^{B}\rho _{i})} . There is currently no known formula to analytically solve the optimization in the definition of accessible information in the general case. Nonetheless, we always have the upper bound:
I a c c ( X : Y ) χ ( η ) S ( i p i ρ i ) i p i S ( ρ i ) , {\displaystyle I_{\rm {acc}}(X:Y)\leq \chi (\eta )\equiv S\left(\sum _{i}p_{i}\rho _{i}\right)-\sum _{i}p_{i}S(\rho _{i}),}
where η { ( p i , ρ i ) } i {\displaystyle \eta \equiv \{(p_{i},\rho _{i})\}_{i}} is the ensemble of states Alice is using to send information, and S {\displaystyle S} is the von Neumann entropy. This χ ( η ) {\displaystyle \chi (\eta )} is called the Holevo information or Holevo χ quantity.

Note that the Holevo information also equals the quantum mutual information of the classical-quantum state corresponding to the ensemble:

χ ( η ) = I ( i p i | i i | ρ i ) , {\displaystyle \chi (\eta )=I\left(\sum _{i}p_{i}|i\rangle \!\langle i|\otimes \rho _{i}\right),}
with I ( ρ A B ) S ( ρ A ) + S ( ρ B ) S ( ρ A B ) {\displaystyle I(\rho _{AB})\equiv S(\rho _{A})+S(\rho _{B})-S(\rho _{AB})} the quantum mutual information of the bipartite state ρ A B {\displaystyle \rho _{AB}} . It follows that Holevo's theorem can be concisely summarized as a bound on the accessible information in terms of the quantum mutual information for classical-quantum states.

Proof

Consider the composite system that describes the entire communication process, which involves Alice's classical input X {\displaystyle X} , the quantum system Q {\displaystyle Q} , and Bob's classical output Y {\displaystyle Y} . The classical input X {\displaystyle X} can be written as a classical register ρ X := x = 1 n p x | x x | {\displaystyle \rho ^{X}:=\sum \nolimits _{x=1}^{n}p_{x}|x\rangle \langle x|} with respect to some orthonormal basis { | x } x = 1 n {\displaystyle \{|x\rangle \}_{x=1}^{n}} . By writing X {\displaystyle X} in this manner, the von Neumann entropy S ( X ) {\displaystyle S(X)} of the state ρ X {\displaystyle \rho ^{X}} corresponds to the Shannon entropy H ( X ) {\displaystyle H(X)} of the probability distribution { p x } x = 1 n {\displaystyle \{p_{x}\}_{x=1}^{n}} :

S ( X ) = tr ( ρ X log ρ X ) = tr ( x = 1 n p x log p x | x x | ) = x = 1 n p x log p x = H ( X ) . {\displaystyle S(X)=-\operatorname {tr} \left(\rho ^{X}\log \rho ^{X}\right)=-\operatorname {tr} \left(\sum _{x=1}^{n}p_{x}\log p_{x}|x\rangle \langle x|\right)=-\sum _{x=1}^{n}p_{x}\log p_{x}=H(X).}

The initial state of the system, where Alice prepares the state ρ x {\displaystyle \rho _{x}} with probability p x {\displaystyle p_{x}} , is described by

ρ X Q := x = 1 n p x | x x | ρ x . {\displaystyle \rho ^{XQ}:=\sum _{x=1}^{n}p_{x}|x\rangle \langle x|\otimes \rho _{x}.}

Afterwards, Alice sends the quantum state to Bob. As Bob only has access to the quantum system Q {\displaystyle Q} but not the input X {\displaystyle X} , he receives a mixed state of the form ρ := tr X ( ρ X Q ) = x = 1 n p x ρ x {\displaystyle \rho :=\operatorname {tr} _{X}\left(\rho ^{XQ}\right)=\sum \nolimits _{x=1}^{n}p_{x}\rho _{x}} . Bob measures this state with respect to the POVM elements { E y } y = 1 m {\displaystyle \{E_{y}\}_{y=1}^{m}} , and the probabilities { q y } y = 1 m {\displaystyle \{q_{y}\}_{y=1}^{m}} of measuring the outcomes y = 1 , 2 , , m {\displaystyle y=1,2,\dots ,m} form the classical output Y {\displaystyle Y} . This measurement process can be described as a quantum instrument

E Q ( ρ x ) = y = 1 m q y | x ρ y | x | y y | , {\displaystyle {\mathcal {E}}^{Q}(\rho _{x})=\sum _{y=1}^{m}q_{y|x}\rho _{y|x}\otimes |y\rangle \langle y|,}

where q y | x = tr ( E y ρ x ) {\displaystyle q_{y|x}=\operatorname {tr} \left(E_{y}\rho _{x}\right)} is the probability of outcome y {\displaystyle y} given the state ρ x {\displaystyle \rho _{x}} , while ρ y | x = W E y ρ x E y W / q y | x {\displaystyle \rho _{y|x}=W{\sqrt {E_{y}}}\rho _{x}{\sqrt {E_{y}}}W^{\dagger }/q_{y|x}} for some unitary W {\displaystyle W} is the normalised post-measurement state. Then, the state of the entire system after the measurement process is

ρ X Q Y := [ I X E Q ] ( ρ X Q ) = x = 1 n y = 1 m p x q y | x | x x | ρ y | x | y y | . {\displaystyle \rho ^{XQ'Y}:=\left[{\mathcal {I}}^{X}\otimes {\mathcal {E}}^{Q}\right]\!\left(\rho ^{XQ}\right)=\sum _{x=1}^{n}\sum _{y=1}^{m}p_{x}q_{y|x}|x\rangle \langle x|\otimes \rho _{y|x}\otimes |y\rangle \langle y|.}

Here I X {\displaystyle {\mathcal {I}}^{X}} is the identity channel on the system X {\displaystyle X} . Since E Q {\displaystyle {\mathcal {E}}^{Q}} is a quantum channel, and the quantum mutual information is monotonic under completely positive trace-preserving maps,[1] S ( X : Q Y ) S ( X : Q ) {\displaystyle S(X:Q'Y)\leq S(X:Q)} . Additionally, as the partial trace over Q {\displaystyle Q'} is also completely positive and trace-preserving, S ( X : Y ) S ( X : Q Y ) {\displaystyle S(X:Y)\leq S(X:Q'Y)} . These two inequalities give

S ( X : Y ) S ( X : Q ) . {\displaystyle S(X:Y)\leq S(X:Q).}

On the left-hand side, the quantities of interest depend only on

ρ X Y := tr Q ( ρ X Q Y ) = x = 1 n y = 1 m p x q y | x | x x | | y y | = x = 1 n y = 1 m p x , y | x , y x , y | , {\displaystyle \rho ^{XY}:=\operatorname {tr} _{Q'}\left(\rho ^{XQ'Y}\right)=\sum _{x=1}^{n}\sum _{y=1}^{m}p_{x}q_{y|x}|x\rangle \langle x|\otimes |y\rangle \langle y|=\sum _{x=1}^{n}\sum _{y=1}^{m}p_{x,y}|x,y\rangle \langle x,y|,}

with joint probabilities p x , y = p x q y | x {\displaystyle p_{x,y}=p_{x}q_{y|x}} . Clearly, ρ X Y {\displaystyle \rho ^{XY}} and ρ Y := tr X ( ρ X Y ) {\displaystyle \rho ^{Y}:=\operatorname {tr} _{X}(\rho ^{XY})} , which are in the same form as ρ X {\displaystyle \rho ^{X}} , describe classical registers. Hence,

S ( X : Y ) = S ( X ) + S ( Y ) S ( X Y ) = H ( X ) + H ( Y ) H ( X Y ) = I ( X : Y ) . {\displaystyle S(X:Y)=S(X)+S(Y)-S(XY)=H(X)+H(Y)-H(XY)=I(X:Y).}

Meanwhile, S ( X : Q ) {\displaystyle S(X:Q)} depends on the term

log ρ X Q = log ( x = 1 n p x | x x | ρ x ) = x = 1 n | x x | log ( p x ρ x ) = x = 1 n log p x | x x | I Q + x = 1 n | x x | log ρ x , {\displaystyle \log \rho ^{XQ}=\log \left(\sum _{x=1}^{n}p_{x}|x\rangle \langle x|\otimes \rho _{x}\right)=\sum _{x=1}^{n}|x\rangle \langle x|\otimes \log \left(p_{x}\rho _{x}\right)=\sum _{x=1}^{n}\log p_{x}|x\rangle \langle x|\otimes I^{Q}+\sum _{x=1}^{n}|x\rangle \langle x|\otimes \log \rho _{x},}

where I Q {\displaystyle I^{Q}} is the identity operator on the quantum system Q {\displaystyle Q} . Then, the right-hand side is

S ( X : Q ) = S ( X ) + S ( Q ) S ( X Q ) = S ( X ) + S ( ρ ) + tr ( ρ X Q log ρ X Q ) = S ( X ) + S ( ρ ) + tr ( x = 1 n p x log p x | x x | ρ x ) + tr ( x = 1 n p x | x x | ρ x log ρ x ) = S ( X ) + S ( ρ ) + tr ( x = 1 n p x log p x | x x | ) S ( X ) + tr ( x = 1 n p x ρ x log ρ x ) = S ( ρ ) + x = 1 n p x tr ( ρ x log ρ x ) S ( ρ x ) = S ( ρ ) x = 1 n p x S ( ρ x ) , {\displaystyle {\begin{aligned}S(X:Q)&=S(X)+S(Q)-S(XQ)\\&=S(X)+S(\rho )+\operatorname {tr} \left(\rho ^{XQ}\log \rho ^{XQ}\right)\\&=S(X)+S(\rho )+\operatorname {tr} \left(\sum _{x=1}^{n}p_{x}\log p_{x}|x\rangle \langle x|\otimes \rho _{x}\right)+\operatorname {tr} \left(\sum _{x=1}^{n}p_{x}|x\rangle \langle x|\otimes \rho _{x}\log \rho _{x}\right)\\&=S(X)+S(\rho )+\underbrace {\operatorname {tr} \left(\sum _{x=1}^{n}p_{x}\log p_{x}|x\rangle \langle x|\right)} _{-S(X)}+\operatorname {tr} \left(\sum _{x=1}^{n}p_{x}\rho _{x}\log \rho _{x}\right)\\&=S(\rho )+\sum _{x=1}^{n}p_{x}\underbrace {\operatorname {tr} \left(\rho _{x}\log \rho _{x}\right)} _{-S(\rho _{x})}\\&=S(\rho )-\sum _{x=1}^{n}p_{x}S(\rho _{x}),\end{aligned}}}

which completes the proof.

Comments and remarks

In essence, the Holevo bound proves that given n qubits, although they can "carry" a larger amount of (classical) information (thanks to quantum superposition), the amount of classical information that can be retrieved, i.e. accessed, can be only up to n classical (non-quantum encoded) bits. It was also established, both theoretically and experimentally, that there are computations where quantum bits carry more information through the process of the computation than is possible classically.[2]

See also

References

  1. ^ Preskill, John (June 2016). "Chapter 10. Quantum Shannon Theory" (PDF). Quantum Information. pp. 23–24. Retrieved 30 June 2021.
  2. ^ Maslov, Dmitri; Kim, Jin-Sung; Bravyi, Sergey; Yoder, Theodore J.; Sheldon, Sarah (2021-06-28). "Quantum advantage for computations with limited space". Nature Physics. 17 (8): 894–897. arXiv:2008.06478. Bibcode:2021NatPh..17..894M. doi:10.1038/s41567-021-01271-7. S2CID 221136153.

Further reading

  • Holevo, Alexander S. (1973). "Bounds for the quantity of information transmitted by a quantum communication channel". Problems of Information Transmission. 9: 177–183.
  • Nielsen, Michael A.; Chuang, Isaac L. (2000). Quantum Computation and Quantum Information. Cambridge, UK: Cambridge University Press. ISBN 978-0-521-63235-5. OCLC 43641333. (see page 531, subsection 12.1.1 - equation (12.6) )
  • Wilde, Mark M. (2011). "From Classical to Quantum Shannon Theory". arXiv:1106.1445v2 [quant-ph].. See in particular Section 11.6 and following. Holevo's theorem is presented as exercise 11.9.1 on page 288.
  • v
  • t
  • e
GeneralTheoremsQuantum
communication
Quantum cryptography
Quantum algorithmsQuantum
complexity theoryQuantum
processor benchmarksQuantum
computing modelsQuantum
error correctionPhysical
implementations
Quantum optics
Ultracold atoms
Spin-based
Superconducting
Quantum
programming
  • Quantum information science
  • Quantum mechanics topics