The quantum Chernoff bound as a measure of distinguishability between density matrices: application to qubit and Gaussian states

Hypothesis testing is a fundamental issue in statistical inference and has been a crucial element in the development of information sciences. The Chernoff bound gives the minimal Bayesian error probability when discriminating two hypotheses given a large number of observations. Recently the combined work of Audenaert et al. [Phys. Rev. Lett. 98, 160501] and Nussbaum and Szkola [quant-ph/0607216] has proved the quantum analog of this bound, which applies when the hypotheses correspond to two quantum states. Based on the quantum Chernoff bound, we define a physically meaningful distinguishability measure and its corresponding metric in the space of states; the latter is shown to coincide with the Wigner-Yanase metric. Along the same lines, we define a second, more easily implementable, distinguishability measure based on the error probability of discrimination when the same local measurement is performed on every copy. We study some general properties of these measures, including the probability distribution of density matrices, defined via the volume element induced by the metric, and illustrate their use in the paradigmatic cases of qubits and Gaussian infinite-dimensional states.


I. INTRODUCTION
About fifty years ago Herman Chernoff proved his famous bound, which characterizes the asymptotic behavior of the minimal probability of error when discriminating two hypothesis given a large number of observations [1]. Its quantum analog was recently conjectured [2] and finally proven by combining the results of two recent publications [3,4]. In this quantum setting one is confronted with the problem of knowing the minimum error probability in identifying one of two possible known states of which N identical copies are given. Hereafter we will refer to this minimum simply as the error probability P e . This problem is widely known as quantum state discrimination 1 . Its difficulty (but also its appeal) lies in the fact that quantum mechanics only allows for full discrimination of such states when they are orthogonal. This has both fundamental and practical implications that lie at the heart of quantum mechanics and 1 See [5] and [6] for two reviews on the recent and more historical developments of this field respectively.
its applications.
For these past fifty years the classical Chernoff bound -as well as hypothesis testing in general-has proved to be extremely useful in all branches of science. Likewise, one would expect its quantum version to be far more than a mere academic issue. The characterization and control of quantum devices is a necessary requirement for quantum computation and communication, and quantum hypothesis testing is specially designed for assessing the performance of these tasks. Particularly important examples for which state discrimination plays an essential role are quantum cryptography [7], classical capacity of quantum channels [8], or even quantum algorithms [9]. Equally important are some new theorems concerning different quantum extensions of hypothesis testing: the quantum Stein's lemma, proved some years ago [10,11], and the quantum Hoeffding bound, recently established in [12,13,14].
In this paper we study the classical and the quantum Chernoff bounds in connection to measures of distinguishability for quantum states, putting special emphasis on the qubit and Gaussian cases. We start by reviewing classical and quantum hypothesis testing and the corresponding Chernoff bounds in Sec. II and Sec. III, respec-tively (the latter includes the before mentioned recent results by Nussbaum and Szkola [4] and Audenaert et al. [3]). In Sec IV we discuss the notion of a distinguishability measure for quantum states. We briefly motivate an important instance of such a notion based on classical statistical measures, that is, the quantum fidelity, and move to a fully operational alternative, based on the asymptotic rate exponent of the error probability in symmetric quantum hypothesis testing: the quantum Chernoff measure 2 . We also discuss a similar distinguishability measure derived from the same rate exponent when the decision is based on N identical single-copy (local) measurements -instead of the collective measurements on the N copies assumed in the derivation of the quantum Chernoff bound. In Sec. V we study the metrics induced by the previously defined measures of distinguishability and give explicit expressions for general d-dimensional systems. We also give the probability distribution of the eigenvalues of a d × d density matrix based on the quantum Chernoff metric (induced by the corresponding distinguishability measure). We find that the metric based on local measurements is discontinuous and has to be defined piecewise: on the set of pure states, where it agrees with the Fubini-Study metric, and, separately, on the set of strictly mixed states, where it agrees with one-half the Bures-Uhlmann metric. The quantum Chernoff metric, in contrast, is continuous and smoothly interpolates between the Fubini-Study and one-half the Bures-Uhlmann metrics. In Sec. VI we concentrate on the particular case of two-level systems and study in some depth the differences between the quantum Chernoff measure and metric and those based on identical local measurements. In Sec. VII we give explicit expressions of the quantum Chernoff measure and its corresponding induced metric for general Gaussian states. Finally, we state our conclusions in Sec. VIII.

II. CLASSICAL HYPOTHESIS TESTING: CHERNOFF BOUND
One of the most fundamental problems in statistical decision theory is that of choosing between two possible explanations or models, that we will refer to as hypothesis H 0 and H 1 , where the decision is based on a set of data collected from measurements or observations. For example, a medical team has to decide whether a patient is healthy (hypothesis H 0 ) or has certain disease (hypothesis H 1 ) in view of the results of some clinical test. Often, H 0 is called the working hypothesis or null hypothesis, while H 1 is called the alternative hypothesis. In general these two hypotheses do not have to be treated on equal footing, since wrongly accepting or rejecting one of them might have very different consequences. These two types of errors, i.e., the rejection of a true null hypothesis or the acceptance of a false null hypothesis, are called type I or type II errors respectively, and their corresponding probabilities will be denoted by p(1|H 0 ) ≡ p 0 (1) and p(0|H 1 ) ≡ p 1 (0) throughout the paper. In our example, failure to diagnose the disease is a type II error, whereas it is a type I error to wrongly conclude that the healthy patient has the disease. Of course it would be desirable to minimize the two types of errors at the same time. However, this is typically not possible since reducing those of one type entails increasing those of the other type. Hence, a common way to proceed is to minimize the errors of one type, while keeping those of the other type bounded by a constant (which may depend on the number of observations). Another (Bayesian-like) approach consists in minimizing a linear combination of the two error probabilities P e = π 0 p(1|H 0 ) + π 1 p(0|H 1 ), where π 0 and π 1 can be interpreted as the a priori probabilities that we assign to the occurrence of each hypothesis. In this paper we consider this latter approach, which is known as symmetric hypothesis testing.
For the sake of simplicity, we assume to start with that π 0 = π 1 = 1/2, and we deal with tests that have only two possible outcomes, b = 0, 1. This is, for example, the situation that corresponds to the identification of a biased coin that can be (with equal probability) of one of two types: 0 or 1 (corresponding to hypothesis H 0 or H 1 respectively). If it is of the type 0 the probabilities of obtaining head and tail are respectively p 0 (0) = p and p 0 (1) = 1 − p ≡p, while if it is of type 1 we write p 1 (0) = q and p 1 (1) = 1 − q ≡q. The test consists in tossing the coin, which has two possible outcomes: either head (b = 0) or tail (b = 1).
If we can toss the coin only once (single observation), it is easy to convince oneself that the minimum (average) probability of error is attained when we accept the hypothesis (decide that the tossed coin is of the type) for which the observed outcome occurs with largest probability. Therefore 3 where we have used the inequality min{p, q} ≤ p s q 1−s .
The subscript CC stands for classical Chernoff. This expression also holds for tests with more than two outcomes. We just need to extend the sum over b to the entire range of possible outcomes. In what follows, we leave the range of b unspecified whenever an expression is valid for an arbitrary number of outcomes. Next, let us assume we can toss the coin N times.
The set of possible outcomes (the sample space) is the N -fold Cartesian product of {0, 1} (or {head, tail}). The two probability distributions of these outcomes, p (N ) 0 (b (N ) ) and p (N ) 1 (b (N ) ), will be given by the product of the corresponding single-observation distributions, p 1} ×N , and one immediately obtains [15] P e ≤ 1 2 min This is the Chernoff bound [1]. It is specially important because it can be proved to give the exact asymptotic rate exponent of the error probability, that is, The so-called Chernoff information, or Chernoff distance, C(p 0 , p 1 ), can also be written in terms of the Kullback-Leibler divergence K(p 0 /p 1 ) = [15]: where is a family of probability distributions known as the Hellinger arc that interpolates between p 0 and p 1 , and s * is the value of s at which the second equality in (4) holds.
In other words, it is the point at which p s is equidistant to both p 0 and p 1 (in terms of Kullback-Leibler distance). It can be shown that s * is also the value of s that minimizes the right hand side of (3). For the case of measurements with two outcomes, such as the example of the coins discussed above, one can give a closed expression for the Chernoff distance, which we denote in this binary case as C(p, q): with ξ ≡ log(q/p) log(p/p) + log(q/q) ;ξ ≡ 1 − ξ.
The parameter ξ has a very straightforward interpretation. If N 0 is the number of heads (of 0's) after N trials, which according to the distribution p 0 occurs with probability [according to the distribution p 1 it occurs with probability P 1 (N 0 ), defined the same way but with p replaced by q], then ξ is the fraction of heads above which one must decide in favor of p 0 . That is, if N 0 ≥ ξN one accepts hypothesis H 0 , while if N 0 < ξN one accepts H 1 . Asymptotically, the contribution to the error probability is dominated by situations where N 0 = ξN , i.e., by events that occur with the same probability for both hypotheses (see Fig. 1). The probability of such events is clearly a lower-bound to the probability of error. It is straightforward to check that − lim N →∞ log P 0 (ξN )/N [or equivalently − lim N →∞ log P 1 (ξN )/N ] coincides with the upper bound given by the Chernoff distance C(p, q). This proves that the Chernoff bound is indeed attainable.
i FIG. 1: (Color online) Each curve represents the probability to obtain N0 heads after N tosses of a bias coin that can be of one of two types, 0 or 1. The probability that the coin of type 0 (1) produces a head at any given toss is p (q). For large N these curves approach Gaussian distributions centered at pN and qN , respectively. The point ξN where they cross defines the decision boundary (see main text). The error probability is given by the shaded area.

III. QUANTUM HYPOTHESIS TESTING: THE QUANTUM CHERNOFF BOUND
We now tackle discrimination (symmetric hypothesis testing) in a quantum scenario. We consider two sources, 0 and 1 that produce states described respectively by the density matrices ρ 0 and ρ 1 acting on a Hilbert space H. We are given N copies of a state ρ with the promise that they have been produced either by the source 0 (with prior probability π 0 ) or by the source 1 (with prior probability π 1 = 1 − π 0 ). Accordingly, we can formulate two hypothesis (H 0 and H 1 ) about the identity (0 or 1, respectively) of the source that has produced these copies. We wish to find a protocol to determine, with minimal error probability, which hypothesis better explains the nature of the N copies. No matter how complicated this protocol might be, it is clear that the output must be classical: we have to settle for one of the two hypotheses. Therefore the protocol develops in two stages. First, to obtain information about the states we must necessarily make a (quantum) measurement, which in contrast to the classical world is an inherently random and destructive process. Second, one has to provide a classical algorithm that processes the measurement outcomes (classical data) and produces the best answer (H 0 or H 1 ). Quantum mechanics allows for a convenient description of this two-step process by assigning to each answer, H 0 and H 1 , a single POVM (positive operator valued measure) element E 0 and E 1 respectively (E b ≥ 0 acts on H ⊗N ; E 0 + E 1 = 1 1). The probability that this POVM measurement gives the answer H b conditioned to The problem thus reduces to finding the set of opera- b=0 that minimize the mean probability of error, For the simplest case of a single copy (N = 1) and two equiprobable hypotheses (π 0 = π 1 = 1/2) it is [16] Since E 0 = 1 1−E 1 , we can introduce the Helstrom matrix Γ ≡ ρ 1 − ρ 0 , as is common in quantum state discrimination, and write which only needs to be optimized with respect to E 1 . We note that Γ has some negative eigenvalues, as tr Γ = 0. This necessarily implies that the minimum error probability is attained if E 1 is the projector on the subspace of positive eigenvalues of Γ. We will denote this projector by {Γ > 0} and define the positive part of Γ as Γ + = {Γ > 0}Γ. Taking into account that Γ is traceless, we obtain where the matrix |A| (absolute value of A) is defined to be |A| = √ A † A. We arrive at the final result [16], The problem of discriminating multiple copies (arbitrary N ) is thus formally solved by replacing ρ i by ρ ⊗N i in the above equations. Indeed, if we do not have any restrictions on the type of measurements performed on the N copies, < 0}, and the mean probability of error is just However, the computation of the trace norm of the Helmstrom matrix in (13) is tedious and, moreover, this equation provides little information about the large N behavior of the error probability, which is what the Chernoff bound is about. The quantum version of the Chernoff (upper) bound was presented very recently in [3]. There it is shown that (the subscript QC stands for quantum Chernoff), which holds for arbitrary density matrices. Moreover, this bound can be very efficiently computed. The bound (14) is a straightforward application of the following theorem [3]: Theorem 1 Let A and B be two positive operators, then for all 0 ≤ s ≤ 1, The proof of this theorem involves advanced methods in matrix algebra and we refer the interested reader to [3]. Instead, here we will give a simple proof of the inequality (14) where instead of minimizing over s, the particular value s = 1/2 will be chosen. We first notice that one obtains an upper-bound to P e by picking any particular positive operator E 1 (and, accordingly, E 0 ) in (9). A convenient choice isẼ 1 = {ρ where, as above, {A > 0} stands for the projector onto the subspace spanned by the eigenstates of A with positive eigenvalue. After the following series of inequalities we arrive to the desired result [17]: = tr(ρ The general proof (for all s) follows the same steps but In this case, the inequality analogous to the second one in (16) requires the two additional non-obvious relations These inequalities follow immediately from the following non-trivial lemma, which constitutes the core of the proof [3]: Lemma 1 Let A and B be two positive operators, then for all 0 ≤ t ≤ 1, Before proceeding with the the asymptotic limit, several comments about (14) are in order. (i) The exponential fall-off of the probability of error when a number N of copies is available follows immediately from tr(A ⊗ B) = trA trB: Remarkably enough, this rate exponent, which we may call quantum Chernoff information because of its analogy with C(p 0 , p 1 ), is asymptotically attainable, as follows from the results of [4]. This is the quantum extension of the classical result (3) and was first conjectured by Ogawa and Hayashi in [2]. (ii) If the two matrices ρ 0 and ρ 1 commute the bound reduces to the classical Chernoff bound (1), where the two probability distributions are given by the spectrum of the two density matrices.
(iii) The function Q s = trρ s 0 ρ 1−s 1 (whose minimum gives the best bound) is a convex function of s in [0, 1], which means that a stationary point will automatically be the global minimum (see [3] for a proof). This is a very useful fact when computing the quantum Chernoff bound (14). (iv) Q is jointly concave in (ρ 0 , ρ 1 ), unitarily invariant, and non-decreasing under trace preserving quantum operations [3]. (v) The quantum Chernoff bound gives a tighter bound than that given by the quantum fidelity which is the most widely used quantum distinguishability measure (see next section). This follows from the following set of inequalities: In fact, the fidelity also provides a lower-bound to the probability of error [18]: In the case where one of the states (say ρ 0 ) is pure the upper bound to the error probability can be made tighter [19,20]: (vi) The quantum Chernoff bound can be easily extended to the case where the two states ρ 0 and ρ 1 (sources) are not equiprobable: (vii) The permutation invariance of the N -copy density matrices, ρ ⊗N i , guarantees that the optimal collective measurement can be implemented efficiently (with a polynomial-size circuit known as quantum Schur transform) [21], and hence that the minimum probability of error is achievable with reasonable resources.
As stated above, for multiple-copy discrimination the error probability decreases exponentially with the number N of copies: P e ∼ exp [−N D(ρ 0 , ρ 1 )] as N goes to infinity [15]. The error (rate) exponent D(ρ 0 , ρ 1 ) is defined generically by and characterizes the asymptotic behavior of the error probability. From (20) we readily see that if the best (joint) measurement is used it coincides with the quantum Chernoff information, where the equality holds because of the attainability of (20) discussed above and we have added the subscript QC. Moreover, this asymptotic value is also attained by the square root (or "pretty-good") measurement (see [22,23] for the precise definition). This immediately follows from the known bounds [24,25] is the error probability of discrimination when the square root measurement is used.
Before closing this section, we briefly come back to the fidelity bounds in (22)(23)(24) and simply note that the first two inequalities translate into the following bounds to the rate exponent: If one of the states is pure Eq. (24) implies that the factor 1/2 in (28) becomes 1 and we have the exact relation

IV. DISTINGUISHABILITY MEASURES
In this section we aim to define a measure of distinguishability between states using the results reviewed in Sec. III. Before doing so we will briefly outline how classical statistical methods can be used to (partially) accomplish this goal. We will then discuss an operational measure of distinguishability based on the error probability in multiple-copy state discrimination, leading to the quantum Chernoff measure. Finally we will define the analogous quantity for local discrimination protocols.

A. Classical statistical approach
The notion of distance between states is a fundamental issue that has been studied for a long time. A straightforward way to define such a distance is to take any suitable norm in the space of states. However, a more physical approach, kick-started by the pioneering work in [26], is to relate the inherently probabilistic nature of quantum measurements to classical statistical measures of distinguishability between probability distributions.
In particular, the author in [26] uses the notion of statistical distance, as a measure of distinguishability between the probability distributions p 0 and p 1 , where is the statistical fidelity. Accordingly, he defines a distinguishability measure between quantum states ρ 0 and ρ 1 by maximizing d S (p 0 , p 1 ) [i.e., minimizing F(p 0 , p 1 )] over all possible POVM measurements, characterized by all possible sets of operators The statistical distance as such makes sense only when the number of samplings of the probability distribution is large. Hence, in the quantum extension of this notion it is implicitly assumed that one performs the same measurement on each of a large number N of copies of the state ρ ∈ {ρ 0 , ρ 1 }. The optimization over such local repeated measurements leads to one of the most widely used distinguishability measures [27]: The (quantum) fidelity F (ρ 0 , ρ 1 ), defined in (21).
The fidelity, or statistical distance, has many desirable properties: (i) it is easily computable; (ii) for pure states it reduces to the standard distance given by the angle between rays in the Hilbert space H; (iii) as mentioned above, it provides bounds to P e . Nevertheless, a strict physical interpretation is so far unclear, and its definition is based on repeated local measurements, while quantum mechanics allows for much more general ways to access the information contained in the N copies, via collective measurements on the whole of them.

B. Quantum Chernoff distance
A very natural and also operational distinguishability measure is provided by the error probability of discrimination. As a first candidate, one could take this very error probability P e for a given fixed number N of copies. However, the choice of a particular N in such a definition would not only be arbitrary but also problematic since one can find examples [15] where P e (ρ 0 , a different number M of copies. A straightforward way to go around this problem is to use the asymptotic expressions for N → ∞ and define the distinguishability measure as the largest rate exponent in (26). We further note that the presence of the logarithm ensures that D(ρ 0 , ρ 1 ) = 0 if and only if ρ 0 = ρ 1 , while the minus sign makes distinguishability decrease as discrimination becomes more difficult, i.e., as P e increases. The quantum Chernoff information, D QC (ρ 0 , ρ 1 ), is therefore a physically meaningful and efficiently computable distinguishability measure. Note that (27) does not stricto sensu define a distance, since it does not fulfil the triangular inequality. It has however all of the other properties that one should expect from a reasonable measure. This, in itself, is already a remarkable fact since, as far as measures and metrics are concerned, there is usually a compromise among operational definiteness, computability and contractivity [28]. For instance, the distance proposed in [29], although having an operational definition, is not contractive.
We point out that another operational distinguishability measure can be obtained in asymmetric hypothesis testing by minimizing the type II error rate while keeping the type I error rate upper-bounded by a fixed value. The optimal error rate in this situation is provided by the quantum Stein's Lemma [10,11] and leads to the well known quantum relative entropy. Despite of having an operational meaning, the quantum relative entropy has two obvious drawbacks as a distinguishability measure: it is not symmetric on its arguments and it diverges if one of the states is pure.

C. Classical Chernoff distance: local measurements
In the derivation of the quantum Chernoff bound one optimizes over all possible quantum measurements, in particular over quantum joint measurements on H ⊗N , that act over all the N copies coherently. It is of great interest, both theoretically and in practice, to know whether such joint measurements are strictly necessary to attain the bound or one can make do with separable ones (which include those that can be implemented with local operations and classical communication, simply known as LOCC measurements). As far as we are aware, the answer to this is unknown. This question is also relevant in connection with the operational meaning attached to D(ρ 0 , ρ 1 ). In this section we focus on this operational aspect and compute D(ρ 0 , ρ 1 ) from its definition in (26) assuming that the discrimination protocol P e refers to is constrained to make use of the same individual measurements, defined by a local POVM {E(b)} M b=1 , on each of the N available copies. We loosely refer to these protocols as local. Local protocols are relevant from the theoretical point of view since they help to elucidate the role of quantum correlated measurements in asymptotic hypothesis testing. For example, in quantum phase estimation local measurements suffice to achieve the collec-tive bounds [22]. Here, we will show that these protocols do not achieve the quantum Chernoff bound. In addition, from a more practical point of view, local protocols are much simpler to implement experimentally, specially in a situation where the number of sub-systems is increasingly large.
In such a local protocol, after the measurements have been performed we have a sample of N elements of the probability distribution p i (b) = tr(E b ρ i ), i = 0, 1, based on which we have to discriminate between the candidate H 0 or H 1 . In such a scenario the error probability, which we call P loc e , can be obtained using the classical Chernoff bound (1) applied to the distributions p 0 and p 1 . One can thus define the error exponent (26) and thereby introduce a new operational distinguishability measure based on local discrimination: where the subscript CC reminds us that we have made use of the classical Chernoff bound. The measure D CC (ρ 0 , ρ 1 ) is obtained by maximizing the rate exponent over all possible single-copy generalized measurements {E b } M b=1 (just as is done for the fidelity). Unfortunately, there is no simple closed expression for this maximum for general mixed states. However, we do encounter again the relation (22) with the fidelity: since the square root of the statistical fidelity F(p 0 , p 1 ) upper bounds P CC in (1), it also upper bounds the local error probability P loc e . That is, and Since D QC (ρ 0 , ρ 1 ) ≥ D CC (ρ 0 , ρ 1 ), we note that whenever D QC (ρ 0 , ρ 1 ) = −(1/2) log F (ρ 0 , ρ 1 ) the inequality (34) has to be saturated. This, in turn, means that in this situation one can optimally discriminate between H 0 and H 1 just by performing a fixed local measurement on each of the N copies (no collective measurements are required to attain the quantum Chernoff bound).
There is still another important situation when the quantum Chernoff bound is attainable by local measurements: when one of the states (say ρ 0 ) is pure.
If this is the case, Eq. (24) holds and D QC (ρ 0 , ρ 1 ) = − log F (ρ 0 , ρ 1 ). To prove that D CC (ρ 0 , ρ 1 ) = D QC (ρ 0 , ρ 1 ), let us consider the twooutcome measurement defined by E 0 = ρ 0 , Note that p 0 (1) = tr(E 1 ρ 0 ) = 0 and p 0 (0) = tr(E 0 ρ 0 ) = 1. After performing this measurement on each of the N copies the protocol proceeds as follows: we accept H 0 if all of the outcomes are 0, otherwise we accept H 1 . One may refer to this classical data processing as unanimity vote [30]. The error probability can be easily computed by noticing that no error occurs unless we get N times the outcome 0 [since p 0 (1) = 0]. Therefore, where the last equality holds because ρ 0 is assumed to be a pure state. From this equation it follows immediately that D CC (ρ 0 , ρ 1 ) = − log F (ρ 0 , ρ 1 ) = D QC (ρ 0 , ρ 1 ), and the quantum Chernoff bound is attainable by local measurements. It also follows from the first equality in (35) that this result corresponds to taking the limit s → 0 in (1).

V. METRIC
The set of states of a quantum system, as that of classical probability distributions on a given sample space, 4 can be endowed with a metric structure [31], and thus thought of as a Riemannian manifold. This enables us to relate geometrical concepts (e.g., distance, volume, curvature, parallel transport) to physical ones (e.g., state discrimination and estimation, geometrical phases). Among the novel applications of metrics in quantum information, they have been recently used to characterize quantum phase transitions [32].
The first step towards this geometric approach to quantum states is to define the line element ds or (infinitesimal) distance between two neighboring "points" ρ and ρ − dρ. All local properties follow from this definition. More precisely, they follow from the metric, i.e., from the set of coefficients of ds 2 when written as a quadratic form in the differentials of the coordinates (parameters) that specify the quantum states. There is, however, no unique choice of ds unless some monotonicity conditions are invoked.
For classical probability distributions, {p(b)}, a line element is singularized (up to a propotionality factor) by imposing that it be non-increasing under stochastic maps. It is the well known Fisher metric (in what follows the terms metric and line element will be used interchangeably): In contrast to the classical case, the monotonicity condition under completely positive (quantum stochastic) maps does not define a metric uniquely, which explains why a substantial body of research on quantum metrics has emerged over the last years. Among the main developments, Petz [33] has characterized the family of quantum contractive metrics by establishing a correspondence with operator-monotone functions.
An alternative, more physical approach is to define a line element from a suitable distinguishability measure between infinitesimally close states. A remarkable example is given in [34]. In this seminal paper Braunstein and Caves consider a one-parameter family of states ρ(θ) and map the problem of distinguishability to that of estimating the parameter θ optimally. They define a line element, ds 2 BC , as dθ 2 expressed in the appropriate units of statistical deviation (roughly speaking, dθ 2 divided by the minimal error in the estimation of θ). By making use of classical statistical methods (Cramér-Rao bound) they find where , and the maximization is over all possible POVM measurements {E b } on a single copy of ρ(θ). They also succeed in giving a closed expression for ds 2 BC and show that their metric coincides up to a factor with that induced by the Bures-Uhlmann distance [35,36] More precisely, they show that ds 2 BC = 4ds 2 BU , where [see also (69) below] and a series expansion to O(dρ 2 ) is understood in the right hand side of this equation. We note in passing that for commuting states, i.e., classical probability distributions, the Bures-Uhlmann line element ds 2 BU coincides with the Fisher metric (36). A quantum metric with such normalization is said to be Fisher adjusted.
Although one can obtain a finite distance d BC (ρ 0 , ρ 1 ) for arbitrary states ρ 0 and ρ 1 by integrating ds BC along geodesics, it is important to notice that the operational meaning of the Braunstein and Caves metric is lost in the process.
In the spirit of Braustein and Caves' physical approach to metrics, we next consider the distinguishability measures D QC and D CC , discussed in Section IV, for infinitesimally close states and derive line elements with the same operational meaning, which we call ds QC and ds CC respectively. For ds QC we also give the volume element and the prior probability distribution, whereas those corresponding to the metric ds CC can be easily found in the literature since, as will be shown, ds 2 CC is proportional to the widely-studied Bures metric ds 2 BU . Before we start we would like to point out that one could also consider line elements induced by other quantities, such as the quantum relative entropy, which, as we saw above, also has a clear operational interpretation. The quantum relative entropy induces the so-called Kubo-Mori metric [37], which has the drawback of being singular for pure states.

A. Quantum Chernoff metric
For neighboring density matrices ρ and ρ − dρ (e.g., those for which their independent matrix elements differ by an infinitesimal amount) the distinguishability measure D(ρ, ρ − dρ) defines a metric, as in (39). For the quantum Chernoff measure, D QC , this metric can be computed from Eq. (27) [38]: where the dots stand for higher order terms in dρ that will not contribute to ds 2 and we have also used that log y = y − 1 + . . .. We now recall the integral representation and its derivative, These representations hold for a > 0 and can be straightforwardly extended to positive matrices. In particular, using (41) and the convergent sequence which also holds for matrices provided a > b, one can write, up to second order in dρ, where c s = π −1 sin(sπ). Inserting this expansion in (40) one finds The first term in the integrand vanishes, as can be seen by using (42) and tr dρ = 0, while the second term can be computed in the eigenbasis {|i } of ρ; ρ = i λ i |i i|: where in the second equality we have taken into account that dρ = dρ † , which enabled us to symmetrize the expression in parenthesis that multiplies | i|dρ|j | 2 in the sum (this symmetrization gives the factor 1/2). The quantum Chernoff metric can be finally written as, The quantum Chernoff metric belongs to the family of contractive quantum metrics, as it should, since by construction the probability of error cannot be improved by a pre-processing of the states. In fact the quantum Chernoff metric coincides with a member of this family that has been explicitly written by Petz in [39] and with the so called Wigner-Yanase metric, which has been recently studied in depth by the authors of [40]. In particular, the geodesic distance, the geodesic path, and the scalar curvature of the quantum Chernoff metric can be read off from their Eqs. (5.1-5.3).
By separating diagonal from off-diagonal terms, the metric in (47) can also be written as Next, we wish to identify the degrees of freedom in the off-diagonal terms. We will see that they correspond to infinitesimal unitary transformations acting on ρ (which leave its eigenvalues unchanged). This is most conveniently done by parameterizing ρ by its eigenvalues and eigenvectors, namely by λ i and the components of |i onto a given canonical basis {|α k }: (naturally, it also holds that U ki = k|U |i ). A neighboring density matrix ρ ′ = i λ ′ i |i ′ i ′ | is thus parameterized by λ ′ i = λ i + dλ i and U ′ ki = U ki + dU ki = α k |i ′ . We further note that |i ′ = (1 1 + δT )|i , where δT is antihermitian, δT † = −δT . It is actually the infinitesimal generator along the direction in parameter space that takes {|i } into {|i ′ }. It follows that dU ki = α k |δT |i . The matrix elements of dρ can be expressed as and those of δT as where we have used (49) in going from the first to the second line [the very same matrix elements of δT can also be written as (dU U † ) ij in the eigenbasis of ρ]. Substituting these relations back into (48) we obtain The same expression can also be derived by differentiating where ρ (0) ≡ i λ i |α i α i | is diagonal in the canonical basis and has the spectrum of ρ.
Eq. (52) displays the metric ds 2 QC in a very suggestive form. Any density matrix can be parameterized by its eigenvalues {λ i } and the unitary matrix U that diagonalizes it. Eq. (52) expresses the infinitesimal distance between two such matrices in terms of these very parameters. The first term is immediately recognized as the (Fisher) metric on the (d − 1)-dimensional simplex of eigenvalues of ρ, which is assumed to be d × d throughout the rest of this section (note that i λ i = 1, which implies i dλ i = 0). Thus, stricto senso, it should be expressed in terms of a set of d − 1 independent eigenvalues. If we choose this set to be {λ i } d−1 i=1 the first term in (52) becomes where the subscript F stands for Fisher, and It follows that the determinant of g F , which we will need below, is The second term in (52) contains the factors |(U † dU ) ij | 2 , which are invariant under leftmultiplication [since the left-hand side of (51) is independent of the choice of basis {|α }]. Hence, the normalized volume element induced by these terms will coincide with the (unique) Haar measure dV H of U (d)/[U (1)] d , known as the flag manifold F l (d) C (see e.g., [41] and references therein). Using the wedge product of differential forms, this Haar measure can be written as where C H is a normalization constant so that dV H = 1.
Note that the one-form basis in (57) contains 2 × [d(d − 1)/2] (real and independent) elements, which indeed coincides with the d 2 − d independent parameters Volume elements (derived from metrics) are of great interest because they give a canonical way of defining prior probability distributions on continuous sets. According to this approach, Eqs. (52-57) provide a means to define such probability distribution for general density matrices: if θ = (θ 1 , θ 2 , . . .) is a set of independent real parameters that specifies the density matrices as ρ(θ) and the metric is written as ds 2 = dθg dθ t (i.e., g is the metric tensor), then we can define the prior P[ρ(θ)] through the relation P[ρ(θ)] α dθ α = dV / dV , where dV = √ det g α dθ α . It follows from (52) that P[ρ(θ)] is the product of two independent probability distributions: one that depends exclusively on the parameters encoded in the unitary matrix U and expresses the fact that they are simply distributed according to the Haar measure dV H ; and one, denoted as P({λ i }), that gives the probability distribution of eigenvalues. The latter can be written as where for a given dimension d the constant C d is chosen to ensure that probability adds up to one.
The prior distribution on the simplex of eigenvalues of ρ for the Bures metric (see below), analogous to P({λ i }) in (58), was proposed in [42], but it took considerable efforts to compute the right normalization constant. Slater [43] gave values for dimensions d = 3, 4, 5 and finally Sommers andŻyczkowski [44] managed to give a general expression for arbitrary finite dimensions. Here we will compute C d following similar techniques.
The coefficient C d is defined by the normalization con- Although we only need this integral for r = 1, the introduction of this radial parameter r enables us to compute the normalization I(1) more easily. We first note that by re-scaling λ i → r 2 λ i one gets [i.e., I(r) is a homogeneous function of r of degree d 2 −2], and thus ∞ 0 dr r e −r 2 I(r) = I(1) It follows from this equation that This expression can be further simplified by the change of variables λ i → t i = √ λ i , which leads to By expanding the square of the Vandermonde determinant i<j (t i − t j ), one could in principle compute C d in terms of Euler gamma functions. However this is very impractical since the number of terms in such an expansion grows exponentially with d. A much more efficient way to proceed is as follows. Let {P k (t) = a k t k + a k−1 t k−1 + . . . + a 1 t + a 0 }, a k = 0, be a family or orthonormal polynomials in the set [0, ∞) with a weight function of Hermite type, so that ∞ 0 dt e −t 2 P k (t)P l (t) = δ kl .
Note that {P k (t)} are not Hermite polynomials, since the integration range is [0, ∞) instead of (−∞, ∞). Now, if we define the renormalized polynomials Q k (t) ≡ P k (t)/a k it is not hard to show that . (65) Substituting in to (63) and using the orthonormality of P k , one has In contrast to the examples considered in Ref. [44], and as far as we are aware, there is no known closed expression for the leading coefficients a k for the case at hand. However, Eq. (66) provides an efficient way of computing the quantum Chernoff normalization constant C d ; e.g., by applying the Gram-Schmidt orthogonalization algorithm [with the internal product defined in Eq. (64)] one easily obtains the coefficients a k , and thereby C d . We give the value of this constant for d ≤ 6: where ds 2 F is the Fisher metric (36), with p(b) = tr(ρE b ), dp(b) = tr(dρE b ) and s * = 1/2 being the value of s that achieves this minimum in (32). The maximization of (36) over the local measurements {E b } M b=1 , which commutes with the minimization over s as long as p(b) = 0, 1, results in [34] or equivalently, where we use the same notation as in (47) and (52), respectively. This is the Bures-Uhlmann metric, which, as mentioned above, can be also obtained from the Bures distance (38) [45]. From (68) we then have for strictly mixed states (the last equality holds to order dρ 2 ). The corresponding prior probability distribution (quantum Jeffreys prior) was derived and calculated in [42,43,44]. If one of the states is pure (say ρ 0 , as in previous sections) then the classical distribution p(b) becomes degenerate [p(0) = 1] for the optimal choice E 0 = ρ 0 (recall the last comments in Sec. IV C), and the previous derivation does not hold. In this case, the optimal choice of s in (1) is obtained by taking the limit s → 0, as we already discussed in Sec. IV C. Recalling the first equality in (35), we obtain D CC (ρ, ρ − dρ) = − log[p(0) − dp(0)] = dp(0) [note that dp(0) ≥ 0 since 1 ≥ p(0) − dp(0) = 1 − dp(0)], which is linear in dp(b) and therefore does not define a proper metric in probability space. From the results of Sec. IV C we also know that if one of the states is pure then D CC (ρ 0 , ρ 1 ) = − log F (ρ 0 , ρ 1 ) and therefore for pure states. This agrees with the previous discussion since dp(0) = 1 − F (ρ, ρ − dρ) if ρ is a pure state. Eq. (72) has to be taken with special care. It gives a valid metric for the set of pure states (which only includes variations in the unitary parameters), i.e., when ρ−dρ is also a pure state (ρ−dρ = U ρU † ). Moreover, for pure states ds 2 CC coincides with the Fubini-Study metric [recall that the Bures-Uhlmann metric is Fubini-Study adjusted [44], hence this statement follows from Eq. (72)].
By combining Eqs. (71) and (72), we see that ds 2 CC shows a discontinuity when the mixed state ρ approaches the set of pure states. The quantum Chernoff metric (47) does not have this pathology. This can be seen by comparing the i < j (dλ i = 0) terms in (52) with those in (70) (the diagonal terms i = j coincide). As λ j → δ 1j (ρ approaches a pure state), we readily see that ds 2 QC → ds 2 BU . In the opposite situation, when ρ approaches the completely mixed state 1 1/d, we can write λ i = 1/d + ǫ j , where ǫ j approaches zero. Expanding the i < j terms in both (52) and (70) we can check that ds 2 QC = 1 2 ds 2 BU up to terms of order ǫ 3 . We conclude that the quantum Chernoff metric smoothly interpolates between the two components (that on strictly mixed states and that on pure states) of the local metric ds 2 CC . We will come back to this point in the next section, where qubit states are discussed as an example to illustrate the results in this and in previous sections.

VI. QUBIT STATES
In this section we apply our results to qubit mixed states, that is, general two-dimensional states. We will first study the distinguishability measures D QC and D CC and then move on to the corresponding metrics and priors.
For qubits one has ρ i = (1 1 + r i · σ)/2, i = 0, 1, where where θ is the angle between r 0 and r 1 . The value of s that minimizes Q s and hence gives (14) and (27) is in general a function of r i and θ. However, one can check that in the particular case r 0 = r = r 1 the minimum is at s * = 1/2 5 . In Fig. 2 we plot the quantum Chernoff distinguishability measure D QC (ρ 0 , ρ 1 ) and the measure based on local measurements D CC (ρ 0 , ρ 1 ) together with the bounds (28) provided by the fidelity, for states of equal purity r 0 = r 1 = r and for θ = π/2. Notice that in general local measurements perform much worse than the collective ones and D CC (ρ 0 , ρ 1 ) runs remarkably close to (actually, coincides with) the fidelity lowerbound (28) for most values of r. However, as it approaches the pure-state regime (r → 1) it rapidly increases towards its upperbound. The reason for this rapid change can be understood by recalling the unanimity vote protocol discussed in Sec. IV C. For two pure states, ρ i = |ψ i ψ i | (as corresponds to r = 1), it boils down to [30] projecting along one of the states, say |ψ 0 , and its orthogonal, |ψ ⊥ 0 . After performing this measurement on each of the N copies, if all of them project on |ψ 0 , one claims that the unknown state is |ψ 0 (hypothesis H 0 ). However, if at least one of them projects on |ψ ⊥ 0 the guess is |ψ 1 (one accepts H 1 ). This corresponds to ξ = 1 in (7). For pure states it reaches the joint-measurement Chernoff bound by making use of a much less demanding local-measurement protocol (see also [30,46] for the optimal local strategy for finite N ).
In contrast, near the completely mixed state 1 1/2, for low r, the optimal local strategy consists in choosing the measurement {E 0 , E 1 } such that p = p 0 (0) = tr(ρ 0 E 0 ) = tr(ρ 1 E 1 ) = p 1 (1) =q, with p > 1/2. In this case, 5 Qubit states are an example for which the doubly stochastic matrix D ij = | i|U |j | 2 is symmetric (D ij = D ji ). Therefore, for isospectral states, Qs(ρ, U ρ U † ) = P ij λ s i λ 1−s j D i,j = P ij (λ s i λ 1−s j + λ s j λ 1−s i )D ij , which has its minimum at s * = 1/2. the acceptance of either H 0 or H 1 is done on the basis of a majority vote protocol: H 0 is accepted if the outcome 0 occurs more times than the outcome 1 does, i.e, N 0 = N/2 [see also Eq. (7)]. It follows from (4) that s * = 1/2. Therefore, the lower-bound provided by the fidelity, Eq. (28), is saturated [s = s * = 1/2 saturates the second inequality in (33) and thus it also saturates (34)]. This protocol is optimal up to a given value of the purity, i.e., for r ≤ r * (θ). For larger values of r the 'voting rule' (given by ξ) starts changing and so does s * . Accordingly, D CC (ρ 0 , ρ 1 ) moves away from its lower-bound to end up saturating its upper bound at r = 1. We next consider the metrics induced by local and by joint measures. The former, in particular, requires special attention because of the abrupt behavior of D CC (ρ 0 , ρ 1 ) near the set of pure states. Indeed the critical value r * (θ), beyond which majority vote is no longer optimal, goes to one as the relative angle θ between the Bloch vectors of the states becomes smaller; r * (θ) → 1 as θ → 0. As a result, the sudden increase of D CC (ρ 1 , ρ 2 ) develops into a jump discontinuity at r = 1 For this reason, when defining the corresponding metric we have to distinguish these two regions: the set of strictly mixed states (r < 1) and the set of pure states (r = 1).
In the region r < 1 the outcome probabilities will never be degenerate and the metric reduces to the Fisher metric, which upon optimization over local measurements coincides with one-half the Bures metric: where dΩ 2 = dθ 2 + sin 2 θdφ 2 is the usual metric on the 2-sphere. In the region r = 1 (pure states), the before-mentioned unanimity vote protocol is optimal and the resulting metric is where ds 2 FS is the well known Fubini-Study metric, which, as mentioned above, also coincides with the Bures metric ds 2 BU in the limiting case r → 1. We notice again that ds 2 CC in Eq. (75) is a factor 2 larger than lim r→1 ds 2 CC in Eq. (74), where the limit is taken along the lines dr = 0. The local distinguishability measure thus induces a discontinuous metric or, phrased in a different way, two different metrics for pure states or for strictly mixed states.
This can be visualized using the Uhlmann representation, that is, by embedding the Bloch sphere r ≤ 1 in R 4 . To this end, one simply needs to define the new coordinate as t = cos τ , where sin τ ≡ r. In spherical co-ordinates one has where the first line correspond to strictly mixed states and the second to pure states. We note that in the second (first) line ds 2 CC is nothing but the standard metric on a 2-sphere (the top half of a 3-sphere) of radius 2 −1 (2 −3/2 ). In Fig. 3, A and B represent (the slice z = 0 of) these two manifolds. One readily sees that the radius of B (pure states) is a factor √ 2 larger than that of the limiting circle of A (for r → 1 ⇔ t → 0, i.e., τ → π/2).
The quantum Chernoff (collective-measurement based) metric can be readily obtained from (27) [or (52) particularized to qubit mixed states]: This metric quantifies distinguishability of qubit states in a precise and operational way, and encapsulates the full power of quantum mechanics. It approaches the Fubini-Study metric ds 2 FS for pure states and also ds 2 CC for very mixed states, i.e. for small r. The metric smoothly interpolates between the two regimes. By defining r ≡ sin 2τ with 0 ≤ τ ≤ π/4 we obtain again the standard metric on a 3-sphere but this time of radius 1/ √ 2: The corresponding manifold is denoted by C in Fig. 3. Geometrically the space of states endowed with the quantum Chernoff metric ds 2 QC is a spherical cap defined by 0 ≤ τ ≤ π/4 whose radius is twice that of the Bureslike hemisphere A. In order to emphasize that the two metrics, are equal up to order r 3 at τ ≈ 0, i.e., r ≈ 0 (near 1 1/2), in the figure we have shifted the center of the larger sphere so as to make the two manifolds tangent at τ = 0. The fact that ds 2 CC = 1 2 ds 2 BU = ds 2 QC + O(r 4 ) is a particular example of a general relation that we discussed at the end of Sec. V B.
From the quantum Chernoff metric one can obtain a proper finite distance (satisfying the triangle inequality) by, for example, computing the geodesic distance, d QC (ρ 0 ,ρ 1 ) = arccos(cos τ 0 cos τ 1 cos θ+sin τ 0 sin τ 1 ) where r i ≡ sin 2τ i and θ is the relative angle between the respective Bloch vectors. The volume element and the prior distribution of density matrices for qubit mixed states, which we here denote as P[ρ( r)], can be easily obtained from the above metrics. According to the local and quantum Chernoff metrics we have respectively: where it is understood that r and θ are the length and the azimuthal angle of the Bloch vector of ρ. Since the Haar volume density on the 2-sphere is sin θ/(4π), we see that the eigenvalues of ρ, λ ± = (1 ± r)/2 are distributed according to (One can check that the latter agrees with our results in Sec. V.) This have been recently used in [47] to assess the accuracy of different quantum tomographic measurements.

VII. GAUSSIAN STATES
We now illustrate our results with infinite-dimensional systems. In particular we will focus on the family of single-mode Gaussian states. This is a very significant class of quantum states mainly for two reasons. First, it has a very simple mathematical characterization that allows for the derivation of otherwise highly non-trivial results, and, second, it describes accurately states of light that are realized with current technology. In the following we show that the Quantum Chernoff information, besides being the natural distinguishability measure, has the advantage of being relatively easy to compute. The calculation of the fidelity, for instance, is much more involved, as is apparent from [48,49,50,51,52], where one can find such calculations for different classes of gaussian states.
Gaussian states are by definition those that have a gaussian characteristic function. The (symmetrically ordered) characteristic function of one such state, ρ, is: where t denotes transposition, σ is the symplectic matrix and D(u) = exp[i(u 2q − u 1p )] is the displacement operator, with u = (u 1 , u 2 ) t and with position and momentum operators satisfying [q,p] = i. The annihilation and creation operators, defined as a = (q + ip)/ √ 2 and a † = (q − ip)/ √ 2, fulfil the canonical commutation relations. The positivity of ρ implies that the 2 × 2 covariance matrix Γ is real-symmetric and satisfies Γ + iσ ≥ 0. A symplectic transformation is a linear transformation S t (q,p) that preserves the commutation relations, or more succinctly SσS t = σ. Under such a transformation the displacement vector ξ = (q, p) t and the covariance matrix transform asξ = Sξ asΓ = S ΓS t respectively.
An equivalent, more physical, definition can be given by the action of the squeezing operator S(r, φ) = exp[ r 2 (e −i2φ a 2 − e i2φ (a † ) 2 )] and the displacement operator D(u) defined above, on a thermal state ρ β = (1−e −β ) n e −βn |n n|, where the Fock states |n satisfy a † a|n = n|n : The covariance matrix of a thermal state is simply Γ β = γ β 1 1, with γ −1 β = tanh(β/2). The squeezing operator S(r, φ) induces the symplectic transformation and the latter corresponds to a rotation in phase-space, i.e. to the unitary operation O(φ) = exp[iφ a † a]. One thus finds that the covariance matrix can be written as Γ = γ β S r,φ S t r,φ . In order to calculate the Chernoff bound it is sufficient to realize that any power ρ s of any Gaussian state ρ is also a Gaussian (unnormalized) state with a rescaled temperature: where we have used the relation with N β,s = (1 − e −β ) s /(1 − e −βs ). Recall now that given any two gaussian states ρ A and ρ B , one can write the inner product trρ A ρ B in terms of their displacement vectors and covariance matrices as: where δ = ξ A − ξ B . Using this equation we find that the quantum Chernoff bound (14) is Q = min s Q s with whereΓ i = γ sβi S ri,φi S t ri,φi , i = 0, 1, and δ = ξ 0 − ξ 1 . To simplify the notation we will denote the covariance matrix of the Gaussian state with β = 0 as A = S r,φ S t r,φ .

A. States with equal covariance matrices
If two general Gaussian states ρ 0 and ρ 1 are identical modulo a relative displacement δ, i.e. ρ 1 = D(δ)ρ 0 D(δ) † we find that where in the first equality we used the fact that the factor multiplying the exponential in (92) must be equal to one, since it is independent of δ and for δ = 0 one must have ρ 0 = ρ 1 , which implies that Q s = 1. That is, where we have used that symplectic transformations have unit determinant, i.e., det A = det(SS t ) = 1. One readily sees that Q s , Eq. (92), attains its minimum at s * = 1/2, hence we find that in this case the Chernoff measure is: where θ is the relative angle between the squeezing axis and the displacement vector, i.e., if δ = O ϕ (|δ|, 0) t then θ = ϕ − φ.

B. States with the same temperature
We can generalize the previous result to states that have the same spectra, i.e., the same temperature (β 0 = β 1 = β). In this case we can use (93) to find The determinant can be explicitly written in a compact form as where we have defined With this generality s * , the optimal value of s, is a complicated function of the states' parameters 6 . In the case of δ = 0, i.e., states with no relative displacement and the same temperature, the minimization over s can be done analytically, and one finds s * = 1/2. The quantum Chernoff measure becomes: = cosh 2 (r 0 − r 1 ) + sin 2 (φ 0 − φ 1 ) sinh 2r 0 sinh 2r 1 −1/2 .
Notice that this expression is independent of the temperature (or purity) of the states. That is, the distinguishability of two arbitrary Gaussian states with no relative displacement and equal temperature is independent of the degree of mixedness of the states.

C. Chernoff metric for Gaussian states
Following the definition (40) and using the previous results we find that Chernoff metric is ds 2 QC = dβ 2 32 sinh 2 β 2 + dr 2 + dφ 2 sinh 2 2r 2 + e −2r dq 2 φ + e 2r dp 2 where we have defined the rotated displacement variables (q φ , p φ ) = (q, p)O φ and we have used that for infinitesimal changes s * = 1/2. We find again that the metric is independent of the temperature under variations of the squeezing parameters r and φ. The (unnormalized) quantum Jeffreys prior can be obtained from the metric tensor: P QC (ρ) ∝ | det g| = 1 16 √ 2 tanh β/4 sinh β/2 sinh 2r. (101) 6 In contrast to the claims in Exercise 3.9 page 77 of [17], it is not generally the case that for states with equal spectra the minimum of Qs is reached for s * = 1/2.
We note that, ds 2 CC → 1 2 ds 2 QC as ρ approaches the set of pure states (β → ∞) along the lines dβ = 0, in agreement with the general statement at the end of Sec. V B. In the limit of very mixed states (β ≈ 0) the quantum Chernoff and local metric coincide up to first order in β. In this limit of high temperatures (β ≈ 0, highly mixed states) the quantum Chernoff metric and Jeffreys prior agree with those derived from Bures distance (modulo the omnipresent factor 1/2). In particular this implies that the analysis in [54] of the Bures volume element in this high temperature regime also applies here.

VIII. SUMMARY AND CONCLUSIONS
We have analyzed quantum state discrimination (symmetric hypothesis testing) and the classical and quantum Chernoff bound focussing on the link between them and the concept of measures (distances) and metrics on the space of quantum states. More precisely, we have been concerned with defining measures and metrics that have a clear operational meaning, so that they can as a matter of principle be obtained from experiments. The error probability in state discrimination, or rather its asymptotic rate exponent (error exponent), has been shown to provide the natural link. Thus, the concept of distinguishability measure has emerged and has been analyzed in depth throughout the central part of this work. Before doing so, we have reviewed the methods and the main results of classical and quantum hypothesis testing in the first three sections of the paper. Qubit and Gaussian states have provided two excellent, very relevant examples to illustrate our results in the last sections.
Our main points and results are summarized as follows: The quantum Chernoff bound gives an upper bound to the error probability in state discrimination. When the unknown state (which we are asked to identify as either one or the other of two known states) is a tensor product, corresponding to many identical copies, the quantum Chernoff information (which is essentially the log of the quantum Chernoff bound) gives the error exponent of the optimal discrimination protocol. We propose this quantity as a distinguishability measure for general mixed states. We show that the quantum Chernoff measure is not attainable by protocols that use local fixed measurements (those for which the same measurement is performed on each of the individual copies). Given the practical relevance of these types of protocols (they can be realized with current technology), we define a local distinguishability measure as the error exponent of the best such protocol and present its main features. We derive the metrics induced by these measures and their corresponding volume elements. The latter provide a means to define operational prior probability distributions of density matrices. We derive them for general matrices of arbitrary dimension.
Examples of all the above are given in the last part of the paper. For qubit and Gaussian states, we give explicit formulas for the distinguishability measures and their corresponding metrics and volume elements. We give a geometrical picture of the space of qubit states based on those metrics. This space can be viewed as a spherical cap, similar to Uhlmann hemisphere, with the pure states sitting on the rim. These examples also illustrate the fact that the quantum Chernoff measure, besides being the most natural distance between general states, is conveniently easy to compute relative to other distances, such as the widely used fidelity.

IX. ACKNOWLEDGMENTS
We are grateful to Montserrat Casas, Juli Céspedes, Alex Monràs, Sandu Popescu and Andreas Winter for discussions. We are specially grateful to Koenraad Audenaert and Frank Verstraete for their collaboration at the early stages of this work. We acknowledge financial support from the Spanish MEC, through the Ramón y Cajal program (JC), the travel grant PR2007-0204 (EB), contracts FIS2005-01369, FIS2004-05639 (AC) and project QOIT (Consolider-Ingenio 2010), from the Generalitat de Catalunya, contract CIRIT SGR-00185 and from EU QAP project (AC).