The $\eta^\prime$ transition form factor from space- and time-like experimental data

The $\eta^\prime$ transition form factor is reanalyzed in view of the recent BESIII first observation of the Dalitz decay $\eta^\prime\to\gamma e^+e^-$ in both space- and time-like regions at low and intermediate energies using the Pad\'e approximants method. The present analysis provides a suitable parameterization for reproducing the measured form factor in the whole energy region and allows to extract the corresponding low-energy parameters together with a prediction of its values at the origin, related to $\Gamma_{\eta^\prime\to\gamma\gamma}$, and the asymptotic limit. The $\eta$-$\eta^\prime$ mixing is reassessed within a mixing scheme compatible with the large-$N_c$ chiral perturbation theory at next-to-leading order, with particular attention to the OZI-rule--violating parameters. The $J/\psi$, $Z\to\eta^{(\prime)}\gamma$ decays are also considered and predictions reported.


I. INTRODUCTION
Padé approximants (PAs) have been shown recently to be very useful for describing meson transition form factors from the analysis of space-like (SL) experimental data [1][2][3][4][5]  1 .Such parameterizations based on the measurement of SL data have been used to extrapolate our knowledge of the form factors down to the low-energy limit (Q 2 → 0), thus extracting the low-energy parameters (LEPs), and up to the highenergy limit (Q 2 → ∞), then predicting the asymptotic behavior.Moreover, they have been employed to reconstruct the double-virtual transition form factor [8,10].PAs are now regarded as a valuable tool for incorporating available data into problems requiring a precise error estimation.They conform a data-driven approach that can be considered as simple, systematic and model independent, the latter because one can provide a systematic error which can be reduced as soon as more experimental data is included.These PAs applied to the pseudoscalar transition form factors (TFFs) are utilized in the evaluation of the lightest pseudoscalar mesons contributions to the hadronic light-by-light piece of the anomalous magnetic moment of the muon [1,2,4,5,7], the calculation of the π 0 → e + e − [8] and η, η → + − rare decays [9], the extraction of the η-η mixing parameters [2,11], the analysis of π 0 , η and η single and double Dalitz decays [10], and in the quest for dark photons [12].In all cases, they provide an excellent laboratory for synergic studies between theory and experiment.
The PAs P L M (Q 2 ) to a given function f (Q 2 ) are ratios of two polynomials (with degree L and M, respectively), constructed such that their Taylor expansion around the origin exactly co-incides with that of the function up to the highest possible order, i.e., f [13,14].They often provide a means of obtaining information about the function outside its circle of convergence, and more rapidly evaluating the function within it.However, in spite of being flexible and user friendly, PAs reconstructed from their power series at the origin are rational functions with a simple analytical structure given by a set of poles.Therefore, they do not possess branch cuts and cannot be used to predict the position of resonance poles, which are hidden in the second Riemann sheet of the complex energy plane.Similarly, PAs reconstructed using information on the branch cut, which allow for a precise determination of the resonance pole parameters [15][16][17], are not suitable for the extraction of the LEPs, i.e., PAs in their simplest form considered here cannot access different Riemann sheets.Nonetheless, for special kind of functions, such as Stieltjes [18,19] or non-Stieltjes but meromorphic functions [20], convergence theorems for PAs are known.To apply these theorems, an understanding of the analytical properties of the functions is required in advanced.When this knowledge is missing, the practitioner would explore a sequence of PAs and expect a pattern of convergence.Even when convergence is guaranteed in advanced, this will be restricted by the limits of the theorem's applicability.The question is whether observing convergence beyond these limits one could infer, within some uncertainties, the approximate analytical structure of the function under consideration.
In this work, we will explore this last insight taking the η TFF as a proof of concept.Within certain approximations, the authors of Ref. [21] proved the TFF to be a Stieltjes function for which the PA convergence is guaranteed in the SL and the time like (TL) below the production threshold [13].Not only that, but its rate of convergence is also known [13,18,19].Nevertheless, convergence along the branch cut is not a priori ensured by the mathematical theorem.As we will see later, the particularities of the TFF along the branch cut will decide on the success of our approximation.
We will try to learn and extract from the employed sequence of PAs details on the analytical properties of this TFF in the arXiv:1512.07520v2[hep-ph] 15 Sep 2016 energy regime covered by experimental data.In our previous analyses of the TFFs from SL data, we have always carefully expressed the limits on the range of applicability of PAs [1,2].Initially, PAs could be analytically continued from the SL region to the TL one, but only up to the first singularity, usually a branch cut in the form of a production threshold.For instance, in the case of the single Dalitz decay π 0 → e + e − γ PAs can be safely extended into the TL region up to the pion mass since no branch cuts are present.On the contrary, for the η → + − γ decays, with = e, µ, the presence of the ππ branch cut could in principle limit the application of PAs in the TL region.However, the η → e + e − γ decay and its associated TFF in the TL region was recently measured with great accuracy by the A2 Collaboration [22].The authors compared their measurement with several theoretical predictions, among them ours, based on SL parameterizations of the TFF in terms of PAs [2], and found that these PAs show the best agreement with data for the full range of e + e − invariant masses reached in the experiment.This nice result challenged our understanding of the PAs method and triggered, for the first time, a combined analysis of the η TFF from both SL and TL experimental data [11].The reason for that agreement can be understood by the fact that the branch cut in this decay (ππ unitary cut) is not resonant inside the available phase-space region since the ρ resonance is well beyond the η mass.The PAs will fail for sure at the first pole encountered on the real axis, or, to be more precise, will start failing at some point near the pole 2 .In any case, for the η TFF, this pole on the real axis is found to be at √ s 720 MeV for the single-pole parameterization used frequently by the experimental collaborations [11].Therefore, for the η → + − γ decays the PAs can also be extended into the TL region up to the η mass with excellent accuracy.
The case of the η → + − γ Dalitz decays is more cumbersome since the available phase space this time includes the resonant region.However, the analysis performed in [2] on the η TFF using only SL data revealed that the pole on the real axis for the single-pole parameterization is located at √ s 830 MeV.In order to estimate the region of influence of this pole one can make use of the half-width rule [23].In this case, the ρ and ω resonances are within the phase-space region and the φ is not far from its end point.Taking the values of their masses and widths from the PDG [24], the application of this rule gives M eff ± Γ eff /2 = 822 ± 58 MeV [2].This value of the effective pole is compatible with the result obtained before from the single-pole parameterization, thus showing that the pole found at 830 MeV is somewhat a kind of weighted average of the three existing resonance poles.The range given by the half-width rule above implies that the region of influence of the former pole is from 764 MeV to 880 MeV.Consequently, for the η TFF the PAs can also be used in a safe manner up to around 760 MeV in the TL region 3 . 2The question is how close the PAs can approach the pole without failing.A detailed discussion on this issue for the case of η Dalitz decays can be obtained from [10]. 3 In [10], we were more conservative and the half-width rule was applied taking only into account the ρ resonance.As a result, we obtained that the lowest value of the region of influence in that case was around 700 MeV.√ s) fit to space-like data performed in [2].Experimental points are from the BESIII measurement in [25].
Recently, the BESIII Collaboration reported the first measurement of the e + e − invariant-mass distribution for the η → e + e − γ decay up to 750 MeV [25].As discussed, our prediction for the TL region of the η TFF based solely on SL data should be able to describe this new measurement.In Figure 1, the BESIII experimental extraction of the modulus square of the η TFF as a function of the e + e − invariant-mass ( √ s) is compared with our theoretical prediction.It is worth remarking that this is not a fit but a prediction and the agreement is seen to be excellent.
The main purpose of the present work is therefore to further improve our determination of the η TFF taking into consideration not only the existing SL experimental data but also the new set of TL data from the recent BESIII measurement.This combined analysis will allow us to better determine the LEPs of the TFF, its normalization and the asymptotic limit.Such an enhancement permits to reconsider the η-η mixing, with special emphasis on the OZI-rule-violating parameters and the J/ψ(Z) → η ( ) γ decays.In Section II, we comment on the reasons we believe justify the success of PAs when applied to the TL region.In Section III, we include the TL data in the analysis, present the new results and comment the improvements achieved.Section IV is devoted to the reassessment of the η-η mixing parameters within a mixing scheme compatible with the large-N c chiral perturbation theory at nextto-leading order, with particular attention to the OZI-ruleviolating parameters.The consequences of these results for the J/ψ and Z radiative decays are also investigated.Finally, in Section V, we conclude and mention the future prospects.

II. PADÉ APPROXIMANTS IN THE TIME-LIKE REGION
Within certain approximations, the authors of Ref. [21] proved the isovector contribution to the TFF to be a Stieltjes function, which is, that this contribution to the TFF can be  represented by an integral form defined as [13] f (q 2 ) = where φ (u) is any bounded and nondecreasing function [13].
, and making the change of variables u = 1/s, Eq. ( 1) returns the once-subracted dispersive representation of the isovector contribution discussed in Ref. [21], and also exploited in Refs.[26,27].Since ImF(s) = σ 3 (s)P(s)|F V (s)| 2 [21] where σ (s) = 1 − 4m 2 π /s, P(s) a linear polynomial with positive slope and F V (s) the pion vector FF, then ImF(s) is a positive function, the requirement of φ (u) to be nondecreasing is fulfilled and the convergence of PAs to the TFF is guaranteed. 4adé Theory not only provides a convergence theorem for a sequence of PAs to Stieltjes functions, i.e., lim N,M→∞ P N M (s) − f (s) = 0, but also its rate of convergence [13,18,19], given by the difference of two consecutive elements in the PA sequence.As we will see later, this error prescription will return very small theoretical uncertainties.To be more conservative, we designed in Refs.[1,2,6,11] a different method to extract such uncertainty which yields errors at the level of the statistical ones.
Still, even though the ππ unitary cut driving the decay is of Stieltjes nature, there is no a priori reason why the PA should work above the branch cut.The cumbersome situation is, however, that at least the P L 1 (s) sequence does work well above the cut (cf.Figure 1).And the unanswered question is, then, whether one could have anticipated this success and how general is for any arbitrary situation.A fair statement would be to say that, approximately, the TFF is a meromorphic function which has nothing but a set of single and isolated poles within the data range.In this scenario, PA are an excellent approximation tool [20].Moreover, they tell us about the underlying physical phenomena driving the decay without the need to assume any model.
To better understand this situation from a qualitative point of view, let us discuss the following.As we have said, in the zero-width approximation, the TFF becomes a meromorphic function.If the TFF contains a single and isolated pole, the P L 1 (s) sequence reproduces the pole of the TFF with infinite precision.As soon as the width is again switched on, the ππ threshold opens a branch cut responsible for that width.Then, at first, no mathematical theorem will guarantee convergence on this scenario.On the contrary, if the convergence theorem is to be satisfied, one would expect the single pole of the P L 1 (s) to be located closer and closer to the threshold point as soon as L → ∞, since this is the first singular point the PA is going to find.
A closer inspection at the threshold expansion of the TFF reveals a different pattern for real and imaginary parts in terms of the variable q 2 , the center-of-mass momentum in the ππ rest frame.At low energies, one can model the P → γ * γ TFF as a convolution between the P → ππγ amplitude with the ππ → γ * pion electromagnetic form factor F V (s) [26].
The behavior of the ππ branch cut at threshold has then, two different components, both of them well known.The knowledge of the ππ threshold for F V (s) comes from the Pwave ππ scattering amplitude (the opened cut yields vector states) together with the Fermi-Watson theorem that relates the phase of the scattering amplitude with the phase of the form factor below the first inelastic threshold.The ππ P-wave scattering amplitude t 1 1 (s) at threshold behaves like [28]: with 4q 2 = s − 4m 2 π , and where for the imaginary part we used the unitary relation Im[t 1  1 (s) −1 ] = −σ (s).The absolute value of the threshold expansion of the amplitude t 1 1 (s) is basically a polynomial in (s − 4m 2 π ) with the influence of its imaginary part starting only at (s − 4m 2 π ) 4 .Following the previous equation, if the threshold parameters a and b are of order 1 (with the appropriate units) [28], then the real part dominates near threshold and the absolute value is given basically by the real part.By virtue of the unitary relation for the F V (s), ImF V (s) = σ (s)F V (s)t 1 1 (s) * , and the expansion in Eq. ( 2), one concludes that while the real part of the threshold expansion of the F V (s) starts at order (s − 4m 2 π ) 0 , its imaginary part coming from σ (s)Re[t 1 1 (s)] only starts showing up at order (s − 4m 2 π ) 3/2 .In summary, since Im[TFF] ∼ σ (s) 3 |F V (s)| 2 and F V (s) at threshold is basically real, near the ππ threshold the imaginary part of the TFF, and thereby the expected discontinuity, will behave as (s − 4m 2 π ) 3/2 , while its real part as (s − 4m 2 π ) 0 .If this is the case and the offset of the threshold is that smooth, the P L 1 (s) sequence will be an excellent tool to reproduce the TFF near and above the threshold.Actually, taking the definition of a P L 1 (s) given by the polynomial part will reproduce the modulus of the ππ discontinuity, and the PA pole part will account in an effective manner for the pole of the TFF far away from the threshold.The last question is, then, up to what energy one can go above the threshold before failing.The threshold expansion itself must fail at some point because it breaks unitarity by powers of (s − 4m 2 π ).A quantitative answer to this question would demand to study this problem using a particular model.To make a general statement, model independent, and qualitative, we notice that the threshold expansion should break down when the presence of the resonance pole is large enough and cannot be approximated by a polynomial in (s − 4m 2 π ).This happens basically at a distance of the pole given by the half-width rule [23] which, as argued in the Introduction, provides a simple estimate of the PA range.
The previous discussion already excludes the generalization of our results for any arbitrary Stieltjes function since the clue feature is the behavior around the threshold point.While for vector and tensor form factors we foresee good performance of our PA method, for a scalar form factor with an abrupt threshold offset, the range of applicability within the time-like region will be more limited.Parameterizations existent in literature which would be suitable to be compared our method with can be found in [26,27,[29][30][31][32][33][34][35][36][37][38].

III. INCORPORATION OF THE LOW-ENERGY TIME-LIKE DATA
Since our goal is to provide a parameterization of the TFF as accurate as possible and we have shown in the previous section that the TL experimental data up to 0.75 GeV can be well described with our old parameterization based on SL data, in this section we will include the TL data as a new data set to be fitted, following [11].At low-momentum transfer, the TFF Constraining F η γγ (0) Predicting F η γγ (0)  I. Low-energy parameters as obtained after a joint fit to both space-and time-like data with and without including the measured two-photon partial width as a restriction in the χ 2 function of ( 5), second and third multicolumn, respectively.The leading coefficient of the TFF asymptotic limit, the pole of the PA and the χ 2 dof are also shown.can be described by the expansion where F η γγ (0) is the normalization (the TFF at zero momentum transfer) while the LEPs parameters b η , c η and d η are, respectively, the slope, the curvature and the third derivative of the TFF.By reassessing our SL fits [2] through including TL data, we will update the results for the LEPs of the η TFF.The χ 2 function minimized in our fit is given by ( where the first and second terms correspond to SL [39][40][41][42] and TL [25] data, respectively, while the last term encodes information from the TFF at zero momentum transfer and introduces an additional restriction.For the experimental value we use F exp η γγ (0) = 0.3437(55) GeV −1 , inferred from the partial width to two photons, Γ η →γγ = 4.35 (14) keV [24], through the relation The value Γ η →γγ = 4.35 (14) keV cited in [24] is not a measured quantity, rather a fit inferred from the branching ratio and using the current η total width.The average experimental determination for such decay reads 4.28 (19) keV.It will be interesting to see whether this 0.3σ difference would affect our results at the precision we are working.We start fitting with a Padé approximants' sequence of the type P L 1 (Q 2 ), and current data allow us to reach L = 7.We provide a graphical account of our fits as compared to both SL and TL in Figure 2, from where one can see that the one sigma error band associated to the time-like η TFF has considerably decreased as compared to Figure 1.The LEPs obtained from the fit are collected in Table I and their corresponding convergence pattern in Figures 3 and 4 (red circles) reflect the impact of the inclusion of TL compared with the old results.In the table we also provide the pole of the PA.The coefficients of our best fit are gathered in Appendix A.
Comments on these results are in order: 1.The precision gained on the LEPs determination is remarkable as compared to our previous results (blue triangles) when only SL were fitted [2]; 2. We enlarge our PA sequence by one element (reducing then the systematic uncertainty); 3. The new LEPs sequence reaches faster the stability value manifesting the excellent performance of the method as new experimental data is included; 4. Including F exp η γγ (0) as an additional datum in the fit reduces significantly the uncertainty associated to this quantity.Regarding to this constraint, it is noticed that while LEPs obtained from the P L 1 (Q 2 ) sequence are basically insensitive to this effect, the LEPs obtained from the P 1 1 (Q 2 ) element are not and suffer small distortions.
After the first combined analysis of both SL and TL data, our central value results for F η γγ (0) and LEPs are where the first error is statistic and the second systematic, the latter being 0% for the value at the origin, and 1%, 2%, and 9% for the slope, curvature, and third derivative, respectively [2].The systematic error can be evaluated as well from the difference between two consecutive elements in the PA sequence [13,19].However, as illustrated in Figs. 3 and 4, this difference is basically negligible for the value at the origin and all the derivatives shown, and we prefer to consider the larger systematic errors reported in Ref. [2] which were obtained after applying the PA method to a set of selected models.The results above can be compared with the ones obtained by the P N N (Q 2 ) sequence in Table I.Since this second sequence stops at its first element (which is actually the first element on the P L 1 (Q 2 ) sequence as well), we do not consider its results for a combined weighted average.The systematic error is at the level of the statistical one.To reduce it, we would need more precise high-energy data on the one hand, and enlarge, on the other hand, the sequence which is limited in this analysis to its first element.Notice that the P N N (Q 2 ) has systematic errors dramatically smaller than the ones considered here (see the Appendix in [11] for details).It turns out that the η TFF is very much dominated by a single hadronic scale that gives to the TFF its characteristic vector meson dominance-like shape (VMD).A P 2 2 (Q 2 ) fit cannot be accommodated at the current level, and we hope that more data from BESIII, MAMI, and Belle-II will help to improve the present values (see the discussion at the end of this section).These results can be compared with F η γγ (0) = 0.344(4)(0) GeV −1 , b η = 1.30(15) (7) and c η = 1.72 (47) (34), obtained using SL data only [2].Clearly, the statistical uncertainty of the LEPs has considerably diminished as a consequence of including TL data to the analysis, being that one of the main results of this work.Our slope, b η = 1.31 (4), can be compared with the values 1.46 (23), 1.24(8) and 1.6(4), quoted by the CELLO [39], CLEO [40] and Lepton-G (cited in [36]), respectively.One should notice that all the previous collaborations used a single-pole model, VMD, to extract the slope, which is nothing but the simplest P 1 1 (Q 2 ) element from our approach (which we neglected).Other theoretical predictions existent in the literature are b η = 1.47 predicted by chiral perturbation theory for sin θ P = −1/3, being θ P the η-η mixing angle, b η = 1.30 from constituent-quark loops, both values taken from [37], b η = 1.33 from VMD [43], and b η = 2.11 from the Brodsky-Lepage interpolation formula [44].More recently, one can find b η = 1.323(4) from resonance chiral theory [38], b η = 1.45 +0.17 −0.12 using dispersive techniques [26], and b η = 1.06 or 1.16 from anomaly sum rules [33].
The main difference between Figure 1 and Figure 2, left panel, is the width of the uncertainty band, specially at large √ s, which is the region where we expect the PA to eventually fail.To control on the quality of the fits at this large √ s, we have repeated the fits by first enlarging artificially the errors of the last energy points and secondly eliminating subsequently the last data points.We have observed a completely stable fit even under these manipulations which only slightly enlarge the slope error but always keeping the same χ 2 dof (degrees of freedom).We conclude, then, that our final results in (7) are robust enough and independent of an eventual failure of the PA method at the highest TL energy point.
We benefit from our results F η γγ (0) = 0.344(5) GeV −1 and F η γγ (0) = 0.342(13) GeV −1 (constrained and unconstrained cases, respectively) to predict the η partial decay width to two photons.For the constrained fit, i.e. including the value at the origin in our data set, the fit returns Γ η →γγ = 4.35 (13) keV, slightly better than the PDG fitted value and at 0.3 standard deviations off its averaged result.For the unconstrained case, we find Γ η →γγ = 4.30 (33) keV, which lies 0.1 standard deviations off the experimental value.Regarding the asymptotic behavior of the TFF, we have considered the P N N (Q 2 ) sequence since they have the right asymptotic fall-off 1/Q 2 built-in.We reached N = 1 and then predicted the leading coefficient lim which is in very good agreement with the value 0.254( 21) GeV 6 measured at q 2 = −112 GeV 2 by the BABAR collaboration [45] and assuming that the space-and time-like asymptotic duality already holds at that q 2 .This prediction is basically the same one obtained in [2] when only the SL data were considered.Therefore, the effect of including the TL data is negligible in this respect.Ideally, it would be desirable to extract such value from the N = 2 element, which allows for diminishing the intrinsic systematic error (not yet evaluated) as well as for checking convergence.This should be possible in the future if new precise Belle-II data becomes available.
The result in Eq. ( 8) needs to be upgraded to include a theoretical error, then.Such error can be obtained in the same way as the systematic error for the TFF's LEPs [1,2,6].This is, by comparing the asymptotic value of the model used in Refs.[1,11] (the holographic confining model defined in Appendix B of [11], rescaled to yield the same asymptotic value as in Eq. ( 8)), with the expansion at Q 2 → ∞ of the PA fits to pseudodata, we extract the theoretical uncertainty in percentage collected in Table II.
This Table II shows that the theoretical uncertainty associated to the extraction of the TFF's asymptotic value using the simplest P 1 1 (Q 2 ) is of about 25%.This result seems to 6 Such value is obtained from the BABAR result 0.251(19)(8) GeV after taking into account kinematical corrections [2].
TABLE II.Theoretical error for the first asymptotic coefficient of the P N N (Q 2 ) sequence.First line corresponds to the actual coefficient lim Q 2 →∞ P N N (Q 2 ).Second line collects its relative error with respect to lim Q 2 →∞ Q 2 F η γγ (Q 2 ).See the text for details.disagree with the fit shown in Fig. 2, left panel, where the interpolation of the P 1 1 (Q 2 ) at around Q 2 = 20-35 GeV 2 is very good, much better than a 25% (the error at Q 2 → ∞ and at Q 2 = 20-35 GeV 2 is basically the same).This disagreement comes from a peculiarity of the η -TFF.We have already discussed our impossibility of fitting with a P 2 2 (Q 2 ).This is not an anecdote since it is telling us that a P 1 1 (Q 2 ) is already an excellent description of the TFF.
Since the difference is basically zero, their difference at Q 2 → ∞ is also zero, which implies that the systematic error we have by extracting the asymptotic value with the P 1 1 (Q 2 ) is basically the same as if we would use the P 2 2 (Q 2 ).To be a bit more specific, if we consider the asymptotic value obtained with our fits using the P 1 1 (Q 2 ) and recon- struct the P 2 2 (Q 2 ) the LEPs obtained with the fit of the P 7  1 (Q 2 ), the relative error between both predictions amounts to a 4%.From Table II, the P 2 2 (Q 2 ) induces a theoretical error for the asymptotic coefficient of a 3%.We can then conclude that in our case of study, the theoretical error induced by using the P 1 1 (Q 2 ) to extract that asymptotic coefficient is the combination in quadrature of both sources of error, i.e., a final 5%, much smaller than the generic 25% quoted in Table II.Let us remark that this 5% is valid only for the present case of the η TFF.
We should also include a theoretical error to the asymptotic value of the η TFF of about 3%, which combined with the statistical error obtained in Ref. [11] turns out to read 0.177( 16)GeV.

IV. A REASSESSMENT OF THE η-η MIXING
In this section we reanalyze the η-η mixing as we did in [2,11], and consider the so-called octet-singlet basis, where the η and η pseudoscalar decay constants are defined in terms of the axial currents J a 5µ = qγ µ γ 5 λ a √ 2 q as 0|J a 5µ |P = i √ 2F a P p µ , where a = (0, 8) refers to its singlet and octet components, respectively.The decay constants in terms of the two angles θ 0 and θ 8 read In this basis, large-N c ChPT at NLO predicts [46,47] where F K 1.20F π is the kaon decay constant.At this point we call the attention that F 0 is renormalization group (RG) dependent (F 0 = F 0 (µ)).This is connected to the J 0 5µ anomalous dimension implying [35,48] where N F is the number of active flavors at the scale µ.Solving this equation up to order α s (µ), the singlet decay constant at a different scale can be expressed as with β 0 = 11 − 2N F /3.The parameters δ and Λ 1 are interrelated since [48] µ d dµ at NLO in large-N c ChPT.This equation also implies that if Λ 1 = 0, then δ = 0.
The error quoted for the δ ∞ parameter comes from the uncertainty of α s (M z ) = 0.1182( 16) [24].
In addition, we want to include in the previous set of equations the OZI-rule-violating parameter Λ 3 , neglected in our previous studies, since it is the first correction to the F Pγγ in large-N c ChPT, even though belongs formally to the NNLO order.To be consistent with the counting, if we include OZIrule-violating parameters, we should also take into account mass corrections to the pseudoscalar-into-two-photons decay widths.Such corrections can be directly calculated from the corresponding Lagrangian [48]: electric charge of the quarks and φ the 3 × 3 matrix representing the nine pseudoscalar fields φ 0 (x), . . ., φ 8 (x).For π 0 → γγ: The value Γ π 0 →γγ = (7.63 ± 0.16) • 10 −9 GeV [24] translates into K 2 = −0.45± 0.57, compatible with zero but with a large central value.This suggests that a better experimental resolution for π 0 → γγ will have an impact in the η-η mixing even neglecting isospin corrections since Eq. ( 19) implies The set (15,16,17,18) form a system of 4 equations with 5 unknowns (F (8,0) η ( ) , Λ 3 ).Then it could seem that, at least taking Λ 3 = 0, we may solve the system.However, as explained in [2], such a system is underdetermined as the relation is free of mixing parameters.Indeed, (20) fixes Λ 3 once its left-hand side is (experimentally) known.However, we still have to face the fact that our system is underdetermined.In order to overcome this problem, we notice that at NLO in large-N c ChPT, Eqs.(10,11) provide a clean prediction for both F 8 and (θ 8 − θ 0 ) in terms of the well-known value for F K /F π [24].Taking either Eq. ( 10) or Eq. ( 11) as a constraint, one would add an additional equation to the previous system, which would provide a unique solution.Taking both, would lead to an overdetermined system, which in general has no solution.For this reason, we adopt a democratic procedure in which we perform a χ 2 fit including both, Eq. ( 10) and Eq. ( 11) constraints, together with (15,16,17,18,20).Using the following parameters as inputs for the χ 2 function we would obtain the mixing parameters collected in Table III, first column.The total χ 2 = 3.45 has a p-value=0.18which is acceptable with two degrees of freedom.By observing the different terms contributing to the χ 2 function, we realize that the piece related with the determination of the η TFF at Q 2 → ∞ is the one most disfavored (contributing with 1.97 to the χ 2 ), with a theoretical prediction of 0.155 (19)GeV to be compared with the experimental counterpart of 0.177 (16)GeV.Such a discrepancy of about 0.9σ can be attributed to a rather small value of θ 0 .The smallness of θ 0 comes ultimately from the experimental value of η exp ∞ .This is so because in this observable, the angle θ 0 comes with cos θ 0 ∼ 1 and then, η exp ∞ determines the value of F 0 which, in turn, and through Eq. ( 20) determines θ 0 .To test this correspondence, we could imagine a η exp ∞ 10% higher than what we found from the fits.This new value would be in agreement with the BABAR measurement at q 2 = −112 GeV 2 , transferred to the space-like region including a 10% effect on the violation of the space-and time-like duality as suggested in Ref. [11].With such enhancement, the value of F 0 will be as well proportionally enhanced.Then, the fit will return θ 0 a 10% larger, a χ 2 result of about 1.9, and a prediction for η ∞ 5% larger.As we argued before, experimental data for the η TFF at large momentum transfer comes exclusively from BABAR collaboration and a second set of experimental data would be highly welcome to settle this issues.Alternatively, we can ascribe the smallness of the η ∞ to an underestimation of a theoretical error without the need to touch any number coming from experimental determination.Up to now, we have assumed that the expressions for F 0,8 and θ 0,8 calculated at NLO in large-N c ChPT contain no theoretical error coming from neglecting higher orders in that expansion.This is an hypothesis that can be tested by allowing higher order effects in the definition of F K /F π .We assumed that the departure of that ratio from 1 came from the NLO alone.We can extend this argument by saying that F K /F π − 1 = ε + ε 2 = 0.198 which implies ε = 0.17 and ε 2 = 0.03.ε is the small quantity representing the NLO and ε 2 the NNLO of about 2.4% of F K /F π .This hypothesis is confirmed both from SU(3) ChPT [49] and from SU(3) ChPT in the large-N c limit [50].Whit such new error for F K /F π we can repeat the fit to obtain the mixing parameters.The new χ 2 = 2.01 represents a p-value = 0.37 and χ 2 dof = 1.00.The results of the mixing are collected in the second row of Table III and is the main result of this section.The net effect of including an extra source of theoretical error on F K /F π turns out to be a larger θ 0 by a 25% which in turn comes from a better prediction of η ∞ , which now reads 0.164 (33)GeV.The χ 2 dof value is excellent, which indicates a good agreement with large-N c ChPT but with non-negligible NNLO corrections.In addition, we can use (10) to predict the value Λ 1 = 0.01 (13).
The outcome of our fit is teaching us that Λ 1,3 are compatible with zero and one is tempted to eliminate them from our system of equations.This is actually the avenue followed by the FKS scheme [51] (i.e., Λ 1,3 = 0, K 2 = 0, δ ∞ = 0) and at first sight it may seem we are confirming their findings.Notice, however, that Λ 1 and δ ∞ are interrelated (14) and by repeating the fit imposing δ ∞ = 0, K 2 = 0 but Λ 1,3 = 0, our fit returns non-negligible values for Λ 1,3 and a set of mixing parameters, collected in the last row of Table III, still compatible with our previous findings.Our results represent then an update of the FKS scheme including all NLO corrections together with an estimate of NNLO effects, which cannot be neglected in order to obtain a good χ 2 (cf.first against second rows in Table III).
In Figure 5 we collect our main result (orange crosses) from the second row of Table III and compare them to different predictions in the literature [46,48,51,52] (red dots), as well with our previous results [11] in blue-empty squares.We see that the main difference with respect to our previous work [11], where we did not use the η TFF asymptotic value appears in θ 0 .This is to be expected as the inclusion of Λ 1 and Λ 3 affects the singlet part exclusively.In addition, we have reduced our errors thanks to the constraints from large-N c ChPT.Our prediction for Λ 3 may be compared with the one in [52], Λ 3 = −0.03(2).Both of them point towards a small value for this parameter, and agree with its sign.We find that Λ 3 actually plays an important role not only in fulfilling the degeneracy condition (20), but in the η(η ) → γγ decays as well.In addition, the Λ 1 term is rather important and affects specially the η results, where deviations of order 10% appear if Λ 1 is omitted.Finally, we stress that the use of the RG equation for F 0 is fundamental, whereas most of the theoretical and experimental analysis do not account for this effect, which -to our best knowledge-was included for the first time in [35].This effect increases η ∞ and diminishes η ∞ , bringing in agreement experiment and theory.
Our results may be translated to the quark-flavor basis where the decay constants are defined as Using the rotation matrix [51] U From the above equation, together with the values from the second row of Table III, we obtain In addition, we can predict the ratio R J/ψ ≡ Γ J/ψ→η γ /Γ J/ψ→ηγ , which is given in terms of φ q alone [46] as With ( 22), R J/ψ = 5.2( 9), just at 0.5σ from the experimental value R J/ψ = 4.7(2) [24].It may be that, as precision improves, the deviation grows, which would be a hint of novel phenomena in the η-η system, as gluonium component, which has long been debated, but not found so far [53].We recall in this sense that large-N c ChPT implicitly assumes that such component is not present in the η .Moreover, the 3-gluon annihilation amplitude, not included in our framework, may need to be included to account for this 10% discrepancy [54].Alternatively, we could use the experimental R J/ψ together with Eq. ( 23) to obtain φ q = 38.2(6)• in agreement with our fit determination.With respect to the V Pγ couplings calculated in our previous work [11], the new results yield more precise errors and very similar central values, with the exception of the φ cases, which get slightly closer to the experimental results.
With the set of parameters in Table III, together with (9), we can also predict the ratio R Z ≡ Γ Z→η γ /Γ Z→ηγ , which is given by [55] where, assuming the asymptotic behavior, [55], an observable quite sensitive, then, to the singlet angle.However, since F 8 η F 0 η , the denominator of (24) should not be approximated and all the terms should be retained.In this respect, we find R Z = 11(3), indicating still a large singlet component in R Z .
To close this section, we comment on possible venues to improve our errors.On the one hand, it would be desirable to improve not only on η ∞ , which now is the input with the largest error, but also to obtain η ∞ from a P 2 2 (Q 2 ), which would reassess both the central value and the error of this parameter.This would be possible from future Belle-II data.Curiously enough, the Γ π 0 →γγ measurement is important to study NLO effects and the role of SU(3) breaking in the mixing scheme.On the other hand, it would be interesting to have a more precise O(α s ) 2 calculation for δ ∞ although its impact may be marginal.A NNLO predictions for the mixing parameters and η(η ) → γγ decays will allow to check the stability and accuracy of the results.This calculation will involve new low-energy constants which knowledge is scarce, though.In addition, future lattice analysis may play an important role in this field [56] and a combined analysis using the PA method will be highly desirable.

V. CONCLUSIONS
In this work we have shown the excellent performance of the Padé approximants' method developed in [1][2][3]11] for the description of the recently reported first observation of the Dalitz decay η → γe + e − by the BESIII Collaboration [25].This experimental analysis studies the time-like region of the η transition form factor up to the resonance region.
Unlike our previous works, we have explored in the present one the limits of application of PAs in the TL region finding that, beyond expectations, PAs can be extended to energies very close to the location of poles.We have nicely described the behavior of the modulus square of the η TFF thus showing that this form factor has a simple analytical structure in the complex plane made of an isolated branch cut due to the ππ production threshold and a set of single poles.
The careful analysis of the PA sequence P L 1 (Q 2 ) reveals, however, more effects than those of the ρ resonance emerging here from ππ rescattering.Subleading effects caused by additional branch cuts or the influence of higher resonances' tails are also captured by PAs and are indeed responsible for the shift of the PA-pole location with respect to the naive projection of the ρ resonant pole onto the real axis.Since this shift is not known with precision it is difficult to extract from the PA pole the exact position of the resonance pole.This limitation of the method, already mentioned at the beginning of this work, does not prevent the PAs from guiding us about the underlying analytical structure of the TFF.One can take advantage of this highly non-trivial knowledge to further use the PAs method in other scenarios such as B → π semileptonic form factors or combine it with dispersion relations to include resonance poles.
A last remark concerns the role of VMD in experimental analyses, now that the meaning of the PA pole on the real axis is understood.As pointed out in [6], VMD should be interpreted as a first step in a systematic approximation, that is, the P 1 1 element belonging to a more general and exhaustive P N  1 sequence.Although it is common to report on such fit for ease of comparison, the range of application of VMD in the TL region is much shorter than the P 71 we used here.
In summary, PAs are not only useful for fitting and extrapolating data within the SL region but also give us information about the analytical structure of the TFF in the TL low-energy region.On the one hand, dispersion relations with a single ππ elastic cut for the isovector part of the TFF and a Breit-Wigner model for the isoscalar one [26,27] proves the TFF to be a Stieltjes function for which the PA convergence is guaranteed in the SL and in the TL below the cut.This already ensures an optimal extraction of the LEPs from experimental data with tiny systematic errors.On the other hand, the modulus of the TFF along the branch cut is also well reproduced thanks to the smoothness of the opening of the ππ cut, even though the convergence theorems do not guide about the performance in this region.PAs are also capable of accommodating the SL region high-energy QCD constraints while still providing accurate predictions of the Γ η(η )→γγ decay widths.Moreover, they allowed us to report the most up-to-date results for slope, curvature, and third derivative of the η TFF, and to update the η-η mixing parameters in a mixing scheme compatible with the most general large-N c ChPT scenario at NLO, thus superseding the values obtained in our previous works and those from FKS [51], BDO [52], and EF [46] schemes.With such results we predicted the J/ψ and Z → η ( ) γ decays.

FIG. 1 .
FIG.1.Our prediction for the η transition form factor in the timelike region obtained from the P 6 1 (

TABLE III .
Predictions for the mixing parameters.θ 8,0 are expressed in degrees.First row is our basic scenario while second row (our preferred scenario) includes, on top, a theoretical error for F K /F π .Third row corresponds to setting K 2 = 0 and δ ∞ = 0 in the fit function.See the main text for details.F 8 /F π F 0 /F π θ