Extrapolation and sharp norm estimates for classical operators on weighted Lebegue spaces

We obtain sharp weighted L p estimates in the Rubio de Francia extrapolation theorem in terms of the Ap characteristic constant of the weight. Precisely, if for a given 1 r it is bounded on L p (v) by the same increasing function of the Ap characteristic constant of v, and for p < r it is bounded on L p (v) by the same increasing function of the r 1 p 1 power of the Ap characteristic constant of v. For some operators these bounds are sharp, but not always. In particular, we show that they are sharp for the Hilbert, Beurling, and martingale transforms.


Extrapolation.
A positive locally integrable function on R n is called a weight.A weight w is said to be of class A p , for 1 where the supremum is taken over all cubes Q in R n with sides parallel to the axes (Q will always denote such cubes).The quantity above is called the A p -characteristic constant of the weight w and will be denoted by w Ap .
A weight w is said to be of class A 1 if there is a constant C > 0 such that M w ≤ Cw a.e., where M is the (uncentered) Hardy-Littlewood maximal function, i.e.
The smallest possible C is denoted by w A1 .
For an operator T bounded from a Banach space X into itself (T ∈ B(X)) we will denote by T X its operator norm.When 1 < q < ∞, q shall stand for the dual exponent of q, i.e.1 q + 1 q = 1.Given a weight v on R n , L p (v) denotes the space of complex functions on R n such that The following result is the celebrated extrapolation theorem of Rubio de Francia.
Suppose there is 1 ≤ r < ∞ such that T ∈ B(L r (u)) for all weights u ∈ A r , with bounds depending only on u Ar .Then T ∈ B(L p (w)) for all 1 < p < ∞ and all weights w ∈ A p , with bounds depending only on w Ap .More precisely, suppose for each B > 1 there is a constant N r (B) > 0 such that we have (1) T L r (u) ≤ N r (B) for all u ∈ A r with u Ar ≤ B.
Then for any 1 < p < ∞ and B > 1 there is N p (B) > 0 such that for all weights w ∈ A p with w Ap ≤ B, (2) This result first appeared in [R].Different proofs can be found in the books [GC-RF] and [Gr].
Muckenhoupt proved in [M] that for 1 < p < ∞ the maximal function is bounded on L p (w) if and only if the weight w belongs to the class A p .Hunt, Muckenhoupt and Wheeden proved in [HMW] that the A p condition also characterizes the boundedness of the Hilbert transform Hf (x) = p.v.
1 π f (y) x − y dy in L p (w). Coifman and Fefferman [CF] extended the theory to general Calderón-Zygmund operators.
In 1993, Buckley [Buc] obtained the following result concerning the Hardy-Littlewood maximal function2 (1 < p < ∞): (3) where the constant C(p) depends only on p (and the underlying dimension n).These bounds are sharp, i.e. w p /p A p cannot be replaced by ϕ( w A p ) for any function ϕ : R + → R + that grows slower than the p /p-th power.This can be easily seen by using power functions and power weights.Taking w ≡ 1 we see that the constants C(p) must blow up as p → 1.
In this note we use Buckley's estimate (3) to improve Theorem (E) as follows.
Theorem 1.With the notation and hypotheses as in Theorem (E), assume that N r (B) denotes the smallest constant that satisfies inequality (1).Then for any 1 < p < ∞ and all B > 1 there is a constant N p (B) such that (2) holds for all weights w in A p satisfying w Ap ≤ B.Moreover, Here C(p) is the constant appearing in (3).
This result, applied to the Beurling and martingale transforms for r = 2, N 2 (B) = CB and p > 2, was first observed in [PetV].In this case, a careful extrapolation for p > 2 yields N p (B) ≤ C p B. That is, the linear dependence on the constant when p = 2 is preserved also for p > 2. However, this is not the case when p < 2, which motivates a more careful examination of the problem.

Sharp bounds.
The linear bounds for the Beurling transform in L p (w) in terms of w Ap for p ≥ 2 have important consequences in the theory of quasiconformal mappings.The connection is very well explained in the paper by Astala, Iwaniec and Saksman [AIS] who were interested in finding the minimal q < 2 for which all solutions to the Beltrami equation ∂f = µ • ∂f that belong to the Sobolev space W 1,q loc still self-improve to belonging to W 1,2 loc , i.e. are quasiregular.Here µ is a bounded function with µ ∞ = k < 1.A deep result of Astala [A] says that q > 1 + k suffices.On the other hand, Iwaniec and Martin [IM] found examples showing that the result could in general not be true for q < 1 + k.In [AIS] the borderline case q = 1 + k was addressed; it was pointed out by the authors that quasiregularity would be a consequence of the linear dependence of the norm of the Beurling transform on weighted spaces L p (w) for p ≥ 2 in terms of the A p characteristic of the weight w.This linear dependence was settled in [PetV] and later in [DV] for p ≥ 2, the only range for which it is true.
For the maximal function, the bound for M L 2 (w) is also linear in w A2 , see (3).If 1 < p < 2, extrapolation yields sharp dependence of M L p (w) on w Ap .However, for p > 2, extrapolation only gives linear growth on w Ap , when the sharp growth is w p /p Ap .In [Buc], Buckley considers two more examples were the same phenomena occur.He shows that a parametric class of Marcinkiewicz integral operators is uniformly bounded on L p (w) by w Ap for all 1 < p < ∞ and these linear estimates are sharp [Buc,Theorem 2.15].In particular, extrapolating from the sharp linear estimate at p = 2 yields the right sharp linear estimate for p > 2, but for p < 2 it yields a worse estimate.Buckley also shows that a parametric class of averaging operators is uniformly bounded on L p (w) by w

1/p
Ap for all 1 < p < ∞ [Buc,Lemma 2.18].In this case, starting from the estimates on L r (w) for any 1 < r < ∞, extrapolation yields an estimate that is worse than the sharp estimate for all p = r.Therefore the estimates of Theorem 1 may not be sharp for some operators even when the initial estimate is sharp.However, the theorem itself is sharp, as we will show that for a variety of classical operators that have a sharp linear norm estimate in L 2 (w), the extrapolated bounds are also sharp for all 1 < p < ∞.
Buckley [Buc] also showed that the Hilbert transform -and for that matter convolution singular integral operators with Calderón-Zygmund kernels-are bounded on L p (w) with an operator norm which is at most a multiple of w α Ap , where max{1, p /p} ≤ α ≤ p .In particular, for p = 2 he showed that the dependence on w A2 was at least linear, and at most quadratic.
Recently there has been renewed interest in computing the exact dependence of the operator norms on the A p characteristic constant of the weight.Sharp linear dependence on w A2 was obtained by Hukovic, Treil, and Volberg [Huk], [HukTV] for the dyadic square function on L 2 (w) and for the martingale transform, a dyadic model for singular integral operators, by Wittwer [W1], [W2].As we already mentioned, analogous results were recently obtained for the Beurling transform by Petermichl and Volberg [PetV], and later by Dragičević and Volberg [DV].Petermichl and Pott [PetPot] very elegantly showed that α ≤ 3/2 for the Hilbert transform.Petermichl [Pet] improved this estimate to α = 1 when p ≥ 2. The difficulty in [Pet] was to obtain linear dependence of H L 2 (w) on w A2 ; extrapolation then gave the same dependence for p > 2. Using Theorem 1 we obtain that the norms of these operators on L p (w) are bounded by at most a multiple of w α Ap , α = max{1, p /p}, for all 1 < p < ∞.
As mentioned earlier, using power weights and power functions, Buckley [Buc] showed that for convolution operators with Calderón-Zygmund kernels the power is at least max{1, p /p}. Hence in the cases of Hilbert and Beurling transform, Theorem 1 provides the sharp bounds.If we could prove linear bounds for all convolution operators with CZ-kernels, then by extrapolation we will obtain the same sharp bounds in L p (w) as for the Hilbert and the Beurling transform.Obtaining the linear bounds in L 2 (w) can be very difficult.For instance, it is not yet known to the authors whether there is a bound for the first-order Riesz transforms on L 2 (w) depending linearly on w A2 .
We will show that the bounds obtained by extrapolation for the martingale transform are also sharp for all 1 < p < ∞.We can show that the extrapolated bounds for the square function are sharp for p < 2. It is not clear yet that the linear bound obtained by extrapolation for p > 2 is sharp, so far we can show that it must be at least of the order w p /p Ap .We can summarize all these results in the following theorem.
Theorem 2. Let T be any of the Hilbert transform, the Beurling transform, the martingale transform, or the dyadic square function.Then for any 1 < p < ∞ there exist positive constants C p such that for all weights w in A p we have (4) T L p (w) ≤ C p w α Ap , where α = max{1, p /p}.The exponent α in this estimate is sharp for the Hilbert, Beurling and martingale transforms for all 1 < p < ∞.For the dyadic square function the exponent is sharp for 1 < p ≤ 2.
All results establishing the linear bounds for the above operators on L 2 (w) have been obtained using the technique of Bellman functions introduced by Nazarov, Treil and Volberg [NTV] in the harmonic analysis context; see [NT] for an extensive introduction to this technique.The linear upper bound in Theorem 2 for p > 2 was previously known for the martingale, Hilbert and Beurling transforms.
Unfortunately, extrapolation does not preserve the nature of the initial estimate on L r (w) for all 1 < p < ∞, only for p > r.Therefore sharpness at the given r does not automatically transfer to all other p ∈ (1, ∞).One has to check sharpness by other means for each p = r.In all the examples discussed we search for a function and a weight (or a family of functions and weights) that will provide a lower bound estimate of the same order of the upper bound, therefore showing that the estimate is indeed sharp.
Acknowledgement.The authors would like to thank the referee for some very useful suggestions that improved the presentation.
Lemma 1.Take p, s > 1, w ∈ A p and u ∈ L s (w).Let If r > 1, then the pair (uw, S(u)w) belongs to the class A r .Furthermore, Ap .
If r = 1, the A 1 condition on the pair (uw, S(u)w) also holds and translates into M (uw) ≤ S(u)w.
Proof: (a) Estimating directly the norm we obtain It only remains to insert Buckley's sharp (3) estimate and to recall that All together, these facts imply Here f Q denotes the mean of the function f over the cube Q.Consequently, Ap .Taking supremum on the left-hand-side, over all cubes Q with sides parallel to the axis, we obtain the desired inequality.
Lemma 2. Let p, s, r and w be as in the previous lemma.Then for each u ≥ 0, u ∈ L s (w), there exists v ∈ L s (w) such that (a) u(x) ≤ v(x) a.e. and v L s (w) ≤ 2 u L s (w) .
Proof: Define v via the following convergent Neumann series: Taking supremum on the left hand side, over all cubes Q with sides parallel to the axis, we obtain the desired estimate for vw Ar , r > 1.
When r = 1, then s = p and S ≤ C(p ) w Ap , furthermore We conclude that vw A1 ≤ 2C(p ) w Ap , as claimed.
The next lemma appears as IV.5.18 in [GC-RF]; see also Lemma 9.5.4 in [Gr] for a slightly different method for part (b) which yields the same bounds as here.Attention was paid to the constants in [Gr] but Buckley's sharp estimate for M L p (w) was missing; with this additional information, the constants in [Gr] would be of the same order as the ones obtained here.
Moreover, vw ∈ A r and vw Ar ≤ 2C(p ) Here C(p) denotes the constant in (3).
Proof: (a) Clearly r ≥ 1 implies s ≤ p, and we can now use Lemma 2 after observing that p s = p−r p−1 .(b) Take p, r and s as in the formulation of the lemma.(Notice that everything that is being said still holds if 0 < s < 1.) Now the dual exponents satisfy the opposite inequality, r < p , and if we define s * := (p /r ) > 1, then s * = s(r − 1).
We apply the previous case with p , r and w 1−p ∈ A p instead of p, r and w ∈ A p , respectively.If u ≥ 0, u ∈ L s (w), then u 0 = u s/s * w p /s * ∈ L s * (w 1−p ) and by (a) there exists v 0 ∈ L s * (w 1−p ) such that and furthermore, v −1 w = (v 0 w 1−p ) 1−r ∈ A r , and this time,

Proof of the extrapolation Theorem 1
As in [GC-RF], Theorem 1 is a consequence of Lemma 3.

Proof of Theorem 1:
Case 1: Assume 1 ≤ r < p, w ∈ A p , and 1 s = 1 − r p , i.e. s = p/r.Then Now we use the hypothesis, T L r (vw) ≤ N r ( vw Ar ), and the fact that N r is an increasing function 3 and vw Ar ≤ 2C(p ) Taking the supremum over all admissible u we obtain the desired inequality, T L p (w) ≤ 2 1/r N r (2C(p ) p−r p−1 w Ap ).In particular, if T L r (vw) ≤ C vw Ar , then for p > r we get r−1 p−1 .Now, using Hölder's inequality in the first line with q = r/p > 1, q = r r−p and q /q = s, we obtain 3 If Nr(B) denotes the smallest constant with the property that w Ar ≤ B =⇒ T f L r (w) ≤ Nr(B) f L r (w) , then Nr(B) is increasing in B. Indeed, suppose that B ≤ B .Take w Ar ≤ B. Then w Ar ≤ B and the above norm inequality holds with Nr(B ) in place of Nr(B).Since Nr(B) is the smallest constant with this property, it follows that Nr(B) ≤ Nr(B ).Note that if it is known that Nr is an increasing function the argument goes through without requiring Nr(B) to be the smallest constant.
In particular, if we know that T L r (u) ≤ C u Ar for all u ∈ A r , then for 1 < p < r we have Ap .Specializing further, when r = 2 and T L p (w) ≤ C (p) w α Ap , where α = max{1, p /p} and

Sharp weighted L p bounds
Proof of Theorem 2: It has been proven in [Pet], [PetV], [D], [HukTV], [W1] and [W2] that the Hilbert transform, the Beurling transform, the dyadic square function, the martingale transforms, and the continuous square function are bounded in L 2 (v) with bounds linearly depending on the A 2 -characteristic constant of the weight v.That is, if T denotes any of the above operators, then there exists constant C > 0 such that for all v ∈ A 2 .These results are known to be sharp for all these operators.The line (8) says that (9) Buckley [Buc] showed that if w . This shows the estimate (9) is sharp for p < 2.An argument by duality (using the fact that the Hilbert transform is essentially self-adjoint) shows that so is the estimate for p > 2.
For the sake of completeness, here is the duality argument.Suppose we can show that for a given operator T and some 1 < p ≤ 2 there exists a constant C p such that T L p (w) ≤ C p w p /p Ap for all weights w ∈ A p .The adjoint operator T * is bounded on the dual space (L p (w)) * = L p (w 1−p ) with the same bound, i.e.T L p (w) = T * L p (w 1−p ) .We can combine these estimates with ( 6) to arrive at for all u ∈ A p .The consideration above also shows that if T * = e iϕ T , it suffices to prove sharpness of the estimates for T either for 1 < p ≤ 2 or p ≥ 2. For example, Hilbert, Beurling or martingale transforms are essentially selfadjoint operators, i.e.T * = e iφ T , therefore it is sufficient to consider the case p < 2.
For the Beurling and the martingale transform an example similar to the one given by Buckley for the Hilbert transform will work, hence the bounds given by extrapolation from the sharp linear bound in L 2 (v) to L p (w), give the correct rate in terms of the A p characteristic of the weight w for these operators as well.For p = 2, the details of the example for the Beurling transform are in Dragičević's PhD Thesis [D], where he shows that if w(z) = |z| α , |α| < 2, and f (z) = |z| −α χ E , where E = {(r, θ) : 0 < r < 1, 0 < θ < π/2}, then the growth must be linear.
The martingale transform is defined below and we demonstrate the estimate of its norm from below.The martingale transform is self-adjoint hence it suffices to prove sharpness for p < 2. The same example will also work for the dyadic square function and p < 2, but this time we can not use the duality argument to guarantee sharpness of the linear estimate for p > 2. We do not know yet if the linear bound for p > 2 is indeed the sharp bound for the dyadic square function, the best we can say is that it is between p /p and 1.

The dyadic square function.
The dyadic square function is defined formally by , where D denotes the family of all dyadic intervals, χ I is the characteristic function of the interval I, h I = |I| −1/2 (χ Ir − χ I l ) is the Haar function associated to the interval I, and I r , I l denote the right and the left halves of I, respectively.Take 0 < δ < 1 and let w(x) = |x| (1−δ)(p−1) , f (x) = x δ−1 χ [0,1] (x).Then w Ap ∼ δ 1−p and f L p (w) = δ −1/p .We will show that for x > 2, (10) Taking p-th roots we get where C(p) ∼ 1 p−1 when p is near 1.This proves that ϕ(x) = x p /p is the best function for the estimate S d L p (w) ≤ C(p) ϕ( w Ap ) when 1 < p ≤ 2. However, for p > 2 it only shows that x p /p ≺ ϕ (asymptotically when x → ∞) and ϕ ≺ x by extrapolation.
We obtain (10) by a direct calculation.Notice that f, h I = 0 implies I ∩ [0, 1] = ∅.If, in addition, we require that I ∩ (2, ∞) = ∅, i.e. if our dyadic I is to contain some x > 2, then it must be I = I k = [0, 2 k ) where x < 2 k .For each x > 2 there is a unique n(x) ∈ N such that 2 n(x) ≤ x < 2 n(x)+1 .That means that the only contributions to S d f (x) come from the intervals I k with k > n(x).For those intervals, That is, for x > 2 we have

Martingale transforms.
The martingale transform is formally given by where σ = {σ I = ±1 : I ∈ D} is the symbol of the operator.Wittwer [W1] showed that This result is sharp.One way to see that is to resort to [DV].There the Ahlfors-Beurling operator T was represented as the result of certain averaging process for (planar) martingale transforms associated to the Haar basis in L 2 (C).The same reasoning works for arbitrary p ∈ (1, ∞).Indeed, without any change we obtain operators T n as in [DV,p. 431].Since L p (w) is a reflexive space, the closed unit ball in B(L p (w)) is compact in weak operator topology.This justifies existence of weak limit T of a subsequence of operators T n for arbitrary p.As in [DV] we show that T is (a multiple of) the Ahlfors-Beurling operator.Now it is clear that the sharpness of the estimates for T on L p (w) implies the same for sup σ T σ .Moreover, one can show by examining [W1] that this extends to martingale transforms on the line.
One can also prove sharpness directly.The same example that works for the Hilbert transform and p < 2 will work in this case, then the duality implies the case of p > 2.
Thus take x > 2 and let δ, w, f , I k and n(x) be as in the previous section.We have Then 2 n(x)+1 − k>n(x)+1 σ I k 2 k .
We can now estimate T σ f L p (w) from below: So far this was true for any σ.Now choose σ I k = (−1) k .We get Thus ϕ(x) = x p /p is sharp for p < 2, and by duality ϕ(x) = x is sharp for p > 2.
A locally integrable function b is said to be in dyadic BMO d , if the average oscillation of b is uniformly bounded on dyadic intervals.More precisely, if For each function b ∈ BMO d the dyadic paraproduct π b is defined by It is known that the dyadic paraproduct is bounded in L p (w) whenever w ∈ A p ; see [KPer].The following quadratic estimate can be shown to hold [PerPet] π b f L 2 (w) ≤ K( b BMO ) w 2 A2 f L 2 (w) .(We do not think this is sharp, we believe the sharp estimate should be linear as for all other operators studied in this paper.)Theorem 1 then gives the upper bound where α = max{1, p /p}.
) p/p = C(p ) w Ap .Thus, S L s (w) ≤ C(p ) p /s w p /s Ap , as claimed.(b) If s = p , we have r = 1, then S(u)w = M (uw).Automatically the two-weight A 1 condition, M (uw) ≤ S(u)w, holds.If s > p > 1, then p > s > 1 and r > 1.Note that (r − 1) = (p − 1) 1 − p s and, by definition of the maximal function, where S = S L s (w) .Then (a) is clearly satisfied.(b) It follows from the definition of v and the sublinearity of S that Sv ≤ 2 S (v − u) ≤ 2 S v. Suppose r > 1.By the previous lemma, the pair (vw, S(v)w) lies in A r with its A r -constant bounded by w 1− p s Ap .Also recall that S ≤ C(p ) p /s w p /s Ap .We can now estimate vw Ar : vw