Weighted two-parameter Bergman space inequalities

In this inequality, ∇ denotes the full gradient in R + : ∇ = (∂/∂x1, . . . , ∂/∂xd, ∂/∂y); R + is the usual upper half space Rd×(0,∞); μ is a positive Borel measure defined on R + ; and v is a non-negative function in Lloc(R d). We studied this inequality primarily for p and q in the range 1 < p ≤ q < ∞. For the case in which q ≥ 2, we proved sufficient conditions on μ and v (depending on p, q, and d) for the inequality (1.1) to hold for all f ∈ ∪1≤r<∞L(R, dx). The argument in [WhWi] began with the observation that (1.1) is a special case of a more general inequality. Let h be a smooth function with decay at infinity (precisely how much decay will be specified later), defined on Rd. For y > 0, set hy(x) = y−dh(x/y), the usual L1-dilation. If we set u(x, y) = f ∗hy(x), then any component of ∇u(x, y) can be written as f ∗ (yφy)(x), where φ is smooth, has some decay, and in addition satisfies ∫


Introduction
In an earlier paper [WhWi], Richard L. Wheeden and the author studied the following weighted norm inequality for the Poisson integral u(x, y) (x ∈ R d , y > 0) of a function f : In this inequality, ∇ denotes the full gradient in R d+1 + : ∇ = (∂/∂x 1 , . . ., ∂/∂x d , ∂/∂y); R d+1 + is the usual upper half space R d × (0, ∞); µ is a positive Borel measure defined on R d+1 + ; and v is a non-negative function in L 1 loc (R d ).We studied this inequality primarily for p and q in the range 1 < p ≤ q < ∞.For the case in which q ≥ 2, we proved sufficient conditions on µ and v (depending on p, q, and d) for the inequality (1.1) to hold for all f ∈ ∪ 1≤r<∞ L r (R d , dx).The argument in [WhWi] began with the observation that (1.1) is a special case of a more general inequality.Let h be a smooth function with decay at infinity (precisely how much decay will be specified later), defined on R d .For y > 0, set h y (x) = y −d h(x/y), the usual L 1 -dilation.If we set u(x, y) = f * h y (x), then any component of ∇u(x, y) can be written as f * (y −1 φ y )(x), where φ is smooth, has some decay, and in addition satisfies This said, we may now shift our attention to an arbitrary smooth φ with decay (how much, again, to be specified presently), and which satisfies (1.2), and we may ask: What conditions on µ and v ensure that holds for all f in our test class?
In this paper, we are concerned with two-parameter generalizations of (1.1) and (1.3), and especially the latter.What does "two-parameter" mean?Let R d = R d1 ×R d2 .For i = 1, 2, let φ i be smooth functions with good decay, defined on R di , and which satisfy R d i φ i dx i = 0.In our two-parameter problem, we look for sufficient conditions on measures µ, defined on R d1+1 + ×R d2+1 + , and non-negative weights v ∈ L 1 loc (R d1 ×R d2 ), which are sufficient for the inequality (1.4) to hold for all f .(Here we are using '(x, y)' to stand for '(x 1 , x 2 , y 1 , y 2 ).')When we write (φ i ) yi (x i ), we mean, of course, y −di i φ i (x i /y i ).In the case where the φ i 's are the kernels that "generate" the components of the Poisson kernel (in their respective upper half spaces!), such a result would yield a sufficient condition for the inequality (1.5) , where u is f 's biharmonic extension and ∇ i denotes the full gradient in the (x i , y i ) variables.Thus, ∇ 1 ∇ 2 u is a (d 1 + 1) × (d 2 + 1) matrix of functions, and |∇ 1 ∇ 2 u| can be taken to be the square root of the sum of the squares of its entries.
In this paper we prove sufficient conditions for inequality (1.4), valid for 1 < p ≤ 2 ≤ q < ∞ and for a certain class of φ i 's.This class includes the kernels that generate the x-derivatives of the Poisson kernels, but not, alas, the y-derivatives.The reason for this troubling gap is that, while the convolution kernels for the x-derivatives of the d-dimensional Poisson kernel decay to order (1 + |x|) −d−2 , the corresponding y-derivative kernel only decays like (1 + |x|) −d−1 .Unfortunately, our general oneparameter result (Theorem 1.1 below) requires decay like (1 + |x|) −d−2 .In [WhWi], the authors treated the y-derivative by means a trick combining harmonicity and the Poisson kernel's semigroup property.The whole trick is given on [WhWi,, but in a nutshell it's this.Our duality argument (which works so well with the x-derivatives) requires that we obtain good Littlewood-Paley estimates for a certain function T g(x), expressed as a weighted integral of ∂P y (x − t)/∂y over (t, y) ∈ R d+1 + .For the x-derivatives, the corresponding integrals involved ∂P y (x − t)/∂x i , and we got our Littlewood-Paley estimates by convolving with ψ η (•), where ψ was a smooth, compactly-supported function with cancellation.The extra decay in the ∂P y (x − t)/∂x i 's let us bound the resulting integrals nicely.Lacking that decay for the y-derivative, R. L. Wheeden and the author convolved T g with ∂P y (•)/∂y.By harmonicity and the semigroup property, the resulting integral could be expressed in terms of second partials in the x derivatives -for whose kernels we do have good bounds.In our final section we drag this additional argument in to obtain a sufficient (but not so good) condition for the y-derivatives in the bi-space setting as well.
Before stating our main theorem, we should state the one-parameter result from [WhWi] that motivated it.Even this earlier result is fairly technical, and the two-parameter result is, in our opinion, liable to be completely indigestible to a reader who has not seen the one-parameter version first.

The one-parameter result.
As is traditional in this business, we begin with cubes Q ⊂ R d .We use (Q) to denote the sidelength of Q, and |Q| is its Lebesgue measure.We denote the Euclidean center of Q by x Q .By Q we mean the set the so-called "Carleson box" sitting above Q.We use T (Q) to denote the "top half" of Q: Equation (1.6) defines an Orlicz-type norm that shows up in weighted Littlewood-Paley theory [W1], [W2], and whose properties underlie the results in [WhWi] as well as those of the present paper.
) be a non-negative weight and let µ be a positive Borel measure on R d+1 where p is the dual exponent to p. Let η > p /2.There is a positive constant C = C (η, p, q, d, m) such that the following is true: If there exists a weight w satisfying Remark.The reader can see what we mean by indigestibility.
Remark.The theorem, as stated in [WhWi], actually gives a sufficient condition for the range 1 < p ≤ q < ∞, with q ≥ 2. We have stated this limited form of the theorem to make it more closely resemble Theorem 1.3 below.The restriction in Theorem 1.3 comes about because our method of proof, in two parameters, requires p ≥ 2. This is related to another difference between Theorem 1.1 and the corresponding result in [WhWi].The theorem in [WhWi] does not contain the hypothesis (1.7).Rather, it speaks of pairs of weights ('p -pairs') (σ, w) for which w also satisfies (1.8).However, as is pointed out in [WhWi,p. 949] and in [W1], a pair that satisfies (1.7) is a p -pair.Unfortunately, we have no good characterization of p -pairs (for p = 2) in the two-parameter setting.We express Theorem 1.1 in this fashion in order to make its statement look more like those of Theorem 1.3 and Theorem 5.3 (see below).
Remark.If σ belongs to the Muckenhoupt A ∞ class, then (1.7) holds for w = cσ, where c depends on η, d, and the A ∞ "box specs" of σ.In that case, (1.8) amounts to saying that µ and σ cannot put too much mass too near any cube Q.Since σ is big when v is small, this is a quantitative way of saying that v cannot be too small near points where µ is "large".Theorem 1.1 is a restatement of this fact for v's whose corresponding σ's are not in A ∞ .
The two-parameter result.
We begin here with rectangles R = Q 1 × Q 2 , where the Q i are cubes in R di .We use |R| to mean the Lebesgue measure of R. We set T (R) = T (Q 1 ) × T (Q 2 ) and R = Q1 × Q2 , where T (Q i ) and Qi are as defined above.
We will be using the next definition so often that it merits its own formal statement: Remark.The only difference between Definition 1.2 and the one given earlier is that Definition 1.2 applies to rectangles.
Our main result, which we prove in Section 4, is: ) be a non-negative weight and let µ be a positive Borel measure on R d1+1 where p is the dual exponent to p. Let η > p and let > 0. There is a positive constant C, such that the following is true: If there exists a weight w satisfying (1.10) for all rectangles R = Q 1 ×Q 2 , then (1.4)holds for all f∈∪ 1≤r<∞ L r (R d ,dx).
Remark.Note the absence of log's in the numerator and the extra 's in the denominator of the two-parameter condition (1.10).
The rest of the paper is laid out as follows.In Section 2 we state and prove certain results from weighted Littlewood-Paley theory which we will need in the proof of Theorem 1.3.In Section 3 we state a technical result from [WhWi], concerning convolutions of smooth functions with specified amounts of decay and cancellation, and we apply this result to prove a lemma (Lemma 3.2).Lemma 3.2 is a pointwise substitute for a series of integral inequalities used in [WhWi] to prove Theorem 1.1.This pointwise result is part of what lets us prove our two-parameter theorem without having a full-blooded, two-parameter weighted-norm theory of the Littlewood-Paley square function; it is also where the extra 's in (1.10) will come from.In Section 4 we prove Theorem 1.3.In Section 5 we state and prove a sufficiency result for the biharmonic Poisson kernel.

Littlewood-Paley theory
The basis of all of our arguments is the Calderón-Torchinsky decomposition lemma.Let ψ i (i = 1, 2) be real, radial, C ∞ functions defined on R di , that satisfy: For If y 1 and y 2 are positive numbers and as a distribution [CF].
It is easy to show that, for f ∈ L 2 , the (vector-valued) integral (2.1) actually converges to f in the L 2 norm.If f is smooth and decays rapidly at infinity, then the integral (2.1) converges uniformly and pointwise, and can be cut up and rearranged at will.We will use this freedom in the following way.
) With suitable (and quite weak) hypotheses on f , we may re-write the integral formula (2.1) as a sum: Each of these functions b R has support contained in R (the concentric triple of R), is smooth (it inherits this from the ψ i 's), and has cancellation in the x 1 and x 2 directions; that is to say, for each fixed and analogously for each fixed x * 2 ∈ R d2 ; this cancellation property is also inherited from the ψ i 's.
The meaning of the Calderón-Torchinsky lemma is that any (essentially arbitrary) function can be written as a sum of smooth, compactly supported functions that have cancellation.We can go further.Let us say that a function where each a R is adapted to R, and the λ R's are complex numbers satisfying: , for some constant C that depends on Ψ (which, recall, depends on d 1 and d 2 ) but not on f .Let us say that a function f is in standard form if there is a finite family, G, of triples of double-dyadic rectangles, such that where the λ R 's are real numbers and each a R is adapted to R. (Notice that the 'tildes' have been "absorbed" into the R's.) We will use Littlewood-Paley theory to bound certain functions in standard form on weighted spaces.We will measure the "badness" of our weights via the Orlicz-type norm σ(R, η) given in Definition 1.2.When η > 0, the ratio σ(R, η)/ R σ measures the extent to which σ's mass gets concentrated on a small part of R (that is, a subset with small Lebesgue measure compared to |R|).In particular, the ratio is uniformly bounded (for any η > 0) if and only if σ is a two-parameter A ∞ weight.
The proof of Theorem 1.3 depends on this result from [W2]: .
(This is one of many variants of the Lusin square function.)The next corollary follows by rearranging sums.
Corollary 2.2.Let σ and w be weights such that, for some In the one-parameter setting, both Theorem 2.1 and Corollary 2.2 have L p analogues for p = 2 [W1].Precisely, by applying the oneparameter version of the Calderón-Torchinsky lemma, we can write an essentially arbitrary f as a sum f = Q λ Q a (Q) , indexed over the dyadic cubes Q ⊂ R d , where the λ Q 's are numbers and the a (Q) 's are smooth functions satisfying: We define an analogous one-parameter square function: , and suppose that σ and w are two weights in In the two parameter setting, the appropriate theorem would be that if η > p, and σ and w are two weights such that σ(R, η) ≤ w(R) for all rectangles R, then Unfortunately, this is not known to be the case (yet) in the twoparameter context.Now, we obviously need some L p estimates to prove our main result.However, the L p estimates we need do not have to be the precise analogues of the one given in Corollary 2.2.This saving fact lets us get around the hole in our theory by means of a trick.The author introduced this device in the context of two-parameter martingales in [W3], and we apply it with essentially no change here.The only difference is that, in [W3], we used it to estimate linear sums of two-parameter Haar functions, whereas here we are applying it to linear sums of two-parameter (i.e., rectangle) adapted functions, as defined above.
The trick yields: .
Proof of Theorem 2.3: Let s = (r/2) , the dual exponent to r/2 (which, recall, is ≥ 1).Let h be a non-negative, measurable function defined on R d1 × R d2 , such that h L s (σ) = 1 and Let α = 2η/r > 2. Define w ≡ hσ.According to Theorem 2.1, the right-hand side of (2.2) is less than or equal to Let us now consider one of the terms w(R, α).By following the argument from [W3] (which is essentially Young's Inequality1 ) we see that w(R, α) is bounded above by a positive constant times Now let's apply Hölder's Inequality to (2.4).We get: (What we just did is the beginning of the trick we mentioned before the statement of the theorem.)Define But now, a second application of our Young's Inequality argument implies that (2.6) This finishes the trick.
It might be helpful here if we explain how we will use Theorem 2.3.The reader of [WhWi] will recall how, in that paper, inequality (1.3) was treated by writing the kernel φ as a sum of a ψ 1 and a ψ 2 , where ψ 1 had compact support and integral equal to 0; and ψ 2 , while not compactly supported, had many moments of cancellation.The analogous inequalities (1.3) for ψ 1 and ψ 2 were treated by different arguments.In the two-parameter setting, we get four inequalities like (1.4).One of these -that in which both kernels are compactly supported-will be handled as a direct consequence of Theorem 2.3.The other three will require more subtlety, but their treatment will follow the basic idea of Theorem 2.3.

Two technical estimates
The proof of Theorem 1.3 depends on certain precise estimates on the convolutions of smooth kernels that have cancellation.These estimates are stated in the following (highly technical) lemma, whose proof can be found in [WhWi,.
Assume that each ψ i has support contained in {|x| ≤ 1} and satisfies ψ i = 0. Furthermore, suppose that, for some non-negative integer m i , and for all and that R d i φ i (x i )P (x i ) dx i = 0 for all polynomials of degree ≤ m i + 1.Then the following estimates hold for the convolutions (ψ i ) y * (φ i ) η (x i ), for all x i ∈ R di and positive numbers y and η: for constants C = C i only depending on the φ i 's, ψ i 's, m i 's, and d i 's.
We will be applying Lemma 3.1 in the following, very specific way.
. If ψ i and φ i satisfy the hypotheses of Lemma 3.1 for some m i , then Inequality (3.1) follows from statement c) in Lemma 3.1 and inequality (3.2) follows from a) and b).We will be seeing a lot of (3.1) and (3.2).Therefore, let us define, for dyadic cubes Q i and Q i in R di , and non-negative integers m i : The next lemma is important: Let γ > 0, 0 < < 1, and let k be an integer.There is a constant and all k, (3.3) We consider two cases: k ≤ 0 and k > 0.
and we have that It is obvious that On the other hand, if The next corollary follows by iterating Lemma 3.2:

Proof of Theorem 1.3
We rephrase our weighted norm inequality in a dual form.Set σ = v 1−p .Let φ 1 and φ 2 satisfy the respective hypotheses of Theorem 1.3.If g : R d1+1 and compactly supported, we define: This integral converges absolutely for all x ∈ R d1 × R d2 because of our special assumptions on g (note that the support of g stays away from . The operator T is the adjoint of the operator that takes f into for all these g.We will prove Theorem 1.3 by showing that that is what happens (given hypotheses (1.9) and (1.10)).
For i = 1, 2, we can write where supp ρ (2) i P i (x) dx = 0 for all polynomials P i (in the x i variables) of degree ≤ m i +1; we do this by, essentially, throwing m i +1 of φ i 's moments "onto" ρ (1) i .When we do this, the functions ρ (1) i get one good property (compact support), while the non-compactly supported ρ (2) i 's get lots of cancellation.Using our decompositon (4.1), we may write T g as a sum of four terms: Now, the piece T (1,1) g, from its very formulation, is equal to a function in standard form.We can dispose of it quickly.We write: The sum is over all double-dyadic rectangles R, but only finitely many terms are not identically zero, because g and the ρ (1) i 's have compact supports.It is clear that each summand in (4.2), as a function of x, has support contained in its respective R.These functions also inherit smoothness and cancellation from the ρ (1) i 's.Thus we may write the sum as where each b R is adapted to R and the λ R's satisfy |g| q dµ(t, y) with a constant C that depends on the d i 's and the ρ (1) i 's.Take η > p , as in the hypotheses of Theorem 1.3, and suppose that w is a weight satisfying (1.9) for all rectangles R. By Theorem 2.3, .
Since q ≤ 2, the last quantity is less than or equal to The hypothesis (1.10) on w implies (after an elementary estimate) Therefore, our bound on λ R implies that (4.3) is less than or equal to |g| q dµ(t, y) which is exactly what we want.Thus, the T (1,1) g term is okay.The terms T (1,2) g, T (2,1) g, and T (2,2) g involve non-compactlysupported kernels, and require different arguments.This is where we will use Lemma 3.1.It is obvious that T (1,2) g and T (2,1) g are the same kind of animal, and so we need only treat one of them.It will turn out that the argument that handles T (2,2) g can also be used, with minor modifications, on T (1,2) g.Therefore we shall deal with T (2,2) g first.
Our argument is modeled closely on that of [WhWi].Let κ be the dual exponent to p /2 (which, recall, is ≥ 1), and let h ∈ L κ (σ) be non-negative, satisfy h L κ (σ) = 1, and be chosen so that We seek a good a priori bound, independent of h, for the right-hand side of (4.4).
The function T (2,2) g is bounded, smooth, and has good decay at infinity.If we let Ψ y be as defined at the beginning of Section 2, then by a standard approximation argument (essentially Fatou's Lemma), combined with Theorem 2.1, we may write: where η is any number larger than 2, and As in the proof of Theorem 2.3, we can dominate (hσ)( R, η) by a constant times With the φ ( R) 's now fixed, let us define and it is this last object which we must bound.We need to know how big Λ(R) can get (or doesn't get).Let us make the convention that "(x, y) ∈ R d1+1 y 1 , y 2 ); y i > 0"; and analogously, when we write "(x, y) ∈ The "ρ (2) " functions satisfy the cancellation and decay hypotheses required of the φ i 's in the statement of Lemma 3.1.The discussion following the lemma shows that if we at once get that where we have set (We refer the reader to [WhWi, for a detailed discussion of this argument in the one-parameter setting.) On the other hand, the preceding discussion implies that Our goal now is to show that, under the hypotheses of Theorem 1.3, the inequality obtains for all non-negative, finitely-supported sequences {G(R)} R .
In other words, we have reduced our problem to showing that the "kernel" β(R , R) maps boundedly from the sequence space q (Γ(R)) into the sequence space 2 (ν(R)).We shall prove this boundedness in the same way as in [WhWi], i.e., by means of the Riesz-Thorin Interpolation Theorem.
We shall need two endpoint estimates, ∞ → ∞ and 1 → 2/q (recall that 2/q ≥ 1).In order to make these estimates (particularly the first) go through smoothly, let us redefine our problem, by setting G(R) = Y (R)| R|, and having {Y (R)} be the sequence that is acted on.This change-of-variable requires that we replace the kernel β(R , R) with B(R , R).In addition, we must replace the "weight" Γ(R) by | R| q Γ(R).This done, we now need to show that the kernel B(R , R) maps boundedly ∞ → ∞ and 1 (| R| q Γ(R)) → 2/q (ν(R)).
∞ → ∞ .This is equivalent to having R B(R , R) ≤ C for all R , and this inequality will follow if we have, for i = 1, 2, and all dyadic cubes This is proved in [WhWi], though with slightly different notation from what we have here.For the sake of completeness (and ease of reading), we shall give a proof that uses our present notation.
Let us write the sum as (I) i + (II) i + (III) i , where for some δ > 0, since d i + m i + 2 > d i .But it is easy to see [WhWi] that this last sum is ≤ C δ,di .So much for (I) i .
(II) i : The ∞ → ∞ bound has been proved.Now for the 1 → 2/q bound.By Minkowki's inequality for double integrals, Therefore, the 1 → 2/q bound will follow if ≤ C| R| q Γ(R), (4.6) holds for all R, for some constant C.
Inequality (4.6) will turn out to be an easy consequence of Lemma 3.2 and the hypotheses of Theorem 1.3.

Proof of Inequality
which, by taking η sufficiently close to 2, we may assume is ≤ Cw( R ).Thus, we may dominate (4.7) by Because of Corollary 3.4, this is less than or equal to: , which (see again the hypotheses of Theorem 1.3) is assumed to be less than or equal to When we raise this to the power q /2, the result is less than or equal to which is what we wanted.Therefore, the T (2,2) term is okay.
We can handle the term T (1,2) g by modifying the preceding argument just a little.First, observe that dt 2 dy 2 y 2 (4.8) in L 2 .The meaning of (4.8) is that we take the convolution of f with (ψ 2 ) y2 "in the x 2 variable" (leaving x 1 fixed), and then convolve that with (ψ 2 ) y2 again, much as we do in the original Calderón-Torchinsky formula (2.1).The proof of (4.8) comes by Fourier inversion, where we take the Fourier transform only with respect to the x 2 variable.If we let f = T (1,2) g in (4.8), we get: 2 ) η2 (t 2 − s 2 )].If we plug (4.11) into (4.10), and then substitute that into (4.9),we get where the Q i are, as usual, dyadic cubes.
It is important to note that the integration over T (Q 2 ) is done in the (t 2 , y 2 ) variables and that the integration over T (Q 1 ) × R d2+1 + is done in the (s, η) variables: failure to observe this hung the author up for some time.
It is easy to see that, if (s 1 , η 1 ) ∈ T (Q 1 ), then F (x 1 , s, η, t 2 , y 2 ), considered as a function of The function F inherits smoothness and cancellation (in x 1 ) from ρ (1) 1 , and therefore so does b R .In the same fashion, b R inherits smoothness and cancellation (in x 2 ) from (ψ 2 ) y2 (x 2 − t 2 ).Thus, we may write b R (x 1 , x 2 ) = λ R a R (x 1 , x 2 ), where λ R is a number and a R (x 1 , x 2 ) is adapted to R .
We need a good bound on |λ R |, which we get, as usual, by controlling b R ∞ .Let x = (x 1 , x 2 ) ∈ R ⊂ R d1 × R d2 .Note that, for any (5.1) | q dµ(x, y) holds for all mixed partials such that neither j i = 0, and for all f ∈ ∪ 1≤r<∞ L r (R d , dx).
Unfortunately, the kernels φ (0) i only decay like (1+|x i |) −di−1 , which is not quite good enough for Theorem 1.3.In [WhWi], the authors circumvented this by a trick that exploited harmonicity and the Poisson kernel's semigroup property.We refer the reader to [WhWi, for the details of this argument.Its upshot is that, in obtaining our sequence space estimates when j i = 0, it is sufficient to replace a i (Q i , Q i ) by (see the top of p. 959):

We analogously define
The proof of the following lemma is like that of Lemma 3.2, and we omit it.Lemma 5.2.Let γ > 0, 0 < < 1, and let k be a positive integer.There is a constant C = C(γ, , d i ) such that, for all all x i ∈ R di , all cubes Q i ⊂ R di , and all k, (5.2) The proof of the following theorem is essentially identical to that of Theorem 5.1, and we omit it.
Theorem 5.3.Let 1 < p ≤ 2 ≤ q < ∞ and let v, µ, and σ be as in the hypotheses of Theorem 1.3.Let η > p and > 0. There is a positive constant such that the following is true: If there exists a weight w satisfying (5.3) | q dµ(x, y)