The boundedness of multilinear Calder\'on-Zygmund operators on weighted and variable Hardy spaces

We establish the boundedness of the multilinear Calder\'on-Zygmund operators from a product of weighted Hardy spaces into a weighted Hardy or Lebesgue space. Our results generalize to the weighted setting results obtained by Grafakos and Kalton (Collect. Math. 2001) and recent work by the third author, Grafakos, Nakamura, and Sawano. As part of our proof we provide a finite atomic decomposition theorem for weighted Hardy spaces, which is interesting in its own right. As a consequence of our weighted results, we prove the corresponding estimates on variable Hardy spaces. Our main tool is a multilinear extrapolation theorem that generalizes a result of the first author and Naibo (Differential Integral Equations 2016).


Introduction
In this paper we study the boundedness of multilinear Calderón-Zygmund operators (m-CZOs) on products of weighted and variable Hardy spaces. More precisely, we are interested in the following operators. Let K(y 0 , y 1 , . . . , y m ) be a kernel that is defined away from the diagonal y 0 = y 1 = · · · = y m in (R n ) m+1 and satisfies the smoothness condition for all α = (α 0 , . . . , α m ) such that |α| = |α 0 | + · · · + |α m | ≤ N , where N is a sufficiently large integer. An m-CZO is a multilinear operator T that satisfies T : L q 1 (R n ) × · · · × L qm (R n ) → L q (R n ) for some 1 < q 1 , . . . , q m < ∞ and 1 q = 1 q 1 + · · · + 1 qm , and T has the integral representation K(x, y 1 , . . . , y m )f (y 1 ) · · · f (y m ) dy 1 · · · dy m whenever f i ∈ L ∞ c (R n ) and x / ∈ ∩ i supp(f i ). Multilinear CZOs were introduced by Coifman and Meyer [3,4] in the 1970s and were systematically studied by Grafakos and Torres [24]. They showed that m-CZOs are bounded from L p 1 (R n )× · · · × L pm (R n ) → L p (R n ), for any 1 < p 1 , . . . , p m < ∞ and p defined by 1 p = 1 p 1 + · · · + 1 pm . Further, m-CZOs satisfy weak endpoint bounds when p i = 1 for some i. For Lebesgue space bounds, it is sufficient to take N = 1 in (1.1) and in fact weaker regularity conditions are sufficient. Bounds for m-CZOs from products of Hardy spaces into Lebesgue spaces were proved by Kalton and Grafakos [20] (see also Grafakos and He [19]). As in the linear case, more regularity is required on the operators: in this case, N ≥ s = ⌊n( 1 p − 1)⌋ + where x + = max(0, x). Very recently, bounds into Hardy spaces were proved by the third author, Grafakos, Nakamura and Sawano [22]. To map into Hardy spaces the kernel K must satisfy (1.1) for Moreover, in the multilinear case the operator T must satisfy an additional cancelation condition: (1.2) x α T (a 1 , . . . , a m )(x) dx = 0, for |α| ≤ s and all (p k , ∞, N ) atoms a k . For linear CZOs of convolution type, this condition holds automatically: see [23,Lemma 2.1]. An example of a bilinear CZO that satisfies this cancelation condition is T = R 1 + R 2 , where R i is the bilinear Riesz transform x − y i |(x − y 1 , x − y 2 )| 3 f (y 1 )g(y 2 ) dy 1 dy 2 .
Somewhat surprisingly, neither Riesz transform itself has sufficient cancellation. For more examples of convolution-type multilinear operators that do and do not satisfy this cancelation condition, see [22,23]. 1 Weighted norm inequalities for multilinear operators were first considered by Grafakos and Torres [25]. Later, Lerner, et al. [27] characterized the weighted inequalities for m-CZOs using a multilinear generalization of the Muckenhoupt A p condition. Weighted Hardy spaces were introduced by García-Cuerva [16]. A complete treatment of weighted Hardy spaces is due to Strömberg and Torchinsky [33]; they proved that (linear) Calderón-Zygmund operators whose kernels have enough regularity map H p (w) into L p (w) or H p (w), for 0 < p < ∞ and for weights w ∈ A ∞ .
Our goal is to generalize the results of Strömberg and Torchinsky to m-CZOs. To state them, we first define some notation. To do so we rely on some (hopefully) well-known concepts; complete definitions will be given below. Given w ∈ A ∞ , we define r w = inf{r ∈ (1, ∞) : w ∈ A r } and for 0 < p < ∞ we define the critical index s w of w by Our first result gives the boundedness of m-CZOs into weighted Lebesgue spaces.
Theorem 1.1. Given an integer m ≥ 1, 0 < p 1 , . . . , p m < ∞, and w k ∈ A ∞ , 1 ≤ k ≤ m, let T be an m-CZO associated to a kernel K that satisfies (1.1) for N such that Then 1 We note in passing that the results for m-CZOs in [22] are stated for convolution type operators, but as the authors note (see Remark 3.4), their results extend to non-convolution type m-CZOs.
where w = m k=1 w p p k k and Our second result gives boundedness of m-CZOs into weighted Hardy spaces.
Suppose further that T satisfies the cancellation condition (1.2) for all |α| ≤ s w , where for 1 ≤ k ≤ m, a k is an (N, ∞) atom: i.e., a k is supported on a cube Q k , a k ∞ ≤ 1, and Remark 1.3. In Theorems 1.1 and 1.2, if all the weights w k = 1, then r w k = 1, so we recapture the unweighted results in [20,22].
Remark 1.4. If p > 1 and w ∈ A p , then H p (w) = L p (w) (see [33]). Therefore, in Theorems 1.1 and 1.2, if w k ∈ A p k , then we can replace H p k (w k ) by L p k (w k ) in the conclusion.
Remark 1.5. Implicit in the statement of Theorem 1.2 is the assumption that w ∈ A ∞ . However, this is always the case: see Lemma 2.1 below.
Remark 1.6. Earlier, Xue and Yan [35] proved a version of Theorem 1.1 with the additional restriction that 0 < p k ≤ 1 for all 1 ≤ k ≤ m. We want to thank the authors for calling our attention to their paper, which we had overlooked.
Our next pair of results are the analogs of Theorems 1.1 and 1.2 for the variable Lebesgue spaces. The variable Lebesgue spaces are a generalization of the classical L p spaces with the exponent p replaced by a measurable exponent function p(·) : R n → (0, ∞). It consists of all measurable functions f such that for some λ > 0.
This becomes a quasi-Banach space with quasi-norm If p(x) ≥ 1 a.e., then this is a norm and L p(·) is a Banach space. These spaces were introduced by Orlicz [32] in 1931, and have been extensively studied by a number of authors in the past 25 years. For complete details and references, see [7]. Variable Hardy spaces were introduced by the first author and Wang [13] and independently by Nakai and Sawano [31]. In variable Lebesgue exponent spaces, harmonic analysis requires some assumption of regularity on the exponent function p(·). A common assumption that is sufficient for almost all applications is that the exponent function is log-Hölder continuous both locally and at infinity. More precisely, there exist constants C 0 , C ∞ and p ∞ such that .
Finally, given an exponent function p(·), we define As an immediate application of Theorems 1.1 and 1.2, and multilinear Rubio de Francia extrapolation in the scale of variable Lebesgue spaces, we get the following two results. Theorem 1.7. Given an integer m ≥ 1, let p 1 , . . . , p m be real numbers, and let q 1 (·), . . . , q m (·) be log-Hölder continuous exponent functions such that Let T be an m-CZO as in Theorem 1.1 satisfying (1.1) for all |α| ≤ N with Then Theorem 1.8. Given q(·), q 1 (·), . . . , q m (·), p, p 1 , . . . , p m as in Theorem 1.7, let T be an m-CZO as in Theorem 1.1 satisfying (1.1) for all |α| ≤ N with Suppose further that T satisfies (1.2) for all |α| ≤ ⌊n(1/p − 1)⌋ + . Then T : H q 1 (·) × · · · × H qm(·) → H q(·) . Remark 1.9. As we were completing this paper we learned that a version of Theorems 1.7 and 1.8, with the additional hypothesis that (q k ) + ≤ 1 for all 1 ≤ k ≤ m, was independently proved by Tan [34]. We want to thank the author for sharing with us a preprint of his work.
The remainder of this paper is organized as follows. In Section 2 we give some basic definitions and theorems about weights that we will use in subsequent sections. In particular, we prove a finite atomic decomposition for weighted Hardy spaces that extends the results in [13]. In Section 3 we gather together a number of technical lemmas that we need for the proofs of Theorems 1.1 and 1.2. Then in Sections 4 and 5 we prove these results. Finally, in Section 6 we give some basic facts about variable exponent spaces and prove Theorems 1.7 and 1.8. In fact, we prove more general results which include these theorems as special cases. Their statements, however, require additional facts about variable exponent spaces, and so we delay their statement until the final section.
Throughout this paper, we will use n to denote the dimension of the underlying space, R n , and will use m to denote the "dimension" of our multilinear operators. By a cube Q we will always mean a cube whose sides are parallel to the coordinate axes, and for τ > 1 let τ Q denote the cube with same center such that ℓ(τ Q) = τ ℓ(Q). We define the average of a function f on a cube Q by f Q = − Q f dx = |Q| −1 Q f dx. By C, c, etc. we will mean constants that may depend on the underlying parameters in the problem. Sometimes, to emphasize that they (only) depend on certain parameters, we will write C(X, Y, Z, . . .). The values of these constants may change from line to line. If we write A B, we mean that A ≤ cB for some constant c.

Weights and weighted Hardy spaces
Weights and weighted norm inequalities. In this section we give some basic definitions and results about A p weights. For complete information, we refer the reader to [14,17,18]. By a weight w we always mean a non-negative, locally integrable function such that 0 < w(x) < ∞ a.e. For 1 < p < ∞, we say that w is in the Muckenhoupt class A p , denoted by w ∈ A p , if When p = 1 we say that w ∈ A 1 if there is a constant C such that for every cube Q and a.e. x ∈ Q, The infimum over all such constants will be denoted by [w] A 1 . The A p classes are nested: for 1 < p < q < ∞, A 1 A p A q . Let A ∞ denote the union of all the A p classes, p ≥ 1. Given w ∈ A ∞ , then w is a doubling measure. More precisely, if w ∈ A p for some p ≥ 1, then it follows from the definition that given any cube Q and τ > 1, In the study of multilinear weighted norm inequalities, we often need the fact that the convex hull of A ∞ weights is again in A ∞ . The following result can be found, for instance, in [35] or in [21,Lemma 5]. For completeness we sketch a short proof, using a multilinear reverse Hölder inequality: if w 1 , . . . , w m ∈ A ∞ , 1 < p 1 , . . . , p m < ∞, and 1 p = 1 p 1 + · · · + 1 pm , then for every cube Q, This was originally proved in the bilinear case by the first author and Neugebauer [12]; for simpler proofs in the multilinear case, see [10,35].
Since each w k ∈ A ∞ , by choosing C sufficiently large and δ < 1 sufficiently close to 1, we have that for every cube Q and E ⊂ Q, But then, if we apply Hölder's inequality and the multilinear reverse Hölder's inequality, we have that There is a close connection between Muckenhoupt weights and the Hardy-Littlewood maximal operator, defined by where the supremum is taken over all cubes Q. We have that if 1 < p < ∞, then the maximal operator is bounded L p (w) if and only if w ∈ A p . Moreover, we have a weighted vector-valued inequality that generalizes the Fefferman-Stein inequality. This was first proved by Anderson and John [1]; for an elementary proof via extrapolation, see [8].
. Remark 2.3. Below we will repeatedly apply Lemma 2.2 in the following way. Fix 0 < p < ∞ and w ∈ A ∞ . Then w ∈ A q and without loss of generality we may assume p < q. Let r = q p > 1. Given , and the implicit constant depends only on n and τ . But then by Lemma 2.2, we have that for any non-negative λ k , Below we will need to prove a weighted norm inequality for an m-CZO. To do so, we will make use of some recent developments in the theory of harmonic analysis on the domination of singular integrals by sparse operators. Here we sketch the basic definitions; for further information, see, for instance, [6].
A collection of cubes S is called a sparse family if each cube Q ∈ S contains measurable subset E Q ⊂ Q such that |E Q | ≥ 1 2 |Q| and the family {E Q } Q∈S is pairwise disjoint. Given a sparse family S we define a linear sparse operator The following estimate is proved in [9,30].
Proposition 2.4. If 1 < q < ∞ and w ∈ A q , then given any sparse linear operator T S , In a similar way, given a sparse family S we define the multilinear sparse operator The following pointwise domination theorem was proved in [26, Theorem 13.2] (see also [5]). Proposition 2.5. Let T be an m-CZO whose kernel satisfies (1.1) for any N ≥ 1. Then given any collection f 1 , . . . , f m of bounded functions of compact support, there exists 3 n sparse families S j such that Weighted Hardy spaces. In this section we define the weighted Hardy spaces and prove a finite atomic decomposition theorem. In defining them we follow Strömberg and Torchinsky [33] and we refer the reader there for more information.
Let S (R n ) denote the Schwartz class of smooth functions. For N 0 ∈ N to be a large value determined later, define where the grand maximal function M N 0 (f ) is defined by Note that in this definition, N 0 is taken to be a large positive integer, depending on n, p and w, whose value is chosen so that the usual definitions of unweighted Hardy spaces remain equivalent in the weighted setting. Its exact value does not matter for us. Given an integer N > 0, an (N, ∞) atom is a function a such that there exists a cube Q with supp(a) ⊂ Q, a ∞ ≤ 1, and for |β| ≤ N , In [33,Chapter VIII] it was shown that every f ∈ H p (w) has an atomic decomposition: for every N ≥ s w there exist a sequence of non-negative numbers {λ k } and a sequence of smooth (N, ∞) and the sum converges in the sense of distributions and in the H p (w) quasi-norm. Moreover, we have that Below, we want to use the atomic decomposition to estimate the norm of an m-CZO. One technical obstacle, however, is that this atomic decomposition may be an infinite sum, and therefore it is not immediate that we can exchange sum and integral in the definition of an m-CZO. For the argument to overcome this problem in the unweighted setting, see [19]. Our approach here is different: we show that for a dense subset of H p (w), we can form the atomic decomposition using a finite sequence of atoms. Our result generalizes a result in the unweighted case from [29]; in the weighted case it generalizes results proved in [13,31].
To state our result, note that for N ≥ s w , if we define there exists a finite sequence of non-negative numbers {λ k } k and a sequence {a k } of (N, ∞) atoms, supp(a k ) ⊂ Q k , such that f = k λ k a k and The proof of Theorem 2.6 is gotten by a close analysis of the atomic decomposition given above. To prove it, we use the following technical result. It is adapted from the corresponding result from [33, Chapter VIII] (in the weighted case) and from the proof of the unweighted version of Theorem 2.6 in [29]. (See also the construction of the atomic decomposition in [13].) Indeed weights play almost no role in the result except in (4).
Then there exists a sequence {β k,i } of smooth functions with compact support and a family of cubes {Q k,i } with finite overlap that such that the following hold: Proof of Theorem 2.6. Fix f ∈ O N ∩ H p (w); by homogeneity we may assume without loss of generality that f H p (w) = 1. Then there exists R > 1 such that supp(f ) ⊂ B(0, R) = B. Let B * = B(0, 4R). We claim that for all x / ∈ B * , To prove this, we argue as in [13, Lemma 7.11] (cf. inequality (7.7)). There they showed a pointwise inequality: given any ϕ ∈ F N 0 and t > 0, 2) follows if we take the supremum over all ϕ ∈ F N 0 and t > 0, and note that since w ∈ A ∞ , w(B * ) w(B * ). Now let k 0 be the smallest integer such that for all k > k 0 , Ω k ⊂ B * . More precisely, by (2.2) we can take k 0 to be the largest integer such that 2 k 0 ≤ Cw(B * ) −1 p .
By Lemma 2.7 we can decompose f as where the β k,i are (N, ∞) atoms. We will show that this sum can be rewritten as a finite sum of atoms. Set Since the β k,i are supported in Ω k ⊂ B * for all k > k 0 , the function F 1 is also supported in B * . Moreover Further, F 1 has vanishing moments up to order N . To see this, fix |α| ≤ N and q > 1 such that w ∈ A q . Then, since supp(β k,i ) ⊂ B * , Therefore, the series on the left-hand side converges absolutely, so you can exchange the sum and integral; since each β k,i has vanishing moments, so does F 1 . Therefore, if we set a 0 = C −1 To estimate the remaining terms, note that f is a bounded function and so there exists an integer k ∞ > k 0 such that Ω k = ∅ for all k ≥ k ∞ . Thus the sum has finite many terms under the summation of k indices. Further, since the sum k,i λ k,i χ Q k,i M N 0 f it converges everywhere. Therefore, for each k 0 < k < k ∞ there exists an integer ρ k such that Moreover, arguing as we did above for F 1 , we have that F 2 has vanishing moments for |α| ≤ N . an (N, ∞) atom. Therefore, we have shown that we can decompose f as a finite sum of (N, ∞) atoms: It remains to prove that (2.1) holds. But by our choice of k 0 , we have that C 1 2 k 0 χ B * L p (w) ≤ C, and clearly w(B * ) − 1 p χ B * L p (w) ≤ C. Finally, by the weighted Fefferman-Stein inequality (see Remark 2.3), we have that Since f H p (w) = 1, we get the desired inequality, and this completes the proof of Theorem 2.6.

Auxiliary results
In this section we state and prove several lemmas on averaging operators and m-CZOs needed for the proofs of Theorems 1.1 and 1.2.
Averaging operators. We begin with a well-known result on the maximal operator M µ defined with respect to a measure µ: For a proof, see [17, Chapter II].
Proposition 3.1. Let µ be a doubling measure on R n . Then the maximal operator M µ satisfies the weak (1, 1) inequality and for 1 < p < ∞ the strong (p, p) inequality The next three lemmas on averaging operators are weighted extensions of results from [20]. Our proofs, however, are different and are motivated by ideas from [33].
Proof. Let F = Q∈J f Q and G = Q∈J a µ 1 (Q)χ Q and for each t > 0 let We can now estimate as follows: and so µ(Q) ≤ 4 3 µ(L c t ∩ Q) for all Q ∈ J . Thus we have that Given this estimate, if we multiply by pt p−1 and integrate, by Fubini's theorem we get Proof. First suppose that p > 1; we estimate by duality. Then there exists non-negative g ∈ L p ′ (µ), ; the first and third inequalities follow from Hölder's inequality, and the last from (3.2) (since p ′ > q ′ ) and the fact that g L p ′ (µ) = 1. Finally, when p = 1 the proof is essentially the same except that we use use the fact that M µ is bounded on L ∞ . This completes the proof.
Lemma 3.4. Let w ∈ A ∞ , and fix 0 < p < ∞ and max(1, p) < q < ∞. Then given any collection of cubes {Q k } ∞ k=1 and nonnegative integrable functions Proof. Since w ∈ A ∞ , the measure µ = w(x) dx is doubling. If p ≥ 1, then if we fix an arbitrary integer K and apply Lemma 3.3 to the functions {g k } K k=1 , we immediately get The desired inequality now follows from Fatou's lemma. When 0 < p < 1, we can apply Lemma 3.2 to get the same conclusion, using the fact that Estimates for m-CZOs. In this section we prove three estimates on m-CZOs.
Proof. By the domination estimate in Proposition 2.5 it will suffice to prove this estimate for any multilinear sparse operator T S and non-negative functions f 1 , . . . , f m . By the definition of the sparse operator we have where on the right-hand side we now have a linear sparse operator. But then by Proposition 2.4 we have that The following lemma was first prove in [22]. For completeness we include its short proof.
By the smoothness condition of the kernel and the fact that |y − y k | ≈ |y − c k | for all k ∈ Λ and y k ∈ Q k we have that K(y, y 1 , . . . , y m ) − P N (y, c 1 , y 2 , . . . , y m ) which implies (3.4). To prove (3.5), fix y ∈ (Q * 1 ∩ . . . ∩ Q * m ) c ; then there exists a non-empty subset Λ of {1, . . . , m} such that y / ∈ Q * k for all k ∈ Λ and y ∈ Q * l for l / ∈ Λ. Then by (3.4) we have that Inequality (3.5) follows from the definition of the maximal operator.
Lemma 3.7. Given w ∈ A q , 1 ≤ q < ∞, for 1 ≤ k ≤ m let a k be an (N, ∞) atom supported in Q k and let c k be the center of Q k . Suppose Q 1 is the cube such that ℓ(Q 1 ) = min{ℓ(Q k ) : 1 ≤ k ≤ m}. Then Proof. Since the A p classes are nested, we may assume without loss of generality that q > 1. To prove (3.8) we consider two cases: Q * 1 ∩ Q * k = ∅ for all 2 ≤ k ≤ m or this intersection is empty for at least one value of k. In the first case, since ℓ( for all 1 ≤ k ≤ m, and so Lemma 3.5 yields In the second case, since Q * 1 ∩ Q * k = ∅ for some k, the set Λ = {2 ≤ k ≤ m : Q * 1 ∩ Q * k = ∅} is non-empty. Fix any point y ∈ R n . Then arguing as in the previous proof we have that (3.10) T (a 1 , . . . , a m )(y) = R mn K 1 (y, y 1 , y 2 , . . . , y m )a 1 (y 1 ) · · · a m (y m )d y, where K 1 (y, y 1 , . . . , y m ) is defined by (3.7). For y 1 ∈ Q 1 we have that for some ξ 1 ∈ Q 1 and for all y l ∈ Q l , 1 ≤ l ≤ m, Therefore, for all y 1 ∈ Q * 1 and y k ∈ Q k , k ∈ Λ, If we combine this inequality with (3.10), we get |T (a 1 , . . . , a m )(y)| ℓ(Q 1 ) n+N +1 Since Q * 1 ⊂ 3Q * l for all l / ∈ Λ, the last inequality gives us since w ∈ A q is doubling, this implies that This completes the proof.
4. Proof of Theorem 1.1 . By Theorem 2.6, we have the finite atomic decompositions where λ k,j k ≥ 0 and a k,j k are (N, ∞)-atoms that satisfy Set w = m k=1 w p p k k . Again by Theorem 2.6, it will suffice to prove that Since T is m-linear, we have that for a.e. x ∈ R n , · · · jm λ 1,j 1 . . . λ m,jm T (a 1,j 1 , . . . , a m,jm )(x).
We first estimate G 2 L p (w) . By (3.5) we have that By condition (1.3), Hölder's inequality and the weighted Fefferman-Stein vector-valued inequality (see Remark 2.3), we get We now estimate the norm of G 1 . Since w ∈ A ∞ by Lemma 2.1, we can choose q > max(1, p) such that w ∈ A q . Then by Lemma 3.5 we have that If we combine this inequality, Lemma 3.4, Hölder's inequality and the Fefferman-Stein vector-valued inequality imply that (again see Remark 2.3), we get the following estimate: If we combine the estimates for G 1 and G 2 , we get the desired inequality.

Proof of Theorem 1.2
The proof of Theorem 1.2 is very similar to the proof of Theorem 1.1. Instead of estimating the norm of T , we will estimate the norm of M φ • T , where M φ is the non-tangential maximal operator where φ ∈ C ∞ 0 and supp(φ) ⊂ B(0, 1). We will use the that the Hardy space can be characterized by using the non-tangential maximal function M φ with the norm See [33]; this equivalence is guaranteed by our choice of N 0 sufficiently large. Throughout this section we fix a choice of φ.
If we split the integral on the left-hand side of (5.4) over Q * 1 and (Q * 1 ) c , we can estimate as follows: |T (a 1 , . . . , a m )(z)|dz.
By (3.8), we can estimate the first integral in the last inequality by To estimate second integral, we need to exploit carefully the smoothness of the kernel. Recall the representation of T (a 1 , . . . , a m )(z) in (3.10). Denote We now estimate K 1 (z, z 1 , . . . , z m ) in (3.11) to get for all z ∈ (Q * 1 ) c . Thus, Therefore, Now we combine (5.5) and (5.6) we get (5.4), which completes the proof of Case 1.
Since M φ • T is multi-sublinear, we can write and Here R j 1 ,...,jm is the smallest cube among Q * * 1,j 1 , . . . , Q * * m,jm . A similar argument as in the proof of Theorem 1.1 with Lemma 5.2 in place of Lemma 3.7 gives We now estimate the norm of G 2 . By Lemma 5.1 we get that The function G 21 can be estimated by essentially the same argument used for G 1 to get (5.14) To estimate G 22 , since (n+s w +1)p n > 1, we use (1.4) and the Fefferman-Stein vector-valued inequality (cf. Remark 2.3) to get If we combine (5.13), (5.14) and (5.15), we get (5.12) and this completes the proof. 6. Variable Hardy spaces: proof of Theorems 1.7 and 1.8 In this section we prove Theorems 1.7 and 1.8. In fact, we will prove two more general results that include these theorems as special cases. To do so, we first recall some basic facts about the variable Lebsesgue spaces. For complete information we refer the reader to [7].
Let P 0 (R n ) be the set of all measurable functions p(·) : R n → (0, ∞). Define Given p(·) ∈ P 0 (R n ) define L p(·) = L p(·) (R n ) to be the set of all measurable functions f such that for some λ > 0, This becomes a quasi-Banach space with the "norm" If p − ≥ 1, L p(·) is a Banach space; if p(·) = p a constant, then L p(·) = L p with equality of norms.
If the maximal operator is bounded on L p(·) we write that p(·) ∈ B. A necessary condition for this to be the case is that p − > 1. A sufficient condition is that 1 < p − ≤ p + < ∞ and p(·) is log-Hölder continuous: i.e., (1.6) and (1.7) hold. However, this continuity condition is not necessary: see [7] for a detailed discussion of this problem.
Given p(·) ∈ P 0 (R n ), the variable Hardy space H p(·) is defined to be the set of all distributions f such that M N 0 f ∈ L p(·) . Again, we here assume N 0 > 0 is a sufficiently large constant so that all the standard definitions of the classical Hardy spaces are equivalent. These spaces were examined in detail in [13] (see also [31]).
A very important tool for proving norm inequalities in spaces of variable exponents is the extension of the Rubio de Francia theory of extrapolation to the scale of variable Lebesgue spaces. For the history and application of this approach for linear operators, see [7,8]. To prove Theorems 1.7 and 1.8 we will use a multilinear version due to the first author and Naibo [11]. They only stated their proof for the bilinear case, but the same proof immediately extends to the general multilinear setting.
Remark 6.2. In [11], the hypothesis on the exponents q k (·) was stated as (q k (·)/p k ) ′ ∈ B, where this exponent is the conjugate exponent, defined pointwise by 1 p(x) + 1 p ′ (x) = 1. It was stated in this way for technical reasons related to the proof. However, these two hypotheses are equivalent: see [7,Corollary 4.64].
The one technical obstacle in applying Theorem 6.1 is constructing the family F to satisfy the hypotheses that the left-hand sides of (6.1) and (6.2) are finite and that the resulting family is large enough that the desired result can be proved via a density argument. In our case we will use the atomic decomposition in the weighted and variable Hardy spaces. As we noted in Section 2, given w ∈ A ∞ and 0 < p < ∞, every f ∈ H p (w) can be written as the sum where λ k ≥ 0 and the a k are (N, ∞) atoms, provided N ≥ s w . Moreover, this series converges both in the sense of distributions and in H p (w). (See [33,Chapter VIII].) The same is true in the variable Hardy spaces. More precisely: suppose p(·) ∈ P 0 is such that there exists 0 < p 0 < p − with p(·)/p 0 ∈ B. Then given N > n(p −1 0 − 1), if f ∈ H p(·) , there exists a sequence of (N, ∞) atoms a k and constants λ k such that (6.3) holds, and the series converges both in the sense of distributions and in H p(·) . (See [13, Theorem 6.3]; here we have slightly modified the definition of atoms, but the change is immediate.) It follows immediately from these two results that finite sums of (N, ∞) atoms, for N sufficiently large, are dense in H p (w) and H p(·) . Remark 6.3. In applying the density of finite sums of atoms, we are not making use of the finite atomic decomposition norm (as in Theorem 2.6 for weighted spaces or in the corresponding result for variable Hardy spaces in [13]). We will only use that these sums are dense with respect to the given Hardy space norm. Theorem 6.4. Let q 1 (·), . . . , q m (·), q(·) ∈ P 0 be such that 1 q(·) = 1 q 1 (·) + · · · + 1 qm(·) and 0 < (q k ) − ≤ (q k ) + < ∞, 1 ≤ k ≤ m. Suppose further that there exist 0 < p 1 , . . . , p m , p < ∞ with 1 p = 1 p 1 + · · · + 1 pm , 0 < p k < (q k ) − , and q k (·)/p k ∈ B. If T is an m-CZO as in Theorem 1.1 satisfying (1.1) for all |α| ≤ N , where N ≥ max mn 1 p k − 1 + , 1 ≤ k ≤ m + (m − 1)n, then T : H q 1 (·) × · · · × H q 1 (·) → L q(·) .
Proof. Fix an integer K 0 such that where 0 < R < ∞.
Since finite sums of (K 0 , ∞) atoms are dense in H q k (·) , 1 ≤ k ≤ m, a standard density argument shows that this inequality holds for all f k ∈ H q k (·) , 1 ≤ k ≤ m. This completes the proof.
The proof of the following result is identical to the proof of Theorem 6.4, except that in the definition of the family F we replace T by M N 0 T (for N 0 sufficiently large) and use Theorem 1.2 instead of Theorem 1.1. Theorem 1.8 again follows as an immediate corollary. Theorem 6.6. Given q 1 (·), . . . , q m (·), q(·) and p 1 , . . . , p m , p as in Theorem 6.4, let T be an m-CZO as in Theorem 1.1 satisfying (1.1) for all |α| ≤ N , where Suppose further that T satisfies (1.2) for all |α| ≤ ⌊n(1/p − 1)⌋ + . Then T : H q 1 (·) × · · · × H q 1 (·) → H q(·) .