Logarithmic bump conditions for Calder\'on-Zygmund Operators on spaces of homogeneous type

We establish two-weight norm inequalities for singular integral operators defined on spaces of homogeneous type. We do so first when the weights satisfy a double bump condition and then when the weights satisfy separated logarithmic bump conditions. Our results generalize recent work on the Euclidean case, but our proofs are simpler even in this setting. The other interesting feature of our approach is that we are able to prove the separated bump results (which always imply the corresponding double bump results) as a consequence of the double bump theorem.


Introduction and main results
Given a Calderón-Zygmund singular integral T , the problem of finding sufficient conditions on a pair of weights (u, σ) such that the two-weight norm inequality holds dates back to the 1970s. Significant progress has only been made in the past twenty years: for a brief history, see [9,Chapter 1]. One approach to this problem is to use the so-called A p -bump conditions introduced by Pérez [28,30]. It was conjectured that a sufficient condition for (1.1) to hold is that the pair (u, σ) satisfies where the supremum is taken over all cubes in R n , A and B are Young functions that satisfy the growth conditionsĀ ∈ B p ′ andB ∈ B p , and · is a normalized Orlicz norm. (For precise definitions, see below.) This problem proved quite difficult, and a number of partial results were proved [7,8,11,12] before the full result was proved by Lerner [22] and by Nazarov, Reznikov and Volberg [26] (when p = 2). Much of the recent progress on this problem was due to the close connection with the A 2 conjecture on sharp one-weight norm inequalities for singular integrals-see [8] for details.
Recently, it was noted [13] that while the conjecture was originally stated in terms of the "double bump" condition (1.2), it was motivated by the so-called Muckenhoupt-Wheeden conjectures (see [13] and [9,Section 9.2]) and that implicit in this motivation was a weaker conjecture in terms of a pair of "separated bump" conditions: T satisfies (1.1) if the pair (u, σ) satisfy In [13] this conjecture was proved in the special case when A and B are "log bumps": i.e., A(t) = t p log(e + t) p−1+δ and B(t) = t p ′ log(e + t) p ′ −1+δ , δ > 0. A simpler proof, one which also gives quantitative estimates on the constants for separated bumps, was found by Hytönen and Pérez [19]. The exact value of the constants is important, since Hytönen [15] has shown that if the sharp constants for the separated bump condition are as conjectured, then as an immediate corollary this result yields a new proof of the sharp A p -A ∞ bounds for singular integrals [18,20].
Remark 1.1. It has generally been accepted that the separated bump condition is weaker than the double bump condition, but no explicit pair (u, v) that satisfies (1.3) but not (1.2) for a given pair of Young functions A, B has appeared in the literature. We rectify this by constructing an example in Section 7 below.
The goal of this paper is to extend the double bump and separated bump results discussed above to the case of singular integrals on spaces of homogeneous type. These spaces are of interest since they often arise in applications: see for example [4,5,6,25,37]. Many of the tools of classical harmonic analysis on Euclidean spaces generalize to this setting; nevertheless there are substantive differences and some care must be taken to insure that proofs still hold. Our arguments differ extensively from those in [13]: they have more in common with the approach taken in [19]. Our proof, when restricted to the Euclidean case is somewhat simpler than theirs, but we do not prove the same quantitative estimates on the constants. A very interesting feature of our proof is that we are able to prove the separated results as a consequence of the double bump estimates.
Before we can state our main results we need to make a number of definitions. By a space of homogeneous type (hereafter, SHT) we mean an ordered triple (X, ρ, µ) where X is a set, ρ is a quasimetric on X, and µ is a non-negative Borel measure on X that is doubling: The smallest such constant C d is called the doubling constant of µ. We also assume that µ is non-trivial, i.e., for every ball, 0 < µ(B ρ (x 0 , r)) < ∞. For further details, see Christ [4] or Coifman and Weiss [5].
Remark 1.2. For brevity, hereafter we will say that a constant depends on X and write C(X, . . .) if the constant depends on the triple (X, ρ, µ).
If T is bounded on L 2 (X, µ), then T is referred to as a Calderón-Zygmund operator.
The bump conditions discussed above are given in terms of Orlicz norms. Here we summarize some of the basic properties we need; for the general theory of Orlicz spaces, see Rao and Ren [33] or [9,Chapter 5]. A Young function is a continuous, convex, increasing function A : [0, ∞) → [0, ∞) such that A(0) = 0 and A(t)/t → ∞ as t → ∞. It is often convenient to assume that A(1) = 1 but this is not strictly necessary. Note that A(t) = t is not a Young function though t p is for p > 1. However, in many cases results for Young functions hold in this limiting case. The Young functions we are interested in are referred to as log bumps: Given two Young functions A and B, we write that A B if there exists constants c, t 0 > 0 such that A(t) ≤ B(ct) for all t ≥ t 0 . Note that given any Young function Given a Young function A and a set E such that 0 < µ(E) < ∞, define the Orlicz space norm Given a Young function A, defineĀ, the complementary function, bȳ It can be shown thatĀ is also a Young function. Given A, we have the generalized Hölder's inequality: for any set E, 0 < µ(E) < ∞, More generally, given three Young functions A, B, C such that then there exists a constant K such that In the special case of log bumps, if A(t) = t p log(e + t) p−1+δ thenĀ(t) ≈ t p ′ log(e + t) −1−(p ′ −1)δ , and soĀ ∈ B p ′ . We can now define our bump conditions. Given Young functions A and B, and a pair of weights (u, σ), define By weights u and σ we always mean non-negative measurable functions on X that are finite almost everywhere and positive on sets of positive measure. Many authors assume that weights are locally integrable; however, when working with bump conditions this assumption can be avoided by an approximation argument. As was shown in [9, Section 7.2], we can always assume that u and σ are bounded and bounded away from 0 on X, provided that in the norm inequality being proved we are working with a function f ∈ ∩ p>1 L p (X, µ): for example, f is a bounded function of compact support. Remark 1.3. Since bounded functions of compact support are dense in any weighted space L p (X, u), we will hereafter assume that u, σ and f satisfy these conditions. Moreover, since T is linear we will also assume without loss of generality that f is non-negative.
We can now state our main results. The first generalizes the double bump condition to SHT. Theorem 1.4. Given an SHT (X, ρ, µ), suppose the pair of weights (u, σ) satisfies [u, σ] A,B,p < ∞, whereĀ ∈ B p ′ andB ∈ B p . Then a Calderón-Zygmund operator T satisfies the strong type inequality The next two results give separated bump conditions for weak and strong type inequalities.
Theorem 1.5. Given an SHT (X, ρ, µ), suppose the pair of weights (u, σ) is such that [u, σ] A,p < ∞, where A(t) = t p log(e + t) p−1+δ . Then a Calderón-Zygmund operator T satisfies the weak type inequality Theorem 1.6. Given an SHT (X, ρ, µ), suppose the pair of weights (u, σ) is such that [u, σ] A,p < ∞ and [σ, u] B,p ′ < ∞, where A(t) = t p log(e + t) p−1+δ and B(t) = t p ′ log(e + t) p ′ −1+δ . Then a Calderón-Zygmund operator T satisfies the strong type inequality The remainder of this paper is organized as follows. In Section 2 we introduce the powerful notion of dyadic grids on spaces of homogeneous type. These were first constructed by Christ [4], but we will follow the more recent work of Hytönen and Kairema [16]. These grids let us naturally extend the Calderón-Zygmund decomposition and the techniques of the so-called sparse operators to an SHT. In Section 3 we will reduce the proof of our main theorems to proving estimates for sparse operators. The proof depends on results that in the Euclidean case are due to Lerner [22] and Lacey, Sawyer and Uriarte-Tuero [21]. We give the corresponding results for an SHT. In Section 4 we prove Theorem 1.4 by proving the corresponding result for sparse operators. The proof is nearly identical to the proof given in [8] in the Euclidean case, so we only sketch the details. In Section 5 we prove a weak (1, 1) inequality for sparse operators that we need for our proof of Theorem 1.5. Our proof follows the broad outline of the analogous result for singular integrals in Euclidean spaces due to Pérez [29]; however, it is simpler because of the localized behavior of sparse operators. In Section 6 we prove Theorems 1.5 and 1.6. Finally, in Section 7 we construct a pair of weights on the real line that satisfies a separated logarithmic bump condition but not the corresponding double bump condition.
In our proofs of Theorems 1.5 and 1.6 the only place we use that A and B are log bumps is in the final argument in Section 6. However, despite repeated efforts we are unable to eliminate this assumption. Nevertheless, we conjecture that both results are true with the weaker assumption thatĀ ∈ B p ′ andB ∈ B p , but we believe that new techniques will be required to prove this. On the other hand, very recently Nazarov, Reznikov and Volberg [27] have given a proof of the separated bump result in Euclidean spaces using Bellman functions. Certain aspects of their proof lead them to suggest that the full conjecture may be false.

Dyadic cubes in spaces of homogeneous type
An important tool for our proofs is the concept of a dyadic grid D on an SHT and the concept a sparse family S in D. These generalize the classical Calderón-Zygmund decomposition (cf. [9, Appendix A]). The following result is due to Hytönen and Kairema [16] (see also Christ [4]).
Theorem 2.1. Given an SHT (X, ρ, µ), there exist constants C > 0, 0 < η, ǫ < 1, depending on X, a family of sets D = ∪ k∈Z D k (called a dyadic decomposition of X) and a corresponding family of points {x c (Q)} Q∈D that satisfy the following properties: The sets Q ∈ D are referred to as dyadic cubes with center x c (Q) and sidelength η k , but we must emphasize that these are not cubes in any standard sense even if the underlying space is R n , and care must be taken when visualizing them. An exact characterization of the kinds of sets which can be dyadic cubes is given in [17]. Below we will need the dilations λQ, λ > 1, of dyadic cubes. However, these will actually be balls containing Q: given a cube Q, we define Families of dyadic grids can be constructed that have additional useful properties: see [16]. We apply one such family to show that our bump conditions can be restated in terms of dyadic cubes. Given a dyadic grid D, a pair of weights (u, σ), and a Young function A, define We define [u, σ] D A,B,p similarly. Lemma 2.2. Given a pair of weights (u, σ), and Young functions A and B, In both cases, the constants in the equivalence depend only on X.
Proof. We prove the first equivalence; the proof of the second is identical. Given a dyadic grid D and Q ∈ D, by Theorem 2.1 there exists a ball B ρ such that Q ⊂ B ρ and µ(B ρ ) ≈ µ(Q). Therefore, there exists C(X) > 1 such that for any λ > 0, the last inequality holds since Young functions are convex. Hence, by the definition of the Orlicz norm, u A,Q ≤ C(X) u A,Bρ . The same estimate holds for the norm of σ. We thus have that To prove the reverse inequality, we use the fact that there exists a family of dyadic grids D 1 , . . . , D J , J depending only on X, that satisfy the properties of Theorem 2.1 with the additional property that given any ball B ρ , there exists j and Q ∈ D j such that B ρ ⊂ Q and µ(B ρ ) ≈ µ(Q). (See [16,Theorem 4.1].) Therefore, we can repeat the above argument, reversing the roles of B ρ and Q, to get Given a collection of dyadic cubes D, a sparse family S ⊂ D is a collection of dyadic cubes for which there exists a collection of sets {E(Q) : Q ∈ S} such that the sets E(Q) are pairwise disjoint, E(Q) ⊂ Q, and µ(Q) ≤ 2µ(E(Q)). Sparse families of cubes are a generalization of the Calderón-Zygmund decomposition in the Euclidean case. Using Theorem 2.1 we can form this decomposition in an SHT. In order to do this we need the Lebesgue differentiation theorem, which holds in any SHT. This fact seems to be new, though the proof only consists of assembling pieces already present in the literature: in particular, it is implicit in Toledano [35]. Lemma 2.3. Given an SHT (X, ρ, µ), the Lebesgue differentiation theorem holds: for µ-almost every x ∈ X, Proof. Macías and Segovia, building on their earlier work in [23], showed in [24] that given any SHT (X, ρ, µ), there exists an equivalent quasidistance δ (i.e., there exist constants c 1 , c 2 depending on X such that for all x, y ∈ X, c 1 ρ(x, y) ≤ δ(x, y) ≤ c 2 ρ(x, y)), such that given any ball B δ with respect to δ, then (B δ , δ, µ) is again a space of homogeneous type, and the constants are independent of the ball B δ . Toledano [35] proved that since µ(B δ ) < ∞, the measure µ when restricted to B δ is regular. The Lebesgue differentiation theorem holds for regular measures: this follows from the standard argument (cf. Rudin [34,Chapter 7]) using the fact that the maximal operator is weak (1, 1) on L 1 (B δ , µ) (Christ [4]) and that smooth functions of compact support are dense in L 1 (B δ , µ) ([34, Chapter 3]). Therefore, we have that for µ-almost every Since ρ and δ are equivalent and ρ is doubling, it follows immediately that (2.1) holds in B ρ . Since X can be covered by a countable collection of ρ-balls, it holds for µ-almost every x ∈ X. Remark 2.4. As a corollary to this proof we also have that C ∞ c (X) is dense in L 1 (X, µ). This fact, together with Lemma 2.3, can be used to simplify the hypotheses for results in a number of papers: see, for example, [2,3].
Corollary 2.5. Given an SHT (X, ρ, µ) and a dyadic grid D that satisfies the hypotheses of Theorem 2.1, then for µ-almost every x ∈ X, if {Q k } is the sequence of dyadic cubes in D such that ∩ k Q k = {x}, then Proof. First note that since ρ is a quasi-distance and µ is doubling, if x ∈ B(x 0 , r), then B(x 0 , r) ⊂ B(x, 2Kr) and µ(B(x, 2Kr)) ≈ µ (B(x 0 , r)). Hence, Therefore, if B k is a sequence of balls such that k B k = {x}, then it follows from Lemma 2.3 that Now for any k, by Theorem 2.1 there exists a ball B k such that x ∈ Q k ⊂ B k and µ(B k ) ≤ Cµ(Q k ). Then (2.2) follows at once from (2.3).
Remark 2.6. Corollary 2.5 was stated in [1] without proof and with a reference to [35]. However, as we noted, this result was only implicit there.
We now extend the Calderón-Zygmund decomposition to an SHT. We give a version that holds for Orlicz norms and not just for L 1 averages. We begin by defining a dyadic Orlicz maximal operator. Given a dyadic grid D and a Young function Φ, define The standard dyadic maximal operator is gotten by taking Φ(t) = t; in this case we simply write M D .
Theorem 2.7. Given an SHT (X, ρ, µ) such that µ(X) = ∞, a dyadic grid D, and a Young function Φ, suppose that f is a measurable function such that f Φ,Q → 0 as µ(Q) → ∞. Then the following are true: (1) For each λ > 0, there exists a collection {Q j } ⊂ D that is pairwise disjoint, maximal with respect to inclusion, and such that Moreover, there exists a constant C(X) such that for every j, (2) Given a > 2/ǫ, where ǫ is as in Theorem 2.1, for each k ∈ Z let {Q k j } j be the collection of maximal dyadic cubes in (1) with Then the set of cubes S = {Q k j } is sparse, and E(Q k j ) = Q k j \ Ω k+1 .
The proof of Theorem 2.7 is essentially identical to that in the Euclidean in case: see, for example, [9, Appendix A.1]. The constant C(X) in (1) depends on the doubling constant of µ. When µ(X) < ∞ some minor modifications to the proof are necessary; these correspond to what is often referred to as a "local" Calderón-Zygmund decomposition. To make them it suffices to note that in this case X is bounded (see [14]). Therefore, by Theorem 2.1, for all dyadic cubes Q sufficiently large, X = Q, and so the argument for (1) Theorem 2.8. Given an SHT (X, ρ, µ) such that µ(X) = ∞, and a dyadic grid D, suppose f is a function such that f 1,Q → 0 as µ(Q) → ∞. Then for any λ > 0 there exists a family {Q j } ⊂ D and functions b and g such that: ( If µ(X) < ∞, then this decomposition still exists if we take λ > − X |f (x)| dµ(x).
Theorem 2.8 is proved exactly as in the Euclidean case, taking {Q j } to be the cubes from (1) in Theorem 2.7. The proof that g ∈ L ∞ requires the Lebesgue differentiation theorem, Corollary 2.5.

Reduction to estimates for sparse operators
Given a dyadic grid D and sparse family S in D, define the sparse operator T S = T S,D by The operator T S is a positive, dyadic Calderón-Zygmund operator. It follows from the definition of sparseness that T S is bounded on L 2 (µ) and satisfies a weak (1, 1) inequality: see [1, Lemmas 6.4, 6.5].
A key feature of our proofs is that we reduce the problem for Calderón-Zygmund operators to proving the same estimates for sparse operators. To do so we need to extend two results from the Euclidean setting to spaces of homogeneous type. The first result is due to Lerner [22] in the Euclidean setting; it was central to his greatly simplified proof of the A 2 conjecture. We defer the proof until the end of this section. Theorem 3.1. Given an SHT (X, ρ, µ) and a Calderón-Zygmund operator T , then for any Banach function space Y , where the supremum is taken over every dyadic grid D in (X, ρ, µ) and over every sparse family S in D.
By taking Y equal to L p,∞ (u) or L p (u), it follows immediately from Theorem 3.1 that to prove estimates for Calderón-Zygmund operators, it suffices to prove them for all sparse operators T S with constants independent of S and D. Below, we will prove Theorems 1.4, 1.5 by establishing such estimates.
To prove Theorem 1.6 we need to argue indirectly using a result which connects the weak and strong type norm inequalities of sparse operators. In the Euclidean case this theorem is due to Lacey, Sawyer and Uriarte-Tuero [21].
Theorem 3.2. Given an SHT (X, ρ, µ), let D be a dyadic grid and S a sparse family in D. Then The constants in the equivalence depend only on X, T and p; in particular they are independent of D and S.
The proof of Theorem 3.2 passes through the equivalence of the weak and strong type inequalities to certain testing conditions. The proof of this equivalence for weak type inequalities in an SHT is the same as in [21, Section 2.2] in the Euclidean setting; it is straightforward to see that the only properties of dyadic cubes used in the proof are the those given in Theorem 2.1. The proof of this equivalence for strong type inequalities in [21] is much more involved; however, a simpler proof was recently given by Treil [36] and as he notes (see Section 5 of his paper), this proof also extends to an SHT with essentially no change.
Given Theorem 3.2, Theorem 1.6 follows from the characterization of the weak type inequality in Theorem 1.5. We will make this precise in Section 6 below.
Proof of Theorem 3.1. Our proof draws heavily on the results proved in [1] and we refer the reader there for complete details.
By our assumptions on f and σ we can, for clarity, replace f σ by f . As was proved in [16,Proposition 4.3], if we fix a point x 0 ∈ X, we can construct a dyadic grid D * satisfying Theorem 2.1 that contains a sequence of nested dyadic cubes {Q N } such that x 0 is the center of each cube Q N and such that N Q N = X. Therefore, by duality and Fatou's lemma, there exists g in the associate space Y ′ , g Y ′ = 1, such that Fix N > 0; we will prove that the final integral is bounded by C sup T S f Y , where the supremum is taken over some collection of S and D, but the constant is independent of these and also independent of N.
As was proved in [1,Section 5], there exist C 1 , C 2 , η > 0 such that for µ-a.e. x ∈ Q N , where and S N is a sparse subset of D * that consists of dyadic sub-cubes of Q N . The constants depend only on X and T ; in particular C 1 depends on the fact (see [4]) that T : L 1 (X, µ) → L 1,∞ (X, µ). Therefore, by Hölder's inequality (with respect to Y and Y ′ ), To estimate I 1 we give a pointwise estimate for Mf (x). By [16,Theorem 7.9] there exists a constant K = K(X) and a collection D 1 , . . . , D K of dyadic grids such that for every x ∈ X, where M k = M D k is the dyadic maximal operator defined with respect to D k . We claim that for each k there exists a sparse subset S k (depending on f ) such that This follows from (2) in Theorem 2.7. With the notation of this result, let S k = {Q i j } ∈ D k be the sparse family. Then given x ∈ Ω i \ Ω i+1 , there exists Q i j such that hence, for µ-a.e. x, If we now combine inequalities (3.4) and (3.5), we have that To estimate I 2 we will decompose each A i f and apply duality. By [16, Theorem 4.1] there exists a family of dyadic grids D 1 , . . . , D J , satisfying the properties of Theorem 2.1 with the additional property that given any ball B ρ , there exists j and Q * ⊂ D j such that B ρ ⊂ Q * and µ(B ρ ) ≈ µ(Q * ), with constants depending only on X. Recall (see the discussion after Theorem 2.1) that 2 i Q is defined to be a ball. Therefore, if we define Arguing as in [1, Section 6] (see especially Lemmas 6.5 and 6.13) we apply the same argument used to prove (3.3) for T to the adjoint operators B * i,j . Key to this is the fact that adjoint operators are weak (1, 1) with a constant that is linear in i. This yields the following pointwise estimate: where S j * ⊂ D j is sparse. Therefore, repeating the above argument for bounding the maximal operator, we have that B * i,j is bounded pointwise by a finite sum of sparse operators T S l , 1 ≤ l ≤ L (defined with respect to a finite collection of dyadic grids D). We can now estimate I 2 by duality using the fact that the operators T S l are self-adjoint: there exists a collection of g i ∈ Y ′ , g i Y ′ = 1, such that This completes the proof of Theorem 3.1.

Proof of Theorem 1.4
We will prove this result for sparse operators, with the [u, σ] A,B,p condition replaced by the [u, σ] D A,B,p condition. Theorem 1.4 then follows immediately by Theorem 3.2 and Lemma 2.2. The proof for sparse operators is essentially identical to the proof in the Euclidean case in [8]; for the convenience of the reader we sketch the details.
We need one preliminary result. In the Euclidean case this is due to Pérez [30], and in an SHT to Pérez and Wheeden [31] and Pradolini and Salinas [32]. In the latter papers the proofs are for maximal operators defined with respect to balls instead of dyadic cubes, but the proofs rely on a version of Theorem 2.7 for balls and so immediately adapt to this setting. Lemma 4.1. Given an SHT (X, ρ, µ) and a Young function Φ such that Φ ∈ B p , then

Remark 4.2.
In [30,32] it is assumed that Φ satisfies the doubling condition Φ(2t) ≤ CΦ(t). However, as noted in [9, p. 102] this assumption is only used to prove an equivalent formulation of the B p condition.
Proof of Theorem 1.4. By duality and the definition of T S , there exists g ∈ L p ′ (u), g L p ′ (u) = 1, such that In the next section we will need an equivalent version of this result for sparse operators, and so we state it here. The equivalence is easily seen by letting σ = v 1−p ′ . Theorem 4.3. Given an SHT (X, ρ, µ), a dyadic grid D and a sparse family S ⊂ D, and Young functions A, B withĀ ∈ B p ′ andB ∈ B p , suppose the pair of weights (u, v) satisfies Then

A weak (1, 1) inequality
In this section we prove a two-weight, weak (1, 1) inequality for sparse operators. A version of this result for general Calderón-Zygmund operators in the Euclidean case was due to Pérez [29] and our proof closely follows his. However, it is simplified because we are working with sparse operators: instead of appealing to duality and the Coifman-Fefferman inequality relating singular integrals and the maximal operator, we use two-weight theory via Theorem 4.3.
Theorem 5.1. Given an SHT (X, ρ, µ), let D be a dyadic grid satisfying the hypotheses of Theorem 2.1, and let S ⊂ D be sparse. Let Φ be a Young function such that for some 1 < q < ∞, A Φ (t) = Φ(t q ) satisfiesĀ Φ ∈ B q ′ . Then for all λ > 0, Proof. We first consider the case when µ(X) = ∞; at the end of the proof we will sketch the changes needed when µ(X) < ∞. Fix λ > 0 and let the disjoint cubes {Q j } and functions g and b = j b j be as given by Theorem 2.8. Since f = g + b, we have that where Ω = j Q j .
The estimate for I 1 is immediate: since µ is doubling, by the properties of the cubes {Q j }, the last inequality follows from the fact since t Φ(t), u 1,Q ≤ C(Φ) u Φ,Q .
To estimate I 2 , fix x ∈ Ω c ; then x ∈ Q c j for all j. By linearity, T S b(x) = j T S b j (x), and for each j, Hence, T S b j (x) = 0 and so I 2 = 0.
To estimate I 3 we want to apply Theorem 4.3 with the pair (u, M Φ u). Let B(t) = t (rq) ′ with 1/q < r < 1; thenB ∈ B q and [B] Bq ≤ C(q). We claim that To see this, fix Q ∈ D. Since B(1) = 1, it follows that χ Q B,Q = 1. Moreover, for any x ∈ Q, by a change of variables in the definition of the Orlicz norm, Hence, by Theorem 4.3 and since g(x) ≤ C(X)λ, Clearly, as desired. To estimate J 2 , assume for the moment that for each j and x ∈ Q j , . Given this, we have that It remains to prove (5.2). But if x ∈ Q j , then The norm on the right hand side is non-zero only if x ∈ Q and Q∩Q c j = ∅. Therefore, by the properties of dyadic cubes we must have that Q ⊂ Q j . Hence, and since this quantity is independent of x ∈ Q j , we get (5.2).
If µ(X) < ∞, then we can repeat the above proof for all λ > − X f (x) dµ(x). If the opposite inequality holds, then for some dyadic cube Q sufficiently large, Q = X, and so 6. Proof of Theorems 1.5 and 1.6 We first show that Theorem 1.6 is a consequence of Theorem 1.5. Given both separated bump conditions, the latter result implies that Therefore, by Theorem 3.2 we get the desired strong type inequality.
To prove Theorem 1.5 it will again suffice to prove it for sparse operators. In order to do this we need a weighted norm inequality for an Orlicz maximal operator. The following result was proved in [10] for the Hardy-Littlewood maximal operator in the Euclidean case; the proof in a SHT is nearly the same and we sketch the details. Lemma 6.1. Given 1 < p < ∞, let A, C and Φ be Young functions such that Proof. We first consider the case when µ(X) = ∞. By Theorem 2.7, fix a > 1 sufficiently large and form the cubes {Q k j } such that If µ(X) < ∞, then let k 0 be the largest integer such that Then Ω k 0 = X. We can repeat the above argument summing over k ≥ k 0 , and for k > k 0 we can still form the cubes {Q k j } and argue as before. When k = k 0 , then there exists a large dyadic cube Q = X. Hence, a k 0 < f Φ,Q and the argument proceeds as before, replacing the collection {Q k j } with the single cube Q. We can now prove Theorem 1.5. Note that this is the only part of the proof in which we use the fact that A is a log bump. The proof uses an extrapolation argument from [10]; see also [9,Chapter 8].

Separated and double bump conditions
We construct our example on the real line with p = 2. Our example can be readily modified to work for other values of p. Define the Young functions A(t) = B(t) = t 2 log(e + t) 2 .
ThenĀ,B ∈ B 2 . By rescaling, if we let Φ(t) = t log(e + t) 2 , then for any pair (u, σ), Φ,Q . Therefore, it will suffice estimate the norms of u and σ with respect to Φ. Similarly, we can replace the localized L 2 norms of u 1/2 and σ 1/2 with the L 1 norms of u and σ.
We now define u and σ as follows: where I n = (e n + n − 1, e n + n) and J n = (e n , e n + 1). Since the above computations are translation invariant, we immediately get that if Q n = (e n , e n + n), then u Φ,Qn σ Φ,Qn ≈ log(e + n), and so [u, σ] A,B,2 = ∞. It remains, therefore, to show that [u, σ] A,2 and [σ, u] B,2 are both finite. We will consider [u, σ] A,2 ; the argument for the second is essentially the same. Fix an interval Q; we will show that u Φ,Q σ 1,Q is uniformly bounded. Fix an integer N such that N − 1 ≤ |Q| ≤ N. We need to consider those values of n such that Q intersects either I n or J n .
Suppose that for some n ≥ N + 2, Q intersects I n . But in this case it cannot intersect J k for any k and so σ 1,Q = 0. Similarly, if Q intersects J n , then u Φ,Q = 0. Now suppose that for some n < N + 2, Q intersects one of I n or J n . If log(N) n (more precisely, if N < e n − e n−1 − 1), then for any k = n, Q cannot intersect I k or J k . In this case u Φ,Q σ 1,Q = 0 only if Q intersects both I n and J n , and will reach its maximum when N ≈ n. But in this case we can replace Q by (e n , e n + n) and the above computation shows that u Φ,Q σ 1,Q 1.
Finally, suppose Q intersects one or more pairs I n and J n with n log(N). Then | supp(u) ∩ Q| log(N) and u L ∞ (Q) ≈ K ⌊log(N )⌋ log(N) 2 . Therefore, hence, we again have that u Φ,Q σ 1,Q 1. It follows that [u, σ] A,2 < ∞ and our proof is complete.
Remark 7.1. l If we modify our example by defining K n = n 2 log(e + n) −2 , then the same argument shows that (u, σ) satisfy the separated bump condition when A(t) = B(t) = t 2 log(e + t) 1+δ , 0 < δ < 2, but do not satisfy the double bump condition for any δ > 0. It would be of interest to construct a pair that satisfies a separated bump condition for some pair of log bumps but fails to satisfy the double bump condition for any pair of Young functions for which the appropriate B p conditions hold.