A numerical estimate of the regularity of a family of Strange Non--Chaotic Attractors

We estimate numerically the regularities of a family of Strange Non--Chaotic Attractors related with one of the models studied in C. Grebogy et al. (1984) (see also G. Keller (1996)). To estimate these regularities we use wavelet analysis in the spirit of R. de la Llave et al. (2002) together with some ad-hoc techniques that we develop to overcome the theoretical difficulties that arise in the application of the method to the particular family that we consider. These difficulties are mainly due to the facts that we do not have an explicit formula for the attractor and it is discontinuous almost everywhere for some values of the parameters. Concretely we propose an algorithm based on the Fast Wavelet Transform. Also a quality check of the wavelet coefficients and regularity estimates is done.


Introduction
The aim of this paper is to develop techniques and algorithms to compute approximations of (geometrically) extremely complicate dynamical invariant objects by means of wavelet expansions. Moreover, from the wavelet coefficients we want to derive an estimate of the regularity of these invariant objects. In the case when the theoretical regularity is known, the comparison between both values gives a natural and good quality test of the algorithms and approximations.
In this paper the invariant objects that we study and consider when developing our algorithms are Strange Non-chaotic Attractors. They appear in a natural way in families of quasiperiodically forced skew products on the cylinder of the form where F σ,ε : S 1 × R −→ R is continuous and C 1 with respect to the second variable, R ω (θ) = θ + ω (mod 1) with ω ∈ R \ Q, S 1 = R/Z = [0, 1) denotes the circle and ε, σ ∈ R + . These systems have the important property that any fibre, {θ} × R, is mapped into another fibre, {R ω (θ)} × R.
Our main goal will be to derive approximations in terms of wavelets of the invariant maps ϕ : S 1 −→ R: ϕ(R ω (θ)) = F σ,ε (θ, ϕ(θ)). Under certain conditions the graphs of these invariant maps have very complicate geometry where roughly speaking, the word complicate means non-piecewise continuous. In such case, we will say that the graph of ϕ is a Strange Non-chaotic Attractor (SNA). A usual particular case of SNA is when the invariant function is positive in a set of full Lebesgue measure and vanishes on a residual set.
A usual standard approach is to use Fourier expansions (rather than Wavelet ones) when approximating dynamical invariant objects. In the SNA's framework this approach has a serious drawback: an accurate approximation of ϕ demands a high number of Fourier modes due the appearance of strong oscillations (see e.g. [Jor01]). One natural way to overcome this problem is by using other orthonormal basis such as wavelets and the multi-scale methods (see e.g. [Coh03,Mal98]). One of the advantages of this approach is that wavelets also define certain regularity spaces B s ∞,∞ (see e.g [HW96,Coh03,Mey01,Tri06]) that provide a natural framework for the approximations that one gets.
Precisely, the regularity can be considered as a trait of how ϕ becomes strange in terms of functional spaces. For example, in [dlLP02], the authors make numerical implementations of wavelet analysis to estimate the "positive" regularity of invariant objects which are graphs of functions in appropriate spaces. However, due to the complexity of the SNAs described above we need to consider the possibility that these objects have zero or even negative regularity (see [Coh03]). Hence, the techniques of [dlLP02] need to be extended to this case. To this end, we develop ad-hoc techniques to overcome the theoretical difficulties of the objects we study in performing a wavelet analysis, in the same spirit of [dlLP02], to estimate the regularity of such ϕ. Our wavelet analysis will be based on the Fast Wavelet Transform (see e.g [Mal98]).
The computation of the regularity (depending on parameters) can give some insight in the study of the fractalization or other routes of creation of SNA and help in detecting this bifurcation.
We apply the above program to a slight modification of the system considered in [GOPY84]. Indeed, the attractor obtained in [GOPY84] (as shown by Keller in [Kel96]), is the graph of an upper semi-continuous function from the circle to R in the pinched case (that is, when there exists a fibre whose image is degenerate to a point), whereas in the non pinched one the attractor is the graph of a map with the same regularity as the skew product (see also [Sta97] and [Sta99]). As we will see, the wavelet coefficients together with the computed regularity detect well the functional space jump associated to the creation of the SNA.
This paper is organized in two parts. The first one is devoted to make a survey on wavelets and regularity. Whereas the second one is devoted to apply these techniques to the SNA case. More concretely, in Section 2 we recall some topics about the theory of wavelet bases. Section 3 is devoted to review the notion of regularity through Besov functional spaces and discuss it by means of a particular simple example. In Besov spaces the regularity can be any real number (in contrast to Hölder regularity defined only for positive regularities). In Section 4 we review the relation between the regularity and the wavelet coefficients of a function. Section 4.1 is devoted to present and test a methodology to numerically estimate regularities based on the previous sections.
Finally, in the second part, in Section 5 we survey on the family of Strange Non-Chaotic Attractors that we will study. In particular, we state Keller's Theorem and we remark crucial aspects of its proof that will be used in devising the algorithm that we propose. In Section 6, we present how one can overcome some of the theoretical difficulties related with the strangeness of the SNA. In 6.3, we perform the algorithm to compute the regularity of the attractors and in Section 7, the results of this computation, for a particular instance of SNA's, are presented and discussed.

A survey on wavelets
We aim at approximating by means of wavelets a certain class of functions from the circle R \ Z to an interval of the real line. Recall that a standard approach used in the literature to compute and work with invariant objects of systems exhibiting periodic or quasi-periodic behaviour is to use finite Fourier approximations (trigonometric polynomials), namely functions of the form (a n cos(nθ) + b n sin(nθ)) .
As it has been said, in this paper instead we aim at using finite wavelet expansions of the form: where ψ j,n (θ) is obtained by translation and dilation of a mother wavelet ψ(x). To be explicit, let us start by introducing the orthonormal wavelet basis of L 2 (R). A natural way to do it is via the notion of Multiresolution Analysis. We refer the reader to [Mal98,HW96] for more detailed and comprehensive expositions.
There exists a function φ whose integer translates, φ(x − n), form an orthonormal bases of V 0 . Such function is called the scaling function.
Before continuing the explanation, let us recall that for f ∈ L 2 (R), stands for the inverse Fourier transform. If we fix an MRA, it follows that V j has an orthonormal basis {φ j,n } n∈Z , for every j, where Now, define the subspace W j as the orthogonal complement of V j on V j−1 . That is, Therefore, by the inclusion of the spaces V j we have The mother wavelet ψ ∈ W 0 is defined to be the function whose Fourier transform is where h * (ξ) is the complex conjugate of is called the scaling filter (or the low pass filter ) of the Multiresolution Analysis. We define the support of h[n], denoted by supp(h), as the minimum subset I of Z such that I = { , + 1, . . . , } is a set of consecutive integers and h[n] = 0 for every n ∈ Z\I.
The following result (see [Mal98,Theorem 7.3]) allows to obtain the wavelet basis from the scaling function: Theorem 2.2 (Mallat, Meyer). The mother wavelet given by Equation (4) verifies that, for each integer j, the family {ψ j,n } n∈Z is an orthonormal basis of W j , where: As a consequence, the family {ψ j,n } (j,n)∈Z×Z is an orthonormal basis of L 2 (R).
Recall that we want to approximate maps f ∈ L 2 (R) by linear combinations of wavelets. By (3) and the theorem above, the projection of f to V −J ⊂ L 2 (R) : is a good approximation of f provided that J > 0 is large enough. We want to rewrite such an approximation as linear combination of wavelets of the form To do it, as usual, we define the coefficients a j [n] := f, φ j,n and d j [n] := f, ψ j,n for j, n ∈ Z. With this notation, the initial approximation of f becomes By (2) we have for every j, p ∈ Z. Hence, from the iterative use of (6) and (7) starting with the approximation n∈Z a −J [n]φ −J,n we obtain the approximation of f that we are looking for: For (numerical) applications such infinite approximations are usually not available since we often work with finite information about our function. For this we need a similar theory for subspaces of V j and W j of finite dimension. For j ≥ 0 we define . . , f n denotes the subspace of L 2 (R) generated by the linear combinations of f 1 , f 2 , . . . , f n . From the comment at the end of Section 7.3.1 of [Mal98] (see also [Fra99,Lemma 3.26] for a more detailed account), it follows that for every j > 0. Hence, given a function f ∈ L 2 (R) we can take a good finite approximation of the map given by its projection to V * −J : provided that J is large enough. Again, we are interested in writing such an approximation as linear combination of wavelets, but in this case this expansion must be finite: To obtain this expression observe that (8) implies (10) for every j > 0 and p ∈ {0, 1, . . . , 2 j−1 − 1}. Hence, with the iterative use of (10) and (11) starting with the approximation (9) we obtain as we wanted.
Remark 2.3. It is important to point out that the "finiteness" of the FWT turns the orthonormal basis of W * −j into an orthonormal basis of S 1 for j > 0. Therefore, we do not need anything more when we have to deal with maps which naturally live in S 1 .
To effectively compute an approximation of the type given in Equation (12) one remaining problem is left: to find a good estimate of the initial coefficients a −J [n] = f, φ −J,n . In the literature there is a lot of discussion on how to compute these coefficients, but a simple customary approach is to use the following estimate (see, for instance, [Fra99,Lemma 5.54] and its proof): Lemma 2.4. Assume that f verifies | f, φ j,n | < ∞ for every j, n ∈ Z × Z and for all real numbers x, y and a constant C 1 < ∞. Suppose that the scaling function φ from an MRA {V j } j∈Z is such that Then, for every j, n ∈ Z × Z, As a corollary of this lemma we see that if f is Lipschitz, then Summarizing, Lemma 2.4 gives us a method to initialize the FWT. This gives a complete algorithm to compute wavelet coefficients of a certain function f ∈ L 2 (R).

Defining regularity through Besov spaces
In this section we will make precise the notion of regularity that we will use. To do so, we will describe, in two steps, the functional spaces that define the notion of regularity. Roughly speaking these spaces collect the functions that verify an α-Hölder condition. However, As we will see, we will have to deal with functions with regularity zero (that do not verify any Hölder condition). The notion of nonpositive regularity is formalized trough Besov spaces (see [Tri83,BL76]). In what follows, recall the definition of the Besov spaces on the real line [Tri83, Section 2.3]. Next we will recall the extension of such definition to S 1 .
3.1. The spaces B s ∞,∞ (R). The space of all real valued rapidly decreasing infinitely differentiable functions is called the (real) Schwartz space and it is denoted by S(R). The topological dual of S(R) is the space of tempered distributions which is denoted by S (R). For f ∈ S (R), f (ξ) denotes the Fourier transform of f and f ∨ (x) stands for the inverse Fourier transform in the sense of distributions (see e.g [Tri83]). Recall, also, that the essential supremum is defined as where µ is a measure (in our case the usual Lebesgue measure).
Let ϕ 0 ∈ S(R) be such that for j ∈ N. It is not difficult to show that, independently of the choice of ϕ 0 , ∞ j=0 ϕ j (x) = 1 for all x ∈ R. Each of the families {ϕ j } ∞ j=0 is called a Dyadic Resolution of Unity in R. Then, we define the Besov Spaces by As it can be seen in [Tri83, Remark 2 of Section 2.3], the spaces B s ∞,∞ (R) are, in fact, independent of the chosen dyadic resolution of unity ϕ. Therefore, we can remove the subscript ϕ from f ∞,∞,ϕ,s . So, in what follows we will write f ∞,∞,s instead of f ∞,∞,ϕ,s . The spaces B s ∞,∞ (R) are a particular case of the Generalized Besov Spaces B s p,q (R) defined also, for example, in [Tri83] and one has the inclusion property. That is, if s < s , then B s p,q (R) ⊂ B s p,q (R). For s > 0, the spaces B s ∞,∞ (R) coincide with the Hölder-Zygmund spaces and it is natural to extend the notion of regularity to s ≤ 0 through B s ∞,∞ (R) in the following way (we refer to [Tri83,Ste70] for a more complete explanation).
The following examples help to clarify this regularity notion.
(i) Consider the Weierstraß function defined by where A, B ∈ R are such that B −1 < A < 1 < B is Hölder continuous and nowhere differentiable (but it has a distributional derivative, as does every locally integrable function). Moreover, it has regularity − log B (A); that is , is Lipschitz (because is the anti-derivative of a bounded function) and the derivative operator reduces s by 1 (leaving p = q = ∞ unchanged). As matter of fact, [RS96, Section 2.3, Example 1] is devoted to give examples of such functions (in terms of belonging to a certain functional spaces). The prototypical example is to consider α 2 + β 2 > 0, with β > 0, and defining has the support near the origin and the singularity is located near the origin. (iii) The negative exponents, in the Besov spaces, must be understood as a distributional space. For example, it is known that δ(x) ∈ B −1 ∞,∞ (R) where δ(x) stands for Dirac's delta. In view of that, δ(x) can be considered as the second (distributional) derivative of the continuous function We conclude this section with some remarks on the spaces B s ∞,∞ , with s ∈ [0, 1). The fact that f belongs to a concrete Hölder space is equivalent to specify its differentiability degree and how "wild " is the last derivative. Indeed, if a function f is Hölder n + s, where n ∈ N and s ∈ (0, 1), then it can be approximated by a polynomial of order n (the Taylor polynomial for example) and the difference between the polynomial and the function is uniformly bounded by an |y| s factor. Moreover, if n ≥ 1 then f is continuously differentiable whereas, if n = 0 then f "may be taken" to be continuous (but not differentiable) in the spirit of the following proposition.
Finally, we want to mention that in [Coh03, Section 3.8] it is shown that the range of negative s is the dual of the positive ones. That is, for s < 0 the spaces B s ∞,∞ (S 1 ) are "purely" distributional spaces (see Example 3.3 (iii)). Thus, we can state the following lemma.
Proof. If f is an upper semi-continuous function which is not continuous then, it cannot verify an s-Hölder condition (for any s > 0). Thus, the regularity s is nonpositive. On the other side, since f is not a distribution, the parameter s cannot be negative.
3.2. The Besov spaces on S 1 . In this section we will extend the Besov spaces to S 1 . Recall that we consider S 1 = R \ Z and, hence, as the interval [0, 1). To do it we follow [BL76,Tri92]. Indeed, given f ∈ S (S 1 ) (the space of tempered distributions on S 1 ) it is known that Definition 3.6. Let ϕ = {ϕ j } ∞ j=0 be a dyadic resolution of unity (on R). We define the Besov Spaces on S 1 by As in Definition 3.2 we say that a circle map f has regularity s ∈ R if the map f belongs to B s ∞,∞ (S 1 ). The following lemma shows that the regularity of a circle map coincides with the regularity of its real extension which we define as follows. Given f ∈ S (S 1 ) there exists a unique f PER ∈ S (R) such that f PER is 1-periodic and the restriction of f PER over [0, 1) coincides with f (such an f PER can be defined as f ({·}), where {·} denotes the fractional part function). This lemma is usually omitted and used implicitly but we include here for completeness. Hence, .

Wavelets and regularity
In Section 3 we have recalled the notion of the regularity of a function through the spaces B s ∞,∞ (R) and B s ∞,∞ (S 1 ). Also, we have introduced the wavelet expansions of a given function in L 2 (R). Next, we want to show the relationship between this notion of regularity and the wavelet coefficients. Such relationship will be the main tool of the forthcoming Algorithm 6.5. The main tool for this will be the Daubechies wavelets, because they are orthonormal bases on L 2 (R) (see [Mal98] for a definition and construction) and, depending on the number of vanishing moments, they are well adapted to the functional spaces B s ∞,∞ (R) (see [HW96,Tri06]). Definition 4.1. Let ψ(x) be a wavelet from a MRA {V j } j∈Z . We say that ψ has p-vanishing moments if the integer p is the minimum non-negative integer such that Daubechies wavelets are a family of wavelets with compact support that have an element with p vanishing moments for each p ≥ 1. From [HW96, Theorem 7.16], [Coh03, Theorem 3.8.1] and [Tri06, Theorem 1.64] we will state the following theorem, in the spirit of [dlLP02, Theorem 5.10] and [Mey01, Chapter 3], which will be useful for our purposes. To this end, for t ∈ R we set Let f ∈ L 2 (R) and let ψ be a mother Daubechies wavelet with more than max(R(τ ), 5 2 − R(τ )) vanishing moments for some τ ∈ R \ − 1 2 , 1 2 .
for all j ≤ 0. Furthermore, if ψ has more than 2 vanishing moments, then j=0 is unbounded for every τ ∈ R or there exist C > 0 and τ ∈ − 1 2 , 1 2 such that sup n∈Z | f, ψ j,n | ≤ C2 τ j for j ≤ 0 Remark 4.3. In view of Theorem 4.2, if the coefficients sup n∈Z | f, ψ j,n | decay approximately exponentially with respect to j, that is . This tells us that, in this case, to compute the value of regularity s we can make a linear regression to estimate the slope τ of the graph of the pairs (j, s j ) and get the correct value of s from this slope. The Pearson correlation coefficient controls the degree of linear correlation between the variables j and s j .
Moreover, in Theorem 4.2 the constant C > 0 must be understood as the norm of f . Therefore, Equation (13), can be used to see the "jumps" between different Besov spaces. In other words, the "pass" from a concrete Besov space to another one can be recovered by the inspection of the values of C. This is because such In view of the above Remark, we can perform a strategy to estimate the regularity of a function using the wavelet coefficients. This is the main topic of the following subsection.
4.1. A method to estimate regularities on L ∞ . As we have said, we want to compute wavelet approximations of certain dynamical objects while controlling the precision of these approximations (in fact, the quality of the computed wavelet coefficients). This quality test will be done by comparing the theoretical regularity of such functions with the estimated one (given by Theorem 4.2 and Remark 4.3). More concretely, in [dlLP02], numerical implementations of wavelet analysis to estimate the "positive" regularity of conjugacies between critical circle maps are done. Due to Theorem 4.2, we can generalize such techniques to any value (positive or not) of the regularity measured in terms of the Besov Spaces B s ∞,∞ (R). The steps described below explain how to apply Theorem 4.2 in a general framework.
Among the many methods described in the literature to compute wavelet approximations, in this paper we will use (and test) the Fast Wavelet Transform. Alternatively, in a forthcoming paper we will explore the technique of solving numerically the Invariance Equation given in Remark 6.6 which can be more adapted to the dynamical complexity of the object.
Remark 4.4. In view of Lemma 3.7, to estimate the regularity of a map f ∈ S (S 1 ) it is enough to use Theorem 4.
In view of this remark, the verbatim application of Theorem 4.2 is the following: Step 2. Use Equation (7) to calculate the coefficients d PER −j [n] = f PER , ψ −j,n for j = 0, . . . , J − 1 and 0 ≤ n ≤ 2 j − 1.
Step 4. In view of Equation (13), make a linear regression to estimate the slope τ of the graph of the pairs (−j, s −j ) with j = 0, . . . , J − 1. Then, when there is evidence of linear correlation between the variables −j and s −j , we set s = R(τ ).
Step 5. If k > max(s, 5/2 − s) then, by Theorem 4.2, f ∈ B s ∞,∞ (R) and, hence, f has regularity s. Otherwise we need to repeat Step 1 -4 with a Daubechies wavelet having a larger value of k until k > max(s, 5/2 − s).
To test the precision of this implementation of Theorem 4.2 we will use the Weierstraß function. The reason for such experiment is twofold. From one side there exists an analytic formula for the regularity of the Weierstraß function in terms of its parameters (which allows us to test the quality of the computed coefficients) and, at the same time, the graph of the Weierstraß function is "strange" enough. This idea is borrowed from [dlLP02], but since we use more data than in [dlLP02] we reproduce the example.
Therefore the above algorithm is valid in this case only for Daubechies wavelets with k ≥ 3 vanishing moments.
To perform the above algorithm we take J = 30 (that is, we use a sample of the graph of W A,2 of 2 30 points). To carry out Step 1, by Lemma 2.4, we can estimate Then, after executing Steps 1-4 of the above algorithm we obtain the results depicted in Figure 1. We want to remark that the best numerical estimate of the Figure 1. On the left picture the theoretical and estimated regularity of W A,2 with A ∈ [0.56 · · · , 0.86 · · · ] are shown. The theoretical curve is plotted in blue and the numerical one in red. The estimated regularity is computed with a Daubechies wavelet with 10 vanishing moments. On the right picture the Error function | − log 2 (A) − s A | is plotted (here s A denotes the estimated regularity of W A,2 ). Notice that the error is decreasing as the regularity gets closer to zero. regularity of W A,2 (x) with A ∈ [0.56 · · · , 0.86 · · · ] computed with a Daubechies wavelet of 10 vanishing moments is obtained for A = 0.86 · · · (that is, when the regularity is closer to zero). The fact that we have to work with the Daubechies wavelet of 10 vanishing moments can be explained as follows. Daubechies wavelets with higher vanishing moments have bigger domain and regularity (see [Mal98]) and, hence, they are less adapted to approximate the Weierstraß function, which has highly concentrated oscillations. It turns out that the value of 10 vanishing moments is the best adapted (in the sense that minimizes the error) to the Weierstraß function for the range of parameters considered.
We also want to remark that all the computed Pearson correlation coefficients are bigger than 0.999. This agrees with the fact that the Weierstraß function is self-similar. Then, the coefficients d j [n], (as pointed out in Remark 4.3) must be approximately on a straight line. This is what Figure 2 shows for a particular case. It turns out that the Daubechies wavelet with 10 vanishing moments also maximizes globally the computed Pearson correlation coefficients in the range of parameters that we consider. Hence, despite of the inherited error of the FWT's seed, Daubechies wavelet with 10 vanishing moments is the best to approximate and to explain the regularity of the Weierstraß function. The error of such estimates are represented on the left hand side of Figure 1.
In view of the good results obtained in Example 4.5, we will perform the same methodology to a family of a Strange Non-Chaotic Attractors in the following sections.

An upper semi continuous SNA
In [GOPY84], a quasi-periodically forced skew product on the cylinder was studied. The attractor obtained there (as shown by Keller in [Kel96]), is the graph of an upper semi-continuous function. Precisely, these kind of systems will be our testing grounds for the algorithms that we are going to develop.
Recall that the vertical Lyapunov Exponent at a point (θ 0 , x 0 ) is defined by Therefore, by using Birkhoff's Ergodic Theorem, it can be shown that the vertical Lyapunov Exponent at x ≡ 0 is When κ(f σ , g ε ) is positive, x ≡ 0 is a repellor for System (14). Moreover, since f σ and g ε are bounded, infinity is also a repellor and the system must have an attractor different from x ≡ 0. These attractors, which are the objects that we want to study, are typically but not generally very complicate. As we will see later, we are going to restrict ourselves to the study of a particular subfamily of Model (14) which is with ω = 1+ √ 5 2 , σ > 0 and ε ≥ 0. Apart from the parameter ε, it is the natural restriction to R + of the system considered in [GOPY84] (see Figure 3, where a graph of the attractor of this system with σ = 1.5 and ε = 0 is shown). In this case, the vertical Exponent κ(f σ , g ε ) at x ≡ 0 is precisely log(σ). Hence, the interesting case (for us) occurs when σ > 1.
The attractor of System (14) and its dynamics is described by the following theorem (see also Figure 3 for an illustration of the graph of this attractor): Theorem 5.1 (G. Keller, [Kel96]). There exists an upper semi continuous function ϕ : S 1 −→ R + whose graph is invariant under System (14) and satisfies (a) The Lebesgue measure on the circle, lifted to the graph of ϕ is a Sinai-Ruelle-Bowen measure (that is, for every f ∈ C 0 (S 1 × R + , R) and Lebesgue almost every (θ, if κ(f σ , g ε ) > 0 and g vanishes at some point then the set θ ∈ S 1 : ϕ(θ) > 0 is meager and ϕ is almost everywhere discontinuous, (e) if κ(f σ , g ε ) > 0 and g > 0 then ϕ is positive and continuous; if g ∈ C 1 then so is ϕ, (f ) if κ(f σ , g ε ) = 0 then |x n − ϕ(θ n )| → 0 exponentially fast for almost every θ and every x > 0.
Observe that, when κ(f σ , g ε ) > 0 and g vanishes at some point (i.e. there exists a fibre whose image is degenerate to a point), it follows from statements (c,d) that ϕ is discontinuous almost everywhere. If there exists θ 0 ∈ S 1 such that g ε (θ 0 ) = 0, we will say that the system is pinched. In the particular case of System (15), the pinching condition implies that ε = 0 and, since | cos(2πθ)| vanishes for θ ∈ 1 4 , 3 4 , it follows that the set (16) ( i 4 + kω (mod 1), 0) : k ∈ N, i ∈ {1, 3} is both a subset of the attractor and is dense (and invariant) in x ≡ 0. On the other hand, if ε > 0 we can not have a dense set of pinched points.
The proof of the above theorem is based on the iteration of the Transfer Operator of the system and many of the properties of ϕ can be derived from such iteration. Since we will strongly use this construction let us briefly explain it. Let P be the space of all functions (not necessarily continuous) from S 1 to R. If we look for a functional version of the System (14) in the space P one can define the Transfer Operator T : P −→ P as Remark 5.2. From the above definition we obtain where π x : S 1 × R + −→ R + denotes the projection with respect to the second component.
Notice that the graph of a function ϕ : S 1 −→ R is invariant for the System (14) if and only if T(ϕ) = ϕ.
To obtain the map ϕ from Theorem 5.1, Keller takes a sufficiently large constant function ϕ 0 = c (with c > sup x∈R f σ (x) max θ∈[0,1] g ε (θ)) and iterates it under the transfer operator T (see Figure 4). In such a way he gets, since the map f is monotone, a non-increasing sequence of continuous maps given by where T k stands for the k-th iterate of the Transfer Operator. Then, following Keller's proof, one has that ϕ := lim k→∞ ϕ k = inf k→∞ ϕ k and the point-wise convergence of the above sequence is exponentially fast. Notice that the shape of ϕ depends on the parameters σ and ε. Moreover, when proving the upper semi continuity of ϕ, it is shown that all sets {θ ∈ S 1 : ϕ(θ) < ε} are open. This means that ϕ is continuous at each point where g vanishes (whether pinched or non-pinched) and, also, it is in L 1 (S 1 ). That is, the function ϕ ∈ L 1 (S 1 ) is continuous at where Z gε ⊂ S 1 is the finite and discrete set where the function g ε vanishes.
5.1. On the regularity of the attractor. The regularity of the attractor of System (14) in terms of ε and the regularity of g ε is given by the following Proposition 5.3. Let ϕ : S 1 −→ R + be the upper semi continuous function whose graph is invariant under System (14). Remark 5.4. For ε > 0 we can also have ϕ ∈ B 0 ∞,∞ (S 1 ). Indeed, let g ε (in System (14)) be such that This means that g ε (θ) does not verify any Hölder condition but it is continuous (see Example 3.3(ii)). By Keller's Theorem, the invariant function ϕ is continuous when ε > 0. However, by the choice of g ε , it cannot verify any Hölder condition. Therefore, ϕ ∈ B 0 ∞,∞ (S 1 ) (even in the non-pinched case).
6. An algorithm to compute the wavelet coefficients and regularities of the attractors of System (14) As it has been already said we want to approximate the invariant curve ϕ of the System (15) in terms of wavelets with control quality. Since g ε (θ) = ε + |cos(2πθ)| is Lipschitz then, by Proposition 5.3, we know that the regularity of the attractor is 0 when ε = 0 and 1 when ε > 0. The control quality is implemented by comparing this theoretical value with the estimate of the regularity obtained from the wavelet coefficients.
Since the attractor of System (15) is the graph of a map ϕ : S 1 −→ R + we will use the methodology described in the previous section applied to the function ϕ PER (see Lemma 3.7 and Remark 4.4). However, ϕ PER , in the pinched case, is discontinuous almost everywhere (and the corresponding attractor is called strange). Therefore, we are not allowed to apply verbatim the algorithm from the previous section. In the rest of this section we will describe how to solve this problem in the implementation of the strategy from the previous section.
More concretely, to compute an approximation of the type (12) for ϕ PER , since we do not have an explicit formula for ϕ, we will use Theorem 5.1(f) and the transfer operator to get a sufficiently good numerical approximation of this function. Indeed, by Theorem 5.1(f), for almost every θ 0 ∈ S 1 , any x 0 > 0 and any ε > 0 there exists N 0 such that for every n ≥ N 0 we have: . Moreover, the points (θ n , x n ) with n ∈ {N 0 , N 0 + 1, . . . , N 0 + 2 J − 1} approximate exponentially fast the points (θ n , ϕ(θ n )) from graph(ϕ). Therefore, (18) (θ n , x n ) : n = N 0 , N 0 + 1, . . . , N 0 + 2 J − 1 is an approximate mesh of graph(ϕ) provided that J is large enough. To fix the mesh we choose a random point θ 0 and we fix some x 0 > sup x∈R + 2σ tanh(x) = 2σ. However, this approximate mesh has two problems to be used in our computations: Problem (1) as we will see, we need a mesh of the graph of ϕ at dyadic points of the form i2 −J for i = 0, 1, . . . , 2 J − 1, Problem (1) we cannot use Lemma 2.4 to estimate the initial coefficients a −J [n] since our map ϕ is discontinuous almost everywhere (and, hence, not Lipschitz).
In the following two subsections, we will explain how one can solve the above two problems.
6.1. A solution to Problem (1): a C 1 homeomorphism. As we have said, we need a mesh of the graph of ϕ at dyadic points of the form θ i = i2 −J for i = 0, 1, . . . , 2 J − 1 but, clearly, if we obtain the points (θ n , x n ) just as iterates of a single point by F σ,ε this condition is not satisfied. The natural approach which would be to approximately compute the points of the graph of ϕ based at the dyadic points by interpolating the obtained values is not feasible since, by Theorem 5.1, we know that ϕ is upper semi-continuous and discontinuous everywhere. Then we propose the following solution which consists in moving to a conjugate system with the desired properties. To do this, first we relabel the points {(θ n , x n )} N0+2 J −1 n=N0 to a sequence {( θ i , z i )} 2 J −1 i=0 so that (we do this simply by sorting the data (18) with respect to the first coordinate; see Remark 6.3). In particular if n ∈ {N 0 , N 0 + 1, . . . , N 0 + 2 J − 1} and i = i(n) ∈ {0, 1, . . . , 2 J − 1} is such that θ i = θ n , then z i = x n . Now, we consider a C 1 homeomorphism h : S 1 −→ S 1 such that h i2 −J = θ i for i = 0, . . . , 2 J − 1, h(1) = θ 0 + 1. Such map h can be obtained by taking h to be, for instance, a cubic spline in the intervals i2 −J , (i + 1)2 −J for i = 0, . . . , 2 J − 1.
i=0 is now an approximate mesh of graph( ) with = ϕ•h, based at the dyadic points. Thus, we will use the list of pairs i2 −J , z i 2 J −1 i=0 to estimate the regularity of .
Remark 6.1. The map has the following dynamical interpretation. Consider the homeomorphism H : S 1 × R + −→ S 1 × R + defined by H(θ, x) = (h(θ), x). Then, one can check that, graph( ) is the attractor of the dynamical system which is conjugate to System (15).
To obtain the regularity of ϕ we need to relate the regularities of and ϕ in terms of the Besov Spaces B s ∞,∞ (S 1 ). To do this we use the fact that h is a C 1 homeomorphism and that a homeomorphism is a bijective mapping of B s ∞,∞ (R) onto itself (we refer the reader to Section 4.3 from [Tri92] for a more detailed explanation). More precisely, . Thus, by Proposition 6.2 and Proposition 5.3, the regularity of ϕ and coincide and we can estimate the regularity of by using the mesh i2 −J , z i 2 J −1 i=0 . We remark that the exact formula for h is irrelevant for our algorithm. We only use the fact that such a map h exists and the fact that it can be taken C 1 . To obtain the data mesh i2 −J , z i 2 J −1 i=0 that approximates (the attractor of the conjugate system) we simply have to sort the obtained mesh {(θ n , x n )} N0+2 J −1 n=N0 for ϕ with respect to the first coordinate θ and replace θ i by i2 −J . Of course, this does not add any further computational error to the mesh other than the errors coming from the iteration of the system and truncation errors derived from the choice of J. Furthermore, the exponential contraction of the system to the attractor (see Theorem 5.1(f)) still holds for the conjugate system, thus assuring that there is no loss of precision when replacing ϕ by (see Subsection 6.2).
Remark 6.3. The process of sorting the data of an array of 2 30 points from S 1 ×R + (stored as pairs of double variables in C) turns to be the bottleneck of the whole algorithm (and the most time consuming task of the whole program). Moreover, even the process of computing and filling the array with the initial mesh of the function ϕ already spends a "visible" amount of CPU time. Indeed, the iteration, storing and sorting process (with a standard sort algorithm like Heapsort) of this data spends about 2200 CPU seconds, with a remarkable variability which depends on the initial sorting of the data, in a computer with a Xeon processor at 3 GHz and 32 Gb of RAM memory. In order to reduce the time elapsed in the sorting process we use the following trick based on the fact that the dynamical system generating the θ i data is the irrational rotation R ω . In this case we know that the Lebesgue measure is the unique ergodic measure of R ω and, hence, its averaged spatial distribution is uniform and it is controlled approximately by the Birkhoff's Ergodic Theorem applied to the Lebesgue measure. Indeed, we have for k large enough and for every i ∈ {0, 1, . . . , N − 1}. The interpretation of this equation is that the statement holds with high frequency for J large enough (observe that in this case we have ). Moreover, when (19) holds, we have i = 2 J θ l , where θ l is the unique element from the set and · denotes the integer part function. This observation gives a good "hash function" and the following efficient algorithm to store and sort the data {(θ n , x n )} N0+2 J −1 n=N0 . First, for n = N 0 , N 0 +1, . . . N 0 +2 J −1 we compute the point (θ n , x n ) = F σ,ε (θ n−1 , x n−1 ). Then, we store it in the position i = 2 J θ n of the array data, if this slot is free. Otherwise, we store the point (θ n , x n ) in a free position j = j(i) of the array data such that |j − i| is minimal. According to the above observations this will happen with low frequency and the array data will be almost sorted. Moreover, the positions of the array data which are not sorted are close to the place where they should be when the array is sorted. This is exactly the situation where the direct insertion sorting algorithm can be used with very good results. This means that we are using a method of order O(2 J + d) where d is the number of insertions (which are very low due to the way we have stored all data) instead of a method of order O(J2 J ) as the Heapsort algorithm.
With this trick, the iteration, storing and sorting process lasts about 300 CPU seconds, almost without variability, which clearly improves the efficiency of the program. . But, as we have pointed out, ϕ (and hence PER ) is discontinuous almost everywhere and the above estimate of a PER −J [n] is, a priori, not valid. However, as we will see, the element z n ≈ n2 −J from our data give indeed a good estimate for a PER −J [n] because our mesh is based at the dyadic points n2 −J .
As it has been already said in Section 5, ϕ is the point-wise limit of a nonincreasing sequence of continuous (and, hence, uniformly continuous) functions for every θ ∈ S 1 and c > sup x∈R + 2σ tanh(x) = 2σ. Consequently, (θ) = lim k→∞ ϕ k (h(θ)) for every θ.
6.3. The algorithm for the System (14). In view of the previous sections and Section 4.1, we present the algorithm to estimate regularities, in terms of the Besov spaces B s ∞,∞ , for the System (14). Algorithm 6.5. Let F σ,ε be a skew product under the assumptions of the System (14). Fix J > 0 and a transient N 0 > 0 big enough. To estimate the regularity of the invariant function, ϕ, of the System (14) perform the following steps: Step 1. Generation of a mesh of points exponentially close to ϕ. By Theorem 5.1(f), fix σ > 1, choose a random θ 0 ∈ [0, 1) and x 0 > 1 and, by using the recurrence (θ n , x n ) = F σ,ε (θ n−1 , x n−1 ), generate the data (θ n , x n ) : n = N 0 , N 0 + 1, . . . , N 0 + 2 J − 1 .
Step 5. Application of Theorem 4.2: data to compute the regularity. By using the coefficients d PER −j [n] from Step 4, calculate for j = 0, . . . , J − 1.
Step 6. Application of Remark 4.3: compute the regularity. Make a linear regression to estimate the slope τ of the graph of the pairs (−j, s −j ) with j = 0, . . . , J −1.
Then, when there is evidence of linear correlation between the variables −j and s −j , we set s = R(τ ).
Step 7. Final test of assumptions of Theorem 4.2 on the number of vanishing moments. If k > max(s, 5/2 − s) then f ∈ B s ∞,∞ (R) and, hence, f has regularity s. Otherwise we need to repeat Step 4 -7 with a Daubechies wavelet having a larger value of k until k > max(s, 5/2 − s).
Remark 6.6. We want to underline that the above algorithm can be performed for other systems. The reason is that the "regularity steps", Steps 4-7, are valid for f ∈ L ∞ (S 1 ) due to Theorem 4.2. Also notice that Steps 1-3 can be skipped if the wavelet coefficients are obtained using other methods such as solving the Invariance Equation for example. Indeed, recall that the graph of a function ϕ : S 1 −→ R is invariant for the System (14) if and only if ϕ is a fixed point of T. That is:

The above equation is called the Invariance Equation.
To give an approximation of ϕ on a certain mesh of points θ i ∈ S 1 , one can solve a non-linear system of equations by imposing ϕ(θ i ) = a 0 + N j=0 Nj n=0 d j,n ψ j,n (θ i ) on the Invariance Equation. The unknowns, like in the Fast Wavelet Transform, are the wavelet coefficients.
As a result, we can get an estimate of the regularity of the (strange) attractor of System (15) for the chosen value of σ and ε. This will be the main topic of the following section.

Results and conclusions
We have performed an exercise which is divided in two steps. Our testing ground will be the System (15). The range of values for σ is [1, 2] whereas ε is given by the function ε(σ) = (σ − 1.5) 2 when 1.5 < σ ≤ 2, 0 when 1 ≤ σ ≤ 1.5. That is, we will use the System .
Notice that with this parametrization, the above system is pinched if and only if σ ∈ [1, 1.5]. Roughly speaking, the parameters (σ, ε(σ)) control the vertical Lyapunov Exponent of x ≡ 0 and the pinching condition at the same time. Hence, in view of Proposition 5.3 we can test the quality of the wavelet coefficients given by Algorithm 6.5. Indeed, we know that for σ ∈ [1.5, 2.0] the estimated value of s must be close to 1; whereas for σ ∈ [1.0, 1.5] the regularity parameter s must be zero. Moreover, when σ crosses 1.5, in a decreasing way, the regularity parameter s has to jump from 1 to 0. Hence, the results obtained by Algorithm 6.5 applied to our problem and the fact that we can observe the jump of s at σ = 1.5 certify the quality of the algorithm that we have developed. Hence, we can have a degree of accuracy of the wavelet coefficients.
To do this test, first we have applied Algorithm 6.5 with N 0 = 10 5 and J = 30. In Figure 5 we plot the estimated regularities of System (15) as a function of σ with the above parametrization. Notice that, the estimated regularity detects the jump at σ = 1.5 in a correct way and agrees with the values described above despite of the fact that the concrete values of the regularity are not correct for σ 1.5. The choice of the Daubechies wavelet that we will use must be done carefully. Indeed, recall that Daubechies wavelets must have more than max(s, 5/2 − s) vanishing moments in order to be under the assumptions of Theorem 4.2. However, the increase of the number of vanishing moments, k, causes an increment of the support size of the wavelet (see [Mal98,Theorem 7.3]). On the other side, the support of the map ϕ is S 1 = R \ Z = [0, 1). Observe that, the ratio of growth between the support of ϕ and ψ −j,n is 1 to 2k−1 2 j . Therefore, there exists j 0 such that for j > j 0 the support of ψ −j,n is contained in [0, 1). In view of that, the first coefficients of the wavelet approximation will be affected by an error induced by the size of the support of ψ. To avoid such "bad performance", a good strategy is to choose different wavelets for different parameters as it can be guessed in Figure 6. Having said that, Figure 5 (and its forthcoming discussion) it is done with Daubechies wavelet with 16 vanishing moments. This choice is based in two reasons. The first one is that all Daubechies wavelets used generate a similar picture as Figure 5. On the other side, such wavelet explains "better " all the range of ε (even the close to zero case as it can be seen in Figure 6). Figure 6. The dot at level n above an ε-value means that the Pearson correlation coefficient applying Algorithm 6.5 with Daubechies wavelets of 2n vanishing moments is the biggest one (always greater than 0.99 and biggest in comparison with the Daubechies wavelets of k = 2n vanishing moments).
Remark 7.1. Figure 6 has another interpretation. Indeed, the Daubechies wavelets with 14 and 16 vanishing moments explain (in terms of Equation (13)) almost all the range of values of ε. However, there are regions of ε where other wavelets are better than these ones. This means that a good strategy to perform Algorithm 6.5 in a more accurate way is to have a dictionary of wavelets. Notice that such dictionary cannot exists in the Fourier setting.
The second step of the exercise, and in view of the results displayed in Figure 5, is the explanation of the three regions that appear. Indeed, in Figure 5 one can clearly appreciate three regions with different qualitative behaviour. One of them corresponds to the pinched case (i.e. σ ∈ [1, 1.5]) and the other two to the nonpinched one: σ ∈ (1.5, σ) and σ ∈ [ σ, 2] with σ ≈ 1.527. In what follows we discuss in detail these three regions.
Non pinched case: σ ∈ [ σ, 2]. In this region we have ε = (σ − 1.5) 2 7.29 × 10 −4 . That is, we are "far " from the pinched case. As we already know, see Proposition 5.3, the function ϕ whose graph is the attractor is continuous but not differentiable. Moreover, since we are far from the pinched case, ϕ is rather well behaved since we have lack of differentiability only in few points (see the left picture of Figure 7). This is confirmed by the estimated regularities that, not surprisingly (see Proposition 5.3), are in the interval (0, 1) and "far" from zero: R( s) ∈ [0.6822, 0.9669] (see the right part of Figure 5).
Observe that (see the right picture of Figure 7) in agreement with the computed Pearson correlation coefficient, the model given by Equation (13) is approximately linear (as we expect). Also, some few of the first coefficients are not well fitted (for the linear model) because of the aforesaid problem with the support of ψ.   In the left hand side it is plotted the attractor of System (15) for σ = 1.699219 (and ε = 0.039688). Whereas in the right hand side it is plotted its pairs (j, s j ) with −29 ≤ j ≤ 0. In this case R( s) = 0.91431.
The pinched case: σ ∈ [1, 1.5]. In this case ε = 0. Therefore, according to Theorem 5.1 and Proposition 5.3, the attractor is pinched and, hence, discontinuous almost everywhere. That is, the function ϕ whose graph is the attractor is an upper semi continuous function (see the left picture of Figure 8). Thus, in agreement with Proposition 5.3, the estimated regularity is equal to zero for the whole range of parameters as it can be guessed in the left part of Figure 5.   In the left hand side it is plotted the attractor of System (15) for σ = 1.425781 (and ε = 0.0). Whereas in the right hand side it is plotted its pairs (j, s j ) with −29 ≤ j ≤ 0. In this case R( s) = 0.0149.
Moreover, even being out of the hypothesis of Lemma 2.4 the coefficients obtained are almost linear. Indeed, from Theorem 4.2, the model given by Equation (13) has more freedom because there is a gap in [−1/2, 1/2] (see Theorem 4.2). However, see the right picture of Figure 8, the model is linear (except for a few first values as before). This means that the proposed solutions of Problem (1) and (2) does not add error on our computations (as the case of the left picture of Figure 8).
Approaching pinching case: σ ∈ (1.5, σ). In this region we have ε = (σ − 1.5) 2 7.29 × 10 −4 . That is, we are "close" to the pinched case. Therefore, since g ε = ε+|cos(2πθ)| is Lipschitz, by Proposition 5.3, the regularity must have a jump from 0 to 1. Thus, in the estimated regularities one must perceive such jump. Having said that, the estimated regularities in this region are not so good (see the caption of Figure 9). However, looking at Figure 5 and 9, the pass from 0 to 1 in a "fast way" is still observed.
(d) σ = 1.507812 (ε = 0.000061). Figure 9. In the left hand side it is plotted the attractor of System (15) for two instances of σ ∈ (1.5, σ). Whereas in the right hand side it is plotted its pairs (j, s j ) with −29 ≤ j ≤ 0. The estimated regularity of the attractors is R( s) = 0.6266 and R( s) = 0.4951 respectively.
Observe that, as in the other cases, the first few values of the supremums s j are the worst fitted (see the right pictures in Figure 9). But, in contrast with the previous regions, the rest of values of s j are not so "well behaved ". Nevertheless, they have a big Pearson correlation coefficient. This "bad behaviour " is probably due to a big value of the constant C > 0 of Theorem 4.2. Indeed, for such range of values, we are close to the change of space from positive regularity to zero. That is, from Remark 4.3 the constant C > 0 tends to infinity when ϕ is approaching to the pinched case. This is precisely what it is shown in Figure 10.
Finally, recall that the Fast Wavelet Transform is not the unique way to obtain the wavelet coefficients. In a forthcoming paper we will explore the technique of solving numerically the Invariance Equation given in Remark 6.6 which can be more adapted to the dynamical complexity of the object and to better recover the large set of zeros of the SNA. Moreover, it is also interesting to explore other models and understand other "routes to complexity" for the SNA's. In particular the study of the arc length curve (see [JT08]) or the Hausdorff dimension (see [GJ13]) by means of wavelets can help understanding some of these routes to strangeness.