Difference equations everywhere : some motivating examples

This work collects several situations where discrete dynamical systems or difference equations appear. Most of them are different from the examples used in textbooks and from the usual mathematical models appearing in Biology or Economy. The examples are presented in detail, including some appropriate references. Although most of them are known, the fact of collecting all together aims to be a source of motivation for studying DDS and difference equations and to facilitate teaching these subjects.


Introduction
The main goal of the theory of discrete dynamical systems (DDS) is to study the limit behavior of the sequence {x n } n , defined iteratively as x n+1 = F(x n ), in terms of the initial condition x 0 , where F is an invertible map defined on a given space. When F is not invertible sometimes it is said that it defines a semi-DDS. In particular, many difference equations and recurrences can be interpreted as semi-DDS or DDS.
They also appear frequently in problems of other branches of Mathematics. Without aiming to be exhaustive, some examples are: the iterative methods proposed to approximate the solutions of linear or non-linear systems, the Bernoulli iterative method to find the dominant root of a polynomial, the numerical schemes like the Euler, Taylor or Runge-Kutta methods designed to approximate the solutions of ordinary differential equations, the schemes of differences used to approximate the solutions of partial differential equations, the study of discrete Markov chains, the complex dynamics, . . .

Harmonic series and the Fibonacci numbers
It is well-known that the so called harmonic numbers h n = 1 + 1 2 + 1 3 + · · · + 1 n tend to infinity and so the harmonic series diverges. There are many demonstrations of this fact; in the papers [43,44,45] the authors collect more than forty proofs.
The oldest one (around 1350) is attributed to the French philosopher Nicole Oresme (1323-1382) and is the following: Difference equations everywhere 3 1 + 1 2 In a more formal way, h 2 k ≥ 1 + k 2 , and so lim k→∞ h 2 k = ∞. Here we include a more sophisticated proof ( [43]) based on the knowledge of the Fibonacci numbers, F n . These famous numbers were introduced in 1202 by the Italian mathematician Leonardo of Pisa, known as Fibonacci, to model a rabbit population. They also appear in many biological settings and are related with plant patterns, see for instance [68] and their references. These numbers are the solutions of the linear difference equation The first ones are 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, . . . and it is well-known that is the golden mean, which satisfies φ 2 − φ − 1 = 0. We refer to [26,30] for the methods used to obtain the explicit solution of the difference equations solved in this paper. Hence, is divergent because the general term of the latter series does not tend to zero.

A couple of puzzleś
Edouard Lucas presented in 1883 one of the most famous and funny puzzles, the known as tower of Hanoi puzzle, see for instance [53]. It consists of three rods and n disks of different sizes, which can slide onto any rod. The puzzle starts with the disks as in Figure 1 (there n = 8) and the objective is to translate this tower to the right rod, moving each time only one disk and following these two rules: • Each move consists of taking the upper disk from one of the stacks and placing it on top of a different stack. • Any disk must be placed either on top of bigger disks or on the basis of a rod. Call T n the (minimum) number of moves to solve the puzzle. It is clear that before solving it we must pass by the position where the bigger n − 1 disks are placed in the middle rod (this needs T n−1 moves). Then we only need one move to exchange the biggest disk from the left to the right, plus again T n−1 more moves to change the middle stack (that has n − 1 discs) to the right stack. In short, T n = 2T n−1 + 1, Hence T n = 2 n − 1 increases exponentially. In [26,Ch. 9] it can be seen how the similar difference equation T n = 2T n−1 + g n , where {g n } n is a given sequence, like for instance g n = c log(n), can be used to study the computational complexity of implementing the described algorithm in a computer. Another similar puzzle is the so called Chinese rings (also known as Baguenaudier, which means "time-waster" in French). It seems that it goes back to the Chinese dynasty Sung (960-1279), see Figure 1. It can be seen that if S n denotes the minimum number of moves to take all the rings off (we do not give here neither its description nor the study of S n ), see for instance [5,Ch. 10] or [54] for the details, it holds that S n = S n−1 + 2S n−2 + 1, Difference equations everywhere 5 Its solution, again with exponential grow, is

Tossing n coins
Take n coins, not necessarily fair, and assume that for each of them the probability of getting head is p and the one of getting tail is q = 1 − p.
We toss all of them together and want to know the probability P n of getting an even number of heads, where 0 counts as an even number.
It is well-known that the number of heads can be modeled by a random variable with Binomial distribution X n = B(n, p). Moreover this X n can be seen as the sum of n independent random variables Y j ∼ Y , each one with Bernoulli distribution with parameter p, that is defined as P(Y = 1) = p, P(Y = 0) = q. Thus X n = ∑ n k=1 Y k . Hence the probability of getting exactly k heads is and in consequence, An easy alternative way of getting a compact expression for P n comes from a difference equation. Notice that P n+1 = P X n+1 is even = P X n+1 is even and Y n+1 is 0 + P X n+1 is even and Y n+1 is 1 = P X n is even P Y n+1 = 0 + P X n is odd P Y n+1 = 1 where we have used twice that P(A ∩ B) = P(A / B)P(B). Hence Solving it we obtain the nice compact expression P n = 1 in astronomical unities (AU), remarkably coincides with the first terms of the sequence {d n } n defined above, except for one gap, the one corresponding to 2.8. When in 1781, William Herschel discovered Uranus, and the corresponding distance was found to be 19.22 AU (also near the mathematical prediction), people started to believe that this law holds in the whole Solar system. For this reason, in 1800 many astronomers began the research of the "lost" planet. In 1801, G. Piazzi found a minor-planet, named Ceres (today classified as a dwarf planet) which precisely was at a distance 2.76 AU, confirming once more the Titius-Bode law. In fact, already in 1807 four more minor-planets were found at a similar distance. Nowadays, thousands of bodies are localized in that region forming the so called asteroid belt. When Neptune was discovered in 1846, thanks to the mathematical predictions of Urbain J. J. Le Verrier, the Titus-Bode law was broken because its distance to the Sun was 30.11 AU. The former planet Pluto, today a dwarf planet, is at distance 39.54 AU, again in good agreement with the mathematical law.
Although Titius-Bode law was only empirical, without physical or mathematical explanations, some people have tried to find some reasons for the good predictions for the actual positions of the planets in the Solar system. For instance, in the nice paper [59] the authors propose a four body problem (center of masses of the Galaxy, Sun and two consecutive planets, the farthest planet with mass much smaller that the closest one). In that situation they prove that if the closest planet follows a circular orbit of radius R, one of the "more stable" solutions of this four body problem (a possible trajectory for the other planet) is at distance 3 2/3 R 2.08R, quite similar to the 2 given for n big by the law. That paper also proposes an explanation of the fail of the mathematical law for Neptune. Normalizing the masses such that the mass of the Earth is 1, the masses of the planets Jupiter, Saturn, Uranus, Neptune and Pluto

Factorial and subfactorial functions
A very classical problem of recreational mathematics (see for instance [5,App. G]), that goes back to the French mathematician Pierre de Montmort (1678-1719) is the next one: consider n persons, each one wearing a different hat. Which is the probability that if the hats and the persons are coupled randomly, no person wear its own hat?
To solve it, people introduced the so called subfactorial function !n. In fact, both the factorial and the subfactorial functions can be defined like solutions of similar difference equations: By using induction it is not difficult to prove that Let us relate this function with our problem. Given n persons and n hats it is clear that the total number of different possibilities of people wearing a hat is n!. On the other hand let us prove that the numbers of possibilities z n of people wearing a hat that is not its own hat is !n.
First we will prove that z n satisfies the following non-autonomous second order difference equation: To prove it, fix one of the n + 1 persons, say person 1, and assume that it wears a hat that is not yours, say hat k, k = 1. This hat can be selected of n different ways. Then consider person k. There are two possibilities: • The hat of person k is hat 1. Then there are z n−1 possibilities of wearing the n − 1 remaining hats without coincidences. • The hat of person k can be any hat but hat 1. Then, there are n remaining people and n remaining hats, and each person only has a forbidden hat. There are z n ways of wearing them. Then equation (1) holds. Finally, let us prove by induction that y n = z n . It is clear that y 0 = z 0 = 1. Assume that y j = z j for j = 0, 1, . . . n − 1 and let us prove it for j = n. Notice first that y n−2 = y n−1 +(−1) n n−1 . Thus z n = (n − 1) z n−1 + z n−2 = (n − 1) y n−1 + y n−2 = (n − 1) y n−1 + y n−1 + (−1) n n − 1 = ny n−1 + (−1) n = y n .
Hence the probability of having no coincidences is which clearly tends to 1 e when n goes to infinity. Similarly, the probability that at least one person wears its own hat tends to 1 − 1 e 0.63. In fact, the value 0.63 is already attained for n = 6 and does not essentially vary increasing n.

Computation of definite integrals
When studying the propagation of the error, several textbooks illustrate a surprising fact that happens computing recursively some definite integrals. By using the same algorithm, either forward or backwards, one is well-conditioned while the other one is not, see for instance [22,Ch. 1]. Consider x n e x dx.
By using integration by parts we get The above difference equation is useful for obtaining exact expressions of I n . For applying numerically it, take as initial condition an approximation of I 0 , say 0.6321, that is I 0 = 0.6321 ± ε, with 0 < ε < 3 × 10 −5 . Then, from the algorithm we get I n with the quite big absolute error, n!ε. On the other hand, take it backwards and with initial condition for n = 100, 0, This has sense because I n tends to zero when n goes to infinity, and I 100 = δ , with 0 < δ < 1/101, because Then all I k , for k from 90 to 0, can be obtained with the reasonable small absolute error δ 100×99×98×···×(k+1) .

2.7
The 3x + 1 problem Consider the simple difference equation, defined on the positive integers, , when x n is odd, x n 2 , when x n is even, The sequence {x n } n is called the orbit of x 0 . For instance, the orbit corresponding to x 0 = 11 is 11, 17, 26, 13, 20, 10, 5, 8, 4, 2, 1, 2, 1, . . . Until now, it has been checked that for any x 0 smaller that 87×2 60 , after finitely many steps the orbit arrives to the 2-periodic behavior 2, 1, 2, 1, . . . The so called 3x + 1 problem asks whether the above situation happens or not for any positive integer initial condition. See [50,51] for more information about it. According to J. Lagarias "this is an extraordinarily difficult problem, completely out of reach of present day mathematics." This problem is one of the simpler to state open problems in mathematics and it is frequently used in the talks addressed to young people for motivating them on science. It seems that it was studied for first time by the German mathematician Lothar Collatz, around 1930.
The map g can also be extended to the reals or to the complex, as giving rise to a complicated dynamical system, see [16,55].

Proofs without words
Famous sequences of numbers like the Fibonacci ones, F n , or the triangular numbers T n = 1 + 2 + · · · + n, satisfy some difference equations. For instance Although it is not difficult to prove them analytically, we show in next figure their proofs without words borrowed from the nice books [

Newton and Chebyshev methods
To find a simple solution s of a non-linear equation f (x) = 0, with f smooth, people use the iterative methods x n+1 = g(x n ), where and x 0 is an approximation of s. They correspond to the Newton and Chebyshev methods. Recall that it is said that an iterative method has order p towards s if lim n→∞ x n = s and It is well-known that Newton method is quadratic (p = 2) and Chebyshev method is cubic (p = 3) toward simple solutions. In fact, for some specific functions f , any of them can be even of higher order. Sometimes you can read that Chebyshev method is faster than Newton method. This assertion could led to some misunderstandings. Let us clarify why. Consider the sequence given by Newton method {x n } n and define a new sequence z n = x 2n , n ∈ N. It is clear that z n+1 = g N (g N (z n )), z 0 = x 0 ; and we will call this "new" method bi-Newton. Thus Newton method has p = 2 and bi-Newton method has p = 4, but clearly the one with higher order is not better than the other one. Both are equal.
The above situation leads to the definition of efficiency of a method, associated to a solution s of a given equation f (x) = 0. This efficiency E, should take into account not only the order of convergence of the method towards s but also the time t needed for computing x n+1 from x n . Hence E = E(p,t). The reasoning of the above paragraph implies that this function must satisfy E(p 2 , 2t) = E(p,t), since applying g • g one uses twice the time of applying g. In fact, in general First, we will prove that the "simplest" smooth solution of the above functional equation is E(p,t) = p 1/t , which precisely is the definition of efficiency of a method towards s, see also [73].
To prove this assertion, assume first that (2) holds for all k ∈ R + . Taking derivatives with respect to k it holds that Replacing k = 1 in the above equation we get that E must be a solution of the linear partial differential equation, To find all its solutions we will us the method of characteristics. That is, first we have to find two functionally independent first integrals of the system of ordinary differential equations From the first two equations we get the new ordinary differential equation d p/dt = p ln(p)/t, that has the first integral H(p,t, E) = ln(p)/t. From the third one we get the fist integral H 2 (p,t, E) = E. Hence the general implicit solution of (3) is Φ(ln(p)/t, E) = 0 and its general explicit solution is E = φ (ln(p)/t), for any smooth functions Φ and φ . Hence the "simplest" solution comes taking φ = exp and then E = exp(ln(p)/t) = p 1/t . Another natural and equivalent definition would be possible: simply take φ = Id, and then the efficiency would be ln(p)/t.
Then the most efficient method among a list of methods, in the sense that it is the one that requires less computational time to get s with a desired accuracy, is the method with biggest efficiency.
As an illustration, let us compare several methods, with increasing orders, to compute √ a. Applying Newton and Chebyshev methods to In particular, g 1 = g N , g 2 = g C and, for instance, To know the efficiency of these three methods we must know the respective times t m , m = 1, 2, 3 needed to compute an iteration. They only perform additions, subtractions, multiplications and divisions. Since the most consuming time ones are the last two, to simplify the problem we only will take into account the number of multiplications and divisions at each step. We will call τ the time used for each of these operations. For the Newton method, at each step we only need two divisions: a/x and 1/2, so t 1 = 2τ.
For the Chebyshev method and the one associated to g 3 we use the following procedures based on the Horner's evaluation of polynomials. First we need two divisions for computing W := a/x 2 , and then Difference equations everywhere 13 Hence t 2 = (2 + 3)τ = 5τ and t 3 = (2 + 4)τ = 6τ. Therefore the efficiencies of g 1 , g 2 and g 3 are and it holds that In general it can be seen that E 1 > E m , m ≥ 2 and in consequence although the methods that we have introduced have increasing orders the most efficient one is in fact the one with lower order, the celebrated Newton method.
It is also worth to comment that the bi-Newton method can also be written using the value W introduced above as but it is more efficient to use it applying twice the Newton expression.
We end this section with some historical comments about the Newton method for computing √ a, and also about bi-Newton method. Most probably, this Newton method is the first recurrence developed by humanity. The babylonians, around (2000-1700 BC), already proposed it as a method for computing (with enough precision for their interests) the square root of a number, see [15,Ch. 7]. This method was also described by Heron of Alexandria in the first century AC. They only perform the first steps of the method and they deduce it with a beautiful geometrical reasoning that I can not resist to include here: Let x 0 be a good approximation of √ a. Then construct the rectangle with one side x 0 and the other one such that its area is a. If it is a square we are done and x 0 is the searched square root. Otherwise it is a rectangle with the sides of lengths x 0 and a/x 0 . One length is bigger that the square root and the other one is smaller. Therefore it is natural to take as a better approximation for √ a their average, that is x 1 = (x 0 + a/x 0 )/2, and the (Newton) method appears naturally.
The bi-Newton method, presented in a similar manner that in (4), was found in an ancient Indian mathematical manuscript called the Bakhshali manuscript (of around 200-400) and so, nowadays is known as Bakhshali method.

Approximating π.
This section includes three different algorithms to approximate π, all them based on recurrent procedures.

Archimedes approximations
The method devised by Archimedes to approximate π consists in computing the perimeters of the regular polygons with 6 × 2 n sides circumscribing and inscribed in a circle of diameter one, denoted as q n and p n , respectively. Then, p n < π < q n and lim n→∞ q n = lim n→∞ p n = π.
We will first deduce a simple recurrence for p n . For the inscribed hexagon (n = 0) it is clear that p 0 = 3. Let us call x the length of a side of a given regular polygon, and let us compute the length y of the side of the regular polygon with the double number of sides, see  By using twice Pythagoras Theorem we get that Consequently, from the right-hand expression x 2 4 + 1 4 + z 2 − z = y 2 and, substituting in the left-hand one, 1 2 Hence, if n is the length of a side of a regular polygon with 6 × 2 n it holds that Notice, that by definition lim n→∞ p n = π. This algorithm produces round-off errors due to cancelations when computing 1 − √ 1 − u, for u = 2 n smaller and smaller. It is convenient to transform it using that and the final algorithm is We get that There is a different expression of Archimedes approach, see [32,Chp. 1] for the details, that computes both q n and p n simultaneously. It holds that q n+1 = 2q n p n q n + p n , p n+1 = √ q n+1 p n = p n 2q n q n + p n , Moreover q n+1 − p n+1 < (q n − p n )/3. For instance, taking the polygon with 96 sides (n = 4) we get p 4 = 3.1410 . . . = p 4 < π < q 4 = 3.1427 . . . and we recover the classical Archimedes bounds 3 + 10 71 < p 4 < π < q 4 < 3 + 10 70 .

A new simple algorithm
The starting point of this algorithm is the simple equality 16 Armengol Gasull Another similar and very nice equality, given by Dalzell ([27]) in 1944, is Both together prove easily that 3 < π < 22/7. Notice that 3 and 22 7 are precisely the first two convergents of the development of π in continuous fractions, see (7). In [3,8,28,60,61,69] there appear many other similar relations and some of them have already been used to get algorithms for approaching π, faster than the one we deduce here.
Let us start the deduction of the algorithm. It holds that So, for 0 ≤ x ≤ 1, Since for |u| ≤ 1/2 the convergence is uniform, Hence, if we define We will prove below that J n = I n − I n+1 , where I n := 3 Therefore, Moreover, Hence, By taking as approximations for π the nth partial sums of this alternating sum we get alternatively upper and lower bounds for π. Finally, let us prove (5) and (6). The first formula follows because To prove the second one, notice first that integrating by parts, Equivalently, On the other hand Joining both expressions, canceling K n , we get (2n + 1)I n = 3 2 n − n 2 I n−1 , which clearly implies (6).

Brent-Salamin algorithm
In 1973, independently, Salamin ([74]) and Brent ([12]) found a quadratic method to approximate π. It is based on some classical equalities involving the so called arithmetical-geometrical mean, that already appears in the works of Gauss and Legendre, complemented with efficient algorithms for computation of multiplications and square roots. The proofs of these equalities are based on the theory of elliptic integrals, see [9,70]. In fact, it is not difficult to prove that if we consider a 0 > 0 and b 0 > 0 and we construct the sequences a k+1 = (a k + b k )/2, b k+1 = √ a k b k , then lim k→∞ a k and lim k→∞ b k exist and coincide. This value is the arithmetic-geometric mean of a 0 and b 0 and it is denoted by AGM(a 0 , b 0 ). The difficult and remarkable equality is , where Thus, taking a 0 = 1 and b 0 = 1 √ 2 , The Brent-Salamin algorithm computes where a k and b k are given above and proves that it is an algorithm that converges quadratically to π. Notice that to get it, the series is truncated and moreover it is used that for n big enough AGM(1, 1/ √ 2) ≈ a n+1 = (a n + b n )/2. Hence z 1 = 3.140 . . . , z 2 = 3.14159264 . . . , |z 3 − π| < 2 × 10 −19 , |z 4 − π| < 6 × 10 −41 , |z 5 − π| < 3 × 10 −84 . Nowadays there are other similar, and even faster, algorithms, see for instance [2,10,11,29,38].

The GCD as a dynamical system
In his beautiful paper Cooking the Classics ( [76]), Ian Stewart presents a way of computing the greatest common divisor (GCD) of two integer numbers by using a continuous DDS that we reproduce here. As he already commented, this way of obtaining the GCD is a dynamical reformulation of the classical way already used by the ancien Greeks called anthyphaeresis. This method consists of removing (the biggest possible) squares of a rectangle until arriving to a new rectangle for which this size of squares cannot be further removed, and then continue with the same procedure with the new rectangle and smaller squares and so on, until arriving to the empty set.
Proof. We start proving that if we define the two functions V,W : Ω → N as V (x, y) = x + y and W (x, y) = gcd(x, y), where Ω = N 2 \ {(x, y) ∈ N 2 : xy = 0}, they are a strict Lyapunov function and a first integral, respectively, for the semi-DDS generated by F, on Ω . In fact, for (x, y) ∈ Ω , Observe also that V (F(x, y)) = V (x, y) when xy = 0.
To prove that W (F(x, y)) = W (x, y) notice that if z divides both, x and y, it also divides max(x, y) and min(x, y) and so does with the two components of F. Conversely, if z divides max(x, y) − min(x, y) and min(x, y) it also divides their sum, max(x, y), and so it divides x and y.
This result gives the following recurrence: a n+1 = max(a n , b n ) − min(a n , b n ), b n+1 = min(a n , b n ), (a 0 , b 0 ) ∈ N 2 , a 0 b 0 = 0, for which there exists m = m(a 0 , b 0 ) ∈ N such that a n = gcd(a 0 , b 0 ) and b n = 0 for all n ≥ m.
It can be seen that this procedure is also equivalent to the Euclides algorithm, with the advantage that there is no need of introducing divisions. Moreover, it allows to extend the definition of GCD to positive rational numbers. Set Q + = Q ∩ {x ∈ R : x ≥ 0}, then: Corollary 1. For each a = p/q, b = r/s with (a, b) ∈ (Q + ) 2 and gcd(p, q) = gcd(r, s) = 1 there exists m = m(a, b) and a rational number c such that F m (a, b) = (c, 0). This c will be called the greatest common divisor of a and b, gcd(a, b). Moreover c = gcd(a, b) = gcd(p, r) lcm(q, s) and lcm(a, b) := ab gcd (a, b) .

Landen maps
Loosely speaking, when given a family of definite integrals depending on parameters there is a map among these parameters in such a way that the integrals remain invariant, then this map is called a Landen map. In other words, the integrals are first integrals of the semi-DDS generated by this Landen map. The paradigmatic example goes back to the works of Gauss and Landen (1719-1790, a British amateur mathematician) on elliptic integrals. For a > 0, b > 0, consider the elliptic integral They proved that Difference equations everywhere 21 π/2 0 dθ a 2 cos 2 θ + b 2 sin 2 θ = π/2 0 dθ a+b 2 2 cos 2 θ + ab sin 2 θ , (9) or, in other words, that if then I(a, b) = I (F(a, b)) and so I is a first integral or an invariant of the semi-DDS generated by F, see [9,23,36,52]. In Section 6.1 we will prove an extension of this result due to Carlson ([14]).
As we have already explained in Section 4.3, for a > 0, b > 0 it holds that where AGM(a, b) is the arithmetic-geometric mean of a and b. Let us prove that This holds because I(a, b) is invariant by F and so, for all n ∈ N, Hence, we have the following algorithm for computing elliptic integrals a n+1 = a n + b n 2 , b n+1 = a n b n , Let us prove that it converges quadratically. Without loss of generality, we can consider a > b > 0. Then, for all n ≥ 1, b < b n < a n < a and a 2 n+1 − b 2 n+1 = 1 4 (a n − b n ) 2 .

Armengol Gasull
Hence Similar procedures can be used to compute also quadratically other elementary functions like log(x), e x , . . ., see [9,12].
Other interesting examples are given by some rational improper integrals. The simplest one is If we consider it is not difficult to prove that I(a, b, c) = I (F(a, b, c)). Hence Notice that in this case there is no need to reach the limit to compute the integral because all points reach fixed points in two steps. Other rational examples, much more involved, are studied in [7,17,62,63].

Computation of other means
In [14], Carlson extends (9) to other types of means. Given a and b positive, consider for each couple (i, j) with i, j ∈ {1, 2, 3, 4}, the next sequences: a n+1 := f i (a n , b n ) , b n+1 := f j (a n , b n ) , n ≥ 0 , where It is not difficult to see that given i and j, the sequences {a n } n and {b n } n converge to a common limit that we will denote as i, j (a 0 , b 0 ) . Clearly these functions are some kind of means. In next result it will appear the Beta function, that is for each m and n positive. It satisfies B(m, n) = B(n, m) and B(1/2, 1/2) = π, see for instance [1].
with r = s + s − r. By taking the parameters (r, s, s ) according to next table: it holds that R(r; s, s ; a 2 , b 2 ) is a first integral of the DDS associated to F i, j (a, b) := ( f i (a, b), f j (a, b)), i < j . That is, for the (r, s, s ) corresponding to the i < j considered, R r; s, s ; a 2 , b 2 = R r; s, s ; Proof. Given a and b, take t = s(s By using the above three equalities, the function (11) can be written as To prove that (11) is a first integral of F i, j , both expressions (11) and (13) (a, b)) 2 , ( f n j (a, b)) 2 for all n ∈ N we get Hence (12) holds.
Notice that for (i, j) = (1, 2), Thus, modulus a multiplicative constant, the given first integral coincides with the first integral given by Gauss and Landen. We end this section with the computation of the harmonic-geometric mean of a > 0 and b > 0, HGM(a, b). It is natural to introduce it as the common limit of the two components of (a n+1 , b n+1 ) = G(a n , b n ), where It is easy to see that if I is a first integral of the SDD given by f , that is, if I • F = I, then given any bijection ϕ, it holds that J = I • ϕ is a first integral of the DDS generated by g = ϕ −1 • f • ϕ. Effectively, By taking f = F, with F given in (9), I as in (8) is a first integral of G. Arguing as in the proof of (10) it holds that J(a, b) = HGM(a, b) π 2 . By using this last equality, (10) and (14) we get that AGM(a, b) · HGM(a, b) = ab.

Gambler's ruin
Assume that a gambler starts playing a game having a capital N ∈ N. He wins a match with probability p, and then its capital increases by 1, and losses a match with probability q = 1 − p, and in this situation his capital passes to be N − 1. The game consists of successive matches and ends when either he arrives to a priori fixed capital, A ∈ N, A > N, or when he losses all its capital arriving to 0, see for instance [33,Ch. XIV]. We want to know the probability R N that the player get ruined starting with the capital N. It seems that this problem goes back to a letter from Blaise Pascal to Pierre Fermat in 1656.
Clearly the above game is equivalent to another one: two players, one against the other following similar rules and one with initial capital N ≥ 0 and the other one with initial capital A − N ≥ 0. This equivalent game stops when one of the two losses all its money.
The first game can be modeled by using the random variable M m = N +S m , where S m is a simple random walk. More concretely, S m = ∑ m j=1 X j , where all X j are independent, identically distributed random variables, with X j ∼ X and X a Bernoulli type random variable such that P(X = 1) = p, P(X = −1) = q.
Call B n the event "get ruined if we start the game with a capital N = n". This happens if M m = 0 for some m ∈ N and M i < A for all i < m. It holds that = P(B n / X 1 = 1)P(X 1 = 1) + P(B n / X 1 = −1)P(X 1 = −1) = P(B n / X 1 = 1)p + P(B n / X 1 = −1)q ( ) = P(B n+1 )p + P(B n−1 )q = R n+1 p + R n−1 q, where the equality ( ) is intuitively clear and can be proved without major difficulties. The case p = 0 is trivial and can be treated apart. For p = 0, we get that Notice that the boundary conditions come naturally from the rules of the game. The general solution of this difference equation is α + β q p n when p = q and α + β n when p = q = 1/2, for some real numbers α and β . Imposing the boundary conditions we get that for p = q, and n = N, and for p = q = 1/2, R N = 1 − N A . The above result also proves that the probability that the game does not finish is zero. In fact, the game ends if either the player gets ruined, or he arrives to the capital N. The probability of the first event is R N , while the other one S N , corresponds to another player that starts with capital A − N, instead of N, and a new p equal to q = 1 − p. Then which clearly gives R N + S N = 1. Obviously, both maps N → A − N and p → 1 − p, are involutions. Similarly, it can be seen that if we take the random variable D N , "duration of the game," and we call E N its expectation, E N = E(D N ), that can be proved to be finite, then for p = 0, and its explicit solution can be found similarly.
Let us explain how we have found this map F. It fact, this result goes back to the studies of 1621 of the French mathematician Claude Gaspard Bachet de Méziriac (1581-1638) about the rational solutions of the now called Bachet equation y 2 − x 3 = k. Nowadays, this equation is also known as Mordell curve, in honor of the American-born British mathematician Louis J. Mordell (1888Mordell ( -1972 who proved that for each 0 = k ∈ Z it contains finitely many points with integer coordinates. This finiteness result was also proved in 1908 by the Norwegian mathematician Axel Thue. Bachet already showed that if (x, y) is a solution of this equation, then is also a solution. Although it is not clear how he found this result, today the more common explanation is a geometric one related with the group structure operation on the elliptic curves. After some computations it can be seen that if we consider the curve C := {(x, y) ∈ R 2 : y 2 − x 3 − k = 0} and its tangent line L at P = (x, y) ∈ C it holds that Then, G k k=y 2 −x 3 = F. Another simpler and famous integrable rational map is the Lyness map L(x, y) = y, a + y x .

It has the elliptic first integral
H(x, y) = (x + 1)(y + 1)(x + y + a) xy , 28 Armengol Gasull because H(L(x, y)) = H(x, y). This map L generates the DDS associated to the second order difference equation Assume for instance that a ∈ Q. Then it is known that for any a / ∈ {0, 1} the above recurrence generates (real) periodic sequences with infinitely many different prime periods, see [6,79]. On the other hand it only generates periodic sequences of rational numbers with prime periods 1,2,3,4,5,6,7,8,9,10 or 12 and all these periods are possible for some x 0 , y 0 and a. Moreover, if we restrict our attention to positive rational values of a and positive rational initial conditions the only possible periods are 1, 5 and 9. Taking a = n 2 − n and x 0 = x 1 = n ∈ N we obtain trivially 1-periodic integer sequences. The existence of positive rational periodic points of period 5 is well-known and simple: they only exist when a = 1 and in this case all rational initial conditions give rise to them because the recurrence is globally 5-periodic. To prove the existence of the 9 periodic sequences was the goal of [35], disproving a Conjecture of Zeeman about this question. See also that paper and its references for more details about the subject.
These strong differences between real and rational dynamics for integrable birational maps are explained by Mazur's Torsion Theorem (see for example [75]) that provides a complete and short list of possible torsion subgroups for rational elliptic curves.

A divide-and-conquer algorithm: QUICKSORT
The so called QUICKSORT algorithm was proposed by the British computer scientist C. A. R. Hoare, in 1959 and it is, in average, one of the best ones for sorting n objects when n is big. Let us describe it and prove that although sometimes it needs O n 2 2 comparisons, in average it only needs O(2n ln(n)). This section is based on [40,Ch. 13.8], see also [26,Ch. 9].
To order n different objects the algorithm follows next three steps: 1. We randomly choose one of the objects. Call r its ordered position in the list, 1 ≤ r ≤ n. A priori we do not know this value r. 2. We compare each of the remainder n − 1 objects with this one. Those smaller than r are placed in its left-hand side (there are r − 1) and the biggest one in the right-hand side (there are n − r). We get two piles with r − 1 and n − r objects, which in general are not ordered. 3. We take each of the piles (with r − 1 and n − r objects) and we proceed in the same way, starting from step 1, until one of these values is 1. In this case, the corresponding branch of the algorithm stops.
Let us compute the average time needed for the algorithm to sort all the n objects. We normalitze to 1 the time needed to compare two objects and put the biggest (or the smallest ) in the right (or the left) pile. We also assign time 1 to the time needed to choose randomly one of the objects from a list.
To formalize the algorithm we introduce two random variables. The first one, X n , mesures "the time needed to sort n objects following the proposed algorithm". Notice that its value depends on the values of each of the random selections done of the values r appearing in each of the steps of the algorithm. The second one, Y n gives "the place of the chosen object in the ordered list formed by the n objects". The random variable X n takes values in R while the second one takes values in {1, 2, . . . , n}, and each one of them with probability 1 n . Set T n for the expectation of X n , that is T n = E(X n ). It holds that The key point in the above chain of equalities is step 2 of the algorithm that says that, once we know that we have chosen the k-th object of the list, the expected time needed to order all the list is the sum of the expected times needed to order the piles with k − 1 and n − k object, which are T k−1 and T n−k , respectively, plus n, that precisely corresponds to the n − 1 comparisons needed to do the two piles in step 1, plus 1, that correspons to the first selection. Hence, E X n /Y n = k = n + T k−1 + T n−k .
The obtained relation and the one corresponding to n − 1 write as Subtracting them we get that nT n − (n − 1)T n−1 = 2n − 1 + 2T n−1 , or equivalently, To compute T n we introduce the change of variables S k = T k k+1 . Then the above difference equation writes as It is easy to solve because 30 Armengol Gasull S n = S n−1 + c n = S n−2 + c n−1 + c n = S n−3 + c n−2 + c n−1 + c n Hence, where h n = 1 + 1 2 + 1 3 + · · · + 1 n . Finally, recall the result of Euler (1731), where γ 0.577218 is the Euler constant, see for instance [40,49]. Hence, T n = (n + 1)S n = 2(n + 1)h n − 3n = 2(n + 1) γ + ln(n) + R(n) − 3n = 2(n + 1) ln(n) + (2γ − 3)n + 2(n + 1)R(n) + 2γ ≈ 2n ln(n), as we wanted to prove. It is clear that if we have very bad luck choosing the r-th objects during the algorithm, it takes longer to order the n objects. The worst situation occurs when, at each step, the chosen element is always the first of the list. Then we would need to perform n + (n − 1) + (n − 2) + · · · + 2 = (n + 2)(n − 1) 2 ≈ n 2 2 comparisons, that is X n ≈ n 2 2 . Taking as a unity of time 10 −6 seconds and n = 10 6 , n 2 2 × 10 −6 seconds 5.8 days and 2n ln(n) × 10 −6 seconds 0.46 minutes, and so, the expected time is much lesser than in the worst possible situation.
To decrease the possibilities that this worst situation happens there is a usual modification of the algorithm, that almost does not increase the number of needed comparisons, that consists in replacing step 1 by the similar one: 1'. We choose randomly 3 objects among all the n objects of the list. The one that is in the middle of these three is the one that gives the value of r.
In the "Appendix II: Solving recurrences" of [31] there is a systematic treatment of divide-and-conquer algorithms. The corresponding running-time functional equations are of the form where a and b are constant parameters and f is a given function. For some values of b they give rise to difference equations.

Lower bounds for Hilbert numbers
Consider real planar polynomial systems of ordinary differential equations (ODE), x = P(x, y), y = Q(x, y) with P and Q polynomials of degree at most n. The Hilbert number H (n) is the maximum number of limit cycles that these families of ODE can have, or infinity if there is no upper for the number of limit cycles. Recall that limit cycles are the periodic orbits that are isolated in the set all of the periodic orbits of the ODE. It is well-known that H (0) = H (1) = 0, but even for n = 2 it is not known whether H (2) is finite. It is known that H (2) ≥ 4.
The knowledge of H (n) is one of the most elusive problems of the famous Hilbert's list and constitutes the second part of Hilbert's sixteenth problem.
Based on the ideas of [18] and by using very simple background on ODEs and difference equations we will provide quadratic lower bounds for H (n).

Proposition 1.
There exists a sequence of values n k tending to infinity and a constant K > 0 such that H (n k ) > Kn 2 k . Proof. The construction of the ODE that gives this lower bound is a recurrent process. Let X 0 = (P 0 , Q 0 ) be a given polynomial vector field, of degree n 0 with c 0 > 0 limit cycles. Since this number of limit cycles is finite, there exists a compact set containing all of them. Therefore, doing a translation if necessary, we can assume that all of them are in the first quadrant. By simplicity we continue calling X 0 this new translated vector field. From it, we construct a new vector field, by using the (non-bijective) transformation The differential equation associated to X 0 isẋ = P 0 (x, y),ẏ = Q 0 (x, y) and it writes in these new variables asu By introducing a new time s, defined as dt/ds = 2uv, we get that this ODE is trans- Since each point in the first quadrant (x, y) has four preimages (± √ x, ± √ y), the new ODE has at each quadrant a diffeomorphic copy of the positive quadrant of the Clearly it has only the limit cycle r = 1. Thus c 0 = 1 and n 0 = 3. Hence, K = 1/16 and n k = 2 k+2 −1. On the other hand if we consider as X 0 to be any quadratic system that reveals that H (2) ≥ 4, then c 0 = 4 and n 0 = 2. By using this seed, K = 4/9 and n k = 2 k 3 − 1. By using the best known lower bounds for H (n) for n small, given in Table 1, see [39,41,42,56,58,71,78], we get that the best K obtained is 6/5. In fact, it is known that H (n) ≥ K n 2 log(n), and for the moment it is the best result on lower bounds for H (n), see again [18]. Our proof is totally inspired in that paper. In their study, the authors add at each step some limit cycles that appear by perturbing the centers created by the method on the axes uv = 0. They obtain (see also [56,57]) that instead of (15), it holds that n k+1 = 2n k + 1, c k+1 = 4c k + d k .
for a given sequence {d k } k . Studying it they got these better lower bounds.

Coxeter difference equations
Globally periodic recurrences have recently attracted the interest of many researchers, see for instance [4,13,19,25,47]. Here we recall one of the less known families of rational examples, the one introduced by Coxeter in 1971, see [24].
For each natural number n ≥ 2, Coxeter proved that the recurrences x n+m = f m x n , x n+1 , . . . , x n+m−1 := 1 − x n+m−1 are globally (m + 3)-periodic, that is for any admissible set of initial conditions, x n+m+3 = x n , for all n ≥ 0. For instance, for m = 2, 3, the recurrences are respectively. It is easy to see that for m = 2 , in the new variables u n = x n − 1, it corresponds to the well-known 5-periodic Lyness difference equation which has already appeared in Section 8. As usual, the study of the above recurrences can be reduced to the study of the discrete dynamical system given by the map F m (x 1 , x 2 , . . . , x m ) = (x 2 , x 3 , . . . , x m , f m (x 1 , x 2 , . . . , x m )).
In his paper Coxeter gives a proof that these recurrences are globally (m + 3)periodic, based on the properties of some cross-ratios. In [20] the authors gave a new algebraic proof of this result showing that F m+3 m = Id . They also prove the surprising fact that "all" the possible geometrical behaviors that linear real globally periodic recurrences can have are present in the Coxeter map. We state their result in next results, where, as usual, [s] denotes the integer part of s. Proof. Let L : R m → R m be the globally periodic linear map L(x 1 , . . . , x m ) = (x 2 , . . . , x m , a 1 x 1 + a 2 x 2 + · · · + a m x m ) associated to the periodic recurrence. It is known that the characteristic polynomial of L has not multiple roots, see [25,48]. On the other hand it is a real polynomial of degree m and all its roots also must be (m + 3)-roots of the unity. So it divides λ m+3 − 1. Thus the proof follows by counting the different number of possibilities for removing a degree 3 real factor from λ m+3 − 1. fixed points, all of them with positive coordinates. At each of these fixed points F m is locally conjugated to a linear (m + 3)-periodic recurrence which has no line of fixed points. Moreover, all these m + 3 linear maps are not conjugated among them. As a consequence, all linear globally (m + 3)-periodic recurrences having no line of fixed points are present in the Coxeter difference equation.
On the other hand the map F has three fixed points x i := (x i , x i , x i , x i , x i ), i = 1, 2, 3, with x 1 = 1 2 , x 2 = 1 − √ 2 2 and x 3 = 1 + √ 2 2 . It is easy to prove that for each i, the characteristic polynomial of the differential matrix d(F 5 ) at the point x i is exactly the polynomial p i (λ ), as it is predicted by Theorem 3.