Mortality by cause of death in Colombia: a local analysis using spatial econometrics

Colombia is undergoing major changes in mortality patterns. National- and department-level cause-specific analyses have previously been carried out, but very little is known about municipal-level trends, despite their epidemiological interest. We first analyze standardized mortality rates for seven cause-of-death groups to obtain high and low mortality clusters based on the spatial autocorrelation indicators Global Moran’s I and Local Moran’s I. The Mann–Whitney nonparametric test is then used to ascertain statistical associations between the high and low mortality clusters and known health determinants. We subsequently apply spatial lag and Durbin (when spatial autocorrelation was present) and OLS models (when not) to explain overall spatial patterns in cause-specific mortality. Age- and sex-specific cause-of-death mortality and population data were obtained from the National Administrative Department of Statistics (DANE). Deaths were corrected for each municipality due to under-registration. Results show that spatial autocorrelation declined over time for all cause-of-death categories, except male circulatory system diseases and perinatal mortality. It is highest in external causes, especially among men, with mortality hotspots moving from the central Andean area to Orinoquia and the Amazon rainforest. Male mortality is also more spatially clustered than female mortality and especially neoplasms, and external-cause mortality is also indirectly affected by the conditions of neighboring municipalities. Municipal surface area, ethnicity and public expenditure on health and education are the most frequent contextual variables explaining territorial differences in mortality. The identification of geographical mortality clusters in Colombia will allow decision makers to prioritize those regions with higher mortality.


3 Background
Mortality in Colombia is still transiting from the second to the third phase of the epidemiological transition (ET), i.e., from a pattern dominated by communicable diseases to one where chronic diseases have become most common (Omran 1971;Otero 2013;Cristancho 2017). Likewise, Frenk et al. (1991) stated that in Latin American countries the ET is characterized by its dilated and heterogeneous pattern that blends infectious with chronic degenerative and social pathological diseases. In the case of Colombia, some of the country's interior and poorer departments still have significant rates of both emerging and re-emerging communicable diseases, which accords with the unequal regional socioeconomic development in the country. For instance, in the Amazon Region acute respiratory infections are even today among the top three mortality causes, as communities there have suffered a systematic lack of targeted public policies to improve their quality of life and also lag behind compared to the rest of the country in terms of economic and social development (Otero 2013). Conversely, the wealthiest parts of the country (including the capital district of Bogotá, Cundinamarca, Quindío, Risaralda and Valle del Cauca) are clearly situated in the third phase of the ET, as infant mortality is low, life expectancy high and cancer has (almost) displaced ischemic heart disease as the primary cause of death (ibid.; Cristancho 2017).
Mortality studies are fundamental to understanding the health status of a population, in which analysis of mortality by cause of death allows one to gain initial insights into possible reasons for mortality differences (Spijker 2004). By also analyzing their spatial distribution and its etiology (i.e., direct and indirect factors that cause diseases), decisions can then be made in accordance with the needs of each region, whether in terms of health or other health-related aspects (López and Arce 2008). The level of mortality could therefore be considered as an indicator of the health and living condition of a population, which are, in turn, related to demographic, economic, social, biological, cultural and political factors.
Interest in the study of mortality in Colombia has increased in recent years due to the changes in the cause-of-death pattern from communicable to chronic diseases. The most studied themes have focused on mortality rates and life expectancy for which updated information is publicly available at the departmental level. Important emphasis has also been given to external causes, particularly homicide-because of the country's historic specificity and the convergence of problems of a political, economic, legal, psychological and individual nature (e.g., Franco-Agudelo 1997; Arroyave et al. 2014), non-communicable diseases such as cancer (e.g., Piñeros et al. 2013) and certain communicable diseases typical of tropical countries (e.g., dengue; Misnaza et al. 2016). On the other hand, information on the impact of specific causes of death on life expectancy is limited (Cristancho 2017), as well as analyses based on geographical models that allow the identification of mortality clusters based on municipal data. Indeed, geographical analysis of mortality in small areas has been a neglected topic despite its epidemiological interest, not only in Colombia but in Latin American demography in general, perhaps with the exception of Brazil. However, given the known regional diversity in cause-specific mortality in 1 3 Mortality by cause of death in Colombia: a local analysis using… Colombia, it justifies analyzing such patterns at a small scale. The same applies to the use of spatial econometrics, a tool that permits the identification of high and low mortality clusters and potential explanations for the observed spatial variation.
Conversely, international literature on the analysis of spatial autocorrelation in issues related to health and mortality is quite abundant and varied. Such studies have generally sought to provide evidence on causal relationships related to the environment and use of health services, which can be used to facilitate public health decision making. Examples based on US county-level data include adherence to breast and colorectal cancer screening (Feng et al. 2016), colorectal cancer mortality in North Carolina (Kuo et al. 2019) and stroke hospitalization and mortality risk in Florida (Roberson et al. 2016). Recent spatial analysis studies from other countries include ones by Rodrigues et al. (2013) on maternal and child health in the Brazilian state of Pernambuco, Santos et al. (2019) on child leprosy in Northeast Brasil, Piuvezam et al. (2015) on mortality from cardiovascular diseases among Brazilian elderly, Kim et al. (2016) on cancer mortality from different sites in Korea, Lin et al. (2019) on suicide mortality in Taipei City, Taiwan and Thompson and Gartner (2014) on homicide mortality in Toronto. Notwithstanding, there are perhaps only a handful of spatial studies that analyzed more than one cause-of-death category (two exceptions include Diez Roux et al. 2007;Windenberger et al. 2012) or that examined the geographical variation in mortality rates by testing the effect of potentially explanatory features of neighboring areas after accounting for the characteristics of the specific area (one exception is Yang et al. 2015).
Although different contributions have been made to the analysis of causespecific mortality in Colombia, spatial autocorrelation studies in general and regarding (cause-specific) mortality at the municipal level in particular are still scarce in Colombia. Acevedo Bohórquez and Velásquez Ceballos (2008) analyzed municipal information of homicides recorded in 2001 in a single department of the country. Their results showed that mortality for this cause did not have a random distribution. The Colombian National Health Institute and Health Observatory (INS/ONS 2014a) estimated the spatial autocorrelation for avoidable causes of mortality between 1998 and 2011. Later that year, a geographical analysis of homicides was included (INS/ONS 2014b), which showed the presence of positive spatial autocorrelation of the standardized mortality rates for homicides (the Global Moran equaled > 0.5) as well as the presence of high and low mortality clusters. However, despite their spatial analyses, these studies were rather descriptive as they did not aim to explain those spatial patterns.
The current study therefore builds on previous research investigating the spatial distribution of major causes of death at the municipal level in Colombia. Mortality and population data came from the National Administrative Department of Statistics (DANE), although we correct for under-registration of cause-specific deaths, as is explained in detail in the ensuing section. Our paper therefore has five main contributions: 1 to produce better spatial data on mortality by cause of death by age and sex at the municipal level in Columbia than officially available; 1 3 2 to identify the presence of spatial clusters in mortality for major causes of death (infectious diseases, neoplasms, circulatory system disease, perinatal mortality, external causes, remaining causes and ill-defined causes); 3 to show that departments are too heterogeneous for data analysis so that analysis at municipal level is more meaningful; 4 to statistically test whether several known health determinants help to explain the observed spatial variation, i.e., the existence of mortality clusters, in Colombia during the period 2004-2006. 5 Construct spatial regression models that overcome the shortcomings of classical analytical approaches used in ecological mortality research, including ordinary least squares (OLS) that ignores spatial effects.

3
Mortality by cause of death in Colombia: a local analysis using…

Data and methods
Currently, Colombia has 33 departments (see Fig. 1) and 1122 local administrative entities, 1 comprising 1103 municipalities, 18 "non-municipalized areas" 2 and San Andrés and Providencia. For the spatial analysis, San Andrés and Providencia were excluded because they are two groups of islands situated too far from the Colombian mainland (about 775 km) to be considered contiguous to other municipalities. Standardized Mortality Rates (SMR) for seven major causes of death (see Table 1) were estimated from 1998 to 2014. We obtained the required mortality and population data by single ages and sex at the municipal level for the studied period directly from the Department of National Statistics of Colombia (DANE). Mortality data were derived from the national data file and contains non-fetal deaths at municipal level, and the population data came from DANE (2014). It should be mentioned that DANE and other researchers (Flórez and Méndez 1997;, 2007Vargas and Schmalbach 2013;Piñeros and Murillo 2004;Piñeros et al. 2006) identified important shortcomings in the vital statistics in recent years related to alterations to the mortality register and the turbulent political history and violence Colombia suffered during the last decades and, accordingly, employed similar methods to ours to correct for this at the departmental level.  estimated that the nationwide coverage of death records is 74%, although in departments such as Amazonas, Chocó, Vaupés, Vichada and Nariño this does not exceed 50%, including for specific groups of causes. In view of this situation, DANE decided to construct mortality tables by sex and department that incorporated significant corrections in infant mortality and other ages based on different internal studies on the coverage of the registry that showed important regional discrepancies. However, these corrections were not passed on at any time to the municipal scale or to the causes (DANE 2007). Rodríguez-García et al. (2017) underscored that for infectious diseases, the departments that exhibit the highest level of mortality are, in general, the most rural ones in the country, such as Guainía, Amazonas, Vaupés, Cauca, Vichada, Putumayo, Nariño, Chocó, Córdoba, La Guajira, Guaviare and Caquetá, departments which are also among those with the lowest mortality coverage. For this reason, the data we use in this study come from corrections applied to the obtained aggregate number of deaths in Colombia according to sex, age group, municipality of residence and cause of death. The procedure for estimating the vital statistics we do as followed: (a) For each year (t) we estimate the corrected total number of deaths ( D ) by sex (s), age group (a) and department of residence (g) by applying the mortality rates by sex, age group and department of residence as published by the official DANE mortality tables (DANE 2007). DANE prepared these mortality tables for the Table 1 Descriptive statistics of sex-and cause-specific standardized mortality rates of Colombian municipalities per population of 100,000 for the period 1998-2014 before and after data correction Cause of death (Pan American Health Organization's 6/67 CIE-10 Code) Indicator Data correction Men 1998Men -2000Men 2001Men -2003Men 2004Men -2006Men 2007Men -2009Men 2010Men -2012Men 2013Men -2014 All-cause mortality  1998-2000 2001-2003 2004-2006 2007-2009 2010-2012 1998-2000 2001-2003 2004-2006 2007-2009 2010-2012 2013-2014 All-cause mortality  1998-2000 2001-2003 2004-2006 2007-2009 2010-2012 2013-2014 Neoplasms (C00-D48)  1998-2000 2001-2003 2004-2006 2007-2009 2010-2012 2013-2014 Ill-defined causes (R00-R99) period 1985-2020 for each sex separately based on the information from the last three censuses, the birth records according to the residence of the mother and the deaths according to the residence of the deceased, information on the degree of omission, as well as special estimates for infants and children from 1 to 4 years. The population data (P) by sex, age and department of residence for the same period was also obtained from DANE. Life table deaths are thus estimated in the following way: (b) The obtained deaths ( D t,s,a,g ) are then pro-rata distributed to the cause-specific (C) distribution by age and sex at the departmental level that were yielded from the Colombian vital statistics microdata. Accordingly, we obtain age-and sexspecific deaths for the seven large groups of causes at the departmental level, i.e.: In other words, we assume that there is no bias in the registration of causes of death, suggesting that the essential problem in the Colombian death registries is overall coverage. Digging deeper into this issue, however, is beyond the scope of this paper. (c) Lastly, we redistribute the corrected death data at the departmental level to the municipal level (m). This was done by applying them pro-rata to the share of deaths in each municipality: Once having obtained our small-area dataset of deaths according to sex, age group and cause-of-death categories across a 17-year period, our next step consists in calculating the age-, sex-and cause-specific death rates for the major cause-of-death categories for each municipality. In order to minimize yearly random fluctuations, we aggregate the data for the following three-year periods : 1998-2000, 2001-2003, 2004-2006, 2007-2009, 2010-2012 and 2013-2014. 3 To allow for comparison over time and across municipalities, we use the direct standardization method to calculate Standardized Mortality Rates (SMR). This is done by first multiplying age-and sex-specific mortality rates of municipalities municipality as by a standard population P COL a , taken here as the national population age structure of both sexes combined according to the 2005 census. The population is distributed as per the following age structure: 0, 1-4, (1) D t,s,a,g = t,s,a,g × P t,s,a,g (2) D t,s,a,g,C =D t,s,a,g × D t,s,a,g,C D t,s,a,g (3) D t,s,a,m,C =D t,s,a,g,C × D t,s,a,m,C D t,s,a,g,C 1 3 Mortality by cause of death in Colombia: a local analysis using… 5-9, …, 80+. The open group is chosen to coincide with the lifetables and populations published by DANE. The sum of the products is then divided by the total population of Colombia p COL to obtain the cause-specific SMR for each municipality: Before embarking on the spatial analysis, however, we first analyze whether the effect of the adjustments that are made to account for under-registration of mortality on the SMR changed over time and differed between spatial units. Given that the adjustments are only sex-and age-specific and under-registration is worst among the youngest ages, those causes of death with a young age profile (perinatal and infectious disease mortality) are most affected by the corrections. However, both the absolute and relative differences between the two rates do not change much over time, neither does the geographical pattern (as an example, see Fig. 2 for total mortality). We therefore decided to correct the death statistics by taking into account the different regional context. The level of coverage of death registries is related to a region's infrastructure and available (financial) resources. These include the presence of hospitals and the availability of qualified personnel to timely and properly carry out the registration, which is not always the case in rural areas and small towns   (1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014). Source: Own calculations based on the microdata obtained from Vital Statistics and the denominators supplied by DANE. *See main text for how the municipal deaths were corrected encounter difficulties in accessing the internet that limit completing mortality records per se and in time. Furthermore, regions with a large indigenous population do not record all deaths due to cultural factors, while there are also instances deaths are registered but not entered into the DANE database. One study estimated that 90% of maternal and perinatal deaths in Colombia in 2002 were registered, with Chocó (64%), Sucre and La Guajira (around 75%) being the departments with the highest under-registration of such deaths (DANE 2006). Conversely, the more developed departments of the Andean mountain range, the coffee-growing region and the large urban centers of Bogotá, Cali, Medellin and Barranquilla do not require major data corrections. The more accurate registration of deaths in these regions is linked to a greater presence of Central Government. In Table 1, we provide descriptive statistics of the cause-specific SMR at the national level before and after the data correction we implemented and in Appendix 1 the results from the linear regressions between the two. Briefly, results show that particularly the municipalities located in the Departments of Chocó (on the Pacific coast), La Guajira (on the Atlantic Coast bordering with Venezuela) and Amazon (in the southeast) were most affected by the corrections. As a consequence, a number of high mortality clusters emerged that the epidemiological literature on Colombia had previously identified using morbidity registers (Ochoa and Osorio 2006), but which had gone unnoticed in the analysis of vital records. Rodríguez-García (2007) identified similar levels of under-reporting at the departmental level for the year 2000 to that we obtained.
In order to explain the observed sub-national mortality patterns, information was obtained for 1087 municipalities 4 from the 2005 census and administrative records on the following known health determinants (see also Appendixes 2 and 3 for more detail on their measurement, data sources and descriptive statistics for the high and low mortality clusters): population density and degree of urbanization (see Cyril et al. 2013 for references), temperature (Patz and Olson 2006), immigration and ethnicity 5 (Nazroo 1998), poverty (Fiscella and Franks 1997;Rodríguez-Acosta 2016), level of education (Holmes and Zajacova 2014), public investment in health (Jaba et al. 2014), education (Holmes and Zajacova 2014), water treatment (Pérez-Vidal et al. 2016) and health insurance coverage (Lloyd-Sherlock et al. 2012). Lastly, we also include municipal surface area as a proxy for geographical isolation (as most large municipalities are located in the Amazon area), which, in turn, is associated with fewer structural resources like social services and health care. These are known 4 Although we corrected the mortality data for 1115 local administrative entities in Colombia for the 2004-2006 period (see Appendix 1), data on the selected explanatory variables were not available for the 18 non-municipalized areas and 3 municipalities. In addition, another 7 municipalities were excluded because they did not have any neighboring entities. 5 For the purpose of the study, the variable is broken down into Afrodescendants (mainly located on the Atlantic coast and Pacific coasts), the indigenous population (distributed throughout the country but particularly in jungle areas in enclaves -resguardos-and generally populated by members of the same ethnic group) and mestizos/whites (the country's economically and politically dominant group and overrepresented in the country's urban areas).

3
Mortality by cause of death in Colombia: a local analysis using… determinants of intimate partner violence (Lanier and Maume 2009) and infectious diseases (Walker et al. 2016).

Spatial Cluster Analysis
To study the possible existence of spatial dependence in the sex-and cause-specific SMR, we use the spatial autocorrelation indicators Global Moran's I and Local Moran's I. 6 Spatial statistics is an analytical tool that treats the data of the municipalities as parts of a whole, a territorial structure where neighborhood relations are established and where it is possible to analyze to what extent statistical association exists between the values of a variable that is distributed in the territory. That is why it is necessary, before calculating the indicators, to establish a criterion based on spatial distance that clearly determines the municipalities that are neighbors. Based on this criterion, a matrix of weights is constructed that relates each municipality to all others, which then serves to calculate the value of the spatial indicator. We experimented with Rook's, Queen's and nearest neighbor (NN) case contiguity, as well as with first-order and second-order (with and without the inferior order) neighbors.
The Global Moran's I indicator summarizes the overall spatial associations within a territory (here Colombia) according to an analyzed variable (Anselin 1995). With its calculation, a global autocorrelation test can be performed, with as null hypothesis that the variable shows spatial independence (i.e., the values of a variable do not depend on those of its neighbors). There are several alternatives to estimate the probability that the distribution of the data is random, but here we use an approximation to the value of Global Moran's I from a random permutation (specifically 999 permutations, a methodology that has an associated probability of 0.001). However, one weakness of the Global Moran's I is that it averages out local variations in the level of spatial association. The Local Moran's I statistic was therefore conceived to be able to identify local patterns of association (hot spots) (ibid.), whereby the obtained value provides evidence of either: • Local positive spatial autocorrelation, i.e., regions with high values of a variable surrounded similar neighbors (HH clusters, known as hotspots) and regions with low values surrounded by regions who also have low values (LL clusters known as coldspots) and • Local negative spatial autocorrelation, i.e., regions with high values of a variable surrounded by neighbors with low values (HL clusters) and vice versa (LH clusters).
Thus, positive spatial autocorrelation indicates the presence of clusters of similar values in the territory and is undoubtedly information that will help us to locate the mortality patterns due to similar characteristics throughout the Colombian territory.

The Mann-Whitney nonparametric test for analyzing the exogenous characteristics of the mortality hotspots and coldspots
To ascertain an understanding of the nature of the spatial clusters with positive high and low spatial autocorrelation (HH and LL), we apply several additional statistical procedures. For the analyses, we only consider the 2004-2006 period as it approximates a period when a significant number of contextual factors are available locally as they can be derived from the 2005 census. As our initial objective is to assess whether significant differences exist in the distribution of independent, potentially explanatory, variables between both spatial groupings, these variables are first subjected to a normality test using the Kolmogorov-Smirnov test with the Lillefors correction. Since they all present a level of significance of 0.000, the hypothesis of a normal distribution is rejected and the typical ANOVA t test not used, but rather the Mann-Whitney U test. 7 This is a nonparametric test comparing two independent samples that has the advantage of not needing a specific a prior distribution but uses the ordinal distribution of the dependent variable (in this case of the selected regressors). It is used to compare two groups of median values and determines whether or not the differences between the two are due to chance. Specifically, we aim to show whether the distribution of the explanatory variables is significantly different between the positively correlated clusters of high and low mortality for total mortality and the first five cause-of-death categories (i.e., excluding ill-defined and remaining natural causes due to a lack of a clear etiology) for each sex. From the U tests we retain the level of significance of the test and the median of each variable. Subsequently, we apply the Spearman's rank correlation coefficient that is not subject to the restriction of normality to obtain the strength and direction of the association at the municipal level between the independent variable and the SMRs in each of the HH and LL clusters. The results of this analysis are described in Sect. 3.3.

Using Spatial Lag and Spatial Durbin models to explain spatial patterns in mortality
One disadvantage of analyzing HH and LL mortality clusters is that only a small proportion of the municipalities are analyzed (see Appendix 3). Moreover, no consideration is made regarding any possible spatial correlation of the covariates. One way to get around this is by first applying a spatial lag model and subsequently a spatial Durbin model. 8 A spatial lag model can be expressed as: (5) y = I n + Wy + X + , where ∼ N 0, 2 I n y is a n × 1 vector and denotes the dependent variable of n = 1087 municipalities, in this case cause-specific mortality; I n is the n × 1 identity vector associated with the intercept; is the spatial autoregressive coefficient associated with the lagged dependent variable Wy , with W being the n × n spatial weight matrix.
In turn, the spatial Durbin model is applied to consider both spatial autocorrelation and spatial dependence in a set of explanatory variables are added as spatially lagged (but usually only when significant): X is a n × k matrix of k explanatory variables related to k × 1 parameters in the set of explanatory variables; is a k × 1 vector of effects of the spatially lagged covariates, WX . The error term is normally distributed with mean 0 and variance 2 I n , with I n an n × n identity matrix, following LeSage and Pace (2009).
The main difference between the spatial lag and Durbin models is that the latter includes not only endogenous interaction relationships ( Wy ), but also the exogenous interaction relationships that enters as spatially lagged explanatory (6) y = I n + Wy + X + WX + , where ∼ N 0, 2 I n Table 2 Global Moran's I according to period and cause of death. Colombian municipalities (1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014) Source: Own calculations based on the microdata obtained from Vital Statistics and the denominators supplied by DANE *Significant at p < 0.01. In italics: time period and causes of death studied in explanatory analyses 1998-2000 2001-2003 2004-2006 2007-2009 2010-2012  variables ( WX ). This way it is possible to assess if and how the specific mortality cause measured in a specific unit i is related to the features of its neighbors. Initial model covariate selection has been conducted through a best subset selection approach (Heinze et al. 2018), where covariates are progressively deleted by evaluating each model performance (both spatial lag and spatial Durbin models) through Akaike's information criterion (AIC) (Akaike 1973) estimators and the final set of variables is checked for collinearity through a variance inflation factor (VIF), which has to score below 4.0 (Fox and Weisberg, 2018). The appropriateness of the spatial lag and Durbin models against spatial error alternatives has been assessed using Anselin (1988) Lagrange multiplier tests. We only apply these models to those causes of death where spatial autocorrelation is present in the dependent variable for the period 2004-2006 (see Table 2). When this is not the case, an OLS regression is conducted.
Evaluation metrics are included for each model and comprise the Lagrange multiplier test (LM test) for residual autocorrelation, which determines whether the model residuals are spatially autocorrelated; the AIC for the selected model and the OLS counterpart.

The interpretation of the spatial Durbin model
The idea behind the spatial Durbin approach is that the effects of the explanatory variables are not only felt in the area where they are measured (as it is the case for the spatial lag model), but that there is a feedback process that makes it possible that its effects expand to/from neighboring areas. A simple example would be the presence of a large health care facility, whose intake capacity exceeds that of the area where it is located, thus benefiting all its neighbors. Thus, the spatial Durbin model allows separating the direct effect of an explanatory variable on the dependent variable from the indirect effect (LeSage and Pace, 2009). Isolating the dependent variable, Eq. (6) can be rewritten as: where the partial derivative with respect to the explanatory variables set can be expressed as: with y X i a n × n matrix. The direct effects are the average of the diagonal elements of y X i , whereas the average of the off-diagonal elements of y X i represents the indirect effects. The direct effect represents the expected average change across all observations for the dependent variable of a particular area due to an increase of one unit for a specific explanatory variable in this area (feedback effect), and the indirect effects represent how changes in the independent variable arising from a particular area influence the dependent variable in all other neighboring areas (spillover effect). It should be noted that some independent variables may not have an indirect effect as we only separate the indirect effect from an independent variable's total effect when the additional lagged explanatory variable improves the model fit (i.e., the AIC value is the same or lower). In these instances, the direct effect equals the total effect.

Results
Colombia is currently characterized by a decline in mortality levels at regional and municipal levels and shows certain stability in the territorial pattern in this reduction (see the coefficient of variation (CV) in Table 1). We do not rule out that the latter result is due to improvements that have been made in the mortality register since 2010 as variance increased during the latter two periods for all causes of death. This is an aspect worthy of a more in-depth study. However, strong territorial disparities still exist in terms of life expectancy at birth between the different departments (a maximum of 10 years in the case of males and 7 in the case of females; see Appendix 5). The spatial hierarchy of life expectancy at the departmental level remained at constant levels since the late 1980s (Cristancho 2017). This territorial stability makes it possible to identify a series of departments in the northern Andean region, Bogotá, the Atlantic coast and more urbanized departments as those that experienced higher life expectancies over the last decades. In contrast, all the departments to the east of the Andean mountain range located in the Amazonian area, those bordering Ecuador and the Pacific coast are the areas that are lagging behind in life expectancy.
The department-specific results are filled with nuances when we descend to the municipal scale and address the causes of death. The correction of deaths implies an increase in mortality disparities between municipalities in the categories infectious diseases, perinatal mortality, other causes and ill-defined diseases. This is verified through the time trend of the coefficient of variation (CV ; Table 1). Conversely, neoplasms, diseases of the circulatory system and external causes experience a marked stability and even a slight reduction for the same indicator. Between 1998 and 2014, a generalized reduction in SMR can be observed in both sexes, affecting most of the causes with the exception of neoplasms among men and the remaining causes in both sexes.

Spatial autocorrelation in Colombian mortality
Based on the Global Moran's I indicator, results provide a clear indication of spatial autocorrelation in Colombia during the period 1998-2014 for both all-cause and cause-specific mortality (Table 2). In neoplasms, external causes and ill-defined causes there is statistically significant spatial autocorrelation for both men and women and during all of the periods studied. In addition, regarding men this also applies to infectious diseases, diseases of the circulatory system and total mortality. This is despite the overall reduction in mortality and some territorial convergence in the decrease in the CV in total mortality. In other words, there is a clear continuity over time in the spatial patterns observed. External causes observed the highest level of spatial association.
Mapping the municipal results allows the identification of mortality hotspots as well as contiguous areas with low levels of mortality (see Fig. 3 for all-cause mortality and the Supplementary Figures S1-S7 in the OSF Preprints repository (https ://osf.io/n5pk8 /) for the specific causes of death). In the first place, the general trend in total mortality among both sexes is a decrease in areas with autocorrelation. The slight break in the trend since 2010 may be related to the spatial variations in the Colombian Vital Statistics registry but also for the effects associated with the increase in infectious diseases in specific local areas of the country. The fact that the Global Moran's I is superior among men can be easily explained by the leading role that men have regarding external causes. As the general level of mortality is the product of different cause-specific trends, the overall pattern conceals many interesting panoramas, including territorial contrasts in the specific causes that we analyzed. These are succinctly described below.
Infectious diseases show different clusters in the Colombian pacific zone, south of the Cauca Valley, north of Nariño, Guajira, the area of Orinoquia (in the east of the country) and Amazonia present high levels of clusterisation. Some of these areas, as is well known by Colombian epidemiologists, are characterized by being jungle, in which there is transmission of diseases such as malaria and dengue where access to drinking water is scarce and health intervention not always optimal (Padilla et al. 2017;Quintero et al. 2014). During the period 2010-2012, there was a rise in malaria and dengue mortality in the Eastern part of the Amazon region, which is reflected in a very clear increase in the Global Moran's I value for infectious diseases (for men it goes from 0.0650 in 2007-2009 to 0.2058 in 2010-2012). The change is also observed in women, but less strong. Some authors have highlighted a These mining communities, basically composed of men, are mainly located in difficult to access jungle regions.
As to cancer, the Andean zones located in the center of the country have the most aged population. As cancer is a degenerative disease related to having survived to older ages, particularly this area observes the highest levels of clustering of cancer mortality. Moreover, the spatial structure is maintained throughout the study period. The municipalities in blue correspond to areas with low life expectancy where cancer mortality is much lower, in part also due to a lack of diagnosis and almost absence of smoking .
The geographical pattern of circulatory system diseases is very similar between time periods, although the amount of clusterisation is less in the more recent periods. The spatial distribution of these causes is associated with areas that still lag behind in the ET. Men have a higher Global Moran's I than women.
Regarding perinatal mortality, there is clustering in Orinoquia, Amazonas and certain areas in Pacific and Nariño, where health services are not accessible to all communities (Cristancho 2017). The municipalities in this part of the country are extensive in territory, what makes access to health care difficult, especially for infants resident in isolated communities in the Amazon rainforest and along the Pacific coast. The spatial autocorrelation among males is higher. A more detailed analysis of the spatial structure showed some recurrence in incomplete coverage in some areas, even after the administered corrections. These problems appear to be related to the health care that municipalities with hospitals provide for outlying (often rural) areas that inadvertently lead to an inflation of death registration of certain urban areas. This assertion should, however, be verified empirically in the future.
Perhaps the most interesting maps are those for external causes as they maintain the highest degree of clustering and continuity in the areas involved. Initially located in Antioquia and Córdoba where armed conflict at different levels (guerrillas, paramilitaries, drug trafficking and illegal mining) overlaps throughout the studied period, these areas are maintained during the period and extended to the departments east of the mountain range, to the plain (Meta) and to the southern guerrilla area (Caquetá and Putumayo) and later consolidated in Orinoquia and Amazonas. Neither negligible is the mortality linked to urban delinquency, which is of predominantly male character. Also worthy to highlight is the slightly different geography of female mortality from external causes: the Antioquia and Cordoba areas are no longer significant, while municipalities in Meta, Caquetá and Putumayo are also characterized by high levels of mortality from non-natural causes (although no longer during the last period studied 2013-2014, except for Meta). The peculiarity of the female trend is explained by the masculinization of the areas where the violence is concentrated. In any case, the gradual reduction of Global Moran's I associated with external causes is good news for Colombia.
Concerning the remaining natural causes and ill-defined causes, results indicate a clustering of under-registration (in red color) located in departments such as Chocó,

3
Cauca, La Guajira and the municipalities in Orinoquia and Amazonia. Nevertheless, the decrease in the Global Moran's I indicator is a sign of progress as this is in line with the improvement that has been made in the quality of vital statistics in Colombia in recent years (Rodríguez-García 2007).
Finally, to evaluate the use of uncorrected vital records in the calculation of SMR, we repeated the analyses for the first and last period using the Rook criterion of order 1 for the uncorrected data and compared the results. Results show that uncorrected data provides a biased and inaccurate view of the spatial structure of mortality in Colombia. Indeed, a high degree of under-registration in municipalities may yield contiguous areas with similar mortality rates and thus produce an inflated (and misleading) Global Moran's I (see also Appendix 6).

Explanations of the mortality hotspots and coldspots in Colombia in 2004-2006
Colombia has a heterogeneous geography that hosts very diverse ecosystems, a climate marked by extreme differences in altitude in populated spaces that extend from sea level to the tropical moorland at altitudes above 3000 m. At the sociocultural level, Colombia also has a very polarized social structure with strong inequalities that are spatially accentuated due to territorial-specific complex ethnic structures and the history of armed conflict that has plagued the country for more than sixty years, joined later by drug trafficking and illegal mining. This has generated a context of latent violence that, despite the recent peace process, has implied a disproportionate weight of deaths linked to external causes that are unequally distributed throughout the country. This heterogeneity implies the existence of multiple factors that can explain the links that emerge between the existence of spatial autocorrelation of different types of causes of death in certain territorial spaces. To better understand this complex geographical scheme, we use the Mann-Whitney U test to see whether the medians of the explanatory variables are significantly different between the high and low mortality clusters for total mortality and the first five cause-of-death categories in the 2004-2006 period (Appendix 3). Subsequently, for the significant variables (p ≤ 0.05) we obtain the Spearman's correlation coefficients to analyze the strength and direction of the association between the independent variables and the cause-specific SMRs among the municipalities in the high and low mortality clusters (Table 3).
Results are very similar for both sexes. Starting with infectious diseases, there is a clear geographical pattern with surface area (used as a proxy of isolation) being positively associated as the HH cluster municipalities are 3-4 times smaller on average than LL cluster ones. The HH cluster municipalities also tend to be more populated and least advanced in the context of the Demographic Transition (i.e., higher fertility and proportionally fewer elderly). The association with low education is in the expected direction, but relatively low (− 0.260 for men and − 0.472 for women) and only just statistically significant and does not yield clear conclusions as the cluster difference in the median equals only 4 percentage points. The ethnic variable is also significantly associated with mortality from infectious diseases. Results show 1 3 Mortality by cause of death in Colombia: a local analysis using… Table 3 Spearman's rank correlation coefficient between the causes of death and the explanatory variables among Colombian municipalities in the HH and LL mortality clusters a C0 = total mortality; C1 = infectious diseases; C2 = neoplasms; C3 = circulatory system diseases; C4 = perinatal mortality; C5 = external causes. *p < 0.05, ** p that mortality hotspots are linked to a greater presence of Afrodescendant in the case of men (+ 1 percentage point) and the indigenous population regarding both sexes (median of men 15% and women 11% vs. < 0.1% in LL cluster municipalities). Not surprisingly, the Spearman's correlation coefficient is high (0.676 for men and 0.643 for women). Results also show that greater public investment per capita in health and water supply are done in areas with high levels of infectious diseases, while HH mortality cluster municipalities also have on average higher investment in education.
Between the HH and LL cancer mortality clusters, significant differences are observed in all variables. Although their contributions are different in intensity and sign, they follow the general scheme of demographic and socioeconomic development adapted to the geographical and socioeconomic peculiarities of Colombia, where, due to competing causes of death, cancer is linked to survival to older ages and therefore low mortality. As the U test shows, HH clusters are made up of mainly Andean municipalities, with a lower average temperature and surface area, a more urban context (higher proportion residing in the main municipal town), more elderly and internal migrants, fewer low-educated people but more with a mestizos/white background and access to health care through social security system contributions. Conversely, municipalities in LL clusters have a higher degree of poverty, greater per capita public investment in health, education and water supply and a younger population. Finally, there are no important gender differences in the results.
The contextual factors associated with circulatory system disease mortality follows a similar pattern to that of cancer, although with several exceptions (the two settlement variables and public investment in health and water supply show no statistically significant association). In other words, again variables related to greater urban, demographic and socioeconomic development are positively associated with this cause. Of all variables, the proportion of people over 50 has the highest Spearman's correlation coefficient with the SMR of circulatory system diseases (0.694 for men and 0.759).
Factors affecting perinatal mortality repeat to some extent what was observed for infectious diseases. Regarding the geography variables, the municipal surface area is positively associated and density negatively associated with SMR in both sexes. This can be explained by noting the indirect association with (isolated) areas, which have less government presence and more public health issues. According to the U test, the median surface area is approximately three times greater in HH clusters than in LL clusters. Other variables associated with high perinatal mortality rates are higher fertility, fewer older people, ethnicity (similar pattern as observed for infectious diseases), poverty and public investment in education. The association in the latter is positive because per capita investment is higher in less economically developed areas, particularly indigenous reservations, where perinatal mortality is high. Results for women are similar, except that the proportion elderly is the only significant population variable. The highest Spearman's coefficient is between perinatal mortality and the proportion of indigenous people in the municipality (males 0.763; females 0.721).
Mortality from external causes presents very interesting associations, especially with respect to the geography and settlement variables. The U test shows that male mortality hotspots are linked to larger-sized municipalities located in mid-mountain heights with low densities and a rural character. This corresponds to the geographical space of the Andean foothills, places that are characterized by a predominance of jungle with poor government control and the presence of armed groups. Municipalities with a greater presence of immigrants are positively associated with external mortality in both sexes, observing in fact the highest Spearman's correlation coefficients (0.537 and 0.631 for men and women, respectively). A greater indigenous presence is also positively association with external-cause mortality, although its explanatory power is low.
Lastly, we describe the results for total mortality. Its pattern is not very different to that of neoplasms and circulatory system as the two causes dominate the overall mortality structure. For instance, total mortality is negatively associated with temperature, which may be considered a proxy for altitude (see Appendix 2 note h): Particularly in the case of men, most municipalities in HH clusters are located in urban areas of the Andean mountain range and the LL cluster municipalities in the lower lying coastal areas (Fig. 3). Internal migration, that also has a strong urban component, is also positively associated with total mortality. The same applies to being of mixed/white race and health insurance. On the other hand, household poverty (measured as unsatisfied basic needs, particularly in housing) is negatively associated with mortality, but this is also more common in (healthier) urban areas. Statistically significant is also public investment in education (negatively associated) and having an older population (positive). The latter association is stronger for women than for men, which is consistent with the fact that they have a higher life expectancy.

Explanations for the overall geographical mortality patterns in Colombia in 2004-2006
To better capture spatial dependence, causes of death were first tested for different neighborhood definitions. First-order Rook criterion of contiguity, i.e., a stricter definition of neighbors where neighboring units are shared by more than a single point, was considered the best to use for establishing the neighborhood relations when Moran's I was has high and comparable with a more complex weight matrix (men: all causes, cancer and circulatory system diseases; women: external causes). NN k = 4 was chosen instead as the spatial weight matrix for male external causeof-death mortality and NN k = 3 for female cancer mortality (Appendix 4). We also found little evidence of a Modifiable Area Unit Problem (see, e.g., Meliker and Sloan 2011), given the small differences in the results between the different spatial weight matrices. Table 4a, b presents the results for the spatial lag and Durbin models and their AIC measurements and Table 5 those of the OLS approach when little to no spatial autocorrelation was present in the cause of death (only women: total mortality and circulatory system diseases; both sexes: infectious diseases and perinatal mortality). The fact that there are fewer spatial regression models for women is a clear indication of the lower importance of geography as an explanatory factor for smallarea differences in mortality. According to the AIC, using spatial Durbin is the best   Public investment is in thousands of Colombian pesos per capita. See also Appendix 2 for exact definitions. In the spatial Durbin models, the indirect effect of an explanatory variable is only separated from the total effect when the model fit improves (i.e., the AIC value decreases) as a result. If not, the variable only has a direct effect that is equal to the total effect 1 3 Mortality by cause of death in Colombia: a local analysis using…  approach wherever spatial autocorrelation appears to be positive and significant (Moran's I > 0 and p-value < 0.05).
Male total mortality shows a clear positive spatial association with the common characteristics of Colombia's urban population (mestizo/white background and health insured) and a negative association with rural areas. The municipalities with the largest surface area are characterized by a positive association. The direct negative effect of rural population on general mortality is more than nullified by a high indirect effect of rural municipalities that contribute a high positive association with this variable. This points to an effect not measured in our model, namely the location of health care facilities in urban areas. The negative association as a direct effect of the coefficients linked to the migratory contribution (lower mortality in centers with little immigration) and positive as an indirect effect, which suggests a higher mortality in the high-migration municipalities that surround the reference municipality.
The importance of characteristics of neighboring municipalities is also observed for cancer, and for both sexes. The spatial Durbin model shows a greater concentration in urban areas (direct negative effect with rurality and positive with the availability of health insurance). On the contrary, the indirect effects associated with rurality are positive, which seems to indicate, as with general mortality, enclaves with a greater concentration of the population in urban centers that are surrounded by areas of greater rural population tend to have a greater incidence of this cause of mortality and may contribute to some of the cancer mortality in the surrounding urban areas where hospitals are located. In the case of women, it is also associated with population density and mixed/white race. These variables are associated with advanced stages of the epidemiological transition in the more developed areas of Colombia. On the other hand, in the case of men the spatial lag model for this cause is characterized by a negative association with the Afro-Colombian population-which has a much younger structure than the country as a whole-and with high-migration areas (these municipalities tend the most advanced in terms of economic and social development). On the contrary, the effect of the superficial extension of the municipality plays against it, unlike the other causes, although its significance is much lower.
Patterns of spatial dependence in male circulatory system disease are similar to those of neoplasms. The most significant direct effects are found in two characteristics associated with greater urban development, a greater proportion of the population being of mixed race/white race and with health insurance. The direct effect of rural population and immigration is in the same line as for cancer, but its indirect (neighborhood) effects are larger. Likewise, among the significant indirect effects, urban areas surrounded by more rural areas are characterized by higher circulatory system disease mortality.
With respect to the deaths linked to the external causes, there is a direct and indirect effect of temperature, which is highly negatively associated with altitude. There is of course no causal effect here, but results suggest that mortality is much higher in municipalities in the foothills (direct effect) surrounded by other municipalities at a higher altitude (indirect effect with a high negative association). This scheme corresponds to the classic model of mortality due to violent guerrilla warfare, which throughout the conflict in Colombia has been characterized by attacks on more populated municipalities in lower altitudes while maintaining their basecamps higher up in the Andean mountains.
The effect of the low educational levels of some populations that have historically constituted a local guerrilla and paramilitary recruitment sector, very close to the military actions carried out by these violent groups, goes along the same lines. Another important component of external-cause mortality is traffic accidents, and municipalities characterized by a low average socioeconomic status (of which education is a marker) tend to have higher rates of lethal motorcycle accidents (Martínez-Cabezas 2020). Effects are also stronger for men than for women. The exception is large municipalities in terms of area surrounded by likewise large municipalities (both direct and indirect effect, suggesting the extreme vulnerability of women in isolated areas).
Regarding the OLS results of the causes of death without statistically significant spatial autocorrelation, the explanatory variables explained between 15% (perinatal mortality, women) and 26% (total mortality, women) of the municipal variation in mortality (Table 5), although the AIC is relatively high in the latter model. This may suggest some overfitting. Turning to the specific results, there is some overlap with the earlier Mann-Whitney U tests that were performed on the HH and LL clusters. For instance, considering infectious diseases (for both sexes), the greater the surface area of the municipality, the higher the mortality. This can be explained by the fact infectious diseases remains an important health issue in the large and isolated municipalities of the Amazon area. These areas also have large concentrations of indigenous people but at the same time fewer internal migrants (women only) who are more likely to live in healthier, more developed and economically dynamic areas. Once controlling for all other significant factors, mortality levels are also higher in more aged municipalities (men only) and those with less investment in health, but also where investment is higher in education. The latter association can be explained by the fact that per capita investment is higher in less economically developed areas where perinatal mortality is high. The association with health and education investment is found for perinatal mortality and the same geographical and settlement variables are significant, with the addition of the temperature (although not in the expected positive association, but this is possibly because respiratory diseases are located in the most populated areas of the Andean mountain range with lower temperatures). In the case of males, mortality rates are also positively associated with the proportion of Afro-Americans and indigenous people.
Finally, turning to female total and circulatory system diseases, mortality is positively associated with living in a more urban context (higher proportion residing in the main municipal town), more elderly, but also an indigenous background (total mortality) and a mestizos/white background (circulatory system diseases). The latter is likely because among mestizos/white women a large proportion of circulatory system disease mortality is due to stroke, which is related to having survived to older ages. Finally, total and circulatory system mortality is also lower in municipalities where more is invested in health care and water treatment.

Discussion
The objective of this study was to analyze the possible existence of geographical clusters of mortality in Colombia, their evolution over time and local factors that may explain the geography of mortality from causes in Colombia between 1998 and 2014. Our results-based on spatial indicators and cartography-show that the geographical distribution of causes of death in Colombia has a significant degree of spatial autocorrelation in which male mortality from external causes stands out. Results also show that the applied data corrections considerably improved the quality of the municipal data, leading to the identification of clusters that had already been previously identified in epidemiological studies which used morbidity registers. In this sense, the correction we applied is a valuable instrument of health policies.
Colombia experienced a general decrease in all-cause mortality over the 16-year period that was studied. This occurs in a context of some stability in the geographical patterns of mortality. From 2004 to 2006 among men, but already earlier among women as they are further ahead in the ET, cancer mortality began to contribute more significantly to the overall mortality pattern. At the same time, spatial autocorrelation between municipalities declined over time for all cause-of-death categories except male circulatory system diseases and perinatal mortality and was highest in external causes. The latter was especially the case among men, with hotspots moving from the central Andean area to Orinoco and the Amazon rainforest. The identification of geographical clusters of mortality in Colombia and their evolution over time will allow decision makers to prioritize those regions with higher mortality. On the other hand, particularly in LL cluster municipalities non-communicable diseases are now clearly dominating, denoting that they are clearly in the third stage of the ET.
We tested the two most important ones, cancer and circulatory system diseases, and according to our explanatory analysis both appeared to be significantly associated with the mestizo/white population who have the highest life expectancy. However, we believe that this link tells us more about the competition between causes than about a real statistical association. Municipalities with a younger population structure correspond to those with a significant presence of indigenous and Afrodescendants, where lower life expectancies are associated with mortality from other causes, particularly infectious diseases and perinatal mortality. On the other hand, HH external mortality clusters are more associated with geographical factors linked to isolation and the presence of poorly accessible jungle areas, where the low presence of the Colombian public administration instigated violence by different actors of armed conflicts in Colombia. Likewise, the negative association between total mortality, cancer, circulatory system disease and external causes with poverty reinforces the idea that urban Andean populations are on the one hand the most developed and richest, but on the other also the geographical spaces in Colombia where extreme inequality is manifested.
While the Mann-Whitney nonparametric test provides interesting and useful insights into mortality extremes, it does so for a small subset of municipalities and does not consider spatial correlation of the covariates. We therefore also performed spatial lag and spatial Durbin models. In particular, the interpretations of the latter results are richer than those of other conventional analytical approaches. To summarize the key findings, the spatial Durbin model is superior to the spatial lag model for all cause-of-death categories explored. Moreover, it allows the estimation of direct and indirect effects of covariates, which improves the overall spatial model. Indeed, results showed the existence of significant spillover effects, thus providing strong evidence to support our argument that the characteristics of surrounding municipalities are important determinants of mortality for different types of causes. That said, this effect is less discerning for women than for men as well as certain causes of death (infectious diseases and perinatal mortality). On the other hand, the prominent role of different variables of a geographical nature, isolation, temperature/altitude, urbanization/rural population and their spatial lag as determining variables in mortality levels confirm the strong impact of the geographical context in explaining mortality patterns in Colombia.
In addition, the negative association between public investment in health and most causes of death except external causes in both the cluster analysis and the OLS models clearly shows the importance of efficiently providing widespread coverage and equitable access to health care. As summarized by Dávila-Cervantes and Agudelo-Botero (2018), this is to ensure that the most vulnerable populations (such as children, women, indigenous communities and persons of African descent) will have their basic needs covered. This would help to reduce inequalities that are mediated by variables such as gender, age, social status, education level and geographical area of residence, which have a differential and unjust effect on the health status of communities. But besides coverage, quality should also be the goal of the public services, with distinct goals for urban and rural areas. Although the decentralized transfers' system is complex in terms of structure and criteria of resource allocation, this would be facilitated by strengthening of the management capacity of the departments (Bonet et al. 2014).
A suggestion for future research is to examine more specific causes of death such as types of cancer, diabetes, homicides and traffic accidents. For instance, it is possible that a higher degree of spatial autocorrelation will be observed for some of these causes due to similar territorial patterns related to their etiology, such as nutritional factors (e.g., low fruit and vegetable and high salt intake are considered risk factors of stomach cancer) and smoking (the most important risk factor of lung cancer). Moreover, as the new political conditions after the 2012-2016 Colombian peace process also led to a considerable reduction in homicides, this might have spatial repercussions that we have not been able to observe. Given the obvious importance of disease risk factors (both proximate and indirect), spatial econometric models of an explanatory nature can be applied to test the effect on these specific causes of death with additional local indicators to those used here (e.g., related to homicide, the sex ratio, alcohol consumption, unemployment, housing conditions and income inequality, or including information on diet, hygiene, alcohol, smoking and access to health care when analyzing different types of cancer). Regarding the explanatory analysis, we do need to acknowledge its ecological nature, although the interpretation of results of sub-national studies (in our case, municipalities) is less subject to ecological bias because they are more homogeneous than countries. According to Higgs et al. (1998), the identification of statistical associations between mortality and exogenous factors at the small area level may even be interpreted in conjunction with and suggestive of epidemiological investigations at the microlevel.
The high heterogeneity of the 1118 Colombian local administrative entitities analyzed suggests another possible route of investigation, namely grouping them into "sub-regions" that are situated between local authorities and departments. This new territorial division would allow the absorption of at least part of the registration errors linked to the regional hospital infrastructure and also raise the statistical and spatial significance of the calculated indicators. Along the same lines, the elimination of the large and sparsely populated municipalities of the Amazon region may imply significant changes in subsequent analyses due to the "noise" they contribute in the calculation of the Global and Local Moran's I. The choice of using one or another criterion of contiguity for establishing the neighborhood relations is something that we did evaluate and have tried different methods, the Rook criterion of order 1 was the one that offered the most robust and consistent results (Appendix 3). The fact that all high-value loadings were positive also implies that the potential issue of "competing causes" (i.e., the decrease in one cause of death leads to the increase in another) does not play a role in our spatial analysis. This is, however, no surprise as most causes of death that we analyzed have a specific age-and sex-structure. For instance, perinatal mortality only occurs around the time of birth and most external causes during late adolescence and early adulthood (ages 15-34). Even the possible competing causes circulatory system diseases and neoplasms did not result in a significant negative association. On the contrary, the association was clearly positive, but also this has a plausible explanation as regions with low levels of mortality tend to have low levels in both circulatory system diseases and neoplasms as their etiologies are similar (e.g., dietary and behavioral factors and medical technology).
Despite our interesting findings, some limitations of our study should be highlighted. Perhaps most importantly is the under-registration of the Colombian mortality data. Specific adjustments were thus applied. As a result, our adjusted mortality rates are quite consistent with those published by the INS/ONS (2013) and the Ministry of Health and Social Protection (2016) reported a decline in the adjusted death rate from 524 in 2005 to 444 per 100.000 in 2014. As a comparison, our adjusted rates for the period 2004-2006 were, respectively, 566 and 334 for men and women and for the period 2013-2014, respectively, 540 and 335.
Finally, the application of spatial regression models allows the introduction of a new conceptual element in the analysis of epidemiological transitions, namely geographical space. If we can operationalize this concept, we would have an indicator of "system convergence" that represents changes in cause-of-death patterns. To illustrate, using the results we obtained for Colombia: although there is a gradual reduction of mortality, this occurs without territorial convergence. That is to say, we could have epidemiological transition models with or without territorial convergence, which would add a geographical dimension to the analysis and description of mortality patterns, but we leave this for future research.

Table 6
Results of the linear regression between the original and corrected* age-standardized mortality rates per population of 100,000 of Colombian municipalities for the period 1998-2014 by sex and cause of death

3
Mortality by cause of death in Colombia: a local analysis using…

Appendix 4
See Table 9. Table 9 Moran's I of SMR in Colombia municipalities (2004)(2005)(2006) according to different contiguity criteria Source: Own calculations based on the microdata obtained from Vital Statistics and the denominators supplied by DANE † p < 0.1; *p < .05; **p < 0.01; ***p < 0.001. Highest values in bold. Underlined value signals the weight matrix chosen for the Spatial lag and Spatial Durbin models. Tests were also performed for second-order Rook and Queen (with and without including the lower order), but results were either similar or worse, so are not shown here (but can be obtained from the authors upon request). Although nearest neighbor may have a higher Moran's I value than Rook 1, Rook 1 is more reliable to use these when spatial polygons differ greatly in terms of size, which is the case in the Colombian context (e.g., compare the very vast Amazonian region with the many small-sized municipalities of the Andean region). Rook 1 was therefore chosen when Moran's I was high and reasonably comparable with a more complex spatial weight matrix. On two occasions a different weight matrix was chosen (external causes-males and cancer-women). Spatial autocorrelation was not considered to be present in the remaining causes of death