Measuring poverty in multidimensional contexts

When measuring multidimensional poverty it is reasonable to expect that the trade-offs between variable pairs can differ depending on whether the concerned pairs are complements or substitutes. Yet, currently existing approaches based on deprivation count distributions unrealistically assume that all pairs of variables are related in the same way—an unfortunate circumstance that undermines the possibilities of identifying the poor, aggregating their poverty levels and modeling non-trivial interactions between variables in highly flexible ways. This paper, which aims at modeling non-trivial relational structures across variables both in the identification and aggregation steps, is a first contribution towards addressing these inadequacies. The approach has been axiomatically characterized to flesh out the normative foundations upon which it is based and has a vast potential for application.


Introduction
Who is poor and who is not? How poor are the poor? These are the fundamental 'identification' and 'aggregation' questions suggested by Amartya Sen that must be addressed before any poverty eradication program can be implemented (Sen 1976). While the answer to these questions has been quite satisfactorily addressed when poverty is measured in the space of income distributions (after the seminal contribution by Sen in 1976 the literature on income poverty measurement is huge and is based on a very solid footing-see, for instance, Chakravarty (2009) for a recent survey on the topic), matters become more complicated when the poverty status and its levels are determined using several dimensions at the same time. After the influential writings of Sen (1985Sen ( , 1987Sen ( , 1992Sen ( , 1993, it is nowadays acknowledged that poverty is a multidimensional phenomenon and many scholars have insisted on the necessity of defining poverty measures that go beyond the distribution of income or consumption expenditures alone (see, for instance, Anand and Sen 1997;Atkinson 2003 Alkire and Foster 2011;Aaberge and Brandolini 2015).
While several contributions have identified different classes of multidimensional poverty measures (e.g. Chakravarty et al. 1998;Tsui 2002;Bourguignon and Chakravarty 2003;Chakravarty and D'Ambrosio 2006;Alkire and Foster 2011;Silber and Yalonetzky 2014;Aaberge and Brandolini 2015;Aaberge et al. 2015;Datt 2017;Pattanaik and Xu 2018) one of the most fundamental issues in this literature still needs to be addressed: the proper modeling of the relational structure across variables. When combining several variables of potentially different nature into a single measure it is reasonable to expect that the trade-offs between them could differ depending, for instance, on whether the concerned pairs are complements or substitutes (see Ravallion 2011Ravallion , 2012 for a conceptually related discussion). Yet, virtually all current approaches to multidimensional poverty measurement rely one way or another on the so-called 'deprivation count distributions'-an approach that takes the amount of variables in which individuals are deprived as its informational basis-implicitly assuming that all pairs of variables are related in the same way. This is unfortunate because the possibilities of identifying the poor, aggregating their poverty levels and modeling non-trivial interactions between variables in more realistic ways are severely undermined (see Ravallion 2011). Even if the trade-offs variability across alternative pairs of variables was readily identified as a central issue as soon as multidimensional poverty or welfare assessments were proposed (e.g. Atkinson and Bourguignon 1982;Bourguignon and Chakravarty 2003), as of now there are no measures that are able to capture such variability in a satisfactory way.
When poverty is assessed via pecuniary and non-pecuniary attributes simultaneously it is customary to partition the variables composing such measures in mutually exclusive dimensions, with several variables within each dimension. Such a partitionwhich is exogenously given-aims at imposing certain coherence and structure on the variables one is dealing with by clustering them in conceptually related areas (e.g: the dimensions of 'Health', 'Education' or 'Standard of Living'). Here we argue that the partition of variables across dimensions lends itself to a natural relational structurewith the variables belonging to the same (resp. alternative) dimensions being more 'similar' (resp. 'dissimilar') among themselves-that has been sistematically ignored in current approaches to multidimensional poverty measurement. Consider the following examples.
Example 1 Assume multidimensional poverty is assessed using the variables V 1 ='Income', V 2 ='Years of Schooling', V 3 = 'Self-assessed Health' and V 4 ='Health insurance' (example taken from Alkire and Foster 2011, p. 483). These variables can be naturally partitioned in two dimensions: 'Capacity to make a living' (denoted as D 1 , including V 1 and V 2 ) and 'Health' (denoted as D 2 , including V 3 and V 4 ). When deciding how to identify the poor, one might argue that the lack of deprivation in one variable could eventually compensate for the deprivation experienced in the other variable within the same dimension (i.e. having a health insurance might somehow compensate a low health status in a 'health' dimension, or having a high-quality education could compensate temporary low income levels). Therefore, individuals could be labeled as 'poor' when they experience simultaneous depriva-tions at least in V 1 and V 2 (something which would severly hinder that individual's capacity to make a decent living) or in V 3 and V 4 (an alarming circumstance for that individual's health), but not when they experience deprivation in one variable within D 1 and in one variable within D 2 .
Example 2 In the same 4-variable 2-dimensional setting, one could alternatively argue that each variable is essential to enjoy a decent living in the corresponding dimension (so that there is no possibility of compensation within dimensions) but that only individuals that are deprived in both dimensions have good reasons to be identified as 'poor'. Under this alternative specification, individuals experiencing deprivations in one dimension only would not be identified as 'poor'.
In these examples, what counts to be identified as poor is not only the quantity of variables in which individuals are deprived but also the qualitative relationships that might exist between them. Presently, there is no multidimensional poverty methodology that is able to identify the poor in the ways described in Examples 1 and 2 (see Sect. 3.1). Existing identification methods basically count the number of existing deprivations irrespective of the dimensions they belong to (see Atkinson 2003;Alkire and Foster 2011;Silber and Yalonetzky 2014;Aaberge and Brandolini 2015), thus ignoring the non-trivial compensation patterns that might exist within and between dimensions. 1 Such disregard for the relational structure across variables also has crucial implications for the aggregation step. When measuring how poor are the poor, current approaches assume that all pairs of variables are either complements or substitutes and that the elasticity of substitution is constant across them-an unduly restrictive assumption that is very unlikely to be satisfied in practice. Using an axiomatically characterized approach, this paper is a first step towards addressing the aforementioned inadequacies both in the 'identification' and 'aggregation' steps.
The rest of the paper is organized as follows. After introducing some basic notations in Sect. 2, in Sect. 3 we present the axiomatic characterization of different families of 'identification functions'. We start with the basic one-dimensional case (i.e. there is no partition of the underlying variables-this includes all approaches currently used in the literature 2 ) and then proceed to the multidimensional case (i.e. the variables are partitioned across multiple dimensions). In Sect. 4 we propose and axiomatically characterize new aggregation methods that allow introducing non-trivial relational structures across variables. Like in the previous section we start with the single dimensional case and then proceed to the multidimensional one. Section 5 concludes. The proofs are relegated to the "Appendix".

Notation and definitions
Let N be the set of individuals and D the set of variables 3 under consideration, with n := |N | ≥ 1, d := |D| ≥ 2. For any natural number d ≥ 2, let X d := {0, 1} d . For x, y ∈ X d , we write x ≥ y whenever x j ≥ y j for all j ∈ {1, . . . , d} and say that x vector-dominates y. Analogously, we write x > y whenever x j ≥ y j for all j ∈ {1, . . . , d} with at least one strict inequality, and say that x strictly vector-dominates y. R + is the set of non-negative real numbers and N + the set of strictly positive natural numbers. Let a = (a 1 , . . . , a d ) be a vector of positive numbers summing up to 1, whose j th coordinate a j is interpreted as the normalized weight associated with variable j. Let d = (a 1 , . . . , a d ) ∈ R d + | i a i = 1 be the d-dimensional simplex. Within this set, we denote by 1/d := (1/d, . . . , 1/d) the equal weights vector.
The achievement of individual i in attribute j is assumed to be measurable in a cardinal scale and will be denoted by y i j ∈ R + . For each attribute j we consider a poverty threshold z j ∈ R ++ indicating the minimum quantity necessary for a subsistence level-which in this paper we consider as exogenously given. Whenever y i j ≤ z j , we say that individual i is deprived in attribute j. Very often , poverty measures are defined in the space of deprivations rather than achievements. 4 One common measure of the deprivation experienced by individual i in variable j is the following 5 where c ≥ 0. Whenever y i j is measured in a cardinal scale γ i j is well-defined for any c ≥ 0, 6 and γ i j ∈ [0, 1]. In particular, when c = 1, γ i j is the so-called normalized deprivation gap (which measures in a [0, 1]-scale the distance between a given achievement y i j and the corresponding poverty line z j ) and when c = 2, we obtain the squared deprivation gap. For an individual i, we define the corresponding vector of deprivations gaps as γ i := (γ i1 , . . . , γ id ) ∈ [0, 1] d (when no confusion arises and its use is unnecessary, we might omit the individuals' label i). A deprivation matrix is a n × d matrix with entries in [0, 1] containing the deprivation gap vectors of n individuals in the different rows. The set of all n × d deprivation matrices is denoted as G n×d , and we define G := n∈N d∈N G n×d . Following Bourguignon and Chakravarty (2003), an identification function ζ : G 1×d → {0, 1} is a mapping from individual i's deprivation gap vector γ i to an indicator variable in such a way that ζ(γ i ) = 1 if person i is multidimensionally poor and ζ(γ i ) = 0 otherwise. For 3 The terms 'variable', 'indicator' or 'attribute' will be used interchangeably in this paper. 4 The alternative approach advocated by Ravallion (2011) of working in the space of attainments is not followed in this paper because (1) it might be possible for a poor person to be lifted out of poverty as a result of an increment in a nondeprived dimension, (2) it does not allow keeping track of the dimension-specific deprivations simultaneously. 5 There exist other definitions of deprivation gaps (see Table 1 in Permanyer (2014: 4) for other examples).
Since alternative definitions do not alter the findings of the paper, we have chosen the one that is more commonly used in the literature. 6 The value c = 0 is typically chosen when the variables that are used to assess multidimensional poverty are measured in an ordinal scale. However, in this paper we will focus on the cardinal case (see Remark 3 in Sect. 4.2).
analytical clarity, we write the identification function ζ as the composite ζ = ρ • ω, with and The function ω converts the deprivation gap vector γ i into a vector of 0s and 1s of length d indicating whether individual i is deprived or not in the different variables taken into account (where 1 denotes deprivation and 0 non-deprivation). The set X d contains all possible combinations of deprivations/non-deprivations across d variables. Its generic members-referred to as deprivation profiles-are denoted as Definition 1 Let Z be any non-empty subset of X d . The up-set of Z is defined as If Z is a subset of X d , Z ↑ is the set of deprivation profiles vector-dominating at least one member of Z . On the other hand, L(Z ) is the set of elements in Z that do not vector-dominate any other element in Z .
Since ω is simply an indicator function specifying whether individuals are deprived or not in the different variables, here we focus our attention on different ways in which ρ can be defined. With a slight abuse of notation, the functions ρ will also be referred to as 'identification functions'. Let d := {ρ : X d → {0, 1}} be the set of all possible identification functions for d variables. Since each identification function ρ is uniquely characterized by the set of elements ρ −1 (1) ⊆ X d (or its complement ρ −1 (0)) and X d has 2 d elements, it follows that d has 2 2 d elements. To simplify notation we will write P ρ := ρ −1 (1) and R ρ := ρ −1 (0) (i.e. P ρ and R ρ are the set of deprivation profiles that ρ identifies as 'poor' and 'non-poor' respectively). The set := { d } d∈N + \{1} contains all identification functions for all possible d ≥ 2. In Sect. 3 we impose several conditions on the elements of d to pin down several classes of identification functions S d ⊆ d . For clarification purposes, it is sometimes useful to graph the Hasse diagram corresponding to the set X d (whose elements are the nodes of the diagram) and the partial order generated by vector dominance ≤ (represented by the edges between nodes) to represent identification functions ρ ∈ d . In Fig. 1 we show two examples of identification functions (ρ 1 , ρ 2 ) for the case d = 4 that will be useful to illustrate other sections of the paper.
Let Q ρ := i ∈ N | ρ(ω(γ i )) = 1 be the set of individuals considered to be poor according to the identification function ρ. After completing the identification step, Sen The shaded circles in the top and bottom panels are the members of P ρ 1 and P ρ 2 respectively. The first identification function comes from the 'unweighted counting approach' using k = 1/2 as deprivation threshold (see Eq. (8)). The second one comes from the 'weighted counting approach', using a 1 = 1/2, a 2 = 1/4, a 3 = 1/8, a 4 = 1/8 as weights and k = 3/4 as deprivation threshold (see Eq. (7)). The least deprived profiles for ρ 1 and ρ 2 (i.e. L(P ρ 1 ) and L(P ρ 2 )) are highlighted in bold (1976) (and all the ensuing literature on poverty measurement after him) suggests to proceed to the aggregation step, i.e. summarize the information on the extent of poverty among the poor into a single number. While the identification functions analyzed in Sect. 3 are based on the vectors of 0s and 1s that obtain after applying the ω function to individuals' deprivation gap vectors, the aggregation step takes ρ as given and associates with a deprivation matrix an overall level of multidimensional poverty. The chosen aggregation method will be denoted as A (Sect. 4 presents and axiomatically characterizes several of such methods). Borrowing notation from Alkire and Foster (2011: 477), we define a multidimensional poverty methodology as the tuple (ρ, A).

The counting approach
The 'counting approach identification functions' can be written as the composite ι k •c a , with and For any x ∈ X d and any a ∈ d , the function c a is defined as c a (x) = j=d j=1 a j x j , that is: c a computes the weighted proportion of deprivations experienced by someone with deprivation profile x. Lastly, for any s ∈ [0, 1] and for any k ∈ (0, 1], ι k is defined as The ι k •c a function takes a value of 1 whenever the weighted proportion of deprivations attains a certain deprivation threshold k (which is exogenously given) and a value of 0 otherwise. With this notation, if one fixes any k ∈ (0, 1] we can define the following class of identification functions: This is the set of identification functions belonging to the so-called 'weighted counting approach' with deprivation threshold k. The higher the value of k, the more difficult it is that an individual ends up being classified as poor. When k ≤ min j a j , a function ρ ∈ W d (k) corresponds to the so-called 'union approach', and when k = 1, ρ ∈ W d (1) is equivalent to the 'intersection approach'. The Hasse diagrams shown in Fig. 1a, b illustrate examples of identification functions ρ ∈ W d (k) for certain combinations of a and k when d = 4. In Fig. 1a, we have chosen a 1 = a 2 = a 3 = a 4 = 1/4 and k = 1/2 (representing ρ 1 ) and in Fig. 1b, a 1 = 1/2, a 2 = 1/4, a 3 = 1/8, a 4 = 1/8 and k = 3/4 (representing ρ 2 ). When the weighting vector a turns out to weight all variables equally we obtain the class which will be referred to as 'unweighted counting approach' with deprivation threshold k. Since both approaches use deprivation thresholds within variables (z j ) and an overall threshold k across them, they are generally known as the 'dual cutoff' identification method-or, simply, the 'counting approach' (see Alkire and Foster 2011). The sets

Identification of the poor
In this section we present different families of identification functions obtained after imposing increasingly demanding axioms on the elements of d . We start assuming that all variables belong to the same dimension (the following subsection deals with the more general case where variables are partitioned across several dimensions). Let S d ⊆ d be a set of identification functions. Non-triviality (NTR): ρ is a non-constant function for all ρ ∈ S d .
NTR prevents the identification function being constant across all deprivation profiles. MON ensures that if individual A experiences deprivations at least in the same variables as those where another individual B experiences deprivations, and possibly in others, then A qualifies at least as much as B to be identified as multidimensionally poor. Because of their uncontroversial nature, we posit that the set of identification functions satisfying these two axioms should be the universe of reference from which identification functions should be drawn; it will be denoted as I d and referred to as the set of consistent identification functions. 7 The set I := {I d } d∈N + \{1} is the collection of consistent identification functions for all possible sets of variables. The following result uniquely characterizes the elements of I.
Proof See the "Appendix".
According to Proposition 1, the sets of 'poor profiles' P ρ derived from consistent identification functions are uniquely characterized and represented by the corresponding subsets of 'least deprived elements' L P ρ . When choosing a sensible set of poor profiles P ρ , the subsets L P ρ are particularly important because their elements determine the least deprived conditions that individuals should experience in order to be considered as poor. Indeed, the sets L P ρ can be thought as a generalization of the concept of a poverty line to the multidimensional context (i.e. they determine the boundary separating the poor from the non-poor: when x ∈ L(P ρ ) and y, z ∈ X d are such that y < x ≤ z, then y ∈ R ρ and z ∈ P ρ ).
The minimal structure imposed on consistent identification functions makes ample room to incorporate different criteria-many of which can be qualitative in naturewhen deciding who should be considered as multidimensionally poor. The downside of such flexibility is that the set of consistent identification functions can perhaps be too unwieldy for certain practical purposes. Indeed, current approaches to multidimensional poverty measurement have considerably reduced the class of admissible identification functions by imposing a set of axioms that, as we will now see, are quite restrictive.
Variable Anonymity (VAN): Independence (IND): Let x, y ∈ X d be two deprivation profiles such that for some variable i ∈ {1, . . . , d}, x i = y i . Let x , y ∈ X d be two other deprivation profiles such that x j = x j and y j = y j for j = i and VAN requires all variables to be treated symmetrically. IND is a classical separability assumption ensuring that the removal or addition of the same deprivation from two deprivation profiles should preserve the weak ordering among them. With these two additional axioms we can present the following result. VAN and IND if and only Theorem 1 is inspired by the seminal work of Pattanaik and Xu (1990) in the field of freedom of choice measurement. Whenever one is willing to accept the four aforementioned axioms simultaneously, the unweighted counting approach obtains. When it comes to characterize the weighted counting approach W d one would be tempted to simply drop the VAN axiom from the list. Yet, it turns out that IND is not powerful enough, so it needs to be strengthened. For our next axiom, we need the following definition.
Definition 2 Consider two hypothetical societies, each with m > 1 individuals, with deprivation profiles (x 1 , . . . , x m ), ( y 1 , . . . , y m ). We say that these two societies are equivalent if for each variable j ∈ {1, . . . , d} the number of individuals that are deprived in that variable is the same in both societies, that is: Compensation (COM): Consider two equivalent societies with deprivation profiles (x 1 , . . . , x m ) and ( y 1 , . . . , y m ). Assume that ρ( COM states that in two equivalent societies, it is not possible that all individuals in one of them qualify at least as much as the others to be identified as multidimensionally poor. That is: it is not possible to match the individuals of these equivalent societies in such a way that each individual of the former is ranked at least as high as the corresponding individual of the latter with some of these rankings being strict. It is easy to show that COM imposes separability across variables-indeed, COM can be seen as a stronger version of classical independence axioms usually employed in welfare analysis (see, for instance, Blackorby et al. 1978): if COM holds for a given S d ⊆ d , then IND holds as well for all ρ ∈ S d ; however, the opposite is not necessarily true (the proof of this statement is shown in the "Appendix" before the proof of Theorem 2). 8

Theorem 2 Let S d ⊆ d . The identification functions ρ ∈ S d satisfy MON, COM and NTR if and only if
With the last axioms the class of consistent identification functions has been narrowed down considerably to obtain W d and C d , the state-of-the art methodologies that are massively used in practice to identify the multidimensionally poor. However, such simplification comes at a cost: while NTR and MON are indisputable, VAN, IND and COM are contentious. VAN is not very meaningful when some variables are much more relevant than others, while IND and COM impose full separability across all variables and treat them as if they were, so-to-speak, mutually orthogonal. These axioms are responsible for the exceedingly uniform way in which all pairs of variables are treated, irrespective of whether they might be complements or substitutes. Such quantitatively-driven approach can be highly unsatisfactory as it might fail to identify the poor in those contexts where the different variables we are dealing with are partitioned across several dimensions and where there might be non-trivial compensation patterns between and within them (see Examples 1 and 2 in the introduction). In the following section we address these issues by proposing another subclass of consistent identification functions that does not satisfy the overly restrictive IND and COM axioms.

Identification in multiple dimensions
Suppose now that the set of variables D is partitioned in G dimensions (G being a natural number strictly greater than one). Such exogenously given partition in thematic areas is naturally derived from the index's design and is unlikely to generate much controversy. Let G denote the set of partitions of D into G dimensions D 1 , . . . , D G where at least two dimensions contain at least two variables 9 and let d g := D g . Clearly, d = g d g . Given any such partition there is a one-to-one correspondence between X d and Definition 3 Let S d ⊆ d be a set of identification functions belonging to a collection {S j } j∈N + \{1} and let ψ ∈ G . We define the two-stage identification functions associated with the pair (S d , The members of S ψ d are identification functions constructed in two steps. Initially, the functions ρ w g : X d g → {0, 1} decide about the deprivation status in each dimension and secondly, overall deprivation across dimensions is assessed via These functions-all of which belong to the collection {S j } j∈N + \{1} -are referred to as within-and across-dimension identification functions respectively. The superscript ψ is used to indicate the dependence of the two-stage identification functions on the choice of the partition of is the set of two-stage identification functions for d variables whereby both the within-and across-dimension identification functions belong to the unweighted counting approach C (resp. the weighted counting approach W, consistent identification functions I).
The identification functions verbally described in Examples 1 and 2 (see Introduction) are members of C ψ d . In those examples, we have four variables partitioned in two dimensions (i.e. d = 4, G = 2) and ψ ∈ 2 is the partition that groups the first two variables (V 1 , V 2 ) in the first dimension (D 1 ) and the last two (V 3 , V 4 ) in the second one (D 2 ). In Example 1, both within-dimension identification functions are based on the intersection approach, that is: individuals have to be deprived in both variables within the same dimension to be considered as deprived in that dimension. Afterwards, the across-dimension identification function is based on the union approach, i.e. whenever individuals are deprived in any of the two dimensions, they are considered to be multidimensionally poor. In Example 2, both within-dimension identification functions are based on the union approach and the across-dimension identification function is based on the intersection approach. In general, if there are good reasons to believe that the relationships between pairs of variables differ within and across dimensions, two-stage identification functions can a priori be more appropriate than counting approaches. The following result highlights the key difference between them. Since In addition, when the partition ψ is trivial (G = 1) the multiple dimensions approach reduces to the classical single-dimensional case. This way, we have generated a new set of consistent identification functions that does not comply with the COM axiom and makes room to model non-trivial compensation patterns that might exist between and within dimensions-thus considerably enlarging the toolkit available to those practitioners aiming at measuring multidimensional poverty.

Aggregation of the poor
So far we have been discussing how the partition of variables in different dimensions affects the identification of the poor. We are now going to explore its implications for the 'agregation step'. As is standard in the literature, we take a certain identification function ρ as exogenously given and propose different ways to summarize the extent of poverty among the poor with a single real number (i.e. we suggest different aggregation methods A). For that purpose we introduce the following notation. Let G S n×d denote the set of n × d deprivation matrices whose rows are the same. For a given deprivation gap vector γ ∈ G 1×d , let [γ ] ∈ G S n×d denote the n × d deprivation matrix whose rows are equal to γ . A family of multivariate poverty indices f : G → R is a set of non trivial functions { f d } d∈N where each function f d converts an element from the space of deprivation matrices ∈ G n×d into a real number f d ( ) indicating the extent of poverty in the corresponding distribution. In this section we present some basic properties one might want to impose on a family of multivariate poverty indices f : G → R to characterize it axiomatically. We begin with the one-dimensional case (i.e. all variables belong to the same dimension) and then proceed to the multidimensional case in the next subsection. Our axioms and results are designed for the cardinal case (i.e. all variables are assumed to be measurable in a cardinal scale). Separability (SEP): Let γ , δ ∈ [0, 1] d be two deprivation gap vectors such that for some variable j ∈ {1, . . . , d}, γ j = δ j . Let γ , δ ∈ [0, 1] d be two other deprivation gap vectors such that γ l = γ l and δ l = δ l for l = j and γ j = δ j . Then Homogeneity (HMG): For any ∈ G n×d and any λ ∈ (0, 1] one has that f d Homotheticity (HMT): For any 1 , 2 ∈ G n×d and any λ ∈ (0, 1] one has that where λ 1 , λ 2 are the deprivation matrices 1 , 2 with all their elements scaled by λ. Continuity (CON): For all d ∈ N, f d is a continuous function in its arguments. SDC allows identifying subgroups where poverty is particularly high and evaluating their contribution to overall poverty levels. Indeed, it is such an intuitive and useful property that it has been imposed on all multivariate poverty indices presented in the literature so far. One of the consequences of SDC is that overall poverty can be written as the arithmetic mean of individual poverty levels, thus ensuring that our aggregation method is consistent with the principle of anonymity (i.e. overall poverty does not depend on the identity of the individuals experiencing deprivations). MOA ensures that if the deprivation felt by any individual in any attribute increases, assuming the rest of deprivations do not decrease, then overall deprivation should increase. Formally, this axiom is the equivalent of MON adapted to the 'aggregation framework'. NRM establishes that when nobody is deprived in any variable, then poverty should be equal to zero. In addition, if all individuals are fully deprived in all variables, NRM stipulates that poverty should reach its maximal value of one. According to SEP the poverty ranking between two deprivation gap vectors only depends on the set of variables where their values do not coincide, irrespective of what happens in the ones where they coincide. Following Blackorby et al. (1978), SEP stipulates that each variable is separable from its complement. HMG ensures that when all deprivation gaps are scaled by a proportionality factor, overall deprivation is scaled by the same factor. HMT is a weaker version of HMG and ensures that the weak poverty ordering between two societies does not change when all deprivations are scaled by the same proportionality factor. Clearly, Homogeneity implies Homotheticity, but not the other way around. Since both HMG and HMT are very standard in the literature, we have used them both to show the difference it makes to impose the one or the other to our poverty measures. Lastly, CON requires that small changes in the deprivations of individuals produce small changes in the corresponding poverty measures (i.e. poverty levels do not change abruptly when individuals' deprivations are slightly altered). This property ensures that poverty levels will not be dramatically affected by small measurement errors in the data.

Theorem 3 Assume we identify the set of poor individuals (Q ρ ) via the identification function
for all d ∈ N, where θ > 0, a j > 0∀ j ∈ {1, . . . , d} and j a j = 1.
Proof See the "Appendix".
The family of multivariate poverty indices shown in (12) measures individuals' poverty levels averaging the corresponding deprivation gaps vector γ i using a weighted generalized mean of order θ . 10 Clearly, when θ = 1, θ corresponds to the class of poverty measures M α suggested by Alkire and Foster (2011) when α = c. In addition, (12) can also be seen as a member of some of the multidimensional poverty indices proposed by Bourguignon and Chakravarty (2003). While the previous two measures were originally defined under the assumption that the poor are identified via the counting and the union approach, respectively, the new measure shown in (12) broadens the class of admissible poverty measures incorporating the more general identification functions embodied in ρ ∈ I d . The choice of θ allows modelling different elasticities of substitution between pairs of deprivation gaps: when θ = 1 there is perfect substitutability and when θ → ∞ there is perfect complementarity. As highlighted by Bourguignon and Chakravarty (2003: 40), such elasticity of substitution is the same across all pairs of deprivations, a restriction that often might not be very realistic. Inspecting the axiomatic characterization shown in Theorem 3, it is clear that SEPwhich treats all dimensions as if they were mutually orthogonal -is responsible for this state of affairs. In the following subsection, we introduce other axioms allowing more flexibility in this regard.
When homogeneity (HMG) is substituted by homotheticity (HMT), it is easy to show that the functional form of the multivariate poverty indices characterized in Theorem 3 can be written as where β > 0 (result not shown here but available upon request). This functional form coincides with some of the measures proposed by Bourguignon and Chakravarty (2003). In addition, when θ = 1 and β > 1, the aggregation function shown in (13) coincides with the one used in the "distribution-sensitive" measures proposed by Datt (2017) and by Pattanaik and Xu (2018). 11 As discussed in those papers, Eq. (13) can be sensitive to the extent of inequality in the distribution of deprivations depending on the values of β. The use of HMT enlarges the class of admissible indices characterized by Theorem 3 at the cost of introducing some extra complexity in the interpretation of our poverty measures.

Aggregation in multiple dimensions
We now assume that the set of variables D is partitioned in G ≥ 1 dimensions (i.e. D = g D g ), with each dimension D g containing d g variables ( g d g = d).
We relabel individual's i deprivation gaps vector γ i = (γ i1 , . . . , γ id ) to identify the specific dimensions where the different deprivations belong to. Given the one-to-one correspondence between [0, 1] d and [0, is the vector of deprivation gaps in dimension D g for each g ∈ {1, . . . , G}. Hence γ igv is individual's i deprivation gap in variable v belonging to dimension g. When no confusion arises, we might omit the label i when it is not necessary. For any g ∈ {1, . . . , G}, let This is the set of deprivation gap vectors for those individuals who only experience deprivations within dimension g. Given any partition (D 1 , . . . , This is the set of deprivation gap vectors whose components are constant within each dimension. Lastly, observe that any deprivation gap matrix ∈ G n×d can be obtained after appending different g ∈ G n×d g (one per dimension), so that = ( 1 | . . . | G ).
In order to generalize the family of multivariate poverty indices shown in Theorem 3 to the multidimensional context, we introduce the following axioms. Within Dimension Separability (WDS): Let (D 1 , . . . , D G ) be any partition of D in G ≥ 1 dimensions. Consider any dimension g ∈ {1, . . . , G} and let γ , δ ∈ [0, 1] d g be two deprivation gap vectors such that for some variable v ∈ {1, . . . , d g }, γ gv = δ gv . Let γ , δ ∈ [0, 1] d g be two other deprivation gap vectors such that γ gu = γ gu and δ gu = δ gu for u = v and γ gv = δ gv . Then Between Dimension Separability (BDS) : Let (D 1 , . . . , D G ) be any partition of D in G ≥ 1 dimensions. Let r, s ∈ [0, 1] d E be two deprivation gap vectors such that r g = s g for some dimension g ∈ {1, . . . , G}. Let r , s ∈ [0, 1] d E be two other deprivation gap vectors such that r j = r j and s j = s j for j = g and r g = s g . Then f d ([r] WDS stipulates that the poverty ranking between two individuals who are only deprived in dimension g just depends on the set of variables where their values differ, irrespective of what happens with the ones where they coincide. In other words, it requires that any variable should be separable from its complement within the same dimension only. In this respect, WDS is much weaker than SEP (the latter requiring any variable to be separable from all other variables irrespective of the dimension they belong to). Indeed, SEP can be seen as a particular case of WDS when G = 1. Likewise, BDS requires the different dimensions to be separable from each other. Interestingly, SEP can also be seen as a particular case of BDS when G = d (each variable constitutes one dimension).
Proof See the "Appendix".

Implications and remarks
Remark 1 The family of multidimensional poverty indices characterized in Theorem 4 satisfies the following identity:  (γ i11 , . . . , γ i1d 1 ), . . . , ϕ G (γ i G1 , . . . , γ i Gd G ) . For obvious reasons, we call this the dimension-first two-stage aggregation method, and leave its more complete exploration for future research.

Remark 2 When homogeneity (HMG) is substituted by homotheticity (HMT)
, it is easy to show that the functional form of the multivariate poverty indices characterized in Theorem 4 can be written as where β 1 , β 2 > 0 (result not shown here but available upon request). Hence, relaxing the somewhat restrictive HMG by HMT, we enlarge the class of admissible dimensionfirst two-stage aggregation methods we can use to measure multidimensional poverty. Yet, such increased flexibility is gained at the cost of complicating the interpretation of these measures. Since the more complicated functional forms shown in (17) are not strictly necessary for the purposes of this paper, in the rest of this section we stick to their simpler version shown in (16).

Remark 3
What happens in the ordinal setting where the deprivation gaps are dichotomous [i.e. c = 0 in Eq. (1)]? In that case, CON, HMG and HMT are not well-defined. In addition, the proofs of Theorems 3 and 4 rely on the fact that the deprivation gaps γ i j can take any real value in [0, 1] (see "Appendix"). Hence, these two theorems do not apply and their ordinal scale versions should await further research. 12 Remark 4 An important characteristic of multidimensional poverty measures is their sensitivity to correlation increasing switches. 13 As discussed in Bourguignon and Chakravarty (2003: 35), these measures might increase or decrease after correlation increasing switches depending on whether the attributes we are taking into consideration are complements or substitutes. 14 Unfortunately, currently existing measures assume that all pairs of variables are either complements or substitutes-an assumption that might not necessarily hold. The poverty measure G θ shown in (16) makes room for the possibility of having pairs of variables that are complements or substitutes depending on whether they belong to the same or alternative dimensions (see Proposition 3 below). Clearly, when G = 1, Eq. (16) reduces to Eq. (12). The new poverty measure depends on parameter θ (governing the complementarity or substitutability across dimensions) and the different θ g (governing the complementarity or substitutability between variables within dimension D g ). As is clear, whenever θ = θ 1 = · · · = θ G the 'G-dimensional measure' G θ is equivalent to the '1-dimensional measure' shown in Eq. (12), so all pairs of deprivations have the same elasticity of substitution. However, when one departs from that trivial case the levels of complementarity/substitutability between deprivations in the poverty measure G θ can vary across dimensions. Following Bourguignon and Chakravarty (2003: 35), it is trivial to check that when using G θ , poverty does not decrease (resp. increase) after a correlation increasing switch whenever the concerned variables satisfy the conditions of substitutability (resp. complementarity) stated in the following proposition.
Proof See the "Appendix".
Remark 5 Following Sen (1976) and the ensuing literature on poverty measurement, the multidimensional poverty methodologies proposed in this paper have two separate parts: the identification step (ρ) and the aggregation step (A). It is only after identifying who is poor and who is not that one proceeds to summarize the extent of poverty among the poor into a single number. While this sequential approach does not generate any problem for 'traditional' income poverty measures, in the multidimensional context it can potentially create different kinds of inconsistencies. One of them might arise because of the lack of coherence between the identification and aggregation stepswhich in this paper have been characterized independently. While this independence gives ample room to generate identification and aggregation functions in highly flexible ways, it could eventually lead to some kind of mismatch between them. 15 The other undesirable consequence of defining ρ and A separately is that whenever the identification of the poor ρ is not performed via the union approach, the multidimensional poverty methodology (ρ, A) will fail to be continuous-even if the aggregation methods A proposed in (12), (13), (16) and (17) use continuous functions. Technically, such discontinuities arise because when ρ does not correspond to the union approach, the deprivations of the non-poor are censored-an issue that currently affects virtually all the multidimensional poverty methodologies using the dual cutoff method proposed by Alkire and Foster (2011) and the consistent identification functions proposed in this paper. This problem and its implications have been discussed at length by Datt (2017) and Pattanaik and Xu (2018). While Datt (2017) suggests to use the union approach to avoid such discontinuities and other distributive-related problems, Pattanaik and Xu (2018) suggest a novel method in which the identification of the poor is based on the weighted sum of deprivation gaps (rather than the weighted count of dichotomous deprivations that is customary in the counting approach-see footnote #12). This identification method-which seems to dilute the hitherto crisp boundaries between the identification and aggregation steps-offers an interesting avenue of research that should be explored in the near future.

Discussion and conclusion
This paper addresses one of the most fundamental challenges that are still pending in the measurement of multidimensional poverty: the modeling of non-trivial relational structures across variables. As opposed to currently existing methods-which assume that all pairs of variables are either complements or substitutes and that the elasticity of substitution is constant across them -our approach offers the possibility to flexibly model the tradeoff complexities and subtleties involved in poverty measurement. The fact that some pairs of poverty indicators are complements while others are substitutes can have crucial implications both when identifying the poor and when aggregating their poverty levels. The techniques presented in this paper allow, for the first time, to take these considerations into account both in the identification and aggregation steps. This is accomplished by generalizing and going beyond the 'deprivation count distributions' or 'counting methods'-whereby individuals' multidimensional poverty status and depth of poverty are assessed on the basis of the amount of variables in which these individuals are deprived-that pervade current approaches to multidimensional poverty measurement (see Atkinson 2003;Bourguignon and Chakravarty 2003;Alkire and Foster 2011;Silber and Yalonetzky 2014;Aaberge and Brandolini 2015). The new identification and aggregation methods proposed here have been axiomatically characterized to flesh out the normative foundations upon which they are based.
Our approach is very general and includes most of the currently existing multidimensional poverty methodologies proposed in the literature as particular cases. Inter alia, it includes the counting approach suggested by Alkire and Foster (2011) that is massively used in empirical applications (like, for instance, in the United Nations' Multidimensional Poverty Index). Among the multidimensional poverty methodologies that are not covered by the techniques proposed in this paper, it is worth highlighting the 'dual approach' proposed by Aaberge et al. (2015). Using a social evaluation function that applies a certain distortion to the CDF of the deprivation count across the population, the dual approach is a flexible method that explicitly takes into consideration the complementarity/substitutability among attributes in the aggregation step. Yet, like all other aggregation methodologies proposed in the literature so far, the dual approach assumes that all pairs of attributes are either complements or substitutes.
All modeling exercises face inescapable trade-offs between parsimony and realism. In this regard, the two-stage identification and aggregation methods introduced in this paper lie between two extremes: (i) extreme parsimony, in which the relational structure between variables is simply ignored, and (ii) extreme realism, in which one attempts to model the association between all variable pairs (d(d − 1)/2). While the former approach (which represents the current state of affairs) is excessively rough, the latter quickly becomes statistically intractable as the number of variables increases. Alternatively, the two-stage approach suggested here is flexible enough to capture important aspects of the relational structure between variables without falling prey of statistical over-sophistication.
The tools proposed here have a vast potential for empirical applications in a wide variety of settings, ranging from microeconomic theory to development economics. Among others, they allow analysts to model multidimensional poverty more realistically but face them with questions that are more difficult to answer (e.g. ¿How to choose the identification functions ρ ∈ I d ?, ¿How to determine the degree of complementarity/substitutability across and within dimensions?). The answer to these questions is highly context-specific and is likely to require econometric callibrating models to estimate the values of the parameters governing the trade-offs within and across dimensions.
To the extent that the success of micro level anti-poverty programs depends on targeting the right individuals and properly assessing their deprivation levels, and that current international cooperation, development and aid programs are guided by the macro level results derived from the corresponding measures, the issues analyzed in this paper have practical and financial implications for the design of effective poverty eradication strategies. Having recently reached the Millennium Development Goals (MDGs) target year, many scholars and policy-makers are currently engaged in an intense debate about what kind of headline poverty indicator should be the most appropriate to guide poverty eradication strategies in the post-2015 global development agenda. Like its predecessor, the first of the so-called Sustainable Development Goals (the SDGs) aims to 'End Poverty in all its forms everywhere'. This is a good moment to take stock and reflect before uncritically extending use of the counting approach. Other procedures, such as the ones suggested here, exist to identify recipients and assess their poverty levels under one of the greatest international endeavours of our time to eradicate poverty.
y ≤ x. Now, if y ∈ L P ρ ⊂ P ρ then x ∈ y ↑ . Since ρ ∈ I d , one can conclude that x ∈ L P ρ ↑ . Otherwise, if y / ∈ L P ρ then we can proceed iteratively until reaching an element belonging to L P ρ . That is: since X d is finite ( X d = 2 d ) there must exist a finite sequence of vector dominations z i ≤ z i+1 from some element z 1 ∈ L P ρ up to x (i.e.: z 1 ≤ z 2 · · · ≤ z n ≤ x ), so that x ∈ z 1 ↑ . Since ρ ∈ I d , one can conclude that x ∈ L P ρ ↑ .
This proves the 'if' part of the proposition. The 'only if' part of the proof goes as follows. Assume P ρ is a subset of X d such that L P ρ ↑ = P ρ . We have to prove that ρ ∈ I d . Take any x ∈ P ρ . Since L P ρ ↑ = P ρ we can say that x ∈ z ↑ for some z ∈ L P ρ . Consider now any y ∈ x ↑ . By the transitivity of ≤ one has that y ∈ z ↑ . Since L P ρ ↑ = P ρ , we can conclude that y ∈ P ρ .

Proof of Theorem 1
It is easy to verify that when S d = C d , then any ρ ∈ S satisfies MON, IND, VAN and NTR. Therefore, we will prove that when a group of identification functions S d ⊆ d satisfies these four axioms then it must be equal to C d .
We will first prove the following auxiliary lemma: Auxiliary Lemma 1 Whenever IND and VAN hold for some ρ ∈ S d ⊆ d , then, one has that |x| = | y| ⇒ ρ(x) = ρ( y) for all x, y ∈ X d and all ρ ∈ S d .
Let ρ ∈ S d and let m > 1 be an integer for which |x| = | y| entails ρ(x) = ρ( y) for all x, y ∈ X d such that |x| = | y| < m. We will now prove the result also holds true whenever |x| = | y| = m. Let x, y ∈ X d be two vectors with |x| = | y| = m. There are now two mutually exclusive cases. Case 1 Assume there exists a variable j ∈ {1, . . . , d} such that x j = y j = 1. Consider the vectors x − e j , y − e j . Since x − e j = y − e j = m − 1 < m, the induction hypothesis holds, so ρ(x − e j ) = ρ( y − e j ). Now, by IND one has that ρ(x) = ρ( y), as desired. Case 2 Assume there does not exist any variable j ∈ {1, . . . , d} such that x j = y j = 1. Consider two different variables j, l ∈ {1, . . . , d} such that x j = 0, y j = 1 and x l = 1, y l = 0. Since |x − e l | = y − e j = m − 1 < m, the induction hypothesis holds, so ρ(x − e l ) = ρ( y − e j ). Now, by IND one has that Observe that Comparing (A1) with (A5), we can conclude that ρ(x) = ρ( y), as desired. This proves the auxiliary lemma 1.
Take now any two vectors x, y ∈ X d such that |x| ≥ | y|. Define now w ∈ X d in such a way that |w| = | y| and δ(w) ⊆ δ(x). By auxiliary lemma 1, one has ρ(w) = ρ( y). On the other hand, by MON ρ(x) ≥ ρ(w), so one can conclude that ρ(x) ≥ ρ( y). This ensures that ρ is a counting measure for all ρ ∈ S d .
Statement: If COM holds for a given S d ⊆ d , then IND holds as well for all ρ ∈ S d ; however, the opposite is not necessarily true.
Proof of the statement To verify this claim let's start assuming that COM applies for a certain set of identification functions S d ⊆ d . Consider now the deprivation vectors x, y, x , y ∈ X d as in the statement of IND. Then, it is trivial to verify that (x, y ) and ( y, x ) are equivalent societies (with m = 2). Imposing COM, one has that ρ(x) ≥ ρ( y) implies ρ(x ) ≥ ρ( y ) for all ρ ∈ S d : this is precisely what IND states. On the other hand, one can find infinitely many examples of sets of identification functions satisfying IND but failing to satisfy COM. A very simple example for the case d = 3 can be S d = {ρ 0 }, where It is trivial to check that S d = {ρ 0 } satisfies IND. However, it does not satisfy COM. To verify this, consider the following pair of three-person societies (x 1 , x 2 , x 3 ) and ( y 1 , y 2 , y 3 ) with x 1 = (111), x 2 = (101), x 3 = (001), y 1 = (101), y 2 = (011), y 3 = (101). Clearly, both societies are equivalent (the number of individuals experiencing deprivation in each variable coincides). However, we have that ρ 0 (x 1 ) ≤ ρ 0 ( y 1 ) and ρ 0 (x 2 ) ≤ ρ 0 ( y 2 ) for all ρ ∈ S d and yet ρ 0 (x 3 ) < ρ 0 ( y 3 ), thus contradicting COM.

Proof of Theorem 2
It is easy to verify that when S d = W d , then any ρ ∈ S d satisfies MON, COM and NTR. Therefore, we will prove that when a group of identification functions S d ⊆ d satisfies these three axioms then it must be equal to W d (k) for some k ∈ (0, 1]. Since (i) X d is finite, (ii) each ρ ∈ S d induces a complete ordering in X d × X d , and (iii) COM holds for all ρ ∈ S d , the hypotheses of Theorem 4.1.B in Fishburn (1970) are satisfied. Therefore, for all x, y ∈ X d and for all ρ ∈ S d one has that for some real-valued functions u 1 , . . . , u d on {0, 1}. One can rewrite the last expression as follows In turn, this expression can be rewritten as can be written as By MON, one must have that w j ≥ 0∀ j. This ensures that ρ is a counting measure for all ρ ∈ S. By MON and NTR, ρ(0) = 0 and ρ(1) = 1. 16 Lastly, by MON and NTR there must exist a real number q ∈ (0, i w i ] such that ρ(x) = 0 whenever d j=1 w j x j < q and ρ(x) = 1 whenever d j=1 w j x j ≥ q. Defining a i := w i / i w i and k := q/ i w i we have found a vector of weights a ∈ d and a deprivation threshold k ∈ (0, 1] such that ρ ∈ W d , as desired. This proves Theorem 2.

Proof of Proposition 2
It is straightforward to prove that if S d ∈ {I d , W d , C d } then S ψ d satisfies NTR and MON. We will only show that S ψ d does not satisfy COM. For that purpose we need to prove an auxiliary lemma (see below). According to (11) C ψ d is the set of identification functions for d variables where the unweighted counting approach is used within-and between-dimensions (given ψ ∈ G ). To allow proper labelling, this set will be rewritten as C ψ d (k 1 , . . . , k G ; k b ), where k 1 , . . . , k G denote the domain specific deprivation thresholds and k b the between domain deprivation threshold. Within this set, define . . . , k G ; k b ) | k b < 1 and k g j ≥ 2/d g j for at least two g j ∈ {1, . . . , G} , (A11) . . . , k G ; k b ) | k b = 1 and k g j < 1 for at least two g j ∈ {1, . . . , G} .
The set C ψ d contains identification functions where poor individuals do not have to experience deprivation in all dimensions simultaneously and in some of them they must be deprived in at least two variables. The set C ψ d contains identification functions where poor individuals have to experience deprivation in all dimensions simultaneously but where deprivation needs not to be universal within at least two of these dimensions. The sets C ψ d and C ψ d are generalizations of Examples 1 and 2 to the multiple dimension context.

Proof of Auxiliary Lemma 2
In both cases we follow the same strategy: if ρ ∈ C ψ d or ρ ∈ C ψ d we start assuming that there is a weighting scheme a ∈ d and a deprivation threshold k such that ρ ∈ W d to arrive at a contradiction. Given the partition of D in G dimensions (D 1 , . . . , D G ) ∈ G , we will denote the elements of the weighting vector a as a gv , where g ∈ {1, . . . , G} indexes the member of the partition D g to which the weight belongs and v ∈ {1, . . . , d g } indexes the members within domain D g . We can assume without loss of generality that within each domain D g the weights are sorted in a non-ascending order, i.e.: a gv ≥ a gv+1 for all g ∈ {1, . . . , G} and all v ∈ {1, . . . , d g − 1}.
It is trivial to show that the inequalities system shown in (A20) does not have feasible solutions. In the first inequality of the system, either a 11 or a 12 must be greater or equal than k /2. The same goes for a 21 , a 22 in the second inequality of the system: at least one of them must be greater or equal than k /2. Picking the largest elements between a 11 , a 12 and a 21 , a 22 and adding them up results in a number that is greater or equal than k , therefore contradicting at least one of the four last inequalities of the system. We have reached the contradiction we were looking for.
Let us now consider case C ψ d . Without loss of generality, we can assume that the two dimensions g 1 , g 2 ∈ {1, . . . , G} with k g 1 < 1 and k g 2 < 1 are g 1 = 1 and g 2 = 2. Since ρ ∈ C ψ d , there exist ρ b ∈ C G (1) and ρ w g ∈ C d g (k g ) such that ρ(x) = ρ b (ρ w 1 (x 1 ), . . . , ρ w G (x G )). By definition, the following inequalities must hold: Again, it is trivial to prove that the inequalities system shown in (A28) does not have feasible solutions. In the second to last inequality of the system, either a 11 or a 12 must be smaller than k /2 . The same goes for a 21 , a 22 in the last inequality of the system: at least one of them must be smaller than k /2. Picking the smallest elements between a 11 , a 12 and a 21 , a 22 and adding them up results in a number that is smaller than k , therefore contradicting at least one of the four first inequalities of the system. We have reached the contradiction we were looking for. This proves auxiliary Lemma 2.
The proof of Proposition 1 is now almost immediate. According to Auxiliary Lemma 2, C ψ d \W d = ∅ (essentially, it is only when the union or intersection Since p is linearly homogeneous on γ , one must have that (b(d − 1) + ς ) /a = 0. Therefore, Eq. (A30) can be rewritten as for some real constants A, B > 0. Hence, whenever θ g < min{1, θ}, ∂ 2 π θ / ∂γ gv ∂γ gu > 0, so the attributes u, v belonging to the same dimension are complements. On the other hand, whenever θ g > max{1, θ}, ∂ 2 π θ / ∂γ gv ∂γ gu < 0, so the attributes u, v belonging to the same dimension are substitutes. This proves part (i). For part (ii), we need to compute ∂ 2 π θ / ∂γ gv ∂γ hu . After algebraic manipulations it can be shown that