How to Render Species Comparable Taxonomic Units Through Deep Time: a Case Study on 2 Intraspecific Osteological Variability in Extant and Extinct Lacertid Lizards

Generally, the species is considered to be the only naturally occurring taxon. However, species recognised and defined using different species delimitation criteria cannot readily be compared, impacting studies of biodiversity through Deep Time. This comparability issue is particularly marked when comparing extant with extinct species, because the only available data for species delimitation in fossils is derived from their preserved morphology, which is generally restricted to osteology in vertebrates. Here, we quantify intraspecific, intrageneric, and intergeneric osteological variability in extant species of lacertid lizards using pairwise dissimilarity scores based on a dataset of 253 discrete osteological characters for 99 specimens referred to 24 species. Variability is always significantly lower intraspecifically than between individuals belonging to distinct species of a single genus, which is in turn significantly lower than intergeneric variability. Average values of intraspecific variability and associated standard deviations are consistent (with few exceptions), with an overall average within a species of 0.208 changes per character scored. Application of the same methods to six extinct lacertid species (represented by 40 fossil specimens) revealed that intraspecific osteological variability is inconsistent, which can at least in part be attributed to different researchers having unequal expectations of the skeletal dissimilarity within species units. Such a divergent interpretation of intraspecific and interspecific variability among extant and extinct species reinforces the incomparability of the species unit. Lacertidae is an example where extant species recognised and defined based on a number of delimitation criteria show comparable and consistent intraspecific osteological variability. Here, as well as in equivalent cases, application of those skeletal dissimilarity values to palaeontological species delimitation potentially provides a way to ameliorate inconsistencies created by the use of morphology to define species.

(Sneath and Sokal 1973), which was mostly applied at higher taxonomic levels than the species. 155 Although numerical taxonomy as a field has since been abandoned in favour of phylogenetic 156 approaches, these methodologies continue to be used to quantify morphological disparity, application of extant variability scores, to delimit extinct species or assess their validity, been 163 relatively widespread (e.g., Simpson 1941;Gingerich 1981;Kay 1982;Kelley 1986;Roth 1992). 164 In lacertid lizards -the focus of our study -knowledge of morphological intraspecific

176
Our study comprises three analytical steps. First, we characterised intraspecific (comparing two 177 specimens assigned to one species), intrageneric (comparing two specimens assigned to two the utility of their inclusion is limited. "Lacerta" siculimelitensis is also solely known from between specimens on disparity analyses. Pseudeumeces cadurcensis is here represented by eight 277 specimens: an articulated lower jaw (the most complete individual specimen to our knowledge), 278 and seven disarticulated cranial bones from a number of localities in France. 279 Specimen Identification.-Correct species identifications of the sampled specimens is 280 paramount to studies of intraspecific variability. Here, 28 of the 99 specimens of extant species 281 were collected, identified based on external morphological features and locality data, and then 282 prepared by one of us (MD). The other identifications were mostly adopted from the collection 283 catalogues, which were assumed to have been compiled by other expert herpetological 284 taxonomists. Exceptions to this were made when we encountered identifications that appeared 285 highly dubious based on the associated collection data and/or strongly aberrant size or 286 morphology of the specimen, and where responsible collection staff urged caution. All 287 specimens with dubious identification were excluded from scoring. Many of the studied 288 specimens were referred to a species and accessioned in collections before important revisions of 289 those respective species or genera were published, and the ID associated with the specimens we 290 studied has not been updated since. These include specimens identified as "Lacerta ocellata" and 291 Lacerta viridis. The populations formerly ascribed to the first taxon are now referred to several 292 different species included in the genus Timon. The species L. viridis is still a valid species within 293 the genus Lacerta, but populations previously referred to the subspecies L. viridis bilineata were 294 raised to species rank in the 1990s (see, among others, Arnold et al. 2007). All the species 295 currently recognized as valid have distinct geographical distributions, and therefore museum 296 specimens, skeletal preparations included, catalogued as "Lacerta ocellata" and Lacerta viridis 297 with associated locality information could still be attributed to their respective species.
The identification of the fossil specimens was taken entirely from literature and museum 299 catalogues for analytical reasons. Because we wanted to test if extinct species as recognized by 300 palaeontologists had disparate intraspecific variability compared to extant species, we had to 301 resort to those earlier referrals by default. from performing an independent analysis based on our own dataset. However, the main 308 importance for this study is that all included species belong to Lacertidae, so we can assess if 309 osteological intraspecific variability is consistent among the extant species in this particular 310 clade, and could reasonably be used as a guideline to delimit extinct lacertid species, as well. for which more complete scorings were available. We used hierarchical clustering analysis 418 implemented through the R package 'pvclust' (https://github.com/shimo-lab/pvclust) to 419 determine whether PCoA clusters were able to discriminate between extant genera, and between 420 species within the well-represented genera Lacerta, Podarcis, and Timon. We used a 421 modification of Ward's clustering method, with a significance threshold of 0.05.  Table 4).

429
In addition to quantifying missing data per se, we explored the dataset using the All     with numbers of specimens ranging from two to the maximum sampled, with each sample size 460 replicated 100 times, and the maximum, minimum, and mean pairwise dissimilarity recorded.

462
Pairwise Dissimilarity 463 Among extant taxa, intergeneric dissimilarity was consistently significantly greater than 464 intrageneric/interspecific dissimilarity, which in turn was consistently significantly greater than 465 intraspecific dissimilarity (Fig. 1). Taxa sampled by two or fewer specimens, such as the three  postcranial characters, and qualitative dental characters, were all significantly more dissimilar intraspecifically than the pooled variation. Quantitative dental characters were significantly less 504 intraspecifically dissimilar than the pooled variation.

505
The weighted intraspecific pairwise dissimilarities of Dracaenosaurus croizeti and 506 Pseudeumeces cadurcensis were significantly lower than the pooled intraspecific dissimilarities 507 of the extant taxa, while Mediolacerta roceki and Plesiolacerta lydekkeri were significantly more 508 dissimilar than the extant taxa (Fig. 2b). "Lacerta" filholi and "L." siculimelitensis did not differ  Under an intermediate fossilization simulation, a single extant taxa approximated the 516 patterns seen in its extinct partner (Psammodromus algirus, which was already similar to its 517 extinct partner species "Lacerta" filholi when scored completely). In this simulation, only 518 Podarcis muralis differed significantly in intraspecific dissimilarity from the average of the 519 remaining "extant" species used in the simulations, being significantly less variable than extant 520 taxa. On the contrary, its extinct partner species, Mediolacerta roceki, was significantly more 521 variable than extant taxa, suggesting that the high variability observed in this species is not solely 522 due to low anatomical overlap and/or sample size.

523
In the extreme fossilization simulation, no simulated "fossilized" taxon differed  together, the only Lacerta species to cluster together significantly was Lacerta pamphylica, and 550 several Lacerta specimens cluster with specimens of Podarcis rather than congeners.

551
Even with only specimens of Lacerta included, there is significant overlap between 552 species, and there is no significant tendency for hierarchical clustering analysis to group 553 specimens of a single species to the exclusion of those referred to others (Fig. 4b). An exception 554 is Lacerta pamphylica, of which all three specimens cluster together when all taxa are analysed, 555 but this cluster does not include one of the specimens when only Lacerta is included in the 556 analysis. This is probably a consequence of lacking more disparate species, which make L.  (Table 1), 566 indicating that the absence of character scores in both extant and extinct species is non-random.

567
Such a non-random distribution of missing entries is fairly typical for morphological datasets,  (Table 1). However, as mentioned above, these 582 absolute values of missing data are not necessarily correlated with the utility of the dataset for 583 analyses of pairwise dissimilarity, which requires at least two individuals scored for the same 584 character.

585
Quantification of anatomical overlap shows that extant lacertids have AOIs ranging from 586 34% (Lacerta media) to 75% (L. trilineata) when analysing the entire dataset, whereas extinct 587 species have values between 1% (Plesiolacerta lydekkeri) and 12% ("L." siculimelitensis). COIs 588 covering the entire dataset range from 56% (L. media) to 83% (L. pamphylica) in extant species, 589 and from 14% (P. lydekkeri) to 90% (Mediolacerta roceki) in extinct taxa (Table 1). The AOI 590 and COI are slightly correlated with completeness values in extant taxa (the AOI slightly more 591 so than the COI). Whereas the COI of extant species does not seem to correlate with the number 592 of OTUs in a particular species, the AOI does so, even more than regular completeness values 593 (Fig. 5). Overlap indices in the extinct taxa, however, show the opposite trends, with the COI Minimum observed pairwise dissimilarity, on the other hand, did not show any correlation with niche, and phylogenetic history, the species were represented by divergent sample sizes, and 642 specimens showed different degrees of anatomical overlap. Notwithstanding these differences in 643 their biology, sampling procedure, and available data, pairwise dissimilarity remains consistent.

644
The three partial exceptions are Lacerta media, L. pamphylica, and Ophisops elegans. 645 Lacerta media was found to have a significantly higher intraspecific variability than eight 646 other species within our dataset, whereas no significant difference was found with five other 647 species. It was sampled by four specimens in our dataset and has the lowest anatomical overlap 648 scores over all characters as well as within the cranial, dental, and postcranial subsets (Table 1).

649
The high overall dissimilarity is driven by high variability in qualitative cranial and postcranial 650 characters (Supplementary Table 5). Lacerta media is less, or similarly, variable than many other   The results from our studies corroborate that current species delimitation is generally 703 robust in the extant species we analysed, and that these taxa do not suffer considerably from the 704 species comparability problem. This stability suggests that osteological intraspecific variability 705 can be used as a proxy for other secondary defining properties and may be suitable for species   with other Lacerta specimens was found to be significantly larger than normal intrageneric 865 variability within extant taxa, and even exceeded most of the recovered dissimilarity scores "Lacerta" siculimelitensis.-As for "L." filholi, also "L." siculimelitensis is comparable 870 to extant taxa in its intraspecific osteological variability, although it has a relatively low 871 dissimilarity score (0.1364 ± 0.0641; Supplementary Table 5 unambiguously identify lacertid tooth-bearing bones, so some of the referred specimens may the species (Fig. 2b). Additionally, the high variability in the dentition of P. lydekkeri could be a 2016a) can be assigned to the species, and possibly even material currently referred to 938 Pseudeumeces sp.

940
The differences in intraspecific dissimilarity seen in the extinct lacertids indicate that 941 species delimitation approaches are not always consistent between neontology and 942 palaeontology, even though most specimen identifications were probably based on morphology.

943
In at least two of the six extinct species we sampled, low anatomical overlap did not significantly 944 skew the recovered dissimilarity values. Pseudeumeces cadurcensis is significantly less disparate 945 than any sampled extant taxon, indicating that palaeontologists have been overly strict when 946 referring specimens to this species, whereas Mediolacerta roceki is significantly more variable, 947 suggesting that some specimens referred to this species should be assigned to other taxa.      Table 1: Completeness (C), All Characters Overlap Index (AOI), and Comparable  . Intergeneric (between two specimens of two different genera), intrageneric (between two specimens of a single genus but different species), and interspecific dissimilarity for extant lacertid taxa (left, middle, and right columns, respectively). The horizontal black line in the boxplots represents the median. NS indicates statistically non-significant differences. Increasing number of stars refers to decreasing significance cutoff (***=0.001, **=0.01, *=0.05). Generally, intraspecific dissimilarity is significantly lower than intrageneric dissimilarity, which is significantly lower than intergeneric dissimilarity. The exceptions are species with low sample size (1 specimen per species of Gallotia; 2 specimens of Iberolacerta monticola; 4 of Lacerta media; 3 of Podarcis siculus).
202x202mm (300 x 300 DPI)  : Simulation of missing data in extant species, following patterns observed in extinct species. Intraspecific, weighted pairwise dissimilarity scores (y-axis) are given for the whole dataset, the simulated dataset with intermediate values of missing data, the simulated dataset with the same characters missing from the comparison as in the extinct partner species, and the extinct partner species. The extinct partner species are Plesiolacerta lydekkeri (for Lacerta agilis), Dracaenosaurus croizeti (for L. bilineata), Pseudeumeces cadurcensis (for L. trilineata), Mediolacerta roceki (for Podarcis muralis), "L." filholi (for Psammodromus algirus), "L." siculimelitensis (for Timon lepidus). NS indicates non-significant differences.
Increasing number of stars refers to decreasing significance cutoff (***=0.001, **=0.01, *=0.05). The black line in the box plots represents the median, diamonds represent the mean.