The Mendeleev-Meyer Force Project

Here we present the Mendeleev-Meyer Force Project which aims at tabulating all materials and substances in a fashion similar to the periodic table. The goal is to group and tabulate substances using nanoscale force footprints rather than atomic number or electronic configuration as in the periodic table. The process is divided into: 1) acquiring nanoscale force data from materials, 2) parameterizing the raw data into standardized input features to generate a library, 3) feeding the standardized library into an algorithm to generate, enhance or exploit a model to identify a material or property. We propose producing databases mimicking the Materials Genome Initiative, the Medical Literature Analysis and Retrieval System Online (MEDLARS) or the PRoteomics IDEntifications database (PRIDE) and making these searchable online via search engines mimicking Pubmed or the PRIDE web interface. A prototype exploiting deep learning algorithms, i.e. multilayer neural networks, is presented.


Abstract
Here we present the Mendeleev-Meyer Force Project which aims at tabulating all materials and substances in a fashion similar to the periodic table. The goal is to group and tabulate substances using nanoscale force footprints rather than atomic number or electronic configuration as in the periodic table. The process is divided into: 1) acquiring nanoscale force data from materials, 2) parameterizing the raw data into standardized input features to generate a library, 3) feeding the standardized library into an algorithm to generate, enhance or exploit a model to identify a material or property. We propose producing databases mimicking the Materials Genome Initiative, the Medical Literature Analysis and Retrieval System Online (MEDLARS) or the PRoteomics IDEntifications database (PRIDE) and making these searchable online via search engines mimicking Pubmed or the PRIDE web interface. A prototype exploiting deep learning algorithms, i.e. multilayer neural networks, is presented.
The starting point in this work for the standardization or tabulation of materials via nanoscale forces consists of force versus distance curves (FDCs) as acquired in force spectroscopy. These curves contain the nanoscale footprint of the substance or material and are typically acquired with an atomic force microscope (AFM) or with a surface force apparatus (SFA). The force between a nanostructure, i.e. the tip of an AFM, and a surface is monitored as a function of separation or distance. In principle, the sensitivity of the AFM should provide information from all the relevant nanoscale force footprints or force contributions exerted between materials in the form of FDCs, as concluded when the AFM was first introduced as a method 1 .
On the other hand, the quest to identify and recognize atoms or materials from atomic footprints or FDC data has remained an active field of research up to this date 2, 3 , remains challenging and proves elusive particularly when considering the generalization and standardization of measurements and procedures 4,5 . Furthermore, experiments are typically sophisticated 6 and are reported by carrying out extensive analysis from complex models or fundamental theory 2 rather than via automated processes. Approximately a decade ago a significant advance was reported by invoking a particular form of normalization of the raw FDC data 4 and single atoms were identified via specific atomic footprints. Similar normalization of FDC data was more recently employed to identify more complex heterogeneous systems 2 . Even more recently the influence of the AFM tip and chemical bonding configuration between materials has been reported to influence the force footprint curve to the point of preventing atom identification 7 . Other forms of sample recognition and identification consist of modelling the FDC and parameterizing it with physically relevant parameters such as stiffness 8 , adhesion, viscoelasticity 9 or other parametric models 10 , and even model free parameters [11][12][13] . Parameterization typically involves an intermediate step after acquiring the raw data which consists of quantification and comparison.
In this way differences detected in the interaction are exploited as parameter contrast maps that might be employed to discriminate between materials 11 . Standardization and tabulation however is still lagging far behind 14 other fields of research such as proteomics, metabolomics and genomics that are heavily assisted by computer science, large databases, powerful search engines and submission protocols 15 that allow rapid access to the databases. On the other hand, Kalinin, et al. have been early proponents of the exploitation of computer science techniques in probe microscopy 13,16 .
Here we propose a full transition in the field towards an integration of force spectroscopy and advanced computer science techniques as done in bioinformatics assisted biology. The objective is to parameterize force curves, or more generally nanoscale footprints arising form force microscopy, by turning the FDC raw data into features with the abstract meaning typically given to the features employed in machine learning algorithms. In this way, we do not impose any restriction to the number of input features to identify a given material or family of materials or substances. We further employ the term substance purposefully to emphasize that we would like to deal with chemistry, mechanics or even phases, i.e. whether a material's surface is hydrated.
Features are then employed to construct feature libraries for groups of families or specific families. Finally an algorithm is exploited to generate a model from a given feature library that, as in the periodic table, groups materials according to similarity. We further employ the concept of classification from standard machine learning where the output of the algorithm is zero when a non-match is predicted and one when the algorithm predicts a match. In the prototype that we report here a deep learning algorithm, i.e. a multilayer neural network, is trained with the backpropagation method (see supplementary) in Matlab 17 . We use F-score as a figure of merit to quantify Precision and Recall for the models or classifiers. Precision and Recall are defined as in machine learning where Precision is the ratio between true positives and predicted positives and Recall is the ratio between true positives and actual positives. The F-score parameter combines Precision and Recall as

Recall Precision
Recall The advantage of employing the F-score rather than Precision or Recall alone is that high values in F-score will be obtained if and only if both Recall and Precision are high simultaneously. In a more intuitive note, Precision could be defined as specificity and Recall as sensitivity implying that high F-score values include both high specificity and sensitivity. More detail on these figures of merit is given when discussing a practical example below. We then show that raw contrast images can be processed with the learned models to turn them into images that predict the likelihood that a given material has been identified in every pixel of the image.

Raw data acquisition
The initial step for parametrization and tabulation involves acquiring the nanoscale force footprint in the form of an FDC. An example of a raw experimental force curve is shown in Fig.   1a(i) where the tip-sample force F is shown in the y-axis and the distance d in the x-axis. This force arises from the atomic interactions between the atoms on the tip and the atoms on the sample. We note that both net attractive, i.e. F<0 nN, and net repulsive, i.e. F> 0 nN, forces are shown in the figure. It is also typical to associate the point of minima in force with mechanical contact between the AFM tip and the surface 18 . Only points in the force curves satisfying F<0 nN are considered next since we find that these provide enough information to classify materials.
Our method of parameterization is also suited for such range as detailed below.

Parametrizing raw data and transformation into input features
The second step consists of parametrizing the raw FDC. Here we choose to measure the distances in the well of the FDC in a similar fashion to that recently proposed elsewhere by the authors 19,20 . The steps are as follows: 1) The adhesion force FAD (or minima in force, see Fig. 1a(ii)) is taken as the force reference for a given curve.
2) This reference allows considering all other force-distance pairs with the use of a factor β as F=βFAD. We note that by varying β from 0 to 1 any arbitrary force curve can be fully parameterized and quantified 19 for F<0 nN.
3) Without loss of generality we limit β to 0.85, 0.75… 0.05 and normalize the distances in the well of the curves with the reference β=0.85. 4) This is done by first computing the absolute distances dFβ=dF0.85, …, dF0.05 where β=0.85, …, 0.05 as illustrated in Fig. 1a(ii). This produces 9 distances as input features for each single curve. 5) We next normalize the distances dFβ by computing the ratios dFi=dFβ/dF0.85 where i=1 to 8 resulting in 8 normalized distances as illustrated in Fig. 1a(iii).
6) The distances dF1 to dF8 can be now employed as a table of input features for a machine learning algorithm to generate a model. In order to remove noise we averaged the distances for a given substance or family of substances over 40-100 samples. An example of tabulation of input features to generate a feature library is shown in Table I. In Table I two polymers, polyethylene high-density (PEHD) and Polycaprolactone (PCL), and two materials from the silica family, i.e. glass and silicon, have been employed to generate three sets of input features. The three sets for each family form a feature library for polymers and silica respectively and concluding the second step of the procedure as shown as an illustration in Fig. 1a(iv).

Feeding the standardized input feature library into a learning algorithm
The third step consists of generating a model from a feature library as shown in the diagram in Fig. 1a(v). In the case of Table I, this model should be able to identify or detect whether input features belong to the polymer family or the silica family. In order to generate models we implemented a standard multilayer artificial neural network in Matlab 17 that included a regularization term λ to avoid overfitting. The steps are as follows: 1) Inputting an input feature library, as that shown in Table I and as illustrated in Fig. 1a(iv), into a machine learning algorithm as illustrated in Fig. 1a(v). Here we have chosen an artificial neural network composed of U units per layer L. Units in these algorithms stand for unit cells or neurons and each unit is modelled with a sigmoid function where the inputs are processed by the function and the output fed into the units U in the next layer L as illustrated in Fig. 1b. The very last layer of the system will produce the predicted outputs as shown in the figure. For example, in the case of Table I, the two unit cells in the last layer L will produce the predictive outcomes for the polymer (one unit) and silica (another unit) families.
2) The model is first trained with a set of input features from a given library where the output is known. For example, in the case of Table I, the last unit cell for polymers should produce ones if and only if data from polymers is fed into the system and similarly for the unit cell of the silica family.
3) Then the model is tested by inputting data into the model generated from the training data and comparing the output to the known values for the output. This is typical from supervised algorithms where the algorithm learns from inputting data for which the outcome is known by the user in advance. Errors in the outcomes are quantified via Recall and Precision and together via the F-score parameter as discussed above. In the experimental section we report errors on testing sets of data via Precision, Recall and Fscore. The testing step is discussed in more detail below.
This concludes the procedure of training and testing a model from feature libraries for substance identification. An illustration of the full process is shown as a diagram in Fig. 1b.  In order to test the performance of the models, raw data obtained by different users and with different cantilever-sample systems are acquired and fed into trained neural networks (model) produced from the feature libraries. Here we illustrate this procedure with the help of the data from Table I Table I  identify and discriminate between the silica and polymer families. Values of 0.5 or below imply that the models lack sufficient predictive power. We have added data for three models in Table I to exemplify that arbitrarily increasing L or U might not result in a better model. That is, relatively simple models, consisting of a few number of layers L and units U, might produce models with enough predicting power and might not be improved by increasing the complexity of the model arbitrarily. This is typical behavior in machine learning algorithms 21  In Table II we show a training feature library produced to discriminate between the PEHD and PCL polymers. The data was employed to train and produce models and the models were then tested. In the table we show the results of these tests, in terms of figures of merit, after feeding data (approximately 200 data points) from a PEHD sample into the generated model. Again we see that a single layer and two unit cells suffice to produce a model with enough predicting power to discriminate between the two samples, i.e. F-score > 0.5. In summary, these two examples illustrate how by using multiple processes or steps, specific samples can be identified.
That is, a given model can be employed to first discriminate between families, as in the case of the silica and polymer families described above (Table I). Then, in a second step, a more specific model can be exploited to discriminate between samples in the same family (Table II). A scheme of such flow is illustrated in Fig. 1c For completeness we also report a feature library in Table III that includes an otherwise disparate collection of samples; barium fluoride (BaF2), calcium fluoride (CaF2), silicon, PCL, graphite and glass. In this case we report the F-score in Fig. 2 as a function of regularization parameter λ.
Three different models are also shown for 1L-2U, 3L-5U and 4L-5U. The F-score is 1 for the lowest values of λ for the 4L-5U models. Low values of λ imply more over-fitting and therefore less capacity to generalize to data other than that of the training and testing sets. In this respect the largest values of λ giving high F-score values should be preferred. For the 1L-2U model the F-score is never 1 implying that the model is never 100% sensitive (Recall) and specific (Precision) independently of the selected λ value. Since F-score=1 in Fig. 2 for some λ values, this figure illustrates that it is possible to discriminate between substances or materials belonging to different groups or families directly, that is, without first employing a family discrimination model. This is provided the model is complex enough however, i.e. in this case we never obtained F-score=1 with models consisting of less than 3 layers.  Table III. Example of libraries employed as input data to generate models for a set of samples. The features employed to parameterize the raw data consist of the normalized dF values at the β points given in the first column. Each value for each set consists of averages of 40-100 data points each.  Fig. 3b relative to 3a. The patch of calcite P2 in Fig. 3a has been circled in order to relate it to the displaced patch predicted by the model in Fig. 3b Fig. 4a and the prediction of the model against force data for the same spot in Fig. 4b. The central pattern in Fig. 4a is relatively well reconstructed by the model even though the model was generated with data from a different cantilever and calcite sample.  b, Guess of the model produced from another data set. The blue pixels refer to calcite P1, green pixels refer to calcite P2 and black pixels refer to pixels where the model could not predict any output unambiguously.  14 . These databases should be searchable via dedicated search engines mimicking, for example, the Pubmed or the PRIDE web interface 23 . The generation of models should not be restricted to artificial neural networks either, but could be enhanced, or even replaced, by other methodologies. We suggest to implement well known methodologies in machine learning such as support vector machines or Bayesian networks and to exploit them in parallel to improve predictive power. These methods are standard in the machine learning field and packages can be found in Matlab, python and the R languages. Finally, we would like to point out that the ordering or fabrication of libraries does not need to be limited to the air environment, but can be expanded to liquid and vacuum environments and to the use of probes other than silicon. Furthermore the classification into families does not have to be restricted to material properties but can be enhanced to, for example, identifying the presence or absence of atomic irregularities or dislocations or identifying biological patterns or behavior of systems for which distinct features might be produced 13,24 . The number and type of input features might be further increased into linear or non-linear combinations of the features presented here or any other input feature, such as temperature, relative humidity, tip radius, geometry or chemistry, that might enhance identification or recognition. Arguably the intuition of researchers working in a particular field will suggest the number and type of features that will make a given feature library preferable to produce a given model. In summary, the application of models and the massive testing of data should ultimately tell us what the limit of the proposed method of standardization is in the field of force spectroscopy. In this sense, we anticipate that the tip radius, the tip geometry and the relative humidity might have to be treated as input features to improve predictability and this process might be very challenging.