Over the past forty years, scientists have conducted extensive research and have proposed many methods to predict the occurrence of liquefaction. In the beginning, undrained cyclic loading laboratory tests had been used to evaluate the liquefaction potential of a soil [72] but due to difficulties in obtaining undisturbed samples of loose sandy soils, many researchers have preferred to use in situ tests [73]. In a semi-empirical approach the theoretical considerations and experimental findings provides the ability to make sense out of the field observations, tying them together, and thereby having more confidence in the validity of the approach as it is used to interpolate or extrapolate to areas with insufficient field data to constrain a purely empirical solution. Empirical field-based procedures for determining liquefaction potential have two critical constituents: i) the analytical framework to organize past experiences, and ii) an appropriate in situ index to represent soil liquefaction characteristics. The original simplified procedure [74] for estimating earthquake-induced cyclic shear stresses continues to be an essential component of the analysis framework. The refinements to the various elements of this context include improvements in the in-situ index tests (e.g., SPT, CPT, BPT, Vs), and the compilation of liquefaction/no-liquefaction cases.

The objective of the present study is to produce an empirical machine learning ML method for evaluating liquefaction potential. ML is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviours based on empirical data, such as from sensor data or databases. Data can be seen as examples that illustrate relations between observed variables. A major focus of ML research is to automatically learn to recognize complex patterns and make intelligent decisions based on data; the difficulty lies in the fact that the set of all possible behaviours given all possible inputs is too large to be covered by the set of observed examples (training data). Hence the learner must generalize from the given examples, so as to be able to produce a useful output in new cases. In the following two ML tools, Neural Networks NN and Classification Treesm CTs, are used to evaluate liquefaction potential and to find out the liquefaction control parameters, including earthquake and soil conditions. For each of these parameters, the emphasis has been on developing relations that capture the essential physics while being as simplified as possible. The proposed cognitive environment permits an improved definition of i) seismic loading or cyclic stress ratio CSR, and ii) the resistance of the soil to triggering of liquefaction or cyclic resistance ratio CRR.

The factor of safety FS against the initiation of liquefaction of a soil under a given seismic loading is commonly described as the ratio of cyclic resistance ratio (CRR), which is a measure of liquefaction resistance, over cyclic stress ratio (CSR), which is a representation of seismic loading that causes liquefaction, symbolically, FS ï€½ CRR / CSR . The reader is referred to Seed and Idriss [74], Youd et al. [75], and Idriss and Boulanger [76] for a historical perspective of this approach. The term CSR CSR ï€½ f ï€¨0.65,ï³ vo ,amax ,ï³ v’ o ,rd ,MSFï€© is function of the vertical total stress of the soil Ïƒvo at the depth considered, the vertical effective stress ‘ Ïƒvo , the peak

horizontal ground surface acceleration amax , a depth-dependent shear stress reduction factor rd (dimensionless), a magnitude scaling factor MSF (dimensionless). For CRR, different in situ-resistance measurements and overburden correction factors are included in its determination; both terms operate depending of the geotechnical conditions. Details about the theory behind this topic in Idriss and Boulanger, [76] and Youd et al. [75].

Many correction/adjustment factors have been included in the conventional analytical frameworks to organize and to interpret the historical data. The correction factors improve the consistency between the geotechnical/seismological parameters and the observed liquefaction behavior, but they are a consequence of a constrained analysis space: a 2D plot [CSR vs. CRR] where regression formulas (simple equations) intend to relate complicated nonlinear/multidimensional information. In this investigation the ML methods are applied to discover unknown, valid patterns and relationships between geotechnical, seismological and engineering descriptions using the relevant available information of liquefaction phenomena (expressed as empirical prior knowledge and/or input-output data). These ML techniques work and produce accurate predictions based on few logical conditions and they are not restricted for the mathematical/analytical environment. The ML techniques establish a natural connection between experimental and theoretical findings.

Following the format of the simplified method pioneered by Seed and Idriss [74], in this investigation a nonlinear and adaptative limit state (a fuzzy-boundary that separates liquefied cases from nonliquefied cases) is proposed (Figure 17). The database used in the present study was constructed using the information included in Table 3 and it was compiled by Agdha et al., [77], Juang et al., [78], Juang [79], Baziar, [80] and Chern and Lee [81]. The cases are derived from cone penetration tests CPT, and shear wave velocities Vs measurements and different world seismic conditions (U.S., China, Taiwan, Romania, Canada and Japan). The soils types ranges from clean sand and silty sand to silt mixtures (sandy and clayey silt). Diverse geological and geomorphological characteristics are included. The reader is referred to the citations in Table 3 for details.

The ML formulation uses Geotechnical ( qc , Vs , Unit weight, Soil Type, Total vertical stresses, Effective vertical stresses, Geometrical (Layer thickness, Water Level Depth, Top Layer Depth) and Seismological (Magnitude, PGA) input parameters and the output variable is Liquefaction? and it can take the values YES/NO (Figure 17). Once the NN is trained the number of cases that was correctly evaluated was 100% and applied to unseen cases (separated for testing) less than 10% of these examples were not fitted. The CT has a minor efficiency during the training showing 85% of cases correctly predicted, but when the CT runs on the unseen patterns its capability is not diminished and it asserts the same proportion. From these findings it is concluded that the neuro system is capable of predicting the in situ measurements with a high degree of accuracy but if improvement of knowledge is necessary or there are missed, vague even contradictory values in the analyzed case, the CT is a better option.

Figure 18 shows the pruned liquefaction trees (two, one runs using qc values and the other through the Vs measurements) with YES/NO as terminal nodes. In the Figure 19, some examples of tree reading are presented. The trees incorporate soil type dependence through the resistance values ( qc , and Vs ) and fine content, and it is not necessary to label the material as sand or silt. The most general geometrical branches that split the behaviors are the Water table depth and the Layer thickness but only when the soil description is based on Vs , when qc , serves as rigidity parameter this geometrical inputs are not explicit exploited. This finding can be related to the nature of the measurement: the cone penetration value contains the effect of the saturated material while the shear wave velocities need the inclusion of this situation explicitly. Without potentially confusing regression strategies, the liquefaction trees results can be seen as an indication of how effectively the ML model maps the assigned predictor variables to the response parameter. Using data from all regions and wide parameters ranges, the prediction capabilities of the neural network and classification trees are superior to many other approximations used in common practice, but the most important remark is the generation of meaningful clues about the reliability of physical parameters, measurement and calculation process and practice recommendations.

The intricacy and nonlinearity of the phenomena, an inconsistent and contradictory database, and many subjective interpretations about the observed behavior, make SC an attractive alternative for estimation of liquefaction induced lateral spread. NEFLAS [82], NEuroFuzzy estimation of liquefaction induced LAteral Spread, profits from fuzzy and neural paradigms through an architecture that uses a fuzzy system to represent knowledge in an interpretable manner and proceeds from the learning ability of a neural network to optimize its parameters. This blending can constitute an interpretable model that is capable of learning the problem-specific prior knowledge.

NEFLAS is based on the Takagi-Sugeno model structure and it was constructed according the information compiled by Bartlett and Youd [83] and extended later by Youd et al. [84].

The output considered in NEFLAS is the horizontal displacements due to liquefaction, dependent of moment magnitude, the PGA, the nearest distance from the source in kilometers; the free face ratio, the gradient of the surface topography or the slope of the liquefied layer base, the cumulative thickness of saturated cohesionless sediments with number of blows (modified by overburden and energy delivered to the standard penetration probe, in this case 60%) , the average of fines content, and the mean grain size.

One of the most important NEFLAS advantages is its capability of dealing with the imprecision, inherent in geoseismic engineering, to evaluate concepts and derive conclusions. It is well known that engineers use words to classify qualities (strong earthquake, poor graduated soil or soft clay for example), to predict and to validate first principle theories, to enumerate phenomena, to suggest new hypothesis and to point the limits of knowledge. NEFLAS mimics this method. See the technical quantity magnitude (earthquake input) depicted in Figure 20. The degree to which a crisp magnitude belongs to LOW, MEDIUM or HIGH linguistic label is called the degree of membership. Based on the figure, the expression, the magnitude is LOW would be true to the degree of 0.5 for a Mw of 5.7. Here, the degree of membership in a set becomes the degree of truth of a statement.

On the other hand, the human logic in engineering solutions generates sets of behavior rules defined for particular cases (parametric conditions) and supported on numerical analysis. In the neurofuzzy methods the human concepts are re-defined through a flexible computational process (training) putting (empirical or analytical) knowledge into simple ifthen

relations (Figure 20). The fuzzy system uses 1) variables composing the antecedents (premises) of implications; 2) membership functions of the fuzzy sets in the premises, and 3) parameters in consequences for finding simpler solutions with less design time.

NEFLAS considers the character of the earthquake, topographical, regional and geological components that influence lateral spreading and works through three modules: Reg- NEFLAS, appropriate for predicting horizontal displacements in geographic regions where seismic hazard surveys have been identified; Site- NEFLAS, proper for predictions of horizontal displacements for site-specific studies with minimal data on geotechnical conditions and Geotech-NEFLAS allows more refined predictions of horizontal displacements when additional data is available from geotechnical soil borings. The NEFLAS execution on cases not included in the database (Figure 21.b and Figure 21.c) and its higher values of correlation when they are compared with evaluations obtained from empirical procedures permit to assert that NEFLAS is a powerful tool, capable of predicting lateral spreads with high degree of confidence.