UNIVERSIDADE ESTADUAL PAULISTA “JÚLIO DE MESQUITA FILHO” CÂMPUS DE ILHA SOLTEIRA DEPARTAMENTO DE ENGENHARIA MECÂNICA LUCCAS PEREIRA MIGUEL STRUCTURAL HEALTH MONITORING FOR LOOSENING DETECTION IN BOLTED JOINTS Ilha Solteira 2025 LUCCAS PEREIRA MIGUEL STRUCTURAL HEALTH MONITORING FOR LOOSENING DETECTION IN BOLTED JOINTS Thesis presented to the São Paulo State University (UNESP), School of Engineering, Ilha Solteira, as a part of the requirements for obtaining the Ph.D. degree in Mechanical Engineering. Knowledge Area: Solid Mechanics. Prof. Dr. Samuel da Silva Supervisor Ilha Solteira 2025 Miguel Structural Health Monitoring For Loosening Detection In Bolted JointsIlha Solteira2025 85 Sim Tese (doutorado)Engenharia MecânicaMecânica dos SólidosNão . . . FICHA CATALOGRÁFICA Desenvolvido pelo Serviço Técnico de Biblioteca e Documentação Miguel, Luccas Pereira. Structural health monitoring for loosening detection in bolted joints / Luccas Pereira Miguel. -- Ilha Solteira: [s.n.], 2025 84 f. : il. Tese (doutorado) - Universidade Estadual Paulista. Faculdade de Engenharia de Ilha Solteira. Área de conhecimento: Mecânica dos Sólidos, 2025 Orientador: Samuel da Silva Inclui bibliografia 1. Bolted joints. 2. Tightening torque. 3. Structural health monitoring. 4. Gaussian mixture model. 5. Gaussian process regression. 6. Global sensitivity analysis. M636s ACKNOWLEDGEMENTS Primarily, I would like to express my gratitude to my family, especially to my parents, Lucy and Sebastião, and to my brother, Douglas, for consistently providing the essential support that enabled me to navigate successfully throughout this academic journey. I also express my deep gratitude to my advisor, Professor Samuel, for his support, confidence, dedication, friendship, and patience over the past eight years. Completing this work would not have been possible without his guidance and motivation. I extend my heartfelt gratitude to the friendships cultivated throughout these years, who directly or indirectly influenced my development. In particular, I am grateful to the ”LK” group, comprising Bruno and Leonardo, who have been like brothers to me, not just during my undergraduate years when this journey started but continuing to be so up to this point. The countless hours spent studying, discussing, and transitioning between labs became much more enjoyable with their companionship. A special mention goes to the invaluable friends I made during my research, who inspired my progress. Notably, I wish to express my profound thanks to my friend Rafael Teloli. His genuine friendship, endless discussions, collaboration, and perseverance against the challenges we faced during the experimental surveys were crucial to constructing the scientific understanding that I have today. He will forever hold the honorary title of my informal co-advisor for imparting such a wealth of knowledge. My gratitude is unending. Most importantly, I express my deepest gratitude toward my fiancee, Amanda, with whom I finally have the pleasure of closely sharing my entire future. Her unwavering support, patience, and love have been a constant throughout the last nine years. Heartfelt thanks! Finally, I am grateful for the financial support provided by FAPESP, grant number 2020/07449-7. ABSTRACT Bolted joints are an usual solution to build industrial structures. Despite their effecti- veness, they require regular checks to prevent failures, such as bolt loosening. However, conventional inspections are costly and time consuming, requiring operational downtime to manually check the bolt. As a result, indirectly detecting loosening is a significant in- dustrial challenge, compounded by variability and nonlinear effects. In this context, this thesis proposes some new contributions to tightening torque monitoring through struc- tural health monitoring (SHM), specifically focusing on vibration-based SHM. Three key challenges are addressed: first, damage detection is conducted under conditions of low repeatability, which includes repeatedly assembling and disassembling the joint. A Gaus- sian Mixture Model (GMM) is trained using estimated natural frequencies, assuming a healthy condition. Then, this probabilistic model assesses the safety of an unknown state through indirect vibration measurements by calculating a damage index. A Gaussian Pro- cess Regression (GPR) is also trained considering a set of torque and damage index pairs under several conditions. The GPR model estimates tightening torque by interpolating a curve for conditions not considered in the learning stage. The second challenge is to pro- pose a damage detection approach that uses no feature extraction procedure but only the acquired frequency response signal. A GPR is trained over the transmissibility functions to identify a baseline metamodel in the frequency domain to perform this task. Then, a GPR-based damage index is formulated to detect outliers from baseline model. The work also proposes a Global Sensitivity Analysis (GSA) via Sobol’ indices as a feature selection method. This procedure assesses whether variations in the calculated damage index in a frequency range are correlated with alterations in structural conditions, indicating sensi- tivity to loosening. Lastly, in the third application, the GPR metamodel is extended to a three-dimensional methodology designed to understand how nonlinear behavior evolves as the input level increases. This way, the GPR-based damage index has demonstrated significant efficacy in learning and distinguising different motion regimes. Moreover, it ef- fectively differentiates changes induced by the transition between them from those caused by loosening. Therefore, with a adequate feature selection, both proposed probabilistic frameworks effectively identify loosening with minimal false diagnoses. Keywords: bolted joints; tightening torque; structural health monitoring; Gaussian mix- ture model; Gaussian process regression; global sensitivity analysis. RESUMO Juntas parafusadas são uma solução comum na construção de estruturas industriais. Ape- sar da sua eficácia, elas requerem verificações regulares para prevenir falhas, como as causadas pelo afrouxamento dos parafusos. No entanto, inspeções convencionais são dis- pendiosas e demoradas, exigindo paradas operacionais para verificação manual. Conse- quentemente, a detecção indireta de desaperto é um desafio significativo, dificultado por efeitos de variabilidade e não-linearidade. Neste contexto, esta tese propõe novas con- tribuições para a detecção de perda de aperto através do monitoramento de integridade estrutural (SHM). Três desafios principais são abordados: primeiro, a detecção é con- duzida sob condições de baixa repetibilidade, incluindo a montagem e desmontagem da junta. Um Modelo de Mistura Gaussiana (GMM) é treinado usando frequências naturais em uma condição saudável. Em seguida, este modelo avalia a segurança de um novo dado em um estado desconhecido. Uma Regressão por Processo Gaussiano (GPR) também é feita considerando um conjunto de pares de torque e ı́ndice de dano sob várias condições. O modelo GPR então estima o torque de aperto interpolando uma curva para dados não considerados na etapa de aprendizagem. O segundo desafio é propor uma abordagem de detecção de danos que não utilize procedimento de extração de atributos, mas sim o sinal de resposta em frequência adquirido. Um GPR é treinado sobre as funções de transmis- sibilidade para identificar um metamodelo de referência no domı́nio da frequência. Em seguida, um ı́ndice de dano baseado na GPR é formulado para a detecção de outliers. O trabalho também propõe uma Análise de Sensibilidade Global (GSA) via ı́ndices de Sobol como método de seleção de atributos. Assim, avaliou-se se variações no ı́ndice de dano em uma faixa de frequência estão correlacionadas com alterações estruturais, indicando sen- sibilidade ao dano. Por fim, na terceira aplicação, o GPR é estendido para uma regressão tridimensional capaz de entender como a não-linearidade evolui à medida que o ńıvel de excitação aumenta. O ı́ndice de dano demonstrou eficácia significativa na aprendizagem e distinção de diferentes regimes de movimento. Além disso, diferencia efetivamente as mudanças induzidas pela transição entre eles daquelas causadas pelo desaperto da junção. Portanto, com uma seleção adequada de atributos, ambas as metodologias propostas fo- ram capazes de identificar a perda de aperto com baixa quantidade de falsos-alarmes. Palavras-Chave: juntas parafusadas; torque de aperto; monitoramento de integridade estrutural; misturas gaussianas; processos gaussianos; análise de sensibilidade global. LIST OF FIGURES Figure 1 – Experimental setup - The Orion beam . . . . . . . . . . . . . . . . . . . 27 Figure 2 – (a) Transmissibility plot with emphasis on the structural conditions. Dashed lines −− indicate the frequency ranges of the bending modes. Different lines of the same color indicate different assemblies; (b) exem- plifies the experimentally measured modes . . . . . . . . . . . . . . . . 32 Figure 3 – Boxplot of the resonance frequencies referring to the 1st (a), 2nd (b), 3rd (c), 4th (d), 5th (e) and 6th (f) bending modes on different tighte- ning torques in each labeled structural condition . . . . . . . . . . . . . 35 Figure 4 – Resonance frequencies, changing the tightening torque, with 960 ob- servations (60 in each structural condition equally distributed in four different assemblies) to show the variability of the natural frequencies . 36 Figure 5 – Boxplot of the resonance frequencies on different tightening torques using three realizations in each labeled structural condition and a dif- ferent set of assemblies indicated by markers: △, ×, ◦ and ♢ . . . . . . 37 Figure 6 – Algorithm for damage detection via Gaussian Mixture Model . . . . . . 38 Figure 7 – Learned GMM and features clustering . . . . . . . . . . . . . . . . . . 41 Figure 8 – DI computed by GMM using frequencies of the 5th and 6th modes. Type I Errors are ≈ 2.8%, and Type II Errors are ≈ 0% . . . . . . . . 42 Figure 9 – Algorithm for quantification of the tightening torque through the GP- based model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Figure 10 – Torque versus DI with a GPR model . . . . . . . . . . . . . . . . . . . 46 Figure 11 – Estimated torque versus actual torque for all index (•) (grey) and the mean of estimated torque (×) (red) in each test condition . . . . . . . 47 Figure 12 – GPR model identified for the whole transmissibility . . . . . . . . . . . 52 Figure 13 – DI computed by GPR-based MSD using the whole transmissibility. Type I Error are ≈ 17.5%, Type II Error are ≈ 44.2%, and MAcroF1 ≈ 0.64 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Figure 14 – GPR model identified using only the frequency range around the vi- bration modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Figure 15 – DI computed by GPR-based MSD using only the frequency range around the vibration modes. Type I Error are ≈ 16.67%, Type II Error are ≈ 25.00%, and MAcroF1 ≈ 0.78 . . . . . . . . . . . . . . . . 54 Figure 16 – Sobol’ indices computed for all damage levels considering each vibration mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Figure 17 – MacroF1 score computed for all damage levels considering each vibra- tion mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Figure 18 – DI computed by GPR-based MSD using 1st mode. Type I Error are ≈ 6.67%, Type II Error are ≈ 75.0%, and MAcroF1 ≈ 0.38 . . . . . . . 61 Figure 19 – DI computed by GPR-based MSD using 5th mode. Type I Error are ≈ 15.0%, Type II Error are ≈ 10.0%, and MAcroF1 ≈ 0.88 . . . . . . . 62 Figure 20 – DI computed by GPR-based MSD using 6th mode. Type I Error are ≈ 13.33%, Type II Error are ≈ 3.33%, and MAcroF1 ≈ 0.92 . . . . . . 62 Figure 21 – DI computed by GPR-based MSD using 5th and 6th modes. Type I Error are ≈ 7.5%, Type II Error are ≈ 3.33%, and MAcroF1 ≈ 0.95 . . 63 Figure 22 – Transmissibility of the Orion Beam under different excitation levels in healthy reference condition . . . . . . . . . . . . . . . . . . . . . . . . . 66 Figure 23 – Boxplot of the fifth and sixth resonance frequencies estimated for the healthy (60 cNm) and damaged (30cNm) condition for different excitation levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Figure 24 – DI computed by MSD using frequencies of the 5th and 6th modes of vibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Figure 25 – Transmissibility surface estimated by GPR with the experimental trans- missibility functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Figure 27 – DI computed by three-dimensional GPR-based MSD using 6th mode. Type I Error are ≈ 0.56%, Type II Error are ≈ 1.67%, and MacroF1 ≈ 0.98 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Figure 26 – Experimental and GPR estimated transmissibility functions for diffe- rent acceleration levels . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 LIST OF TABLES Table 1 – Data set considered in Section 3 . . . . . . . . . . . . . . . . . . . . . . 28 Table 2 – Data set considered in Section 4 . . . . . . . . . . . . . . . . . . . . . . 29 Table 3 – Data set considered in Section 5 . . . . . . . . . . . . . . . . . . . . . . 29 Table 4 – Parameters for learning and validation . . . . . . . . . . . . . . . . . . . 40 LIST OF SYMBOLS A - Truncation set of multi-indices polynomials α - Weight of a mixture model D - Total variance Di - Partial variance D - Dataset DI - Damage index DIGP R - GPR-based damage index εS - Zero-mean Gaussian error ϵP - Truncation error of a polynomial expansion δ - The Kronecker delta f - Nonlinear regressor fX - Probability density function of a parameter set k - Covariance function M - Components from the Hoeffding-Sobol decomposition Np - Number of parameters in a Hoeffding-Sobol decomposition Nf - Number of spectral lines of a transmissibility function Nt - Number of transmissibility functions in a dataset µ - Mean vector of a probability distribution ω - Single frequency value of a frequency vector p - Probability density function Ψ - Polynomial basis of multivariate orthonormal polynomials Φ - Multivariate orthonormal polynomial r - RMS value of the acceleration signal S - Sobol’s index Σ - Covariance matrix of a probability distribution T - Tightening torque Θ - Hyperparameters from a Gaussian Mixture X - Frequency vector X - General parameter set Y - Transmissibility function Y - General system response yα - Coefficient of a Polynomial Chaos Expansion ŷ - Vector of optimal coefficients of a Polynomial Chaos Expansion Z - Test matrix z - New unlabeled observation from test matrix CONTENTS 1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.1 CONTEXT AND MOTIVATION . . . . . . . . . . . . . . . . . . . . . 14 1.2 LITERATURE REVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.3 METHODOLOGY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 1.4 OBJECTIVE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 1.5 MAIN CONTRIBUTIONS . . . . . . . . . . . . . . . . . . . . . . . . . 23 1.6 OUTLINE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2 EXPERIMENTAL SETUP . . . . . . . . . . . . . . . . . . . . . 26 2.1 THE ORION BEAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.2 DATA SETS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3 PROBABILISTIC MACHINE LEARNING FOR LOOSENING DETECTION IN A LOW REPEATABILITY SCENARIO . . 30 3.1 MODAL ANALYSIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.2 GAUSSIAN MIXTURE MODEL FOR DAMAGE DETECTION . . . . 37 3.3 DETECTING TORQUE LOSS BY GMM . . . . . . . . . . . . . . . . . 39 3.4 GAUSSIAN PROCESS REGRESSION FOR TIGHTENING TORQUE QUANTIFICATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.5 TIGHTENING TORQUE ESTIMATION VIA GPR . . . . . . . . . . . 45 3.6 CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4 ON THE GAUSSIAN PROCESS REGRESSION OF THE TRANS- MISSIBILITY CURVE FOR LOOSENING DETECTION IN BOLTED STRUCTURES . . . . . . . . . . . . . . . . . . . . . . 49 4.1 DAMAGE DETECTION BASED ON THE GPR OF TRANSMISSI- BILITY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.2 GLOBAL SENSITIVITY ANALYSIS . . . . . . . . . . . . . . . . . . . 54 4.2.1 Variance-based global sensitivity analysis . . . . . . . . . . . . . . . . . 55 4.2.2 Computing Sobol’ Indices via Polynomial Chaos Expansion . . . . . . 57 4.3 THE USE OF SOBOL’ INDICES TO FEATURE SELECTION . . . . 59 4.4 CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5 THREE-DIMENSIONAL GAUSSIAN PROCESS REGRES- SION OF A TRANSMISSIBILITY SURFACE FOR LOOSE- NING DETECTION IN DIFFERENT EXCITATION LEVELS 64 5.1 THE INFLUENCE OF THE EXCITATION LEVEL IN THE STRUC- TURAL BEHAVIOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.2 THREE-DIMENSIONAL GAUSSIAN PROCESS REGRESSION . . . . 67 5.3 DAMAGE DETECTION . . . . . . . . . . . . . . . . . . . . . . . . . . 68 5.4 CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 6 FINAL REMARKS . . . . . . . . . . . . . . . . . . . . . . . . . 73 6.1 CONTRIBUTIONS AND CONCLUSIONS . . . . . . . . . . . . . . . . 73 6.2 SUGGESTIONS FOR FUTURE WORKS . . . . . . . . . . . . . . . . . 74 REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 14 1 INTRODUCTION This chapter introduces the thesis, outlining its motivation, literature review, and main contributions. Additionally, it provides the structure of the subsequent sections. 1.1 CONTEXT AND MOTIVATION Bolted joints are one of the primary forms for attaching components to assembled mechanical systems, ensuring structural stability and proper functioning in applications ranging from aero and wind turbines (Siewert et al., 2010; Chou et al., 2018) to bolted- flange joints in oil and gas plants (Reza et al., 2014). An important reason why this kind of connection stands out among other engineering solutions is that it provides an easy and low-cost option for building large global structures from stiffly and securely joined sub-components. In addition, the ease of disassembling and reassembling by simply loosening and tightening bolts has the advantage of ensuring a modular aspect to the overall structure, allowing the replacement, maintenance, and improvement of designed components (Daadbin; Chow, 1992). Unfortunately, locations containing bolted joints are susceptible to structural da- mages, e.g., small cracks can propagate (Qiu et al., 2014). Besides that, bolt loosening is commonly seen due to stress relaxation (Daadbin; Chow, 1992) or when the structure is subject to external vibration sources that induce relative motion between connected surfaces (Doyle et al., 2010; Oregui et al., 2017; Li; Jing, 2017). In this context, ensuring that the bolted joint is consistently and adequately tightened is crucial. This practice is vital to safe operation. Furthermore, it ensures structural integrity and the aforementi- oned strengths are maintained, thereby preventing any potential for structural collapses or catastrophic failures. For that, periodic inspection is required; economically speaking, it can become costly due to maintenance stops, crew mobilization, and field operating costs. Specifically dealing with the bolt loosening problem, the main impediment is the standard inspection procedure, which consists of manual measurements on each bolt th- rough torque wrenches. In this context, detecting the variation of the tightening torque during equipment operation through indirect measures is an excellent motivation to deve- lop safer, faster, and cheaper Structural Health Monitoring (SHM) methods, as proposed in Chevallier, Ramasso and Butaud (2019). However, this is still a technological challenge 15 due to the complex nonlinear effects of frictional interactions on the connection interface, such as hysteretic behavior. Furthermore, much uncertainty is related to environmental and operational variations (Brake; Schwingshackl; Reuß, 2019). The literature still needs to investigate what features can indicate damage condi- tions for decision-making on the level of joint tightening torque. These metrics should be easily extracted and analyzed through feature classifiers sensitive to the tightening torque variation, and they should preferably be extracted from cheap and standard tools already used in monitoring these structures. Moreover, the ability to manage various operati- onal conditions and address other sources of uncertainty is of considerable significance in practical applications. Among the different indirect torque monitoring methods, this current Ph.D. focuses on contributing to indirect methods for torque loss detection based on vibration signals (Milanese et al., 2008). Such techniques are gaining ground in the literature because they enable reduced instrumentation for visualizing more global effects rather than individually inferring the state of each bolt. In addition, it is known that most structural damage manifests itself by affecting the dynamic behavior of the global structure (Farrar; Worden, 2012), and it can be seen when this damage is related to the loosening of connections (Huda et al., 2013; Teloli et al., 2022; Miguel et al., 2022; Wall; Allen; Kuether, 2022). 1.2 LITERATURE REVIEW Structural health monitoring (SHM) is a strategy to detect damage in systems from different engineering areas, such as aerospace, mechanical, or civil structures (Far- rar; Worden, 2007). It has gained ground in allowing maintenance strategies to evolve from more classical but not cost-efficient philosophies, such as run-to-failure or time-based approaches (Farrar; Worden, 2012). Two commonly adopted strategies are prevalent for addressing structural damage. The first method to manage structural damage involves extending the life of a component or entire system until hidden; unmonitored damage ine- vitably fails (Farrar; Worden, 2012). However, this strategy carries the risk of catastrophic outcomes, where interconnected components can suffer severe damage or lead to human life loss. Consequently, this approach is generally unacceptable for most applications, particularly those involving critical components. To find a balanced choice that is still simple but more conservative, the second 16 approach tries to estimate the lifetime of a component, considering mainly empirical knowledge acquired from previous failure data or experimental tests. So, after the specified time, the component is replaced or repaired. Despite this philosophy seeming to solve the problems of run-to-failure strategies, there are some practical issues: first of all, the data required to estimate the lifetime are not commonly available; besides, many times, the operational and environmental conditions are not constant, varying the damage progression. Consequently, the lifetime can be misestimated, enabling the occurrence of failu- res or maintenance of perfect components to prioritize safety. Solving this issue is the main contribution of SHM for maintenance decisions once it makes possible the use of a condition-based philosophy that allows assessing the actual state of the structure or component during operation, only then to make a plan if necessary, any intervention. In this way, it is possible to optimize the operating time in a safe condition and evaluate the damage size/progression when it appears to make a scheduled stop at an opportune moment. A complete overview and comparison between the last two approaches can be found in Ahmad and Kamaruddin (2012), in which the authors highlight the advantages of condition-based methods and the need for further research to make them more realistic. Farrar and Worden (2012) conceptualizes SHM as a task fundamentally grounded in pattern recognition, as determining whether a structure is damaged presumes that one has access to prior knowledge of a pattern that describes a healthy condition. Moreover, it is well known that all engineering applications are characterized by variability and un- certainties, which makes the Statistical Pattern Recognition (SPR) paradigm particularly suitable for addressing issues related to SHM. The SPR framework can be divided into four distinct stages as outlined by Farrar, Doebling and Nix (2001): 1. An operational evaluation is undertaken to get a comprehensive understanding of the problem, including the type of damage the structure is exposed to and the limitations involved in extracting structural information; 2. Subsequently, data acquisition is conducted. During this phase, the data acquisition strategy must be defined, including specifying the type of data required, its volume, and the periodicity of acquisition. Additionally, some preliminary data processing is necessary at this stage; 17 3. With an extensive data set in hand, it becomes imperative to ascertain which data is sensitive to damage through a feature selection process. The choice of a charac- teristic feature will play a vital role in the effectiveness of damage detection; 4. In the concluding step, a statistical model is formulated to assess whether the struc- ture remains in the baseline healthy condition or, if deviations are detected, provide knowledge regarding the new structural state. Although this work follows all the steps outlined in the SPR framework, the main con- tributions are in developing statistical models applicable to pattern recognition (step 4) and in discussions and procedures for feature selection (step 3). Although the most general definition of SHM refers to a damage detection strategy, it is hierarchically divided into four levels by Rytter (1993), whereby damage detection, i.e., actually answering the question ”Is there any damage in the structure?”, is only the first level. The following steps are to determine the location of the damage (level 2) and its severity (level 3) and to assess the safety of the structure given the observed state of health, as well as to evaluate the remaining life based on the estimate of damage evolution (level 4). Farrar and Worden (2012) also highlights one more issue: the specification of the type of damage detected. This step is essential for scenarios where different damage may be found (or combined). In this work, as in most SHM benchmarks, the structure has a specific kind of damage of interest: bolt loosening in mechanical connections. Extensive reviews of what has been done to detect loosening in bolted joints can be found in (Wang et al., 2013; Nikravesh; Goudarzi, 2017; Miao; Zhang; Xue, 2020). The literature suggests organizing these approaches into two categories based on the instru- mentation used (Nikravesh; Goudarzi, 2017). The first one includes the direct methods. Each bolt or a set of joints is monitored using specific sensors, typically employing strain gauges or load cells to estimate information about the stress state and then compare it with the design requirements. The key idea is to evaluate an empirical relationship of bolt loosening using the stress torque in each bolted joint. The most popular algorithm is the torque wrench method (Motosh, 1976). However, a significant fluctuation is expec- ted using this approach induced by numerous uncertainties (Nazarko; Ziemianski, 2017). Thus, discrepancies affect the accuracy of loosening detection, limiting its use in wides- pread applications. Another limitation is that many structures with bolted joints, such as pipes, are submerged or in complex human access regions for regular inspection (Razi; 18 Esmaeel; Taheri, 2013). This trend in categories involves using an automated computer vision method combined with machine learning algorithms to estimate the tightness level of the bolted joints (Cha; You; Choi, 2016; Ramana; Choi; Cha, 2019). The second category of approaches is the use of indirect methods. Active sensing approaches that use piezoelectric sensors/actuators are attractive and are regularly used to detect torque loss in this class, mainly applying electromechanical impedance (Park; Cudney; Inman, 2001; Wang et al., 2013). Machine learning algorithms have been using electromechanical signatures to propose new extraction features to compute the damage index with classification, for example, by adopting a support vector machine method (Wang; Chen; Song, 2020). However, vibration-based data are still the most prevalent approach (Azhar et al., 2024). In this class of approach, induced vibration can be collected using a piezoceramic actuator (Wang; Song, 2019), impact modulations (Meyer; Adams, 2019), or shakers with harmonic excitations (Li; Jing, 2020). In Razi, Esmaeel and Taheri (2013), the authors suggest using empirical mode decomposition as a useful energy-based damage index, assuming that tests are performed on the bolted flange joint of a pipeline with progressive torque loss. To the best of the author’s knowledge, the most reliable detection of loosening in the literature occurs when it is at least one-tenth of the healthy torque level. For this, Luo and Yu (2017) proposed a new method for identifying damaged structures with bolted joints based on a residual error of the Auto-Regressive model in time series analysis. The authors could only observe torque changes in this work when the level of the undamaged condition was reduced from 20 Nm to 2.5 Nm in the damaged one. Although significant changes in torque values do not imply changes of equal proportions in the dynamics of jointed structures, at this level, assembled components are practical without any connection and may pose significant safety risks. Using a different approach, Qin et al. (2022) applied guided-wave vibroacoustic modulation to a spreading device of high-speed rail-borne trains fixed by a central bol- ted connection. This work analyzed the ability to loosen detection over a wide range of torques, emphasizing traditional methods’ difficulty detecting initial losses. The authors also highlighted that this is because the loosening process occurs in different stages, and the first is an initial loosening effect on the bolt thread due to the relaxation of accumu- lated stresses and strains (Daadbin; Chow, 1992), which in principle has little influence on structural dynamics, making detection difficult. 19 The features extracted from the transmissibility are also commonly used to diag- nose the level of tightening torque (Li; Jing, 2020). Modal parameters are standard fea- tures extracted from these spectral signatures due to consolidated identification methods in commercial software. On the other hand, these parameters are essentially linear featu- res. Thus, the sensitivity to robustly detect small changes in the tightening torque of lap joints is still a difficulty once diverse nonlinear mechanisms may appear due to interacti- ons between assembled components, such as friction-induced hysteretic damping, dynamic clearance, and other effects (Segalman, 2006). Nevertheless, when a certain amount of lo- osening is present, it has been demonstrated that these parameters are indicative metrics for identifying damage, particularly about high-frequency vibration modes (Huda et al., 2013). Notwithstanding, it is essential to note that even though loosening amplifies the nonlinearity level,the bolted joint inherently introduces nonlinear properties even when optimally tightened. This intrinsic nonlinearity arises due to the phenomena of friction and the occurrence of partial slip between the contact surfaces (Teloli et al., 2022). This implies that many approaches that face damage as a nonlinear phenomenon that may appear in an otherwise linear structure could not be feasible for loosening detection. Although in these cases the SHM strategy relies on monitoring the system with classical approaches to detect the presence of nonlinear behavior (Worden et al., 2008), in the problem stated here, there is no need to detect nonlinear behavior, but to distinguish between nonlinearity when the system is healthy or damaged (Bornn; Farrar; Park, 2010). However, even in works such as Bornn, Farrar and Park (2010), where a healthy nonlinear structure is considered, there may be limitations, as it is often considered a single high excitation level to be representative of the nonlinear regime of motion. In this way, there is a lack of proof that the methodologies could appropriately identify healthy nonlinear states with variations in the nonlinearity induced by different input amplitudes. Another example of an effective way to treat inherent nonlinearities seems to be by applying a higher-order spectrum, such as the Volterra series approach presented by Villani et al. (2020). Although Villani et al. (2020) also explored the concept using a unique input level to assess both nonlinear healthy and damaged states of systems, there is evidence that this approach could be viable for addressing certain applications characterized by varying harmonic excitation amplitudes. In particular, in scenarios where it is established that harmonic components differ between healthy and damaged states, 20 as exemplified by the authors’ experimental study of breathing cracks. In such instances, using the Volterra series proves advantageous as it efficiently represents the harmonic components present within the signal, enabling an understanding of the phenomenological differences between the states. A critical point that requires investigation is to define the degree of damage seve- rity due to loss of tightening without having a direct torque measurement embedded in each bolted joint. The proposal of having a simple and clear indicator to quantify this value is of interest to the industry. In this scenario, using modal properties in bolted joints to characterize and quantify damage with stochastic SHM strategies is still open in the literature. However, in addition to being simple, the processing of modal analysis needs pre-processing and an accurate identification method (Maia; Silva, 2001). Unfortunately, the estimation of damping ratios is more sensitive to noise, sampling parameters, and signal processing techniques than natural frequencies (Cao et al., 2017). For this reason, possible identification errors are often more significant than the variability generated by the loosening. Generally, this issue restricts the use of modal parameters to the reso- nance frequencies (Miguel et al., 2022). Although Miguel et al. (2022) proved that these parameters could detect loss of tightening torque, it is believed that damping features could improve damage detection once they are directly related to the quantification of nonlinear dissipation in friction-damped systems, including lap joints (Festjens; Cheval- lier; Dion, 2013; Zare; Allen, 2021). Lower preload values, where the structure is less stiff, are expected to maximize damping due to micro-slip motion. An alternative to considering more information about the vibration modes, inclu- ding damping effects, is to conduct the analysis directly in the frequency response through transmissibility functions in an output-only approach without extracting any dynamic pa- rameter. Thus, this strategy bypasses the modal identification step and all related issues. Classical applications of this approach in the literature include its combination with a Principal Component Analysis (PCA) decomposition to reduce relevant information from the frequency response to a low-dimensional feature space (Manson; Worden; Allman, 2003). In addition, Bull et al. (2021) used the same idea combined with a domain adap- tation framework through transfer component analysis (TCA) to detect damage between different populations of tailplanes of aircraft. Due to the lack of experimental samples, a coherence-based uncertainty quantification (Worden, 1998) was also performed for resam- pling purposes. However, more general and sophisticated ways exist to quantify uncer- 21 tainty, such as metamodeling through Gaussian Process Regression (GPR) (Rasmussen; Williams, 2006). Concerning bolted joints, Coelho et al. (2024) has demonstrated that correlation metrics in the frequency domain, particularly the Frequency Response As- surance Criterion (FRAC), when applied between baseline transmissibility and a novel measurement, exhibit considerable potential for effectively identifying loosening. GPR is a stochastic regression model that can predict behavior by interpolating data and feedbacking the noise of the uncertainties. GP models mainly use SHM to correlate features and conditions in a quantification step. In Miguel et al. (2022), a GP model of a set of undamped natural frequency components was connected with the level of torque applied to a bolted joint to produce a curve where an estimate of an instantaneous torque can be predicted using only the measured modal parameter. Worden and Cross (2018) performed a similar correlation in a different structure, like a concrete bridge, where a natural frequency was correlated and interpolated with temperature changes. This GP model in this second example can remove the influence of temperature to give a precise diagnosis about the bridge’s state, that is, the modifications effectively caused by structural damage, not environmental changes. 1.3 METHODOLOGY The methodologies presented in this work are illustrated by applications in the Orion Beam, an academic test structure designed to represent vibration effects common to systems assembled by bolted joints (Teloli et al., 2022; Miguel et al., 2022). It consists of a lap joint structure composed of two assembled duraluminium beams. The beams are conected by a bolted joint with three M4 bolts under a controlled torque condition (for details, see Section 2). It is important to point out that this work deals with an output-only data set. Therefore, it usually does not allow for model identification by classical experimental modal analysis (EMA), as it would require input measurements. To identify modal parameters that describe the structure in a modal model, operatio- nal modal analysis (OMA) techniques need to be performed. Several OMA approaches have already been proposed, such as frequency domain decomposition (FDD) (Brincker; Zhang; Andersen, 2001), stochastic subspace identification (SSI) (Peeters; Roeck, 2001), and poly-reference least squares complex frequency-domain (p-LSFD) (Auweraer et al., 2001). The main frequency domain OMA methods are the transmissibility-based ones 22 (Yan et al., 2019). However, they often have particular requirements, such as analyzing transmissibilities obtained through different unknown input conditions, as seen in Devri- endt and Guillaume (2007). An overview of transmissibility-based OMA that even covers its application to SHM purposes can be found in Yan et al. (2019). However, the Orion Beam is a specific case in which a white-noise base excitation and the acceleration of this base are acquired and taken as a reference signal. In this way, the acceleration signal is roughly proportional to the input. The transmissibility function strongly correlates to the system’s frequency response function (FRF), allowing it to be dealt with through classical EMA. Notwithstanding, since this work is not intended to identify a complete modal model but only to extract dynamic features that allow compari- son of how motion is transferred along the joint when it is subjected to different structural conditions (but with excitation and instrumentation conditions held constant), the terms modal analysis and modal parameters will be used in a generalized way throughout the text. In this work, two different methodologies were applied to loosen detection on the structure to deal with various issues. The first proposes an integrated approach based on machine learning techniques to detect damage using resonance frequencies as features. The main challenge here is that the damage detection is performed in a low repeatability scenario that involves assembling and disassembling the beam. The resonance frequencies of the vibration modes more sensitive to damage were selected based on observation and physical interpretation of the mode shapes. As features of different reassemblies are differently distributed, a Gaussian Mixture Model (GMM) (Figueiredo et al., 2019) is proposed to describe their dispersion in them so that it can identify different healthy regions in a feature space. Then, a damage index is proposed for loosening detection considering the shortest Mahalanobis Squared Distance (MSD) (Worden; Manson; Fieller, 2000) between all the healthy clusters and an unlabeled testing sample. Lastly, a damage quantification step is carried out using a GPR that correlates the DI to a torque condition. The second, in turn, illustrates other challenges. Firstly, a more robust way to select the modes of vibration that are sensitive to damage is proposed through Global Sensitivity Analysis (GSA) using Sobol’s index. This analysis allows for quantifying how each mode is correlated to changes in structural state. After selecting them, damage detection is performed directly on the transmissibility function using the frequency ran- ges of interest without extracting any parameter or other feature. In other words, the 23 transmissibility itself is considered a feature. The strategy creates a surrogate model that describes the stochastic baseline condition using a GPR throughout the frequency range of interest. Thus, a damage index is computed considering the distance between an unlabeled testing sample and the metamodel in a GPR-based MSD. Subsequently, this methodological approach is further extended and applied to a three-dimensional GPR model. This framework yields a transmissibility surface, which allows an understanding of the nonlinear variations occurring within the baseline condition. These variations arise from different motion regimes, which are controlled by the root mean square (RMS) of the acceleration of the structure base. 1.4 OBJECTIVE Based on the presented issues, this work aims to propose new SHM frameworks to loosen detection in bolted structures. This work focuses on vibration-based approaches aided by probabilistic machine learning techniques such as stochastic regression using the Gaussian process. Three central problems are addressed on the Orion Beam benchmark: first, a damage detection strategy based on modal analysis is proposed in a low repea- tability scenario generated by reassembly. In the other one, a framework is proposed to loosen detection directly from the transmissibility curve without performing further sig- nal processing, modal analysis, or other feature extraction procedures. Then, this second methodology is extended into a three-dimensional transmissibility metamodel to detect damage in scenarios characterized by diverse nonlinear motion regimes. This approach enables differentiation between alterations in structural dynamics due to input variations and those resulting from structural damage, allowing precise detection. 1.5 MAIN CONTRIBUTIONS The main contributions of the present work on loosening detection rely on: • Investigate the contributions of different aspects on bolted joint systems response (operational variability, nonlinearity, and loosening), and propose a probabilistic ap- proach to detect torque loosening based on modal analysis through a GMM (Miguel et al., 2022); 24 • Propose a toque quantification procedure based on a stochastic regression using GPR (Miguel et al., 2022); • Propose the use of GSA through Sobol’Index as a feature selection tool so that one can evaluate which features are more sensitive to damage; • Propose a damage identification strategy using a GPR metamodel of the baseline condition applied directly on the transmissibility function without further feature extraction; • Extend the GPR strategy to a three-dimensional metamodel that considers the evolution of the nonlinear regime as the input level increases for damage detection in different operational conditions. 1.6 OUTLINE The work is organized as follows: Chapter 1 - Introduction: Presents the motivation of this work, a literature review, the problem statement of this work, and the main contributions; Chapter 2 - Experimental Setup: Presents the experimental test bench considered in the analysis throughout this thesis: the Orion Beam. The different data sets obtained from it are also detailed; Chapter 3 - Probabilistic Machine Learning for Loosening Detection in a Low Repeatability Scenario: Presents a robust framework for damage detection in a low repeatability scenario generated by assembling and disassembling the bolted structure (Orion beam). For this, a Gaussian mixture model (GMM) of the system’s modal parameters is identified to identify different clusters of undamaged data. Also, this chapter presents a torque quantification methodology based on the integration between the GMM damage index and a Gaussian process regression (GPR); Chapter 4 - On the Gaussian Process Regression of the Transmissibility for Looseness Detection in Bolted Structures: Presents a methodology for loo- sening detection using the transmissibility function directly. Some discussions are carried out about selecting only relevant information from the frequency function 25 for detecting damage to improve the classifier’s performance. To aid in selecting it, a global sensitivity analysis (GSA) via Sobol’s index is performed to assess which modes of vibration from the transmissibility are more sensitive to damage; Chapter 5 - Three-dimensional Gaussian Process Regression Of a Transmis- sibility Surface For Loosening Detection In Different Excitation Levels: The methodology for loosening detection using the transmissibility function me- tamodel is extended to a three-dimensional regressor. This regressor incorporates acceleration RMS to create a transmissibility surface capable of detecting across different nonlinear regimes determined by the input level; Chapter 6 - Final Remarks: Presents the main conclusions and discussion about future works. 26 2 EXPERIMENTAL SETUP This chapter aims to introduce the experimental setup and data sets that were considered throughout the thesis. 2.1 THE ORION BEAM The Orion beam benchmark (Teloli et al., 2022; Miguel et al., 2022) is the structure used to demonstrate the approaches proposed in this work. The academic test structure is representative of vibration effects typically found in systems assembled by bolted joints (Brake, 2018). The damage imposed on the structure aims to emulate a gradual torque loss. Figure 1 presents the experimental setup. The lap-joint structure consists of two assembled duraluminium beams with dimensions of 200×30×2 [mm], connected by three M4 bolts spaced along a length of 30 mm. The beam is in a cantilever configuration under base excitation. To minimize uncertainties related to the stiffness in the cantilever beam’s clamped boundary condition and avoid exciting torsional modes, a 40 mm length of the beam with the contact patches is screwed to a solid aluminum block, which is the base of the structure (see Fig. 1(c)). The base motion is driven by a permanent magnetic shaker TIRA (TV Model 50303 − 120), which excites the structure considering a Gaussian white-noise input at an amplitude level of 4 m/s2 RMS as the base acceleration. A Polytec vibrometer OFV- 5000 measures the Orion beam’s free-end velocity. In contrast, the acceleration at the base is monitored by a PCB triaxial accelerometer (Model 356A4). Base acceleration and velocity measurements are used to estimate the transmissibility function. The setup also includes a National Instruments acquisition system composed of a CompactDAQ chassis (NI cDAQ - 9134), C-Series Sound and Vibration Input Module (NI-9263), and C-Series Voltage Output Module (NI-9234) for data acquisition. Concerning the bolted joint, there are contact patches at each bolt connection to retain contact between both beams in a small area. These patches consist of a square of 12 × 12 mm2 with an extra thickness of 1 mm. After each experimental run, each bolt’s torque value was checked by a Lindstrom MA500-1 torque wrench. Additionally, to avoid undesired uncertainties in the measured data set, the following assembly protocol was 27 Figure 1 – Experimental setup - The Orion beam . Lap-JointLaser Shaker Accelerometer Base Excitation Orion Beam Measurement point A ( 2 : 1 ) A 12 1 2 30 200 30 5 5 20 15 30 54 200 10 (a) Experimental Setup. (b) Top view. (c) CAD drawnings (dimensions in mm). Source: Miguel et al. (2022). adopted for all experimental realizations: • To guarantee the alignment between both beams, two axes are inserted in the ex- ternal holes. • The central bolt is fully tightened. • Both axes are removed, and the external bolts are thus tightened. The damage imposed on the structure aims to emulate the gradual torque loss. Preliminary tests indicate that variations in the preload values applied to the central bolt considerably alter the structural stiffness of the assembly. Thus, to avoid abrupt changes 28 in the system dynamics and ensure that the feature variation is gradual, the central bolt is kept fully tightened in the health condition of 80 cNm in all experimental measurements. 2.2 DATA SETS Throughout this thesis, three different data sets sourced from the Orion beam are considered. The first is used for the results discussed in Section 3. It is specifically aimed at presenting measurements across an increased variety of torque levels, characterized by finer intervals between each torque condition. This aspect is quite interesting considering that one of the main results presented in Section 3 is a regression along the torque range for damage quantification. So, a data set encompassing an extensive range of torque values is thereby more suitable. Also, the variability effects of disassembling and reassembling the bolted structure are examined, which increased the number of experimental realizations. On the other hand, a reduced number of time series are available for signal processing and feature extraction for each different condition. This first data set includes data for 4 reassemblies of the Orion Beam. For each tightening torque of an assembly, the shaker imposes the same broadband random exci- tation of 4 m/s2 RMS acceleration on the base, producing an experimental realization of 10 seconds that was acquired by the instrumentation setup presented in Fig. 1 with a sampling rate of 25600 Hz. A single realization is performed for each of the 16 tightening torques considered for each of the 4 assemblies. For increase the number of samples, each time series is equally segmented in 4 subsignals. The resulting data set is presented in Table 1. Table 1 – Data set considered in Section 3 Assemblies Number of Torques Time Series per Condition∗ Subsignals per Time Series Total Samples Torques (cNm) 4 16 1 4 256 80, 75, 70, 65, 60, 55, 50, 45, 40, 35, 30, 25, 20, 15, 10, 5 ∗ Here, a ”condition”refers to a particular tightening torque for a given assembly. Source: Prepared by the author. A second data set is considered for the applications presented in Sections 4. In such case, the objectives do not include the evaluation of reassembled effects or the quantifica- 29 tion of damage. Consequently, numerous tightening torques and varied reassemblies are not necessary, allowing for increased experimental trials per condition. The experiment design is the same as the first data set, with 3 time series of 10 seconds acquired for each condition with a sampling rate of 25600 Hz. The base excitation considered was again a white noise input with 4 m/s2 RMS acceleration, but with a new excitation signal gene- rated for each test. In addition, the time series were again segmented into 4 subsignals. The resulting data set is presented in Table 2. Table 2 – Data set considered in Section 4 Assemblies Number of Torques Time Series per Condition∗ Subsignals per Time Series Total Samples Torques (cNm) 1 3 3 4 48 60, 30, 20, 10 ∗ Here, a ”condition”refers to a particular tightening torque for a given assembly. Source: Prepared by the author. Lastly, for Section 5, a third data set is taken. The data premises are the same as those presented in Table 2, but now considering 3 different excitation level: low (4 m/s2 RMS acceleration), medium (8 m/s2 RMS acceleration) and high (12 m/s2 RMS acceleration). Also, a reduced number of torques were considered, using only the 60 and 30 cNm conditions. The data set is presented in Table 3. Table 3 – Data set considered in Section 5 Excitation level Assemblies Number of Torques Time Series per Condition∗ Subsignals per Time Series Total Samples Torques (cNm) 3 1 2 3 2 36 60, 30 ∗ Here, a ”condition”refers to a particular tightening torque for a given assembly and excitation level. Source: Prepared by the author. 30 3 PROBABILISTIC MACHINE LEARNING FOR LOOSENING DETECTION IN A LOW REPEATABILITY SCENARIO One of the advantages of assembled structures over monolithic structures is that they can be assembled and disassembled according to operational needs. Unfortuna- tely, these structures have high measurement-to-measurement variability (low repeatabi- lity between tests) when subject to operational intervention. This characteristic is be- cause structural stiffness and damping properties are sensitive to changes in the contact area, influenced by several aspects, including contact pressure, residual stresses, rough- ness, surface alignment, dynamic clearance, friction, wear and third body (see Brake, Schwingshackl and Reuß (2019), Jalali et al. (2019), for instance). As expected, the Orion Beam benchmark (Teloli et al., 2022) exhibits random variations and a high degree of unrepeatability between measurements after complete assembly and disassembly of the lap-joint, which hampers the use of a unique damage classifier for all the assembling conditions. So, the challenge in this chapter is to propose an approach to differentiate the influences from uncertainties, inherent nonlinear beha- vior, and effective torque loss. A damage index is computed using undamped natural frequencies, and a GMM is implemented to classify the bolted connection’s state binary. The advantage of the GMM is the ability to detect populations with different probabi- lity densities (Figueiredo et al., 2019). This approach is highly used for SHM under the assumption of environmental variations. Also, it may be applied in the present context to gain a comprehensive understanding of torque fluctuations, under the consideration of disassembly and assembly processes. These processes induce significant alterations in structural dynamics and influence modal characteristics, which are used here as an indication of structural damage. The second method utilized is a stochastic interpolation obtained by GPR. The key idea is to learn the nonlinear correlation between the tightening torque and changes in the natural frequencies through the damage index previously computed based on them, assu- ming the uncertainties to validate these estimates’ utilization. Although this correlation is empirical, as are several well-known phenomenological models in the literature (Visin- tin, 2013; Mathis et al., 2020), it is plausible to associate it with a reduced-order model that emulates the bolted joint’s behavior. This is executed here to prove the GPR-based model’s validity. 31 This chapter is organized as follows: the modal parameters are extracted directly from the frequency response curves considering different torque levels in the first section. The resonance frequencies are visually analyzed, aided by a physical interpretation to select the ones more sensitive to damage. So, the problems faced when identifying da- mage in this low repeatability scenario are highlighted. Afterward, the methodology for detecting damage is covered in the GMM section. To validate the algorithm’s effecti- veness in detecting variations in tightening torque under conditions involving variability in experimental realizations, frequency responses are measured considering three sets of complete assembly and disassembly of the Orion beam. Then, the procedure to cons- truct a GPR model is presented, followed by its application for quantifying the torque condition. Finally, the chapter concludes with some remarks. 3.1 MODAL ANALYSIS Although reduced-order models are usually capable of describing hysteresis effects and their influence on modal parameters, obtaining these models is difficult to achieve quickly, seeking to implement a practical SHM method. Thus, the proposal of a probabi- listic methodology for tightening torque detection through simple features extracted from the modal parameters is justified. Figure 2(a) presents the transmissibility that defines the ratio between the absolute velocity measured at the free-end of the beam and the base absolute acceleration for some tightening torque conditions and different assemblies of the same beam. The data set comprises the following samples and states: 16 different torque levels range from 80 cNm to 5 cNm with a decrement of 5 cNm, as presented in Table 1. The torque range between 80 cNm to 60 cNm is assumed to be a safe/healthy condition, whereas from 55 cNm to 5 cNm are taken as damaged conditions by loss of connecting properties. Also, the central bolt is maintained fully tightened with a torque of 80 cNm at all experimental measurements to avoid abrupt changes in the system dynamics and ensure gradual features’ variation. Each torque condition obtained 16 subsignals, from which 4 measurements were performed in each assembly of the three sets of complete assembly and disassembly of the connection. All experiments were performed with the same white noise base excitation signal that generated a 4 m/s2 RMS acceleration at the point of coupling with the shaker for 10 seconds with a sampling rate of 25600 Hz. 32 Figure 2 – (a) Transmissibility plot with emphasis on the structural conditions. Dashed lines −− indicate the frequency ranges of the bending modes. Different lines of the same color indicate different assemblies; (b) exemplifies the experimentally measured modes Source: Prepared by the author. Each 10-second signal was split into 4 2.5-second signals for the modal analysis. The transmissibilities along this work were estimated using an estimator H1 and a hanning window (Ewins, 2000). Figure 2(b) illustrates the modes predicted experimentally considering 195 sensing 33 points on the Orion beam’s surface. Note that lower modes have only a slight sensitivity to changes in torque values and different assembly and disassembly realizations of the structure, as shown in Fig. 2(a) around 10 and 400 Hz. In contrast, the higher bending modes (between approximately 750 and 1900 Hz) stress the lap-joint area more distinctly, ensuring greater observability of variations in resonance peaks. In addition, as highlighted by Bull et al. (2021), it is common for high-frequency responses to be more sensitive to minor imperfections such as those generated by reassembling. So, it can be supposed that higher bending modes increase the friction effects in the connection. In that case, they are also more sensitive to the variability between the assemblies, which hampers the use of classical classifiers. It is also important to acknowledge that the task of physically modeling contact is a very challenging issue. Specifically, accurately representing the impact of minor variations in the contact region, the distribution of pressure, the tightening torque, and the hyste- retic dynamics of micro-slip remains a complex problem, even for advanced commercial finite element analysis tools. Consequently, employing model-based techniques to detect damage involving these phenomena in practical field applications has been overshadowed by data-driven strategies, which will be thoroughly explored in this thesis. Although bolted structures are known to have inherent nonlinear behavior due to hysteresis effects induced by local frictional interactions at the joint, which indicates that a nonlinear identification might be more suitable for this class of problem (Teloli et al., 2021b), this thesis adopts a more straightforward, but still traditional approach based on a classical modal analysis algorithm to extract the modal parameters; the so-called Complex Exponential Method (Maia; Silva, 2001). The key idea is to show that after a learning procedure involving combined linear parameters in different torque conditions, it is possible to associate them with the loosening in the bolted connection through a damage index. It is known that damping ratios are directly related to the quantification of non- linear dissipation in friction-damped systems, including lap joints (Festjens; Chevallier; Dion, 2013; Zare; Allen, 2021). Lower preload values, where the structure is less stiff, are expected to maximize damping due to micro-slip motion. However, the damping ratios’ estimation is more sensitive to noise and signal processing techniques than the natural frequencies (Cao et al., 2017), possibly making identification errors more significant than the variability generated by loosening. For this reason, this work considers only these 34 frequencies as damage features. For each condition, 4 subsignals are available, as shown in Table 1. To augment the data, after estimating their modal parameters, 11 synthetic samples were generated using the mean and standard deviation of the undamped natural frequencies under each condition, resulting in a total of 15 samples per condition. To illustrate the sensibility of the natural frequencies to damage, Fig. 3 presents the boxplot of them for the first six bending modes for a single assembly. Note that in the 5th (Fig. 3(e)) and 6th (Fig. 3(f)), the resonance frequencies show a clear trend of decreasing values when the torque value is reduced. This trend in behavior is not seen in the 1st, 2nd, 3rd and 4th modes (Figs. 3(a)-(c)). So, considering them to loosening detection would not improve the algorithm’s effectiveness, but it could also substantially hamper the decision-making process. Additionally, since a two-dimensional feature space makes it easier to visualize the clustering and GMM, the analysis follows with emphasis on modes 5th and 6th. 35 Figure 3 – Boxplot of the resonance frequencies referring to the 1st (a), 2nd (b), 3rd (c), 4th (d), 5th (e) and 6th (f) bending modes on different tightening torques in each labeled structural condition Source: Prepared by the author. 36 Figure 4 shows the undamped natural frequencies of the 5th and 6th bending modes, including observations of the 960 tests performed for the 16 torque levels with 60 samples for each of them and equally divided into four different assemblies. Each torque condition presents various tests, including variability in the feature maps (healthy and damaged conditions). The tightening torque is assumed to be the same in each realization by measurements with a torque wrench. Figure 4 – Resonance frequencies, changing the tightening torque, with 960 observations (60 in each structural condition equally distributed in four different assemblies) to show the variability of the natural frequencies Source: Prepared by the author. Figure 5 illustrates potential pitfalls in classifying torque states for different as- semblies by individually interrogating natural frequencies as damage indices since the mean and variance values in each condition overlap for most torque values in the initial loosening (when assumed as a known label to compare situations). Generally speaking, it is difficult to distinguish whether the changes are produced by damage, i.e., torque loosening, or variability caused by fluctuating operational or environmental influences. Incorporating information from a set of different damage-sensitive features (in this case, both undamped natural frequencies) in a combined way is a usual and very effective way to cover the limitations of interrogating each of them independently. This concept is also supported by Figure 7. 37 Figure 5 – Boxplot of the resonance frequencies on different tightening torques using three realizations in each labeled structural condition and a different set of assemblies indicated by markers: △, ×, ◦ and ♢ Source: Prepared by the author. 3.2 GAUSSIAN MIXTURE MODEL FOR DAMAGE DETECTION A Gaussian mixture model (GMM) is a practical machine-learning algorithm for clustering using outlier formation. Its components are defined using a learning process to estimate the main clusters in the undamaged condition. One may examine a learning data matrix, Y ∈ Rn×d, with d-dimensional feature vectors from n different operational and environmental conditions when the structure is undamaged and a testing data matrix, Z ∈ Rl×d, where l is the number of experimental acquisitions from the unlabeled conditions. Here, the operational situation is the torque in the bolted joint assumed to be in the safe case (undamaged). Figure 6 shows the typical steps to implement a GMM for damage detection. First, the statistical model of the healthy condition is learned to define the central clusters by adjusting the Gaussian distributions to the features of Y . Next, in the validation step, each new input vector of features of Z is transformed into a damage index DI, which is given by a GMM score that computes its distance from the nearest cluster. 38 Figure 6 – Algorithm for damage detection via Gaussian Mixture Model Detection by GMM Validation Modal Features Learning Features 1 Fe at ur es 2 * * * ** * * * ** Healthy Cluster Observation G M M S co re * * * * * * * * * * * ** * Source: Prepared by the author. A finite mixture model, p (y|Θ), is the weighted sum of K ≥ 1 components, p (y|θk), in Rd (McLachlan; Rathnayake, 2014) p (y|Θ) = K∑ k=1 αkp (y|θk) , (3.1) where y is the d-dimensional data vector and αk corresponds to the weight of each com- ponent. These weights are constrained as αk > 0 with K∑ k=1 αk = 1. Each component p (y|θk), is described by a Gaussian distribution p (y|θk) = exp [ −1 2 (y − µk)T Σ−1 k (y − µk) ] (2π)d/2 √ det (Σk) , (3.2) where each element denoted by the parameters, θk = {µk,Σk}, composed of the mean vec- tor, µk, and the kernel matrix, Σk. The parameters Θ = {α1, α2, . . . , αK ,θ1,θ2, . . . ,θK} control entirely a GMM. The expectation-maximization (EM) algorithm is the most used way to determine the parameters of the GMMs (Ganjavi et al., 2017). This strategy applies a maximization step until the log-likelihood, log p (Y|Θ) = log m∏ i=1 p (yi|Θ), converges to a local optimum (McLachlan; Rathnayake, 2014). The Bayesian information criterion (BIC) can also be applied to define the appropriate number of structural elements for the fittest model of a GMM. 39 For each component k, a damage index is computed using the MSD between each observation z in the test matrix Z and the distribution k: DIk(z) = (z − µk) Σ−1 k (z − µk)T . (3.3) If a new observation, z, belongs to the distribution of a healthy component k, DIk will be Chi-square distributed with d degrees of freedom, χ2 d. Otherwise, z will be considered an outlier of χ2 d, i.e., it can be either a damaged observation or a healthy observation clustered in another component. Thus, for each observation, the DI is given by the smallest DIk, such way one can analyze if z belongs to any component. DI(z) = min [DI1(z), . . . ,DIK(z)] . (3.4) Then, one can compare the new sample with a threshold established using the learning data to determine the structural state that, in this case, is the bolted joint’s safety. The practical implementation can be performed using ready-made packages available, such as Python (scikit-learn1). 3.3 DETECTING TORQUE LOSS BY GMM As pointed out before, due to the overlap between damaged and undamaged con- ditions from Fig. 5, classical classifiers that allocate the undamaged conditions into a single cluster would overfit the probabilistic distribution and, therefore, are unfeasible for classifying states from features of several torque conditions considering the assembly and disassembly procedure. In such classifiers, the DIs are calculated, for instance, simply using the MSD between the single learning data distribution obtained from the features to a baseline condition in a healthy state and the testing data under unknown torque conditions. However, since the characteristics of different assemblies may be distant from each other in their mapping (see the dispersion of the healthy state in Fig. 3), it generates a healthy cluster that has a higher concentration of observations far from the mean of its distribution. Thus, a classical classifier would indicate that an unknown condition close to this non-representative mean is healthy, even without any baseline data showing 1 https://scikit-learn.org/ https://scikit-learn.org/ 40 a healthy state. This classical classifier can be seen as a GMM algorithm truncated on a single distribution for all the undamaged data, i.e., K = 1 cluster. So, when one applies it to deal with a problem that inherently has multiple healthy clusters, there is the possibility of overlap between the reference state and a damaged one, indicating as healthy conditions those situations where damage exists, presenting high values of type II errors (in particular to this work, Fig. 3 shows that most tests between 55 and 45 cNm would be incorrectly classified as healthy). For this reason, GMM is used in this work to form K > 1 cluster components referring to the baseline conditions (undamaged states). For the GMM learning, half of the data set referring to safe torque conditions (from 80 to 60 cNm) is randomly selected within the sample space of features for each assembly. The remaining healthy data are used to validate false-positive errors and observe the clustering performance of the GMM-based algorithm. Table 4 summarizes the feature samples used in this work. Considering the data augmentation and 4 assemblies, there were 60 samples of features for each torque condition. Table 4 – Parameters for learning and validation Data Set Condition Number of Samples Range of torque (cNm) Learning Healthy 150 80, 75, 70, 65, 60 Validation Healthy 150 80, 75, 70, 65, 60 Validation Damaged 660 55, 50, 45, 40, 35, 30, 25, 20, 15, 10, 5 Source: Prepared by the author. In the GMM learning procedure, K = 9 components represent the data in a healthy state. This number is chosen based on the Bayesian information criterion (BIC). The clearest way to understand how the GMM algorithm deals with the data set to classify the structural states as healthy is by visualizing the learned model and dispersing the features’ clusters. Figure 7 illustrates the multivariate probability density functions (PDF) learned by the GMM. The clusters also demonstrate the possible effectiveness of the model in detecting outliers for damage classification. This can be seen because all test data in healthy conditions are concentrated in the learned PDFs, and those in damaged conditions are far away. Furthermore, damaged data with a single feature value closely matching those of a healthy condition suggest a scenario where relying on this single feature alone fails to accurately classify. For instance, this is evident in data where the 41 5th resonance frequency is approximately 1460 Hz, which is merely distinguished from healthy clusters by differences in the 6th resonance frequency. Figure 7 – Learned GMM and features clustering Source: Prepared by the author. The MSD is responsible for computing the damage indices based on the nearest healthy cluster’s outlier formation. Note that some limitations to early detection may appear when many assemblies are considered in the GMM model. Although the GMM prevents most false-positive occurrences by introducing multiple healthy clusters, it is still possible for a healthy cluster from an assembly to overlap a damaged condition from another one. In this case, the error in condition detection is difficult to avoid, as there would be no distinction between samples in feature space. However, this would generally occur for some cases of early detection (in this work, for 55 cNm and 50 cNm torque values). Figure 8 shows the classification results where the type I errors (false-positive) are ≈ 3.6% and the type II errors (false-negative) are ≈ 0% for all validation data, proving that the identified GMM can detect the changes caused by fluctuations in tightening torque values with good performance, and is suitable for use even in situations where 42 different assembly sets are considered. To establish the threshold value, the Chi-square test was applied. Figure 8 – DI computed by GMM using frequencies of the 5th and 6th modes. Type I Errors are ≈ 2.8%, and Type II Errors are ≈ 0% Source: Prepared by the author. The results’ analysis indicates that it was possible to recognize a reduction in the torque values from the 80 - 60 cNm conditions to 55 - 5 cNm based on the proposed DI. Compared to methods in the literature using modal parameters, just a considerable loss of tightening torque is detected with statistical confidence, as performed by Luo and Yu (2017), where only changes for tightening were observed from 20 Nm in the undamaged condition to 2.5 Nm in the damaged one. Furthermore, it is also important to stress that the structure tested by these authors consisted of a single bolted joint, which is a setup quite sensitive to variations in the applied preload. At this point, additional efforts were made to perform a torque quantification based on the GMM damage index. For this, a stochastic regression between them was made using a GPR. The theoretical basis, as well as its application, are presented in the following sections. 3.4 GAUSSIAN PROCESS REGRESSION FOR TIGHTENING TORQUE QUANTI- FICATION A Gaussian Process Regression (GPR) is a proper procedure for torque quantifi- cation to compute a learning model capable of associating the change in features with the tightening torque variation using a calibration procedure. It should be emphasized that these features can be damage indices or a data set of modal parameters estimated for the learning and validation phases. However, since this work proposes an algorithm 43 that integrates the detection and quantification methodologies, it was trained the GPR based on the damage index. Figure 9 depicts the proposed algorithm to estimate the GPR-based model to quantify the torque state. The key idea is to estimate parameters for controlling a GPR model to infer the regression’s variance and mean within a Bayesian paradigm, assuming only limited information is available. The confidence interval provides an estimation on the degree to which the model is anticipated to successfully identify variations in tightening torque, or conversely, it offers insights into the potential risks and uncertainties associated with such predictions. Figure 9 – Algorithm for quantification of the tightening torque through the GP-based model Modal Features GPR Quantification Validation GP Model Tightening Torque + Modal Features Learning Estimated Torque Source: Prepared by the author. The tightening torque T (i) ∈ R can be described as the output of a nonlinear regression (Rasmussen; Williams, 2006): T (i) = f(DI(i)) + ε (i) S , i = 1, 2, . . . , N samples (3.5) where f(·) is a nonlinear function, DI(i) ∈ R is the input value, in our case the damage index assumed known in the learning data, and ε (i) S is a stochastic variable representing inherent randomness in the observations, which assumes a Gaussian distribution with zero mean: ε (i) S ∼ N ( ε (i) S |0, σ2 S ) , (3.6) 44 where σ2 S is the variance of the Gaussian noise observations. For N tests, the learning data set assumes the following simplified notation: D = ( T (i),DI(i) )N i=1 ≡ (T ,DI) , (3.7) where T ∈ RN×1 is the output vector (torques) and DI ∈ RN×1 is the input one (damage index). Once the regression equation (3.5) represents a Gaussian Process, the function f(·) is then formed by the assumption of a multivariate Gaussian prior distribution of zero mean (Rasmussen; Williams, 2006): f = f(DI) ∼ N (f |0,K) , (3.8) where K ∈ RN×N is the covariance (kernel) matrix with Kij = k(DI i,DIj), and k(·, ·) is the kernel function, also named as covariance function (Paixão et al., 2021; Teloli et al., 2021a). The kernel function can assume several classes depending on the type of appli- cation. This function represents a degree of agreement between two sample observations. In this work, considering the available data, the best results were achieved considering a squared exponential kernel as a covariance function: k(DI i,DIj) = σ2 exp [ −1 2 (DI i − DIj ℓ )2] , (3.9) where σ2 is the hyperparameter that controls the model’s covariance and ℓ is the lengths- cale. The hyperparameters β = [σ2, ℓ] are determined by solving an optimization problem of the marginal log-likelihood of the observed data (Mattos et al., 2016): log p(T |DI, β) = −1 2 log |K + σ2 SI| + − 1 2T T(K + σ2 SI)−1T − N 2 log(2π), (3.10) where I ∈ RN×N is the identity matrix. In this work, such a maximization procedure is performed using an evolutionary optimization method, and the optimum model is used to predict new outputs resulting from new inputs. The Bayesian inference is used to condition a posterior predictive distribution p(f∗|T ,X , x∗) over the predicted function f∗ based on the new input samples DI∗, which 45 yields the main relationship for the GP regression (Rasmussen; Williams, 2006): p(f∗|DI∗,DI,T ) = N ( f∗|µ∗, σ 2 ∗ ) , (3.11) where the posterior predictive mean is given by: µ∗ = k∗N ( K + σ2 SI )−1 T , (3.12) and the posterior predictive variance is given by: σ2 ∗ = k∗∗ − k∗N ( K + σ2 SI )−1 kN∗, (3.13) where k∗N = [k(DI∗,DI1), · · · , k(DI∗,DIN)], (3.14) kN∗ = kT ∗N , (3.15) k∗∗ = k(DI∗,DI∗). (3.16) A predictive distribution of T is similar to f∗. The numerical implementation can be made using ready-made packages as available, for instance, in Matlab (UQLab2, see Lataniotis, Marelli and Sudret (2018)) or Python. 3.5 TIGHTENING TORQUE ESTIMATION VIA GPR A natural and subsequent step to detecting the loss of tightening torque is to estimate this value without requiring a direct measurement using, for instance, a torque wrench or structural inspection in locus. This way, the proposed framework is extended not only to the first hierarchical level of SHM postulated by Rytter (1993) (damage identification) but also to the third (damage quantification). Since natural frequency- based DI can track torque variations, this score calibrates the GPR learning for torque estimation. The learning data (T , DI) used for learning the GPR-based model and estimation of hyperparameters β consist of half of the observed DI samples randomly selected within a uniform distribution to form the learning input vector DI, whereas the torque values 2 https://www.uqlab.com/ https://www.uqlab.com/ 46 corresponding to each DI are the output vector T . The remaining DI samples validate the GPR-based model. Figure 10 depicts the GPR-based model’s predicted torque with 95% of statistical confidence bands in direct comparison with validation data. The model has large confi- dence bands for low torque values, even covering negative torque values. This explanation refers to the dispersion of the features observed to compute the DI. Note in Fig. 3 that for the 5th resonance frequency, the variability between the different assemblies increases as the tightening torque decreases. This result is expected because the uncertainties rela- ted to the contact pressure distribution are reduced when the preload increases. Notably, there is a tendency for more pronounced reductions in the undamped natural frequency as the tightening torque is lower. In this context, the MSD increases similarly to low torque values, making the points more distant from each other and thus hampering the regression in this region. It also stands out that the relationship between torque and DIs is not monotonic. This is because there is more than one DI with the same torque and more than one torque for the same DI. Nonetheless, this GPR-based model can help the user decide about tightening torque based on score changes and quantify possible damages. Figure 10 – Torque versus DI with a GPR model Source: Prepared by the author. Figure 11 compares the actual versus estimated torques. It is worth noting that 47 the estimated mean values are close to each torque condition’s actual ones, especially for damaged conditions. There are some expected limitations in accurately estimating the tightening torque in healthy conditions, as all of them are considered to be on the reference state when computing the MSD. Therefore, the DI tends to be ≈ 0. The zoom shown in Fig. 10 reveals that for all conditions between 80 and 60 cNm, a mean torque of ≈ 70 cNm is predicted. Since both states are indistinguishable healthy states, one can conclude that there is no loss in this imprecision. Combining the GMM and GPR algorithms to first detect whether there is damage to the structural connection and subsequently, in an affirmative case, quantify the severity of the damage with reasonable accuracy is a simple tool with the potential to aid decision- making in maintenance procedures involving bolted joints. This result is feasible for industrial applications once a calibrated GPR-based model is obtained after a supervised learning procedure. Based on this, the most viable alternative is to consider the mean or even the upper and/or lower limits of confidence bands to make a safer decision. Figure 11 – Estimated torque versus actual torque for all index (•) (grey) and the mean of estimated torque (×) (red) in each test condition Source: Prepared by the author. 3.6 CONCLUSIONS The following considerations and benefits of using the proposed SHM framework are outlined: 48 • The advantage of applying GMM is to make several distributions of different as- sembly configurations to learn how the modal parameters change when the torque is changed, making outlier detection much more robust. However, attention must be paid. In practical terms, new clusters associated with the health condition must be produced with each new structure assembly. This fact is not a direct limitation of the method since the assembly protocol is expected to guarantee torque values consistent with healthy conditions; • The approach is not limited to using only modal parameters; for instance, features extracted from output-only time series could be used to obtain additional damage- sensitive indicators and calculate the score for classification. Nonetheless, caution should be taken since some features extracted in output-only methods capture global vibration effects, which may compromise the damage position’s detection; • For this application, where it is only intended to detect whether or not significant torque loss exists and not to locate the position of the damage, a single pair of me- asurement points is satisfactory for extracting features based on modal parameters. Also, the amount of samples used for learning is not exhaustive to measure; • GPR provided a good fitting of the nonlinear relation between the torque and da- mage index. However, as in this case, there is more than one torque condition taken as healthy by the GMM, and GPR can’t distinguish their torque conditions once the damage index tends to 0 for all of them. So, the SHM framework can be divided into two steps: first, verify the existence of damage through the GMM damage in- dex. Then, only if there is damage one can accurately quantify its severity through GPR; • The limitation on quantifying the undamaged state is irrelevant since there are no functional impairments between healthy torque conditions. Also, as the methodo- logy has proven effective in early damage detection, there is a good margin of safety to monitor the damage before it can generate a failure. 49 4 ON THE GAUSSIAN PROCESS REGRESSION OF THE TRANSMISSIBI- LITY CURVE FOR LOOSENING DETECTION IN BOLTED STRUCTURES Chapter 3 illustrated that there is a clear relationship between the tightening torque loss and modal features. Consequently, one can realize that loosening directly affects the transmissibility curve (see Fig. 2). Also, as discussed in Section 3.1, it is not easy to estimate damping accurately due to noise, signal processing, and transmissibility resolution. For this reason, damping estimation errors are generally more significant than the variability generated by damage, preventing their use as a damage feature. Therefore, much information is lost in the frequency domain. Notwithstanding, modal parameters require further processing to perform a modal analysis procedure in all new unlabeled data. This chapter proposes a different approach that simplifies the feature extraction procedure. The key idea is to directly use the transmissibility function as a feature and compare the curves from a baseline condition with new unlabeled data. In this way, all relevant frequency information can be considered while avoiding a modal analysis step and possible estimation errors. A GPR model is identified for this structural state to determine a baseline condition from the undamaged data. Then, for every new data point, a statistical analysis can be done to evaluate the distance the unlabeled sample is from the healthy model. Notably, the effects of loosening in the Orion Beam appear to be more pronounced in vibration modes with higher frequencies (Teloli et al., 2022; Miguel et al., 2022; Coelho et al., 2024). This seems to be a commonly observed behavior in bolted connections due to the geometrical complexity around the joint in high-order vibration modes, as illustrated by Huda et al. (2013). Consequently, employing the full range of the transmissibility function up to this point could generate an excessive volume of data. This abundance of data might negatively impact computational efficiency without enhancing the model’s ability to detect loosening within the machine learning framework. This observation becomes apparent following a detailed examination of the modal parameters, as discussed in Chapter 3. However, it is important to note that, in numerous applications, identifying the most relevant set of data or features worth considering in an SPR algorithm is often complex and nontrivial. A GSA is performed to ensure that the machine learning model and analysis only 50 consider relevant information in the loosening detection context. For this, Sobol’s indices are calculated to quantify the influence of a selected frequency range in the structural state. In other words, the analysis tries to verify if there is any direct relation between a change in the features, i.e., the frequency signature, and a difference in the structural state. This approach allows to selectively omit any data elements that lead to increa- sed computational expenses but do not significantly enhance the classifier’s performance. Furthermore, it is illustrated that in some instances, using non-significant data could even detrimentally affect the classifier. Therefore, the classifier’s effectiveness can be optimized by filtering out such information while maintaining computational efficiency. Section 4.1 presents the framework for directly using transmissibility functions modeled by a GPR for damage detection. Some troubles of indiscriminately using all the frequency spectrum are addressed. As part of the solution, Section 4.2 presents the concept of GSA and Sobol’s Index and how it is applied in practice. Then, in Section 4.3, the performance and influence of every frequency range are analyzed to choose the ones that ensure the construction of an optimized classifier. Then, the final damage detection results illustrate the effectiveness of the methodology. Lastly, some remarks are made. 4.1 DAMAGE DETECTION BASED ON THE GPR OF TRANSMISSIBILITY The proposed methodology is based on transmissibility functions similar to those presented in Fig. 2. However, the data set is different, despite this chapter still dealing with the Orion beam. In this case, the instrumentation, sampling parameters, and input statistical characteristics are the same. However, there is more significant signal variabi- lity because different input signals are used for each realization. More details are given in Section 2.2. In addition, white noise with a 35 dB signal-to-noise (SNR) ratio was su- perimposed to each velocity signal to increase uncertainty and variability. Furthermore, this white noise made possible the generation of new synthetic time series from the mea- surements presented in Table 2. In this way, each subsignal generated 10 different noisy time series, composing a data set of 120 time series for each torque condition. The safe torque condition was taken as 60 cNm, with the damaged conditions 30 cNm, 20 cNm, and 10 cNm. First, a GPR model is learned from healthy transmissibility and is taken as a refe- rence model. Note that, as it uses a frequency signature of the structure, it is important 51 to consider an excitation that covers a wide frequency range. Some harmonic inputs are available, such as sweep sine or step sine. This class of signal has the advantage of induces larger amount of energy along the frequency domain, in special the step sine that reaches a stead state response. In this way, it is possible to strongly excite the nonlinearities and possibly enhance the changes in the structural behavior due to the presence of damage. However, these types of excitation are more difficult to reproduce in industrial applicati- ons. Because of that, the use of broadband random signals is considered. The procedure for identifying the model is the same as described in Section 3.4. With a more appropriate notation, the GP from Eq. 3.5 is described as Y(i) = f(X (i)) + ε (i) S , i = 1, 2, . . . , Nt (4.1) In which Y(i) ∈ R(Nf)×1 is the ith transmissibility function from a data set with Nt measures, X (i) ∈ R(Nf)×1 is the respective frequency vector and ε (i) S ∈ R(Nf)×1 a Gaussian error of zero mean along the frequency range. For this application, the learning data set D from Eq. 3.7 has the components D = ( Y(i),X (i) )Nt i=1 ≡ (Y ,X ) , (4.2) where Y ∈ R(NtNf)×1 is the vector of Nt transmissibility amplitudes measured in Nf frequency values within the frequency ranges of interest for the undamaged condition. Similarly, X ∈ R(NtNf)×1 is the vector of frequencies. Using the whole frequency range indiscriminately from 0 to 2000 Hz as a first example, the GPR from Fig. 12 is obtained. The main idea of the proposed method is to assess whether unlabeled transmis- sibility is inside the identified confidence bands and use this information to indicate the structural state. This way, all relevant spectral information is considered for the classifi- cation procedure. Outlier detection was carried out to quantify the performance of this GPR model and identify damage. At this point, a convenient advantage of identifying a non-parametric stochastic model, such as GPR, is highlighted as a mean vector µ∗ and covariance matrix σ2 ∗, it allows a perfect integration between the information provided from the stochastic regression and the statistical knowledge required to train an MSD classifier. In this way, a GPR-based MSD is proposed to compute the distance between a new transmissibility curve Yn and the identified multivariate Gaussian distribution. An essential advantage of using this GPR-based version instead of applying MSD directly on 52 Figure 12 – GPR model identified for the whole transmissibility Source: Prepared by the author. data is that too many spectral lines are taken as features when applying it on a frequency curve. Consequently, the problem may have more features than realizations, which gene- rate a singular covariance function when covariance is just applied to the data. Thus, the proposed damage index is given by DIGPR(Yn) = (Yn − µ∗) Σ∗ −1 (Yn − µ∗)T . (4.3) Figure 13 illustrates a terrible performance of this first attempt of damage identi- fication when applying the new damage index DIGPR. The presented result brings some first considerations about the approach: this first model highlights a difference between considering all information and all relevant information. Analyzing Fig. 12, one can see that the frequency ranges between the vibration modes have no information about the structural state as it has the same behavior for all the conditions. Also, as the amplitude of these intervals is small (especially for the high-order modes), the background noise becomes more significant, harming the GPR model. Note that, as the model is identified along the entire frequency range, using all spectral lines from the whole learning data set impedes model identification due to the high computational cost in the model exempli- fied in Fig. 12, only 15 realizations of 120 were taken for learning due to computational limitations. Additionally, as the approach considers the entire transmissibility curve as a feature, every spectral line measure is different. In this way, the indiscriminate use of the whole transmissibility confuses the classifier in a high-dimensional feature space. 53 Figure 13 – DI computed by GPR-based MSD using the whole transmissibility. Type I Error are ≈ 17.5%, Type II Error are ≈ 44.2%, and MAcroF1 ≈ 0.64 Source: Prepared by the author. The frequency ranges between the vibration modes were disregarded from the learning data set and the GPR-based MSD to address the abovementioned issues. Figure 14 shows the improved model followed by the classifier in Fig. 15, which considers only the learned frequency ranges. These ranges were selected by taking the mean value of each resonance frequency and a band of 10% around it. To capture sufficient frequency bands, modes of vibration at low frequencies had their ranges limited to a minimum of 50 Hz. As the computational cost decreases, it was possible to train GPR using half of the healthy realizations randomly selected (60 frequency curves) to compose D. Figure 14 – GPR model identified using only the frequency range around the vibration modes Source: Prepared by the author. 54 Figure 15 – DI computed by GPR-based MSD using only the frequency range around the vibration modes. Type I Error are ≈ 16.67%, Type II Error are ≈ 25.00%, and MAcroF1 ≈ 0.78 Source: Prepared by the author. The identified model in Fig. 15 depicts important aspects of a GPR. First, the model is a robust method for uncertainty quantification and making stochastic predicti- ons that are interpolated between learning data. However, GPR cannot make accurate predictions in a region where it does not have sufficient information. When moving away from the end of a resonance region, the confidence bands quickly become larger, eviden- cing that the model completely loses the ability to predict the region’s shape and begins to accept any response that follows the decreasing trend of the transmissibility data. However, it should be noted that these expected model limitations do not compromise the classification since only the amplitudes belonging to the range used in training were also used to calculate the damage index, considerably improving damage identification. 4.2 GLOBAL SENSITIVITY ANALYSIS A GSA was employed to perform further improvements. The main idea is to select which frequency ranges relate to their respective vibration modes related to the structural state change. It is a very relevant analysis that uses all (and only) pertinent features of a classifier. Additionally, in the context of increasing big data and deep learning approaches, it is common for algorithms to use more and more information that may be redundant or irrelevant for a specific analysis, which unnecessarily burdens the solution of simple problems. 55 In general, the sensitivity analysis (SA) applied to any given system controlled by a set of parameters aims to correlate the variability of its parameters and the output of the system to then verify which parameters propagate the most uncertainty in the system (Saltelli et al., 2004). This way, one can determine the set of parameters that should be more rigorously determined (requiring a more careful stochastic analysis) or the setting of parameters that are worth trying to change to try to obtain specific characteristics or improvements in the behavior of a system, since changing the others would generate little change in the output. The specific class of analysis named ”global”(GSA) differs from those called ”local”(LSA) because GSA allows the influence of parameters to be evaluated simultaneously. Then, the effects of each one are computed by an average of the possible values of the others. On the other hand, LSA provides a more individualized analysis in which the parameters are varied individually. A brief but sufficiently detailed overview of GSA, including the concepts and mathematical background, can be found in Saltelli (2004), whereas Saltelli et al. (2004) provides a complete guide for its application. 4.2.1 Variance-based global sensitivity analysis Among the different GSA methods, such as the Morris method (Morris, 1991) and the Borgonovo indices (Borgonovo, 2007), the variance-based global sensitivity analysis method (Sobol’ Indices) (Sobol, 1993) stands out as the most general and flexible approach to dealing with GSA in different scenarios (linear, nonlinear, analytical, experimental, etc.), besides obtaining superior results (Homma; Saltelli, 1996). These indices are based on the decomposition of the system response Y in terms of the contribution of a set of Np parameters X = {X1, X2, . . . , XNp}, which is known as the Hoeffding-Sobol decomposition (Sobol, 1993; Hoeffding, 1948): Y = M(X) = M0