Machine learning for classification of soybean populations for industrial technological variables based on agronomic traits
| dc.contributor.author | Teodoro, Larissa Pereira Ribeiro | |
| dc.contributor.author | Silva, Maik Oliveira | |
| dc.contributor.author | dos Santos, Regimar Garcia [UNESP] | |
| dc.contributor.author | de Alcântara, Júlia Ferreira | |
| dc.contributor.author | Coradi, Paulo Carteri | |
| dc.contributor.author | Biduski, Bárbara | |
| dc.contributor.author | da Silva Junior, Carlos Antonio | |
| dc.contributor.author | Torres, Francisco Eduardo | |
| dc.contributor.author | Teodoro, Paulo Eduardo | |
| dc.contributor.institution | Universidade Federal de Mato Grosso do Sul (UFMS) | |
| dc.contributor.institution | Universidade Estadual Paulista (UNESP) | |
| dc.contributor.institution | Federal University of Santa Maria | |
| dc.contributor.institution | University of Passo Fundo | |
| dc.contributor.institution | State University of Mato Grosso (UNEMAT) | |
| dc.contributor.institution | Universidade Estadual de Mato Grosso do Sul (UEMS) | |
| dc.date.accessioned | 2025-04-29T20:05:59Z | |
| dc.date.issued | 2024-03-01 | |
| dc.description.abstract | A current challenge of genetic breeding programs is to increase grain yield and protein content and at least maintain oil content. However, evaluations of industrial traits are time and cost-consuming. Thus, achieving accurate models for classifying genotypes with better industrial technological performance based on easier and faster to measure traits, such as agronomic ones, is of paramount importance for soybean breeding programs. The objective was to classify groups of soybean genotypes to industrial technological variables based on agronomic traits measured in the field using machine learning (ML) techniques. Field experiments were carried out in two sites in a randomized block design with two replications and 206 F2 soybean populations. Agronomic traits evaluated were: days to maturation (DM), first pod height (FPH), plant height (PH), number of branches (NB), main stem diameter (SD), mass of one hundred grains (MHG), and grain yield (GY). Industrial technological variables evaluated were oil yield, crude protein, crude fiber, and ash contents, determined by high-optical accuracy near-infrared spectroscopy (NIRS). The models tested were: support vector machine (SVM), artificial neural network (ANN), decision tree models J48 and REPTree, random forest (RF), and logistic regression (LR, used as control). A genotype clustering was performed using PCA and k-means algorithm, and then the clusters formed were used as output variables of the ML models, while the agronomic traits were used as input variables. ML techniques provided accurate models to classify soybean genotypes for more complex variables (industrial technological) based on agronomic traits. RF outperformed the other models and can be used to contribute to soybean breeding programs by classifying genotypes for industrial technological traits. | en |
| dc.description.affiliation | Federal University of Mato Grosso Do Sul (UFMS), MS | |
| dc.description.affiliation | Department of Agronomy State University of São Paulo (UNESP), SP | |
| dc.description.affiliation | Department of Agricultural Engineering Federal University of Santa Maria, RS | |
| dc.description.affiliation | Department of Food Science and Technology University of Passo Fundo, RS | |
| dc.description.affiliation | Department of Geography State University of Mato Grosso (UNEMAT), MT | |
| dc.description.affiliation | State University of Mato Grosso Do Sul (UEMS), MS | |
| dc.description.affiliationUnesp | Department of Agronomy State University of São Paulo (UNESP), SP | |
| dc.identifier | http://dx.doi.org/10.1007/s10681-024-03301-w | |
| dc.identifier.citation | Euphytica, v. 220, n. 3, 2024. | |
| dc.identifier.doi | 10.1007/s10681-024-03301-w | |
| dc.identifier.issn | 1573-5060 | |
| dc.identifier.issn | 0014-2336 | |
| dc.identifier.scopus | 2-s2.0-85185689105 | |
| dc.identifier.uri | https://hdl.handle.net/11449/306336 | |
| dc.language.iso | eng | |
| dc.relation.ispartof | Euphytica | |
| dc.source | Scopus | |
| dc.subject | Fiber | |
| dc.subject | Glycine max (L.) Merril | |
| dc.subject | Oil | |
| dc.subject | Protein | |
| dc.subject | Random forest | |
| dc.title | Machine learning for classification of soybean populations for industrial technological variables based on agronomic traits | en |
| dc.type | Artigo | pt |
| dspace.entity.type | Publication |

