Repository logo
  • English
  • Español
  • Log In
    Have you forgotten your password?
Universidad Tecnológica Indoamérica
Repository logo
  • Communities & Collections
  • Research Outputs
  • Projects
  • Researchers
  • Statistics
  • Investigación Indoamérica
  • English
  • Español
  • Log In
    Have you forgotten your password?
  1. Home
  2. CRIS
  3. Publications
  4. An Approach Based on Web Scraping and Denoising Encoders to Curate Food Security Datasets
 
Options

An Approach Based on Web Scraping and Denoising Encoders to Curate Food Security Datasets

Journal
Agriculture (Switzerland)
Date Issued
2023
Author(s)
Santos, Fabián
Centro de Investigación para el Territorio y el Hábitat Sostenible
Acosta N.
Type
Article
DOI
10.3390/agriculture13051015
URL
https://cris.indoamerica.edu.ec/handle/123456789/8244
Abstract
Ensuring food security requires the publication of data in a timely manner, but often this information is not properly documented and evaluated. Therefore, the combination of databases from multiple sources is a common practice to curate the data and corroborate the results; however, this also results in incomplete cases. These tasks are often labor-intensive since they require a case-wise review to obtain the requested and completed information. To address these problems, an approach based on Selenium web-scraping software and the multiple imputation denoising autoencoders (MIDAS) algorithm is presented for a case study in Ecuador. The objective was to produce a multidimensional database, free of data gaps, with 72 species of food crops based on the data from 3 different open data web databases. This methodology resulted in an analysis-ready dataset with 43 parameters describing plant traits, nutritional composition, and planted areas of food crops, whose imputed data obtained an R-square of 0.84 for a control numerical parameter selected for validation. This enriched dataset was later clustered with K-means to report unprecedented insights into food crops cultivated in Ecuador. The methodology is useful for users who need to collect and curate data from different sources in a semi-automatic fashion. © 2023 by the authors.
Subjects
  • Alzheimer; apache net...

Scopus© citations
1
Acquisition Date
Jun 6, 2024
View Details
Views
3
Acquisition Date
May 24, 2025
View Details
google-scholar
Downloads
Logo Universidad Tecnológica Indoamérica Hosting and Support by Logo Scimago

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement
  • Send Feedback