Introduction
Addressing Challenges in Practical Data Utilization
Addressing Data Errors to Ensure Field-Level and Spatial Accuracy
Addressing Insufficient Soil Attribute Data for Agricultural Modeling
Conclusions
Introduction
Agricultural systems are confronting a range of significant challenges (Smith et al., 2016). Although issues like declining soil fertility, deteriorating environmental conditions, and the uncertainty of crop production under climate change are not new, they remain critically important to address and require urgent attention. Consequently, there is a growing demand for advanced management practices to enhance crop production conditions. On commercial land, most agricultural technologies focus on alternative soil tillage and the efficient use of energy and inputs to optimize carbon, nutrient, and moisture retention, thereby enhancing agricultural productivity and profitability (Lal et al., 2015).
Soil plays a crucial role in supporting various ecosystem functions. One of its most important functions is to serve as a source and a sink for essential fertility elements (Oertel et al., 2016). Globally, soil contains approximately 1,500 billion tonnes of carbon in the form of organic matter–approximately three times the amount stored in the atmosphere (760 billion tonnes) and in vegetation (560 billion tonnes) (Friedlingstein et al., 2020). This reservoir of carbon is integral to the chemical, physical, and biological properties of soil. Soil carbon sequestration involves both storing organic matter and reducing CO2 emissions, plays a key role in maintaining soil quality and ecosystem integrity. Furthermore, nutrient availability—especially nitrogen, phosphorus, and sulfur—can limit organic matter formation due to carbon-nutrient stoichiometry (van Groenigen et al., 2006). These figures highlight the critical importance of soil as a natural resource. As a result, effective soil management is vital for sustaining soil fertility and crop yields (Elbasiouny et al., 2022). Yet, managing them is expected to become increasingly challenging over time. Additionally, the physical and chemical variability of soil further complicates management efforts.
At this juncture, we must assess our current knowledge of soil and leverage this data to address both present and future challenges. Compiling and analyzing relevant data is crucial for optimizing crop management, reducing costs, and improving yields, all of which can support sustainable agricultural intensification. Additionally, this data is vital for developing and implementing solutions to various challenges of soil fertility management. Recently, process-based models have emerged as valuable data-driven tools for estimating crop yields, soil organic matter stocks, and greenhouse gas emissions under various scenarios. Although many models have been developed for these purposes, some remain actively used while others have become obsolete. Effective models rely on easily accessible input data related to crops, management practices, soil properties, and climate conditions. Specifically, soil data should include organic carbon, nitrogen, pH, sand and clay fractions, bulk density, soil temperature, and soil moisture (Jahn et al., 2006). This data provides crucial information about organic matter and soil fertility, thereby helping farmers make informed decisions about soil management.
There is growing interest in soil data with optimal spatial and temporal coverage to effectively address agricultural challenges. Fortunately, the Republic of Korea has a valuable repository of soil data known as HeukToram. However, several issues with this data persist. This opinion paper reviews the current state of national soil databases and highlights the challenges that must be addressed to improve their effectiveness and broaden their scope.
Addressing Challenges in Practical Data Utilization
Since 1964, the Rural Development Administration (RDA) of Korea has produced extensive data and maps obtained from nationwide agricultural soil surveys, which are made available through the HeukToram database. The database is also based on comprehensive soil test results provided by Agricultural Technology Centers (ATC), part of the National Institute of Agricultural Science (NAS). It offers field diagnostic services that assist farmers in optimizing manure application and fertilizer use. The measured parameters include soil organic matter, macronutrients (phosphorus, potassium, calcium, and magnesium), pH, and electrical conductivity. These parameters are used to determine crop-specific land preparation requirements, such as lime application rates. However, the online platform currently supports soil test data for only one field at a time, based on postal address (see https://soil.rda.go.kr/sibi/sibiExam.do). While this may not be an inherent issue, the service would benefit from the capability to obtain raw data for multiple fields simultaneously, allowing for more efficient analysis using data analytics. Accessing multi-field chemical data via an application programming interface (API) is possible, but the process is cumbersome for both farmers and researchers. It involves determining basic map components, such as the x and y coordinates of each field by address, and then matching these with administrative division codes.
In addition to the soil test database, 31 thematic soil maps are available to users upon request. These precision maps, with a spatial scale of 1:25,000, cover four categories: 1) soil profile, 2) classification, 3) topography, and 4) commentary. However, extracting physical data for individual fields from these maps can be time-consuming. For instance, displaying data for multiple points simultaneously is impractical, and checking various soil characteristics for a single point requires querying each characteristic individually. Users find accessing data via mobile devices more convenient due to faster loading times, whereas the process can be cumbersome on PCs. As a result, physical data verification often takes longer. For modeling purposes, which often require regional data, providing large datasets in attribute table format could significantly reduce the time needed to prepare input data. The soil maps represent both surface and subsurface layers, but it is important to specify the soil depths and apply consistent classification models for each attribute (e.g., texture). The HeukToram database includes soil profile information for 405 soil series, but this data needs to be downscaled to specific field levels.
To improve domestic use of the data, it is essential to address the issues at the levels of data administration and coordination. Additionally, investing in multilingual services is worthwhile, as Korea’s soil database has limited global visibility compared to more established international databases (e.g., Harmonized World Soil Database, European Soil Database) and national databases (e.g., Soil Survey Geographic Database) (Fig. 1). Approval-required data access is another issue that must be addressed, as open data policies are widely recommended. Such policies can enhance the international use of the data and contribute to continental and global databases.
Addressing Data Errors to Ensure Field-Level and Spatial Accuracy
There may be human errors in the soil test data and spatial inaccuracies in the soil thematic maps. Our primary concern lies with data-entry mistakes, such as mistakenly recording results from a previous sampling period instead of the current data, rather than errors related to sampling or analysis. Some approaches to address the sampling and analytical errors have already been proposed. However, errors like incorrect data entry or failing to record valid data can seriously compromise the reliability of the database (data not shown). Therefore, measures should be implemented to manage these errors statistically and minimize their occurrence whenever possible. Our analysis of the shapefiles for the soil maps also revealed discrepancies between the official administrative boundary data–specifically, the nationwide cadastral map of Korea from the V-World Digital Twin National Spatial Data Infrastructure (NSDI Portal)–and the boundaries shown on the soil maps, with spatial discrepancies of approximately 100–300 meters (Fig. 2). These discrepancies led to data mismatching due to coordinate errors, as the spatial extent covered by the maps was incomplete. Fortunately, these data issues did not arise with soil maps from the online server, suggesting that the problems are related to spatial errors in the shapefiles. The discrepancies were particularly pronounced around coastal areas, likely due to the complexity of the coastlines. Therefore, these issues should be addressed or clarified by officials.
Addressing Insufficient Soil Attribute Data for Agricultural Modeling
Establishing and providing large-scale quantitative datasets is crucial for accurate agricultural modeling. For instance, a typical process-based model that simulates the carbon and nitrogen cycles in soil requires fundamental data, including organic carbon, nitrogen, pH, clay content, and bulk density (Lee et al., 2020; Li, 2000). However, the current soil database is missing essential data (Table 1), such as nitrogen and hydraulic properties. Although soil organic carbon content can be estimated from organic matter values, the absence of detailed metadata on analytical methods and coefficients used across different ATECs complicates this calculation. Standardizing soil analysis methods across all centers is crucial to prevent confusion and ensure the reliability of predictive data used in modeling. Additionally, soil physical properties like clay content and bulk density are provided as categorical data rather than quantitative values, limiting their direct application in modeling and ecosystem simulation. Quantitative data are vital for enhancing the precision of various modeling studies, including those related to greenhouse gases, crops, and spectral analysis.
Unlike widely used global soil databases that offer soil profile data down to 2 meters, the current HeukToram database includes soil test results only for the top 0.2 meters at the field scale. The associated soil maps represent surface and subsurface layers but lack depth specifications. In addition, HeukToram offers a separate soil profile database for 405 soil series can help classify deeper conditions (-2 meters) based on surface data; however, it provides average data for these soil series, and not at the field scale. Additionally, soil point data and thematic maps lack uncertainty estimates, which are crucial for conducting uncertainty and sensitivity analyses.
Table 1.
Group | Variable | Proposed units |
Climate | Evapotranspiration | mm |
Prescott index | - | |
Soil | Total organic carbon1 | g kg-1 |
Total nitrogen | g kg-1 | |
Bulk density | Mg m-3 | |
Cation exchange capacity | cmolc kg-1 | |
Available water content | % | |
Sand content1 | % | |
Silt content1 | % | |
Clay content1 | % | |
Kaolinite | % | |
Illite | % | |
Smectite | % | |
Parent material | Thorium (gamma Th) | mg kg-1 |
Uranium (gamma U) | mg kg-1 | |
Potassium (gamma K) | mg kg-1 | |
Vegetation | Net primary productivity | g C m-2 y-1 |
Fpar-raingreen | - | |
Fpar-evergreen |
Conclusions
The HeukToram soil database offers comprehensive chemical data crucial for soil fertility assessment but lacks map components and quantitative data on soil properties needed for other applications. While 31 thematic soil maps effectively cover soil profiles, classes, topography, and commentary, extracting physical data for specific areas can be time-consuming due to technical issues. This gap poses challenges for users seeking essential soil attribute data for agricultural management and modeling. To improve the database, we outline key challenges that must be addressed to enhance agricultural research capabilities. We believe these efforts will help establish a successful soil database that supports both extension services and research initiatives.