Mapping soil pollution by using drone image recognition and machine learning at an arsenic-contaminated agricultural field

Jia, Xiyue, Cao, Yining, O'Connor, David, Zhu, Jin, Tsang, Daniel C.W., Zou, Bin and Hou, Deyi (2020) Mapping soil pollution by using drone image recognition and machine learning at an arsenic-contaminated agricultural field. Environmental Pollution, 270. p. 116281. ISSN 0269-7491

Manuscript_EP HRAI 2020-11-30.pdf - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (1MB) | Preview


Mapping soil contamination enables the delineation of areas where protection measures are needed. Traditional soil sampling on a grid pattern followed by chemical analysis and geostatistical interpolation methods (GIMs), such as Kriging interpolation, can be costly, slow and not well-suited to highly heterogeneous soil environments. Here we propose a novel method to map soil contamination by combining high-resolution aerial imaging (HRAI) with machine learning algorithms. To support model establishment and validation, 1068 soil samples were collected from an arsenic (As) contaminated area in Zhongxiang, Hubei province, China. The average arsenic concentration was 39.88 mg/kg (SD = 213.70 mg/kg), with individual sample points determined as low risk (66.9%), medium risk (29.4%), or high risk (3.7%), respectively. Then, identified features were extracted from a HRAI image of the study area. Four machine learning algorithms were developed to predict As risk levels, including (i) support vector machine (SVM), (ii) multi-layer perceptron (MLP), (iii) random forest (RF), and (iii) extreme random forest (ERF). Among these, we found that the ERF algorithm performed best overall and that its prediction performance was generally better than that of traditional Kriging interpolation. The accuracy of ERF in test area 1 reached 0.87, performing better than RF (0.81), MLP (0.78) and SVM (0.77). The F1-score of ERF for discerning high-risk points in test area 1 was as high as 0.8. The complexity of the distribution of points with different risk levels was a decisive factor in model prediction ability. Identified features in the study area associated with fertilizer factories had the most important contribution to the ERF model. This study demonstrates that HRAI combined with machine learning has good potential to predict As soil risk levels.

Item Type: Article
Keywords: Arsenic contamination, Soil pollution, HRAI, Remote sensing, Machine learning
Divisions: Real Estate and Land Management
Depositing User: Dr David O'Connor
Date Deposited: 18 Jan 2021 13:41
Last Modified: 14 Dec 2022 05:20

Actions (login required)

Edit Item Edit Item