Repository logo

Analysis of IoT Spatial and Spatiotemporal Data: A Smart Farming Use Case

Loading...
Thumbnail ImageThumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Université d'Ottawa / University of Ottawa

Creative Commons

Attribution 4.0 International

Abstract

Farmers will be expected to produce more food due to an increasing population despite challenges like climate change. Precision agriculture (PA) and smart farming can be used to help farmers achieve this goal by reducing input costs and enabling agriculture resource optimizations. Smart farming can enable PA by gathering site-specific agriculture data (agri-data) from a) sensors using in-field gateways and/or b) other data sources. The high spatiotemporal variability of agri-data, privacy concerns, and high system deployment costs act as challenges for PA. PA model performance evaluation risks being over-optimistic if spatial (and/or temporal) structure (such as spatial autocorrelation) is not carefully considered in the model evaluation process. Block cross-validation (CV) can be used to address this to create folds of spatially (or temporally) disjointed blocks of data, although data from new locations (or time periods) may lead to pessimistic model extrapolation performance when using this evaluation technique. Cloud computing-based centralized learning (CL) could be used to train PA models, but CL does not scale well in the Internet of Things (IoT) setting and suffers from poor privacy. In addition to high system deployment costs, ignoring farmers' privacy concerns will lead to poor smart farming system (SFS) adoption rates. Limiting a SFS's farmer user-base would result in less available training data, and this could in turn negatively impact model performance. Fog computing-based local learning (LL) could instead be used, by training many local models at the edge of networks using only local data, but unfortunately, applying LL to agri-data may lead to loss of useful geographical trends. Distributed machine learning (ML) can be applied to address these challenges by only sharing model updates to train models. We proposed an IoT SFS architecture that uses privacy-aware distributed ML to train PA models without having to share farmers' private data. Expensive sensing equipment is first required to train the models using expensive-to-sense ground truth data and affordable sensors' data, but once the training is complete, new farmers can join the system to benefit from the models without needing any expensive equipment. By leveraging data from a Canadian smart farm, we performed yield prediction and nitrous oxide (N$_2$O) emission prediction experiments as use cases to showcase the proposed architecture. We used various forms of spatial and temporal block CV for evaluating PA model performance using datasets of varying heterogeneity (independent and identically distributed (IID) and non-IID datasets). We performed experiments using CL, LL, federated learning, and distributed ensemble learning, where clients/nodes were simulated on a single machine. Our results showed that when using IID datasets, distributed ML could do reasonably well and even compete with CL in terms of model performance. However, the IID dataset experiment results may have been over-optimistic due to the stronger presence of spatial/temporal autocorrelation. When using non-IID datasets (which represents the more realistic scenario of having high spatiotemporal variability in agri-data), we found that distributed ML did more poorly and failed on multiple occasions. Despite this, by using distributed ML and non-IID datasets, we were able to generate useful yield precision maps for most clients. The results reported in this thesis demonstrate that the proposed smart farming IoT architecture combined with distributed ML can potentially be used for achieving high spatiotemporal resolution agri-data sensing in a manner that is a) privacy-aware, b) affordable, and c) scalable, at the expense of reduced sensing accuracy.

Description

Keywords

internet of things, precision agriculture, smart farming, nitrous oxide prediction, yield prediction, distributed machine learning, spatiotemporal data analysis

Citation

Related Materials

Alternate Version