Institutional members access full text with Ovid®

Share this article on:

An Application of Principal Component Analyses and Correlation Graphs to Assess Multivariate Soil Health Properties

Bezuidenhout, Carel N.1; Antwerpen, Rianto van2,3; Berry, Shaun D.2

doi: 10.1097/SS.0b013e3182639e01
Technical Article

ABSTRACT A complex system is defined by two aspects, first, a relatively large number of influencing factors, and second, a high degree of connectivity between these factors. The health of a soil can be viewed as a complex system. A range of multivariate analytical techniques have been applied in the past to gain a deeper understanding of the health of a soil. Network analyses and, in particular, asset or correlation graphs have not been applied in soil health studies before. In this article, we hypothesize that correlation graphs may provide valuable complementary information when multivariate soil data are analyzed for soil health purposes. We investigate the use of these tools in conjunction with principal component analysis (PCA) for assessing a multivariate soil health data set. A database containing 56 soil samples from three land-use types and with a wide range of physical, chemical, and biological soil property measurements was analyzed. Many expected relationships between soil properties were confirmed. Nematode genera are generally poorly correlated. Aggregate formation is regulated by clay content in a clayey soil, but by biological processes in sandy soils. Carbon correlated strongly with many other variables in virgin soils, but its influence diminishes in agricultural soils, especially when residues were burnt. In clayey soils where carbon levels were depleted, the correlation structure revolves around calcium and small soil aggregates, whereas correlation structures are low, and the degrees of freedom are high in similarly treated sandy soils. The statistical techniques that were applied seem to complement each other well and support a two-step multivariate analysis approach where PCA is first used to explore the strong relationships, and then correlation graphs are used to explore the weaker relationships in a data set. Pareto correlation graphs, which incorporate only the highest 20% of correlation coefficients into a graph, appear particularly useful in depicting the larger aggregated manageability and measurability of soils.

1School of Engineering, University of KwaZulu-Natal, Scottsville, South Africa.

2South African Sugarcane Research Institute, Mount Edgecombe, KwaZulu-Natal, South Africa.

3Department of Soil, Crops and Climate Sciences, University of the Free State, Bloemfontein, South Africa.

Address for correspondence: Carel N. Bezuidenhout, PhD, University of KwaZulu-Natal Scottsville, KwaZulu-Natal, South Africa. E-mail:

Financial Disclosures/Conflicts of Interest: None reported.

Received February 22, 2012.

Accepted for publication June 7, 2012.

© 2012 Lippincott Williams & Wilkins, Inc.