自然资源学报 ›› 2019, Vol. 34 ›› Issue (6): 1345-1356.doi: 10.31497/zrzyxb.20190617

• • 上一篇    

基于随机森林的耕地利用效率测度模型构建及其应用

陈丹玲1(), 卢新海2(), 匡兵2   

  1. 1. 华中科技大学公共管理学院,武汉 430074
    2. 华中师范大学公共管理学院,武汉 430079
  • 收稿日期:2018-11-12 修回日期:2019-03-18 出版日期:2019-06-20 发布日期:2019-06-20
  • 作者简介:

    作者简介:陈丹玲(1993- ),女,江苏徐州人,博士研究生,研究方向为土地利用与管理。E-mail: hustcdl93@163.com

  • 基金资助:
    国家自然科学基金项目(71673096);国家社会科学基金项目(16CGL054)

Measurement of cultivated land utilization efficiency: Construction and application of random forest

Dan-ling CHEN1(), Xin-hai LU2(), Bing KUANG2   

  1. 1. College of Public Administration, Huazhong University of Science and Technology, Wuhan 430074, China
    2. College of Public Administration, Central China Normal University, Wuhan 430079, China
  • Received:2018-11-12 Revised:2019-03-18 Online:2019-06-20 Published:2019-06-20

摘要:

构建合适的量化分析模型是科学把握耕地利用状况及利用效率的基础性工作,可为制定合理有效的耕地资源管控政策,实现耕地利用与生态环境的协调发展提供决策依据。为了更准确地反映耕地利用系统的复杂性、动态性及差异性等特征,鉴于随机森林的基本思想,运用随机抽样Bootstrap法在合理构建分类树的基础上,构造了耕地利用效率测度的RF模型,进而以中国粮食主产区172个城市为例训练该模型并将其运用至2003-2015年的耕地利用效率测度中,同时将BP神经网络和熵权法作为对比验证其一致性、代表性和优越性。结果表明:(1)耕地利用效率测度的RF模型不受量纲限制,运行所需参数少,运算过程简化,能够较为精确地模拟各评价指标间的复杂联系,科学量化各评价指标对耕地利用效率的贡献。(2)对同一空间单元的效率值而言,RF>BPNN>EW,RF与BPNN所得效率值的总体分布格局相似,且均与EW的测度结果存在较大差异。(3)从评价结果与现实的匹配度和精度表征参数来看,RF的测度结果与自然和社会经济发展等客观事实更相符,具有较高的适用性与可靠性。同时,与其余两种常用模型相比,RF能够降低计算复杂度,提高训练效率,其测度结果的相关系数R为0.8685,MRPD为2.3533,且具有最小MMSE0.0174和MMAE0.0211,更适用于复杂非线性特征的耕地利用效率研究。

关键词: 耕地利用效率, 随机森林, 粮食主产区

Abstract:

Setting up a suitable quantitative analysis model is a basic work for scientific grasp of cultivated land utilization efficiency and its distribution pattern, and can provide reasonable decision-making basis for sustainable utilization of cultivated land then realizing the coordinated development of cultivated resources and environment. In order to effectively describe the complexity, dynamics and heterogeneity characteristics of cultivated land use system, a random forest (RF) model for measuring cultivated land utilization efficiency is constructed by applying random sampling Bootstrap to build a classification tree reasonably. Then by taking 172 cities in the major grain producing areas of China as an example, the RF model was trained to measure the cultivated land utilization efficiency in 2003-2015 compared with Back Propagation Neural Network and Entropy weight to verify the consistency, representative and superiority of RF. The results show that: (1) RF model has fewer parameters and simpler implementation. It can simulate the complex relations among the evaluation indexes, which makes it convenient to analyze the value of each index. (2) For efficiency measurement results of the same space unit, RF > BPNN > EW, the overall distribution pattern of the cultivated land utilization efficiency in RF and BPNN is similar while a great difference exists in EW. (3) Judged from the matching degree of evaluation results to reality and the accuracy parameters, the measurement results are reasonable and in accordance with the facts in RF, which reflected its high applicability and reliability. At the same time, compared with the other two commonly used models, RF can reduce the dimensions of input vectors and the computing complexity, then raise the training efficiency. The correlation coefficient R of RF is 0.8685, MRPD is 2.3533, with the minimum MMSE and MMAE being 0.0174 and 0.0211, respectively, which is more suitable for the study of the cultivated land utilization efficiency with complex nonlinear characteristics, and this method has explored a new way for evaluating cultivated land utilization efficiency.

Key words: cultivated land utilization efficiency, random forest, main grain producing areas