自然资源学报 ›› 2015, Vol. 30 ›› Issue (11): 1922-1932.doi: 10.11849/zrzyxb.2015.11.012

• 资源研究方法 • 上一篇    下一篇

面向粮食产量空间化的多元回归分析模型中的两个关键问题探讨

廖顺宝1, 2, 姬广兴1, 侯鹏敏1, 岳艳琳1, 杨旭1   

  1. 1. 河南大学 环境与规划学院,河南 开封 475004;
    2. 中国科学院 地理科学与资源研究所,北京 100101
  • 收稿日期:2014-11-17 修回日期:2015-02-12 出版日期:2015-11-14 发布日期:2015-11-14
  • 作者简介:廖顺宝(1966- ),男,四川德阳人,博士,教授,博士生导师,中国自然资源学会会员(S300000153M),主要从事遥感与GIS应用、属性数据空间化及数据产品质量评价等方面的研究。E-mail: liaosb@henu.edu.cn
  • 基金资助:

    中国科学院战略性先导科技专项“应对气候变化的碳收支认证及相关问题(XDA05050000); 环保公益性行业科研专项(201209030)

Discussion on Two Key Problems of Multivariable Linear Regression Models for Spatialization of Grain Yield

LIAO Shun-bao1, 2, JI Guang-xing1, HOU Peng-min1, YUE Yan-lin1, YANG Xu1   

  1. 1. College of Environment and Planning, Henan University, Kaifeng 475004, China;
    2. Institute of Geographic Sciences and Natural Resources Research, CAS, Beijing 100101, China
  • Received:2014-11-17 Revised:2015-02-12 Online:2015-11-14 Published:2015-11-14

摘要:

为比较不同区域尺度变量对模型拟合效果的影响,在全国不分区和分区两种情况下,分别基于县级和地市级两个区域尺度上的样本构建粮食产量与水田、水浇地、旱地面积之间的多元线性回归模型,结果显示,用地市级数据作分析样本比用县级数据作分析样本好,分区建模比不分区建模效果好。因此,将全国划分为7个区,以地市级数据作为区域尺度的变量,在区域、栅格、亚栅格三个尺度上探讨变量(因变量、自变量)尺度和常数项取值这两个因素对模型应用的影响。得到以下结论:1)基于区域尺度样本构建的多元回归模型,如果常数项不为0,则不能用于空间化计算;如果常数项为0,则可以用于空间化计算;2)基于栅格尺度样本构建的多元回归模型,不论常数项是否为0,均可应用于空间化计算;3)基于亚栅格尺度样本构建的多元回归模型,不论其常数项是否为0,也均可用于空间化计算,但需要将计算结果乘以一个系数,该系数等于栅格单元面积与亚栅格单元面积的比值。上述结论对其他类型的统计数据空间化具有指导和参考价值。

Abstract:

The spatialization of statistical and observed data is one of the important methods for processing geospatial data. It is beneficial to comprehensive analysis between inter-disciplinary data. Multivariable linear regression models are often applied to spatialization of attribute data. However, spatialization is a downscaling issue, so the scale of variables and the setting of constant should be considered when a multivariable linear regression model is constructed. In this paper, the problems on the setting of constant and the scale of variables of multivariable linear regression models for spatialization of statistical data were discussed with using national grain yield of China as a case. Firstly, the country was treated as a whole. The relative coefficient of the yield-area model based on statistical samples at county level was 0.74 and that at prefecture level was 0.83. While the country was divided into 7 regions for modeling, the relative coefficient of the model based on statistical samples at county level was 0.82 and that at prefecture level was 0.90. Therefore, the partition modeling based on statistical samples at prefecture level was a reasonable choice. Secondly, based on partition modeling at prefecture level, the constant settings and variable scales of the grain yield-sown area models were further discussed at the scales of region (prefecture level), grid cell (1 km×1 km) and sub-grid cell (100 m×100 m) respectively. The following conclusions were drawn: 1) the multivariable linear regression models based on statistical samples at regional level (for example prefecture) could not be used for spatialization of statistical data at grid scale if the constant was not set to 0, but they could be used at grid scale while the constant was set to 0; 2) the multivariable linear regression models based on statistical samples at grid scale could be directly used for spatialization of statistical data at grid scale whether or not the constant was set to 0; 3) the multivariable linear regression models based on statistical samples at sub-grid scale could also be used for spatialization of statistical data at grid scale whether or not the constant was set to 0. However, the calculated results by the models at sub-grid scale have to be multiplied by a ratio, which is just the ratio of the area of a grid cell to that of a sub-grid cell. The conclusions drawn from this paper have guidance and reference value for spatialization of other kinds of statistical data though they were drawn based on spatialization of grain yield.

中图分类号: 

  • F224