佳文速递 | 杜培军教授团队最新成果在JGSA期刊发表！

2020年05月19日 17:47:49 访问量：1050次

大多数机器学习的核心任务都可以归纳为分类与回归问题，而在空间数据处理中，回归和分类模型也常用于从观测或测量的离散数据中提取有用的地理信息，如土地覆盖分类、空间插值和定量参数反演。本文综述了支持向量机、半监督和主动学习、集成学习与深度学习四种机器学习方法在空间数据处理中的应用进展。机器学习方法用于空间数据处理的三个核心要素是学习算法、训练样本和输入特征。这四种机器学习方法分别从不同的角度提高分类与回归的性能，支持向量机（SVM）侧重于特征空间变换和决策函数优化，半监督和主动学习（Semi-supervised and active learning）则突出未标记样本的应用，集成学习（Ensemble learning）和深度学习（Deep learning）重点在于增强学习模型和泛化能力。为了成功地将机器学习方法应用到空间数据处理中，重点发展了四个层次的策略：实验和评估方法的适用性，通过引入空间特征对算法进行拓展，优化算法参数以改进性能，以及通过多种方法的组合提升算法性能。

按照以上策略，论文首先回顾了SVM的研究进展，探讨了机器学习新方法用于空间数据处理的优点，从直接应用试验并与传统分类器比较出发，有针对性地进行改进以解决多类分类问题，利用粒子群优化等方法优化SVM的参数，并综合利用空间和光谱特征进行分类。为了克服训练样本不足的限制，采用半监督学习和主动学习的方法将未标记样本用于分类器构建，解决标记样本不足的问题，显示了从小样本中学习的潜力。针对单一算法泛化能力差、不稳定等问题，引入集成学习理论，综合多个学习机或算法模型的优势，提高模型的泛化能力，重点介绍了多分类器组合、新型集成分类器改进和集成学习空间插值等方面的进展。最后，以高分辨率遥感图像场景分类和城市结构类型区识别为例，对深度学习的应用进行了综述。

对相关研究的综述表明机器学习新方法对于空间数据处理非常有效，特别在人工智能和大数据时代具有广阔的应用前景。

福利：文后附杜培军教授93页报告PPT

Abstract

Most machine learning tasks can be categorized into classification or regression problems. Regression and classification models are normally used to extract useful geographic information from observed or measured spatial data, such as land cover classification, spatial interpolation, and quantitative parameter retrieval. This paper reviews the progress of four advanced machine learning methods for spatial data handling, namely, support vector machine (SVM)-based kernel learning, semi-supervised and active learning, ensemble learning, and deep learning. These four machine learning modes are representative because they improve learning performances from different views, for example, feature space transform and decision function (SVM), optimized uses of samples (semi-supervised and active learning), and enhanced learning models and capabilities (ensemble learning and deep learning). For spatial data handling via machine learning that can be improved by the four machine learning models, three key elements are learning algorithms, training samples, and input features. To apply machine learning methods to spatial data handling successfully, a four-level strategy is suggested: experimenting and evaluating the applicability, extending the algorithms by embedding spatial properties, optimizing the parameters for better performance, and enhancing the algorithm by multiple means. Firstly, the advances of SVM are reviewed to demonstrate the merits of novel machine learning methods for spatial data, running the line from direct use and comparison with traditional classifiers, and then targeted improvements to address multiple class problems, to optimize parameters of SVM, and to use spatial and spectral features. To overcome the limits of small-size training samples, semi-supervised learning and active learning methods are then utilized to deal with insufficient labeled samples, showing the potential of learning from small size training samples. Furthermore, considering the poor generalization capacity and instability of machine learning algorithms, ensemble learning is introduced to integrate the advantages of multiple learners and to enhance the generalization capacity. The typical research lines, including the combination of multiple classifiers, advanced ensemble classifiers, and spatial interpolation, are presented. Finally, deep learning, one of the most popular branches of machine learning, is reviewed with specific examples for scene classification and urban structural type recognition from high-resolution remote sensing images. By this review, it can be concluded that machine learning methods are very effective for spatial data handling and have wide application potential in the big data era.

论文引用：

Du, P., Bai, X., Tan, K. et al. Advances of Four Machine Learning Methods for Spatial Data Handling: a Review. J geovis spat anal, 2020, 4(1), 13. ****

学者简介

杜培军，南京大学地理信息科学系教授、博士生导师，自然资源部国土卫星遥感应用重点实验室副主任，IEEE高级会员，入选教育部新世纪优秀人才支持计划、江苏省杰出青年基金，获中国测绘学会青年测绘地理信息科技创新人才奖、中国地质学会青年地质科技奖、高校GIS论坛GIS创新人物奖、霍英东教育基金会高等院校青年教师奖等奖励，兼任中国测绘学会摄影测量与遥感专业委员会副主任委员、国际数字地球学会中国国家委员会（CNISDE）委员及成像光谱对地观测专业委员会副主任委员等职务。

研究方向为城市遥感、遥感图像处理与地学应用、时空数据分析与地理计算，在空间数据处理机器学习新方法应用、遥感多分类器集成、城市扩展与生态环境遥感、土地资源与地质环境遥感等方面取得了丰富的研究成果，已发表SCI收录论文120余篇、出版著作教材10部，获国家教学成果奖二等奖2项、省部级教学与科研奖励10余项。

先后担任国际城市遥感大会（JURSE）、国际高光谱影像与信号处理研讨会（WHISPERS）、国际对地观测与遥感应用会议（EORSA）、智能对地观测系统与应用(IEOAs)国际研讨会、国际模式识别协会遥感模式识别研讨会（IAPR-PRRS）等国际会议学术（程序、组织）委员会副主席或共同主席以及20余个国际、国内会议学术委员会委员。