多机器学习模型下南京市PM<sub>2.5</sub>预测分析

鞠杨

多机器学习模型下南京市PM_2.5预测分析

鞠杨

Predictive Analysis of PM_2.5 in Nanjing under Multiple Machine Learning Models

JU Yang

摘要

摘要: 针对南京市PM_2.5浓度预测问题，采用了五种不同的机器学习模型：多元线性回归、随机森林、K最邻近模型（KNN）、BP神经网络模型（BPNN）和极端梯度提升算法（XGBoost）。研究基于南京市2021年和2022年的空气质量及气象数据，通过数据预处理和特征缩放，对模型进行训练和测试。评估指标包括相关系数（R²）、均方差（RMSE）、平均绝对误差（MAE）和平均绝对百分比误差（MAPE）。研究结果表明，五种模型总体上预测性能良好，其中随机森林模型的预测精度最高，误差最小。不同季节的预测精度分析显示，多元线性回归和BP神经网络模型（BPNN）在春季和冬季的预测精度高于夏季和秋季；而随机森林、K最邻近模型（KNN）和极端梯度提升算内存占用最多，而K最邻近模型（KNN）模型的运行时间和内存占用最少。综合考虑预测精度和运行效率，随机森林模型在南京市PM_2.5浓度预测中表现最佳。

Abstract: In this study, five different machine learning models were used for the PM_2.5 concentration prediction problem in Nanjing:multiple linear regression, random forest, K Nearest Neighbor Model（KNN）, BP neural network, and eXtreme Gradient Boosting XGBoost.The study was based on the air quality and meteorological data of Nanjing for the years of 2021 and 2022, and the models were trained and tested by data preprocessing and feature scaling. The evaluation metrics included correlation coefficient, mean squared error RMSE, mean absolute error MAE and mean absolute percentage error MAPE. The results showed that the five models had good prediction performance in general, with the Random Forest model having the highest prediction accuracy and the minimum error. The analysis of the prediction accuracy in different seasons showed that the prediction accuracy of multiple linear regression and BP neural network was higher in spring and winter than in summer and fall. While the random forest, K Nearest Neighbor Model（KNN） and eXtreme Gradient Boosting XGBoost models had the highest prediction accuracy in winter. In terms of model running efficiency, the BP neural network had the longest training time and the most memory usage, while the K Nearest Neighbor Model（KNN） model had the least running time and memory usage. Considering the prediction accuracy and running efficiency, the random forest model performed best in predicting PM_2.5 concentration in Nanjing. The methods and models in this study could also provide references for air quality prediction in other regions.

HTML全文

参考文献(20)

施引文献

资源附件(0)

多机器学习模型下南京市PM2.5预测分析

Predictive Analysis of PM2.5 in Nanjing under Multiple Machine Learning Models

多机器学习模型下南京市PM_2.5预测分析

Predictive Analysis of PM_2.5 in Nanjing under Multiple Machine Learning Models