基于神经网络的预测模型

2019年11月21日 252次阅读

http://zhidao.baidu.com/link?url=fLsRwA_uBmpyE5YIYC6TekEUzWK8xvOTClbL4wRPea9PmJpmzwkRNM7kNh-svP3ikxWKHInCugOPjS0aHZPXEJ7quYysefMYX6JQ4mUX_Dq

http://blog.csdn.net/desilting/article/details/38981673

基本思想：

根据前几次的数据模拟下一次的数据。

需要数据具有“周期性”且周期可知。

matlab代码：

x=[54167
55196
56300
57482
58796
60266
61465
62828
64653
65994
67207
66207
65859
67295
69172
70499
72538
74542
76368
78534
80671
82992
85229
87177
89211
90859
92420
93717
94974
96259
97542
98705
100072
101654
103008
104357
105851
107507
109300
111026
112704
114333
115823
117171
118517
119850
121121
122389
123626
124761
125786
126743
127627
128453
129227
129988
130756
131448
132129
132802
134480
135030
135770
136460
137510]’;
% 该脚本用来做NAR神经网络预测
% 作者：Macer程
lag=3;    % 自回归阶数
iinput=x; % x为原始序列（行向量）
n=length(iinput);

%准备输入和输出数据
inputs=zeros(lag,n-lag);
for i=1:n-lag
inputs(:,i)=iinput(i:i+lag-1)’;
end
targets=x(lag+1:end);

%创建网络
hiddenLayerSize = 10; %隐藏层神经元个数
net = fitnet(hiddenLayerSize);

% 避免过拟合，划分训练，测试和验证数据的比例
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;

%训练网络
[net,tr] = train(net,inputs,targets);
%% 根据图表判断拟合好坏
yn=net(inputs);
errors=targets-yn;
figure, ploterrcorr(errors)                      %绘制误差的自相关情况（20lags）
figure, parcorr(errors)                          %绘制偏相关情况
%[h,pValue,stat,cValue]= lbqtest(errors)         %Ljung－Box Q检验（20lags）
figure,plotresponse(con2seq(targets),con2seq(yn)) %看预测的趋势与原趋势
%figure, ploterrhist(errors)                      %误差直方图
%figure, plotperform(tr)                          %误差下降线

%% 下面预测往后预测几个时间段
fn=7; %预测步数为fn。

f_in=iinput(n-lag+1:end)’;
f_out=zeros(1,fn); %预测输出
% 多步预测时，用下面的循环将网络输出重新输入
for i=1:fn
f_out(i)=net(f_in);
f_in=[f_in(2:end);f_out(i)];
end
% 画出预测图
figure,plot(1949:2013,iinput,’b’,2013:2020,[iinput(end),f_out],’r’)

图1自相关

图2误差

图3预测

上面的程序是可以通用的，只要你根据自己需要是可以修改用在其他地方的，基本思想就是用前lag年的人口数来预测下一年的人口，至于lag等于几你是可以自己改的。还有在对结果好坏的判断中，仅仅看误差图是不够的，如果是一个好的预测，那么自相关性图中除了0阶自相关外，其他的自相关系数系数都不应该超过上下置信区间。还有其他的统计量和图表都都写在”%“后面了，如果需要，去掉就可用。最后的预测值为f_out，我的预测值为

138701.065269972 139467.632609654 140207.209707364 141210.109373609 141981.285378849 142461.332139592 143056.073139776

R代码：

说明：测试数据具有周期性特征（7天）

[plain] view plain copy print ?

library(nnet)
source <- c(10930,10318,10595,10972,7706,6756,9092,10551,9722,10913,11151,8186,6422,
6337,11649,11652,10310,12043,7937,6476,9662,9570,9981,9331,9449,6773,6304,9355,10477,
10148,10395,11261,8713,7299,10424,10795,11069,11602,11427,9095,7707,10767,12136,12812,
12006,12528,10329,7818,11719,11683,12603,11495,13670,11337,10232,13261,13230,15535,
16837,19598,14823,11622,19391,18177,19994,14723,15694,13248,9543,12872,13101,15053,
12619,13749,10228,9725,14729,12518,14564,15085,14722,11999,9390,13481,14795,15845,
15271,14686,11054,10395,14775,14618,16029,15231,14246,12095,10473,15323,15381,14947)
srcLen<-length(source)
for(i in 1:10){ #预测最后十个数；
real <- source[srcLen-i+1] #实际值
xNum=(srcLen-i+1)%/%7 #组数
yNum=7 #每组7个数
data<-array(1:(xNum*yNum),c(xNum,yNum))
pre=srcLen-i+1;
for(x in 1:xNum){ #数组赋值
for(y in 1:yNum){
data[x,y]=source[pre]
pre=pre-1;
}
if(pre<7){
break;
}
}
ascData<-array(1:(xNum*yNum),c(xNum,yNum)) #数组逆序
for(x in 1:xNum){
for(y in 1:yNum){
ascData[x,y]=data[xNum-x+1,y]
}
}
colnames(ascData) <- c(“a”,”b”,”c”,”d”,”e”,”f”,”g”) #每列列名
trainData<-data.frame(scale(ascData[,c(1:7)]))
nn<-nnet(a~b+c+d+e+f+g,trainData[1:(xNum-1),],size=10,decay=0.01,maxit=1000,linout=F,trace=F)
predict<-predict(nn,trainData[xNum,])
predict=predict*sd(ascData[,1])+mean(ascData[,1])
percent <- (predict-real)*100/real
res <- paste(“预测值：”,predict,”实际值：”,real,”误差：”,percent)
print(res)
}

第一次预测：

[plain] view plain copy print ?

[1] “预测值： 16279.0513125717 实际值： 14947 误差： 8.91183055176118”
[1] “预测值： 14645.5327512872 实际值： 15381 误差： -4.78166080692271”
[1] “预测值： 14502.4347443558 实际值： 15323 误差： -5.35512142298625”
[1] “预测值： 9812.9237303024 实际值： 10473 误差： -6.30264747157069”
[1] “预测值： 11366.9396330435 实际值： 12095 误差： -6.01951522907361”
[1] “预测值： 15417.6946929827 实际值： 14246 误差： 8.22472759358924”
[1] “预测值： 15117.3154726064 实际值： 15231 误差： -0.746402254570258”
[1] “预测值： 16066.7818969626 实际值： 16029 误差： 0.235709632307469”
[1] “预测值： 14360.1836579368 实际值： 14618 误差： -1.76369094310545”
[1] “预测值： 14762.5499357273 实际值： 14775 误差： -0.084264394400848”

第二次预测：

[plain] view plain copy print ?

[1] “预测值： 16274.2977340541 实际值： 14947 误差： 8.88002765808567”
[1] “预测值： 14645.5829417616 实际值： 15381 误差： -4.78133449215549”
[1] “预测值： 14477.3978320906 实际值： 15323 误差： -5.51851574697771”
[1] “预测值： 9851.55020425515 实际值： 10473 误差： -5.93382789787881”
[1] “预测值： 11337.4015608863 实际值： 12095 误差： -6.26373244409858”
[1] “预测值： 15417.5053358782 实际值： 14246 误差： 8.2233983986956”
[1] “预测值： 15123.5455847284 实际值： 15231 误差： -0.705498097771535”
[1] “预测值： 16049.9242132398 实际值： 16029 误差： 0.130539729488775”
[1] “预测值： 14369.3231442035 实际值： 14618 误差： -1.70116880419038”
[1] “预测值： 14765.8214583581 实际值： 14775 误差： -0.0621221092516397”

第三次预测：

[plain] view plain copy print ?

[1] “预测值： 16278.9064421534 实际值： 14947 误差： 8.91086132436858”
[1] “预测值： 14634.2898096302 实际值： 15381 误差： -4.85475710532337”
[1] “预测值： 14483.9746718714 实际值： 15323 误差： -5.47559438836129”
[1] “预测值： 9818.37752965315 实际值： 10473 误差： -6.25057261860837”
[1] “预测值： 11366.9309261672 实际值： 12095 误差： -6.01958721647616”
[1] “预测值： 15417.7054099752 实际值： 14246 误差： 8.22480282167056”
[1] “预测值： 15126.2971700737 实际值： 15231 误差： -0.687432407105736”
[1] “预测值： 16066.8686059418 实际值： 16029 误差： 0.236250582954744”
[1] “预测值： 14364.9178141514 实际值： 14618 误差： -1.73130514330666”
[1] “预测值： 14771.585261145 实际值： 14775 误差： -0.023111599695413”