TIME
SERIES FORECASTING BY USING A NEURAL ARIMA MODEL BASED ON WAVELET DECOMPOSITION
Eliete Nascimento Pereira
Federal University of Paraná, Brazil
E-mail: elietenp@gmail.com
Cassius Tadeu Scarpin
Federal University of Paraná, Brazil
E-mail: cassiusts@gmail.com
Luíz Albino Teixeira Júnior
Latin American Integration University (UNILA),Brazil
E-mail: luiz.a.t.junior@gmail.com
Submission: 30/10/2015
Revision: 23/11/2015
Accept: 26/11/2015
ABSTRACT
In the prediction of (stochastic) time series, it has been common to
suppose that an individual predictive method – for instance, an Auto-Regressive
Integrated Moving Average (ARIMA) model – produces residuals like a white noise
process. However, mainly due to the structures of auto-dependence not mapped by
a given individual predictive method, this assumption might be easily violated,
in practice, as pointed out by Firmino et al. (2015).
In order to correct it (and accordingly to produce forecasts with more accuracy
power), this paper puts forward a Wavelet Hybrid Forecaster (WHF) that
integrates the following numerical techniques: wavelet decomposition; ARIMA
models; Artificial Neural Networks (ANNs); and linear combination of forecasts.
Basically, the proposed WHF can map simultaneously linear – by means of a
linear combination of ARIMA forecasts – and non-linear – through a linear
combination of ANN forecasts – auto-dependence structures exhibited by a given
time series. Differently of other hybrid methodologies existing in literature,
the WHF forecasts are produced carrying into account implicitly the information
from the frequency presenting in the underlying time series by means of the
Wavelet Components (WCs) obtained by the wavelet decomposition approach. All
numerical results show that WHF method has achieved remarkable accuracy gains,
when comparing with other competitive forecasting methods already published in
specialized literature, in the prediction of a well-known annual time series of
sunspot.
Keywords: Wavelet decomposition, ARIMA model,
Artificial neural networks, Linear combination of forecasts
1. INTRODUCTION
Over the years, several forecasting methods have been
proposed with the aim of producing increasingly accurate predictions of
(stochastic) time series. In general, they could be split into two great
exclusive categories: the individual (or single) predictive methods, such as
the well-known Auto-Regressive Integrated
Moving Average (ARIMA) models, as in Hamilton (1994); and the combination of individual predictive
methods, proposed initially by Bates and Granger
(1969). Indeed, the collection of single predictive methods might
yet be regrouped into two exclusive classes: the statistical one (here it lies,
for instance, the (linear) ARIMA models), and the machine learning one (here it
lies the (non-linear) Artificial Neural
Networks (ANNs), as in HAYKIN, 2001). By hybrid forecasting methods, it means those
ones that always carry out the modelling of a given time series, denoted by , according to the following three steps: in
Step 1, a single forecasting method from statistical class/machine learning
class is applied to for
producing its forecasts as well as its residuals; in Step 2, the forecasting
errors generated in Step 1 are predicted by using an individual forecaster from
machine learning class/statistical class; and, in step 3, the forecasts
provided in Step 1 are “corrected” by the predictions of the residuals produced
in Step 2 such that to generate the hybrid forecasts of the underlying time
series . In effect, a hybrid predictive method can be
referred to as particular case of combined forecasters.
In
the process of prediction of time series, it has often been to assume that a
single predictive method produces residuals like a white noise process (i.e.,
unpredictable). However, mainly due to the linear or non-linear structures of
auto-dependence not captured by a single predictive method adopted by the
decision maker, this supposition may be trivially broken, most applications in
real world. For instance, the (linear) ARIMA models are able to statistically
guarantee, based on a significance level , only
uncorrelated residuals (but not statistical independence, as it is often assumed);
this because, from mathematical point of view, those ones may be visualized as
linear filters, as pointed out by Hamilton (1994). In turn, Zhang (2003) shows in his
numerical experiments – on which three very popular time series were forecasted
by employing an hybrid forecaster – that the residuals produced by the ARIMA
models could be efficiently predicted by using non-linear forecasters (namely,
the Multi-Layer Perceptron ANNs
(MLP-ANNs), as in HAYKIN, 2001)). In his research,
it can be verified that the ARIMA model forecasts were in fact remarkably
improved when received the sum of the prediction of their respective residuals.
Finally, in the numerical experiments in Firmino et al. (2015), each time series was
modelled by means of several plausible ARIMA models whose forecasts were added
by the predictions of their residuals produced by different ANNs; the results
achieved exhibit a remarkable accuracy gain in the hybrid forecasts when
compared with other traditional methods, in all adherence statistics.
In
turn, wavelet decompositions of levels r
have shown to be very efficient in dealing with time series forecasting due to
its multi-scaling property, as highlighted in Teixeira Jr et al. (2015). Basically, from a
given time series ,
through a wavelet decomposition of level r,
its r+1 Wavelet Components (WCs) – namely, one WC of approximation at scale
,
represented by (t=1,…,T), and r WCs of detail at
scales ,
denoted by , , …, ,
respectively – are produced. In one hand, the WCs can be interpreted as
frequency patterns of the underlying time series represented at time domain. Each WC tends to
exhibit lesser noise in its stochastic pattern than the underlying time series.
On the other hand, more formally speaking, each WC is an orthogonal projection
of onto a pairwise orthogonal subspace (as in Kubrusly (2011)), recognized as the
“wavelet subspaces”, of the space (defined in Section 2.1). In this way,
accuracy gains may be achieved when the wavelet methods (as the wavelet
decomposition) are adopted in the forecasting process.
Thus, the Wavelet Hybrid Methodology (WHF) proposed here aims to produce hybrid forecasts aggregating different information from different sources (i.e., numerical methods) with higher level of accuracy than other methodologies presenting in literature. The WHF integrates the following approaches: wavelet decomposition of level r, in Section 2.1; ARIMA models, in Section 2.2; ANN methods, in Section 2.3. Section 3 describes detail all steps to carried out the WHF (in particular, presents the Linear Combination (LC) of forecasts proposed here). In general lines, the proposed WHF can capture, at the same time, linear (by means of a LC of ARIMA models) and non-linear (by using a LC of ANNs) auto-dependence structures exhibited by a given time series to be forecasted, as well as its information on spectral frequency. In a different way of other hybrid predictive methods in time series literature, the WHF hybrid forecasts implicitly adheres spectral information through the WCs and perform an alternative linear combination of forecasts, as it will be seen. In order to illustrate the proposed methodology, the Canadian lynx and the British pound/US dollar exchange rate time series are modelled and hence predicted, following Zhang (2003).
The current paper is split into 5 sections. Beyond the introduction (Section 1), in Section 2, a review of the used methodologies is presented; Section 3 describes the steps of the proposed WHF; in Section 4, the main numerical results are delayed and comments upon; and, finally, in Section 5, the paper is closed.
2. REVIEW OF LITERATURE
The purpose of this section is to present a brief review of some basic concepts which are needed for defining the WHF method, described in Section 3. It starts, in Section 2.1, by describing the wavelet decomposition of level r, which is the algorithm adopted in initial step of the ARIMA and ANN methods. This is followed by the basic concepts on ARIMA and ANN models, in Sections 2.2 and 2.3, respectively.
Based on Kubrusly and Levan (2006), and Teixeira Jr et al.(2015), if a subset , wherein takes a fixed integer value, is in fact an orthonormal basis for the space , wherein , then a vector belonging to can be orthogonally decomposed, in terms of , as in (1).
(1)
Where: is recognized as the WC of approximation at level of the state , with ; and is referred to as the WC of detail at level of , with . The orthogonal decomposition in (1) is usually called a wavelet decomposition of .
Tautologically, any finite scalar-valued complex time series may be interpreted as an infinite sequence in , defined as follows: , if ; and , if . In effect, each state can be orthogonally expanded by means of a wavelet decomposition, as the identity in (1). However, in practical terms, it cannot model individually all WCs generated by the expansion (1). Consequently, an adaptation is required in order to obtain a finite number of WCs. In this perspective, Donoho and Johnstone (1994) have proposed the wavelet decomposition of level r, wherein and , defined for each state , as below
(2)
where is the level parameter (which is commonly assumed to be equal to r); and (that consist, respectively, of the approximations of the WCs and from Equation (1), wherein is such that ); is a parameter that takes an integer value such that ; and is the error vector of approximation of the state (where is usually assumed to be equals zero).
Importantly, if T is not an integer power of 2, it is often to fill in the underlying time series with zeros until its length T is increased up to the next integer power of 2. This procedure may be always carried out due to the fact that the zeros added up do not affect the calculation of the WCs and in (2), what means the previously auto-correlation exhibited by , as well its WCs, is preserved (HAVEN; LIU; SHEN, 2012).
Let be a time series exhibiting structure of auto-correlation. According to Liu (2006), is an ARIMA (p,d,q) process, if only if, it can be represented as in Equation (3):
(3)
Where: B is a backward operator, defined by , wherein k runs over in the set ; is the difference operator, with d representing its order; and are the lists of model complex parameters, with and , where they must satisfy both the invertibility and the stationarity conditions (as in HAMILTON, 1994); is the innovation consisting of a state of the random variable of a white noise stochastic process, denoted by , with zero mean and null auto-covariance; p and q are, respectively, the orders of the auto-regressive part, denoted by , and of the moving average part, represented by part .
According to Liu (2006), in order to obtain the best possible ARIMA model, three basic steps should be carried out: (i) test the plausible values for the parameters p, d, and q, in Equation (3), which can be obtained through the profile analysis of the plots of simple and partial auto-correlation functions of the residuals (as in HAMILTON, 1994); (ii) define the method to be used to estimate the ARIMA parameters (the most common is the Maximum Likelihood Estimation method (as in Liu (2006)); and, (iii) make a diagnostic check to choose the most parsimonious and adequate model to be used for generating both the in-sample and the out-of-sample forecasts of .
Artificial Neural Networks (ANNs) are well-known to be flexible computing frameworks for modeling and forecasting a broad range of stochastic time series exhibiting either linear or non-linear auto-dependence structures. Contrary to many linear statistical forecasting models, stationarity is not required by ANN methods (see e.g. TEIXEIRA JR. et al., 2015).
Another important aspect of ANNs is that they are universal approximates of compact (i.e., closed and bounded) support functions, as showed in Khashei and Bijari (2011). In effect, since observations from a time series that exhibit dependency on past values may be seen as points of the domain of an unknown compact support function, it follows that the ANNs are capable of approximating them (for modeling or forecasting) with a high degree of accuracy.
According to Zhang (2003), the predictive power of ANNs comes from the parallel processing of the information exhibited by the data. In addition, AAN models are largely determined by the stochastic characteristics inherent in the time series.
In this context, the feedforward multilayer perceptron ANNs (see e.g. HAYKIN, 2001) are the most widely used neural prediction models for time series forecasting. Particularly, an artificial network composed by three layers (namely, input, hidden an output layers) of simple processing units numerically connected by acyclic links. The relationship between the output and the L-lagged inputs, represented by , has the following Equation (4):
(4)
where and are the ANN parameters, called connection weights; is the number of input nodes; is the number of hidden nodes; is the approximation error at time t; and is the transfer function, here, a logistic function - although it is possible to adopt other functions (see e.g HAYKIN, 2001). The logistic function is widely employed as the hidden layer transfer function in neural network forecasting and is mathematically defined by Equation (5):
(5)
where and is the exponential function with Euler’s basis (as in Haykin, 2001). Due to being a non-linear transfer function, the ANN model in (5), in fact performs a non-linear mapping of the past observations to produce a forecast for .
3. PROPOSED METHODOLOGY
Let be a time series for which k steps-ahead forecasts are required in a forecasting horizon h. For this purpose, the WHF proposed here is carried out accordingly by the following six steps.
Step 1: A wavelet decomposition of level r of is performed, producing its r+1 WCs – namely, one WC of approximation at level , denoted by , and r WCs of detail at levels from to , denoted, respectively, by , , ….,.
Step 2: The r+1 WCs generated in Step
1 are individually modelled through ARIMA model in order to yield the following
lists of in-sample and out-of-sample forecasts: , and , …,,
where h represents the forecasting
horizon, and , the
value of degrees of freedom lost in the ARIMA
modelling. It is important to point out that and , for each, denote, respectively, the prediction (from ARIMA model 1) of and the
prediction (from ARIMA model m+1) of.
Step 3: The forecasts of the r+1 WCs from Step 2 are linearly combined by means of the LWC, defined in Equation (6).
(6)
Where is the combined linear forecast of ; and , , …, are the optimal LWC parameters, which are obtained by solving the NLP (RAGSDALE, 2004) described below.
Objective: minimize MSE.
Subject to:
Where, is the list of (in-sample) forecasting errors; and and are the decision variables (to be optimized and substituted in Equation (6)).
Step 4: The list is decomposed by a wavelet decomposition of level k, providing its k+1 WCs – i.e., one WC of approximation at level , denoted by , and k WCs of detail at levels from to , denoted, respectively, by ,.
Step 5: Each k+1 WC produced in Step
4 is used as input patterns in the ANN in order to generate the combination of
the following list of in-sample and out-of-sample forecasts , where , is the degrees of freedom lost until here.
Step 6: The combined linear forecasts (from Step 3) and the combined non-linear forecast (from Step 5) are simply summed for each instant t (i.e., ), producing the in-sample and out-of-sample hybrid forecasts, denoted by , of the underlying time series .
Following the above seven steps of the WHF, it is worth noting some important aspects. Firstly, Steps 1 and 4 are aimed to obtain a finite number of temporal subseries (WCs) with better stochastic pattern than the underlying time series. It is possible due to the fact each WC has a stationary spectral frequency, as noted in Mallat (2009); while can be seen as the result of the sum of spectral components with different frequencies, as pointed by Levan and Kubrusly (2003).
Secondly, in Step 2, the goal is to identify a plausible linear auto-dependence structure (i.e., a linear forecaster), the ARIMA models were chosen here, as they are widely adopted for this purpose, as highlighted by Hamilton (1994). Thirdly, in Step 3, after identifying a plausible linear structure for each WC, their predictions are linearly combined by means of the WLC and then generated the combined linear forecasts, which decode linear information exhibited by .
Consequently, the forecast is in fact a forecast of and allows for being interpreted as an aggregator of distinct linear information from different linear sources (i.e., the r+1 different linear ARIMA models).
Fourthly, in Step 4, each WC produced here is endowed with noise (like a white noise) and non-linear auto-dependence structure of , as the forecasting errors are, following (HAMILTON, 1994), outcomes from a linear filter (in this case, the WLC of ARIMA models, in Step 3).
The WLC of ARIMA models is not able to map properly non-linear auto-dependence structures. In effect, it is easily noticed that each WCs of the residuals needed to be modelled by a non-linear forecaster. Unlikely, important information could be lost in the predictive process. Thus, in Step 5, each WC was combined by a (non-linear) ANN model. The software library used here allows for the parameters may be set by the decision maker.
Fifthly, in Step 6, the forecast produced in Step 5 generate the list of in-sample and out-of-sample combined forecasts that provide information on the non-linear auto-dependence structure of . Similarly to the combined linear forecast , can be interpreted as an aggregator of non-linear information from different nonlinear sources (namely, the r’+1 ANN models). Sixthly, the hybrid forecasts , endowed with both linear and non-linear information, can be seen as a version of filtered by both a linear filter (namely, the WLC of ARIMA models, in Step 3) and a non-linear filter (i.e., the WLC of ANNs, in Step 5). Accordingly, the list of forecasting errors, note that , can be in fact classified as a noise process.
4. NUMERICAL RESULTS
For evaluating the proposed hybrid
methodology, an annual time series of sunspot concerning the period from 1700
to 1987 (resulting 288 points), plotted in Figure 1, the
number of lynx caught per year in the Mackenzie Lake in Canada's northern
district in period 1821 to 1934 (114 observations) in Figure 2, the weekly British
pound/US dollar exchange rate from the year 1980 to 1993 (731 points), in
Figure 3, have
been projected here. As pointed out in Zhang (2003), prediction has practical importance for
decision makers of several areas, as geophysicists, environment scientists, and
climatologists.
In this paper, a comparison between the actual values and their forecasts is performed accordingly for a forecasting horizon out-of-sample, which is used for all approaches considered here. The E-Views 8 software was used for the ARIMA modelling; while the MLP ANN modelling was performed in MATLAB R2013a software.
In turn, the adjustment of LWC parameters in Steps 3 was done through the solver package of Excel 2013. Finally, the wavelet decompositions were implemented in MATLAB R2013a software. Regarding the accurate performances, the MAE and MSE (see HAMILTON, 1994)), were taken into evaluation here.
Figure 1: Sunspot time series (1700-1987).
Figure 2: Canada's lynx time series (1821 to 1934).
Figure 3: Exchange rate British pound/US dollar (1980 to 1993).
In Step 1, a Wavelet decomposition of level 2 with the Haar orthonormal basis of the training sample of time series of Sunspot was performed. The plots of the WCs can be seen in Figure 4.
(a) WC of approximation at level 2, .
(b) WC of detail at level 2, .
(c) WC of detail at level 3, .
Figure 4: WCs of the training sample of the Sunspot annual time series.
Regarding Step 2, it has:
a) An ARIMA(28,1,32) model, with logarithm transformation, for predicting the WC of approximation at level 2. Let be, for each instant t, the first difference of (which is the WC of approximation at level 2 of the state ). Thereby, the mentioned predictor is algebraically given by Equation (7):
(7)
b) b) An ARIMA (20,0,0) model for predicting the WC of detail at level 2. Assume that = for each instant t. So, the mentioned model is (algebraically) defined by Equation (8):
+
(8)
c) c) An ARIMA(20,1,28) model in order to forecast the WC of detail at level 3. Assume that =, for each instant t. So, the mentioned model is (algebraically) defined by Equation (9):
(9)
The method used here for estimating the complex parameters in items (a), (b) and (c) above was the Likelihood approach, as in Hamilton (1994).
The optimal adaptive parameters from Step 3 are equal to , , .
Following, in Step 4, the predicting errors , being , produced in Step 3, was decomposed by means of a wavelet decomposition of level 2, with orthonormal basis of Haar (see MALLAT, 2009)).
The
best configurations of the ANN required in Step 5 to model the three WCs of the
forecasting errors , has
five data of each decomposition in input layer and seven neurons in hidden
layer. Finally, in Step 6, the forecasts produced in Steps 3 and 5 are simply
summed, generating the hybrid forecasts (t=),
where , of
the Sunspot time series. The plot of the out-of-sample actual values and their
hybrid forecasts can be visualized in Figure 5.
Figure 5: Hybrid forecasts and actual values of out-of-sample.
Table 1 exhibits the out-of-sample
MSE and MAE adherence statistics regarding the forecasts of competing
predictive methods; and the proposed WHF method (highlighted at the bottom of
the Table 1).
Table
1: MAE and MSE of out-of-sample values
sunspot time series.
References |
METHODS |
67
steps-ahead |
|
MSE |
MAE |
||
(ZHANG, 2003) |
ARIMA |
306.0 |
13.03 |
ANN |
351.1 |
13.54 |
|
Hybrid |
280.1 |
12.78 |
|
(KHASHEI; BIJARI, 2011) |
ANN/ARIMA |
218.6 |
11.45 |
(ADHIKARI; AGRAWAL, 2013) |
SVM |
792.9 |
- |
Ensemble
ANN |
280.5 |
- |
|
|
Proposed WHF |
69.28 |
3.90 |
Source: The
authors.
The models and parameters to the exchange rate time series are given in the Table 2, according to steps of the proposed method.
Table
2: Models and parameters to exchange rate methodology.
|
Variable |
Values |
ARIMA |
|
|
|
|
|
|
|
|
Adaptative Parameters |
|
1.014978 |
|
0.88357 |
|
|
0.942313 |
Source: The
authors.
The predicting errors, being produced in Step 3, was decomposed by means of a wavelet decomposition of level 2, with orthonormal basis of Daubechies 45. The ANN input patterns has five observation of each decomposition and the hidden layer has six neurons.
The plot of the out-of-sample actual values and their hybrid forecasts of exchange rate can be visualized in Figure 6.
Figure 6: Hybrid forecasts and actual values of out-of-sample.
Table 3 exhibits the out-of-sample MSE and MAE adherence statistics regarding the forecasts of competing predictive methods; and the proposed WHF method for the time series of exchange rate.
Table
3: MAE and MSE of out-of-sample values exchange rate time series.
References |
METHODS |
52
steps-ahead |
|
MSE |
MAE |
||
(ZHANG, 2003) |
ARIMA |
4.52977x10-5 |
0.005397 |
ANN |
4.52657x10-5 |
0.0052513 |
|
Hybrid |
4.35907x10-5 |
0.0051212 |
|
(KHASHEI; BIJARI, 2011) |
ANN/ARIMA |
3.64774x10-5 |
0.0049691 |
|
Proposed WHF |
4.23 x10-6 |
0.000907 |
Source: The
authors.
The necessary information to application of the proposed method for Canadian lynx time series is shown in Table 4.
Table
4: Models and parameters to Canadian lynx methodology.
|
Variable |
Values |
ARIMA |
|
|
|
|
|
|
|
|
Adaptative Parameters |
|
0.998526 |
|
1.001407 |
|
|
0.990531 |
Source: The
authors.
The Canadian lynx error was decomposed by means of a wavelet decomposition of level 2, with Haar orthonormal basis. The ANN input patterns has five observation of each decomposition and the hidden layer has five neurons. The results can be seen in Figure 7 and the adherence statistics in Table 5
Figure 7: Hybrid forecasts and actual values of out-of-sample.
Table
5: MAE and MSE of out-of-sample values Canadian lynx time series.
References |
METHODS |
14 steps-ahead |
||||
MSE |
MAE |
|||||
Zhang (2003) |
ARIMA |
0.020486 |
0.112255 |
|||
ANN |
0.020466 |
0.112109 |
||||
Hybrid |
0.017233 |
0.103972 |
||||
Khashei e Bijari (2011) |
ANN/ARIMA |
0.00999 |
0.085055 |
|||
(ADHIKARI; AGRAWAL, 2013) |
SVM |
0.05267 |
- |
|
||
Ensemble ANN |
0.00715 |
- |
|
|||
|
Proposed WHF |
0.000981 |
0.018828 |
|||
Source: The
authors.
5. CONCLUSIONS
Comparisons in Table 1, 3 and 5 clearly proved that the proposed WHF method achieved remarkably better results than any of other predictive methods cited in this paper, on out-of-sample performance measures.
According to Figure 5, 6 and 7, the observed values and the predictions produced by the proposed method over the out-of-sample period are strongly correlated, meaning that a high predictive power was achieved in the Sunspot, Exchange rate and Canadian lynx data application.
It is also worth pointing out that despite the relative complexity of the mathematical techniques that integrate the proposed methodology, described in Section 2, its implementation is indeed relatively straightforward with use of appropriate software such as E-Views 8 and MATLAB R2013a software.
REFERENCES
ADHIKARI, R.; AGRAWAL, R. K. (2013) A Homogeneous Ensemble of
Artificial Neural Networks for Time Series Forecasting. International
Journal of Computer Applications, v. 32, n. 7, p. 8..
BATES, J. M.; GRANGER, C. W. J.
(1969) The Combination of ForecastsJournal of the Operational Research
Society.
DONOHO, D. L.; JOHNSTONE, J. (1994)
M. Ideal spatial adaptation by wavelet shrinkage. Biometrika, v. 81, n.
3, p. 425–455.
FIRMINO, P. R. A.; DE MATTOS NETO, P.
S. G.; FERREIRA, T. A. (2015) E. Error modeling approach to improve time series
forecasters. Neurocomputing, v. 153, p. 242–254.
HAMILTON, J. D. (1994) Time Series
Analysis,1ed. New Jersey : Princeton University Press.
HAVEN, E.; LIU, X.; SHEN, L. (2012)
De-noising option prices with the wavelet method. European Journal of
Operational Research, v. 222, n. 1, p. 104–112.
HAYKIN, S. S. (2001) Redes
Neurais, 2ed. Porto Alegre:
Bookman.
KHASHEI, M.; BIJARI, M. A. (2011) New
Hybrid Methodology for Nonlinear Time Series Forecasting. Modelling and
Simulation in Engineering, v. 2011, p. 1–5.
KUBRUSLY, C. S. (2009) The
Elements of Operator Theory, 2 ed. New York: Birkhäuser.
KUBRUSLY, C. S.; LEVAN, N. (2006)
Abstract wavelet generated by hilbert space shift operators. Adavances in
mathematical Sciences and applications, v. 16, p. 643–660.
LEVAN, N.; KUBRUSLY, C. S. (2003) A
wavelet “time-shift-detail” decomposition. Mathematics and Computers in
Simulation, v. 63, n. 2, p. 73–78.
LIU, L.-M. (2006) Time Series
Analysis and Forecasting. second ed. Chicago, IL: Scientific Computing
Associates Corporation.
LUTKEPOHL, H.( 2006) Forecasting with
VARMA Models. In: Handbook of Economic Forecasting. [s.l.] Elsevier, v.
1, p. 287–325.
MALLAT, S. (2011) A Wavelet Tour
of Signal Processing: The Sparse Way, 3 ed. Burlington: Elsevier Inc.
RAGSDALE, C. (2004) Spreadsheet
Modeling & Decision Analysis: A Practical Introduction to Management
Science. Fourth edi ed. [s.l.] South-Western.
TEIXEIRA JR, L. A. et al. (2015)
Artificial Neural Network and Wavelet decomposition in the Forecast of Global
Horizontal Solar Radiation. Sobrapo, v. 35, n. 1, p. 1–16.
ZHANG, G. P. (2003) Time series
forecasting using a hybrid ARIMA and neural network model. Neurocomputing, v. 50, p.
159–175.