MODELO PARA A FORMATAÇÃO DOS ARTIGOS A SEREM UTILIZADOS NO ENEGEP 2003

TIME SERIES FORECASTING BY USING A NEURAL ARIMA MODEL BASED ON WAVELET DECOMPOSITION

Eliete Nascimento Pereira

Federal University of Paraná, Brazil

E-mail: elietenp@gmail.com

Cassius Tadeu Scarpin

Federal University of Paraná, Brazil

E-mail: cassiusts@gmail.com

Luíz Albino Teixeira Júnior

Latin American Integration University (UNILA),Brazil

E-mail: luiz.a.t.junior@gmail.com

Submission: 30/10/2015

Revision: 23/11/2015

Accept: 26/11/2015

ABSTRACT

In the prediction of (stochastic) time series, it has been common to suppose that an individual predictive method – for instance, an Auto-Regressive Integrated Moving Average (ARIMA) model – produces residuals like a white noise process. However, mainly due to the structures of auto-dependence not mapped by a given individual predictive method, this assumption might be easily violated, in practice, as pointed out by Firmino et al. (2015). In order to correct it (and accordingly to produce forecasts with more accuracy power), this paper puts forward a Wavelet Hybrid Forecaster (WHF) that integrates the following numerical techniques: wavelet decomposition; ARIMA models; Artificial Neural Networks (ANNs); and linear combination of forecasts. Basically, the proposed WHF can map simultaneously linear – by means of a linear combination of ARIMA forecasts – and non-linear – through a linear combination of ANN forecasts – auto-dependence structures exhibited by a given time series. Differently of other hybrid methodologies existing in literature, the WHF forecasts are produced carrying into account implicitly the information from the frequency presenting in the underlying time series by means of the Wavelet Components (WCs) obtained by the wavelet decomposition approach. All numerical results show that WHF method has achieved remarkable accuracy gains, when comparing with other competitive forecasting methods already published in specialized literature, in the prediction of a well-known annual time series of sunspot.

Keywords: Wavelet decomposition, ARIMA model, Artificial neural networks, Linear combination of forecasts

1. INTRODUCTION

Over the years, several forecasting methods have been proposed with the aim of producing increasingly accurate predictions of (stochastic) time series. In general, they could be split into two great exclusive categories: the individual (or single) predictive methods, such as the well-known Auto-Regressive Integrated Moving Average (ARIMA) models, as in Hamilton (1994); and the combination of individual predictive methods, proposed initially by Bates and Granger (1969). Indeed, the collection of single predictive methods might yet be regrouped into two exclusive classes: the statistical one (here it lies, for instance, the (linear) ARIMA models), and the machine learning one (here it lies the (non-linear) Artificial Neural Networks (ANNs), as in HAYKIN, 2001). By hybrid forecasting methods, it means those ones that always carry out the modelling of a given time series, denoted by , according to the following three steps: in Step 1, a single forecasting method from statistical class/machine learning class is applied to for producing its forecasts as well as its residuals; in Step 2, the forecasting errors generated in Step 1 are predicted by using an individual forecaster from machine learning class/statistical class; and, in step 3, the forecasts provided in Step 1 are “corrected” by the predictions of the residuals produced in Step 2 such that to generate the hybrid forecasts of the underlying time series . In effect, a hybrid predictive method can be referred to as particular case of combined forecasters.

In the process of prediction of time series, it has often been to assume that a single predictive method produces residuals like a white noise process (i.e., unpredictable). However, mainly due to the linear or non-linear structures of auto-dependence not captured by a single predictive method adopted by the decision maker, this supposition may be trivially broken, most applications in real world. For instance, the (linear) ARIMA models are able to statistically guarantee, based on a significance level , only uncorrelated residuals (but not statistical independence, as it is often assumed); this because, from mathematical point of view, those ones may be visualized as linear filters, as pointed out by Hamilton (1994). In turn, Zhang (2003) shows in his numerical experiments – on which three very popular time series were forecasted by employing an hybrid forecaster – that the residuals produced by the ARIMA models could be efficiently predicted by using non-linear forecasters (namely, the Multi-Layer Perceptron ANNs (MLP-ANNs), as in HAYKIN, 2001)). In his research, it can be verified that the ARIMA model forecasts were in fact remarkably improved when received the sum of the prediction of their respective residuals. Finally, in the numerical experiments in Firmino et al. (2015), each time series was modelled by means of several plausible ARIMA models whose forecasts were added by the predictions of their residuals produced by different ANNs; the results achieved exhibit a remarkable accuracy gain in the hybrid forecasts when compared with other traditional methods, in all adherence statistics.

In turn, wavelet decompositions of levels r have shown to be very efficient in dealing with time series forecasting due to its multi-scaling property, as highlighted in Teixeira Jr et al. (2015). Basically, from a given time series , through a wavelet decomposition of level r, its r+1 Wavelet Components (WCs) – namely, one WC of approximation at scale , represented by (t=1,…,T), and r WCs of detail at scales , denoted by , , …, , respectively – are produced. In one hand, the WCs can be interpreted as frequency patterns of the underlying time series represented at time domain. Each WC tends to exhibit lesser noise in its stochastic pattern than the underlying time series. On the other hand, more formally speaking, each WC is an orthogonal projection of onto a pairwise orthogonal subspace (as in Kubrusly (2011)), recognized as the “wavelet subspaces”, of the space (defined in Section 2.1). In this way, accuracy gains may be achieved when the wavelet methods (as the wavelet decomposition) are adopted in the forecasting process.

Thus, the Wavelet Hybrid Methodology (WHF) proposed here aims to produce hybrid forecasts aggregating different information from different sources (i.e., numerical methods) with higher level of accuracy than other methodologies presenting in literature. The WHF integrates the following approaches: wavelet decomposition of level r, in Section 2.1; ARIMA models, in Section 2.2; ANN methods, in Section 2.3. Section 3 describes detail all steps to carried out the WHF (in particular, presents the Linear Combination (LC) of forecasts proposed here). In general lines, the proposed WHF can capture, at the same time, linear (by means of a LC of ARIMA models) and non-linear (by using a LC of ANNs) auto-dependence structures exhibited by a given time series to be forecasted, as well as its information on spectral frequency. In a different way of other hybrid predictive methods in time series literature, the WHF hybrid forecasts implicitly adheres spectral information through the WCs and perform an alternative linear combination of forecasts, as it will be seen. In order to illustrate the proposed methodology, the Canadian lynx and the British pound/US dollar exchange rate time series are modelled and hence predicted, following Zhang (2003).

The current paper is split into 5 sections. Beyond the introduction (Section 1), in Section 2, a review of the used methodologies is presented; Section 3 describes the steps of the proposed WHF; in Section 4, the main numerical results are delayed and comments upon; and, finally, in Section 5, the paper is closed.

2. REVIEW OF LITERATURE

The purpose of this section is to present a brief review of some basic concepts which are needed for defining the WHF method, described in Section 3. It starts, in Section 2.1, by describing the wavelet decomposition of level r, which is the algorithm adopted in initial step of the ARIMA and ANN methods. This is followed by the basic concepts on ARIMA and ANN models, in Sections 2.2 and 2.3, respectively.

2.1. Wavelet decomposition of level r

Based on Kubrusly and Levan (2006), and Teixeira Jr et al.(2015), if a subset , wherein takes a fixed integer value, is in fact an orthonormal basis for the space , wherein , then a vector belonging to can be orthogonally decomposed, in terms of , as in (1).

(1)

Where: is recognized as the WC of approximation at level of the state , with ; and is referred to as the WC of detail at level of , with . The orthogonal decomposition in (1) is usually called a wavelet decomposition of .

Tautologically, any finite scalar-valued complex time series may be interpreted as an infinite sequence in , defined as follows: , if ; and , if . In effect, each state can be orthogonally expanded by means of a wavelet decomposition, as the identity in (1). However, in practical terms, it cannot model individually all WCs generated by the expansion (1). Consequently, an adaptation is required in order to obtain a finite number of WCs. In this perspective, Donoho and Johnstone (1994) have proposed the wavelet decomposition of level r, wherein and , defined for each state , as below

(2)

where is the level parameter (which is commonly assumed to be equal to r); and (that consist, respectively, of the approximations of the WCs and from Equation (1), wherein is such that ); is a parameter that takes an integer value such that ; and is the error vector of approximation of the state (where is usually assumed to be equals zero).

Importantly, if T is not an integer power of 2, it is often to fill in the underlying time series with zeros until its length T is increased up to the next integer power of 2. This procedure may be always carried out due to the fact that the zeros added up do not affect the calculation of the WCs and in (2), what means the previously auto-correlation exhibited by , as well its WCs, is preserved (HAVEN; LIU; SHEN, 2012).

2.2. ARIMA models

Let be a time series exhibiting structure of auto-correlation. According to Liu (2006), is an ARIMA (p,d,q) process, if only if, it can be represented as in Equation (3):

(3)

Where: B is a backward operator, defined by , wherein k runs over in the set ; is the difference operator, with d representing its order; and are the lists of model complex parameters, with and , where they must satisfy both the invertibility and the stationarity conditions (as in HAMILTON, 1994); is the innovation consisting of a state of the random variable of a white noise stochastic process, denoted by , with zero mean and null auto-covariance; p and q are, respectively, the orders of the auto-regressive part, denoted by , and of the moving average part, represented by part .

According to Liu (2006), in order to obtain the best possible ARIMA model, three basic steps should be carried out: (i) test the plausible values for the parameters p, d, and q, in Equation (3), which can be obtained through the profile analysis of the plots of simple and partial auto-correlation functions of the residuals (as in HAMILTON, 1994); (ii) define the method to be used to estimate the ARIMA parameters (the most common is the Maximum Likelihood Estimation method (as in Liu (2006)); and, (iii) make a diagnostic check to choose the most parsimonious and adequate model to be used for generating both the in-sample and the out-of-sample forecasts of .

2.3. Artificial Neural Network

Artificial Neural Networks (ANNs) are well-known to be flexible computing frameworks for modeling and forecasting a broad range of stochastic time series exhibiting either linear or non-linear auto-dependence structures. Contrary to many linear statistical forecasting models, stationarity is not required by ANN methods (see e.g. TEIXEIRA JR. et al., 2015).

Another important aspect of ANNs is that they are universal approximates of compact (i.e., closed and bounded) support functions, as showed in Khashei and Bijari (2011). In effect, since observations from a time series that exhibit dependency on past values may be seen as points of the domain of an unknown compact support function, it follows that the ANNs are capable of approximating them (for modeling or forecasting) with a high degree of accuracy.

According to Zhang (2003), the predictive power of ANNs comes from the parallel processing of the information exhibited by the data. In addition, AAN models are largely determined by the stochastic characteristics inherent in the time series.

In this context, the feedforward multilayer perceptron ANNs (see e.g. HAYKIN, 2001) are the most widely used neural prediction models for time series forecasting. Particularly, an artificial network composed by three layers (namely, input, hidden an output layers) of simple processing units numerically connected by acyclic links. The relationship between the output and the L-lagged inputs, represented by , has the following Equation (4):

(4)

where and are the ANN parameters, called connection weights; is the number of input nodes; is the number of hidden nodes; is the approximation error at time t; and is the transfer function, here, a logistic function - although it is possible to adopt other functions (see e.g HAYKIN, 2001). The logistic function is widely employed as the hidden layer transfer function in neural network forecasting and is mathematically defined by Equation (5):

(5)

where and is the exponential function with Euler’s basis (as in Haykin, 2001). Due to being a non-linear transfer function, the ANN model in (5), in fact performs a non-linear mapping of the past observations to produce a forecast for .

3. PROPOSED METHODOLOGY

Let be a time series for which k steps-ahead forecasts are required in a forecasting horizon h. For this purpose, the WHF proposed here is carried out accordingly by the following six steps.

Step 1: A wavelet decomposition of level r of is performed, producing its r+1 WCs – namely, one WC of approximation at level , denoted by , and r WCs of detail at levels from to , denoted, respectively, by , , ….,.

Step 2: The r+1 WCs generated in Step 1 are individually modelled through ARIMA model in order to yield the following lists of in-sample and out-of-sample forecasts: , and , …,, where h represents the forecasting horizon, and , the value of degrees of freedom lost in the ARIMA modelling. It is important to point out that and , for each, denote, respectively, the prediction (from ARIMA model 1) of and the prediction (from ARIMA model m+1) of.

Step 3: The forecasts of the r+1 WCs from Step 2 are linearly combined by means of the LWC, defined in Equation (6).

(6)

Where is the combined linear forecast of ; and , , …, are the optimal LWC parameters, which are obtained by solving the NLP (RAGSDALE, 2004) described below.

Objective: minimize MSE.

Subject to:

Where, is the list of (in-sample) forecasting errors; and and are the decision variables (to be optimized and substituted in Equation (6)).

Step 4: The list is decomposed by a wavelet decomposition of level k, providing its k+1 WCs – i.e., one WC of approximation at level , denoted by , and k WCs of detail at levels from to , denoted, respectively, by ,.

Step 5: Each k+1 WC produced in Step 4 is used as input patterns in the ANN in order to generate the combination of the following list of in-sample and out-of-sample forecasts , where , is the degrees of freedom lost until here.

Step 6: The combined linear forecasts (from Step 3) and the combined non-linear forecast (from Step 5) are simply summed for each instant t (i.e., ), producing the in-sample and out-of-sample hybrid forecasts, denoted by , of the underlying time series .

Following the above seven steps of the WHF, it is worth noting some important aspects. Firstly, Steps 1 and 4 are aimed to obtain a finite number of temporal subseries (WCs) with better stochastic pattern than the underlying time series. It is possible due to the fact each WC has a stationary spectral frequency, as noted in Mallat (2009); while can be seen as the result of the sum of spectral components with different frequencies, as pointed by Levan and Kubrusly (2003).

Secondly, in Step 2, the goal is to identify a plausible linear auto-dependence structure (i.e., a linear forecaster), the ARIMA models were chosen here, as they are widely adopted for this purpose, as highlighted by Hamilton (1994). Thirdly, in Step 3, after identifying a plausible linear structure for each WC, their predictions are linearly combined by means of the WLC and then generated the combined linear forecasts, which decode linear information exhibited by .

Consequently, the forecast is in fact a forecast of and allows for being interpreted as an aggregator of distinct linear information from different linear sources (i.e., the r+1 different linear ARIMA models).

Fourthly, in Step 4, each WC produced here is endowed with noise (like a white noise) and non-linear auto-dependence structure of , as the forecasting errors are, following (HAMILTON, 1994), outcomes from a linear filter (in this case, the WLC of ARIMA models, in Step 3).

The WLC of ARIMA models is not able to map properly non-linear auto-dependence structures. In effect, it is easily noticed that each WCs of the residuals needed to be modelled by a non-linear forecaster. Unlikely, important information could be lost in the predictive process. Thus, in Step 5, each WC was combined by a (non-linear) ANN model. The software library used here allows for the parameters may be set by the decision maker.

Fifthly, in Step 6, the forecast produced in Step 5 generate the list of in-sample and out-of-sample combined forecasts that provide information on the non-linear auto-dependence structure of . Similarly to the combined linear forecast , can be interpreted as an aggregator of non-linear information from different nonlinear sources (namely, the r’+1 ANN models). Sixthly, the hybrid forecasts , endowed with both linear and non-linear information, can be seen as a version of filtered by both a linear filter (namely, the WLC of ARIMA models, in Step 3) and a non-linear filter (i.e., the WLC of ANNs, in Step 5). Accordingly, the list of forecasting errors, note that , can be in fact classified as a noise process.

4. NUMERICAL RESULTS

For evaluating the proposed hybrid methodology, an annual time series of sunspot concerning the period from 1700 to 1987 (resulting 288 points), plotted in Figure 1, the number of lynx caught per year in the Mackenzie Lake in Canada's northern district in period 1821 to 1934 (114 observations) in Figure 2, the weekly British pound/US dollar exchange rate from the year 1980 to 1993 (731 points), in Figure 3, have been projected here. As pointed out in Zhang (2003), prediction has practical importance for decision makers of several areas, as geophysicists, environment scientists, and climatologists.

In this paper, a comparison between the actual values and their forecasts is performed accordingly for a forecasting horizon out-of-sample, which is used for all approaches considered here. The E-Views 8 software was used for the ARIMA modelling; while the MLP ANN modelling was performed in MATLAB R2013a software.

In turn, the adjustment of LWC parameters in Steps 3 was done through the solver package of Excel 2013. Finally, the wavelet decompositions were implemented in MATLAB R2013a software. Regarding the accurate performances, the MAE and MSE (see HAMILTON, 1994)), were taken into evaluation here.

Figure 1: Sunspot time series (1700-1987).

Figure 2: Canada's lynx time series (1821 to 1934).

Figure 3: Exchange rate British pound/US dollar (1980 to 1993).

In Step 1, a Wavelet decomposition of level 2 with the Haar orthonormal basis of the training sample of time series of Sunspot was performed. The plots of the WCs can be seen in Figure 4.

(a) WC of approximation at level 2, .

(b) WC of detail at level 2, .

Figure 4: WCs of the training sample of the Sunspot annual time series.

Regarding Step 2, it has:

a) An ARIMA(28,1,32) model, with logarithm transformation, for predicting the WC of approximation at level 2. Let be, for each instant t, the first difference of (which is the WC of approximation at level 2 of the state ). Thereby, the mentioned predictor is algebraically given by Equation (7):

(7)

b) b) An ARIMA (20,0,0) model for predicting the WC of detail at level 2. Assume that = for each instant t. So, the mentioned model is (algebraically) defined by Equation (8):

(8)

c) c) An ARIMA(20,1,28) model in order to forecast the WC of detail at level 3. Assume that =, for each instant t. So, the mentioned model is (algebraically) defined by Equation (9):

(9)

The method used here for estimating the complex parameters in items (a), (b) and (c) above was the Likelihood approach, as in Hamilton (1994).

The optimal adaptive parameters from Step 3 are equal to , , .

Following, in Step 4, the predicting errors , being , produced in Step 3, was decomposed by means of a wavelet decomposition of level 2, with orthonormal basis of Haar (see MALLAT, 2009)).

The best configurations of the ANN required in Step 5 to model the three WCs of the forecasting errors , has five data of each decomposition in input layer and seven neurons in hidden layer. Finally, in Step 6, the forecasts produced in Steps 3 and 5 are simply summed, generating the hybrid forecasts (t=), where , of the Sunspot time series. The plot of the out-of-sample actual values and their hybrid forecasts can be visualized in Figure 5.

Figure 5: Hybrid forecasts and actual values of out-of-sample.

Table 1 exhibits the out-of-sample MSE and MAE adherence statistics regarding the forecasts of competing predictive methods; and the proposed WHF method (highlighted at the bottom of the Table 1).

Table 1: MAE and MSE of out-of-sample values sunspot time series.

References	METHODS	67 steps-ahead
References	METHODS	MSE	MAE
(ZHANG, 2003)	ARIMA	306.0	13.03
	ANN	351.1	13.54
	Hybrid	280.1	12.78
(KHASHEI; BIJARI, 2011)	ANN/ARIMA	218.6	11.45
(ADHIKARI; AGRAWAL, 2013)	SVM	792.9	-
(ADHIKARI; AGRAWAL, 2013)	Ensemble ANN	280.5	-
	Proposed WHF	69.28	3.90

Source: The authors.

The models and parameters to the exchange rate time series are given in the Table 2, according to steps of the proposed method.

Table 2: Models and parameters to exchange rate methodology.

	Variable	Values
ARIMA


Adaptative Parameters		1.014978
		0.88357
		0.942313

Source: The authors.

The predicting errors, being produced in Step 3, was decomposed by means of a wavelet decomposition of level 2, with orthonormal basis of Daubechies 45. The ANN input patterns has five observation of each decomposition and the hidden layer has six neurons.

The plot of the out-of-sample actual values and their hybrid forecasts of exchange rate can be visualized in Figure 6.

Figure 6: Hybrid forecasts and actual values of out-of-sample.

Table 3 exhibits the out-of-sample MSE and MAE adherence statistics regarding the forecasts of competing predictive methods; and the proposed WHF method for the time series of exchange rate.

Table 3: MAE and MSE of out-of-sample values exchange rate time series.

References	METHODS	52 steps-ahead
References	METHODS	MSE	MAE
(ZHANG, 2003)	ARIMA	4.52977x10^-5	0.005397
	ANN	4.52657x10^-5	0.0052513
	Hybrid	4.35907x10^-5	0.0051212
(KHASHEI; BIJARI, 2011)	ANN/ARIMA	3.64774x10^-5	0.0049691
	Proposed WHF	4.23 x10^-6	0.000907

Source: The authors.

The necessary information to application of the proposed method for Canadian lynx time series is shown in Table 4.

Table 4: Models and parameters to Canadian lynx methodology.

	Variable	Values
ARIMA


Adaptative Parameters		0.998526
		1.001407
		0.990531

Source: The authors.

The Canadian lynx error was decomposed by means of a wavelet decomposition of level 2, with Haar orthonormal basis. The ANN input patterns has five observation of each decomposition and the hidden layer has five neurons. The results can be seen in Figure 7 and the adherence statistics in Table 5

Figure 7: Hybrid forecasts and actual values of out-of-sample.

Table 5: MAE and MSE of out-of-sample values Canadian lynx time series.

References	METHODS		14 steps-ahead
			MSE		MAE
Zhang (2003)	ARIMA		0.020486		0.112255
	ANN		0.020466		0.112109
	Hybrid		0.017233		0.103972
Khashei e Bijari (2011)	ANN/ARIMA		0.00999		0.085055
(ADHIKARI; AGRAWAL, 2013)	SVM		0.05267	-
	Ensemble ANN		0.00715	-
		Proposed WHF	0.000981		0.018828

Source: The authors.

5. CONCLUSIONS

Comparisons in Table 1, 3 and 5 clearly proved that the proposed WHF method achieved remarkably better results than any of other predictive methods cited in this paper, on out-of-sample performance measures.

According to Figure 5, 6 and 7, the observed values and the predictions produced by the proposed method over the out-of-sample period are strongly correlated, meaning that a high predictive power was achieved in the Sunspot, Exchange rate and Canadian lynx data application.

It is also worth pointing out that despite the relative complexity of the mathematical techniques that integrate the proposed methodology, described in Section 2, its implementation is indeed relatively straightforward with use of appropriate software such as E-Views 8 and MATLAB R2013a software.

REFERENCES

ADHIKARI, R.; AGRAWAL, R. K. (2013) A Homogeneous Ensemble of Artificial Neural Networks for Time Series Forecasting. International Journal of Computer Applications, v. 32, n. 7, p. 8..

BATES, J. M.; GRANGER, C. W. J. (1969) The Combination of ForecastsJournal of the Operational Research Society.

DONOHO, D. L.; JOHNSTONE, J. (1994) M. Ideal spatial adaptation by wavelet shrinkage. Biometrika, v. 81, n. 3, p. 425–455.

FIRMINO, P. R. A.; DE MATTOS NETO, P. S. G.; FERREIRA, T. A. (2015) E. Error modeling approach to improve time series forecasters. Neurocomputing, v. 153, p. 242–254.

HAMILTON, J. D. (1994) Time Series Analysis,1ed. New Jersey : Princeton University Press.

HAVEN, E.; LIU, X.; SHEN, L. (2012) De-noising option prices with the wavelet method. European Journal of Operational Research, v. 222, n. 1, p. 104–112.

HAYKIN, S. S. (2001) Redes Neurais, 2ed. Porto Alegre: Bookman.

KHASHEI, M.; BIJARI, M. A. (2011) New Hybrid Methodology for Nonlinear Time Series Forecasting. Modelling and Simulation in Engineering, v. 2011, p. 1–5.

KUBRUSLY, C. S. (2009) The Elements of Operator Theory, 2 ed. New York: Birkhäuser.

KUBRUSLY, C. S.; LEVAN, N. (2006) Abstract wavelet generated by hilbert space shift operators. Adavances in mathematical Sciences and applications, v. 16, p. 643–660.

LEVAN, N.; KUBRUSLY, C. S. (2003) A wavelet “time-shift-detail” decomposition. Mathematics and Computers in Simulation, v. 63, n. 2, p. 73–78.

LIU, L.-M. (2006) Time Series Analysis and Forecasting. second ed. Chicago, IL: Scientific Computing Associates Corporation.

LUTKEPOHL, H.( 2006) Forecasting with VARMA Models. In: Handbook of Economic Forecasting. [s.l.] Elsevier, v. 1, p. 287–325.

MALLAT, S. (2011) A Wavelet Tour of Signal Processing: The Sparse Way, 3 ed. Burlington: Elsevier Inc.

RAGSDALE, C. (2004) Spreadsheet Modeling & Decision Analysis: A Practical Introduction to Management Science. Fourth edi ed. [s.l.] South-Western.

TEIXEIRA JR, L. A. et al. (2015) Artificial Neural Network and Wavelet decomposition in the Forecast of Global Horizontal Solar Radiation. Sobrapo, v. 35, n. 1, p. 1–16.

ZHANG, G. P. (2003) Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing, v. 50, p. 159–175.