Combination Forecasting Model for Predicting the Shelf Life of Two-State Materials Based on Support Vector Machine

A combination forecasting model based on Support Vector Machine (SVM) whose objective is to minimize the structure risk is proposed. The storage failure of two-state materials tends to fail immediately without any recognizable defeats prior to the failure, which increases the difficulty of forecasting, so the combination forecasting model is often used to optimize the prediction effect. The core ideas of previous combination forecasting models such as those based on forecasting error and those based on nonlinear weighted average are finding the optimal weights, but the structure of forecasting model is fixed. In this paper, three single forecasting models, Weibull distribution statistic method, BP neural network prediction method and SPFM (Sliding Polynomial Fitting Method) are chosen in which their forecast mechanisms are completely different. The results of single forecasting methods are used as training set of SVM. By using libsvm toolbox, we can get the nonlinear mapping functions that have the minimum structure risk. At last, a simulation is conducted to verify this model by using the data from Petroleum Center.


INTRODUCTION
Due to a variety of environmental stresses, two-states materials in storage tend to fail immediately without any recognizable defeats prior to any failure.The typical twostate materials include electronic products, electromechanical products and POL (Petrol-Oil and Lubricants) products.The mechanism of storage failure is complicated and nonlinear owing to environmental factors, managing factors and designing factors, which can hardly be predicted.If only one forecasting method is selected, we have to take the risk of selecting mistake.Meanwhile, different models have different strong points and weak points.They are interrelated and supplement to each other.Bates and Granger [1] are the earliest people who research the combination forecasting systematically.Through the development in the last few decades, combination forecasting method has become a very important research hotspot in the current world.According to different forecast criteria, combination forecasting models can be divided into the following categories: least variance method [2], unconstrained least square method [3], constrained least square method [4], Bayes method [5], combination forecasting method based on different criteria and standard, recursive combination forecasting method.F. Geng proposed the excellent combination of forecast criteria aiming at minimum the error sum of squares without nonnegative constraints, and explained that unweighted average method can perform the excellent combination forecast using the nature of absolute error information matrix.X. W. Tang [6] studied the combination forecasting error boundary.Considering the standard deviation of predictive precision, H. Y. Chen [7] developed a combination forecasting model for better precision of prediction, and the weight coefficients of combination forecasting is calculated by linear programming.At present, different weighted means are assigned to each single forecasting method [8][9][10][11][12][13][14][15][16][17][18][19][20], and the common nature is that the structure of forecasting models is fixed no matter what the values of weight coefficient are.In 1963, the original SVM (support vector machine) was invented by Vladimir N. Vapnik and Alexey Ya.Chervonenkis.A SVM method constructs a hyperplane or set of hyperplanes in a high-or infinite-dimensional space, which can be used for regression or forecasting.The SVM method can reduce the structure risk of combination forecasting model.In order to improve the forecasting accuracy, this paper uses Weibull distribution statistic method, BP neural network method and SPFM to obtain the reliability of two-state materials respectively, and then regards these forecasting outcomes as training data of SVM model to calculate the 90% reliability of two-state materials.

SINGLE FORECASTING MODELS
The two-state materials in storage are either in good state or in failure state, and the storage failure data obtained from technical inspection annually mainly consist of three items: storage ages, sample size and sample failure numbers.Considering the shelf failure characters of two-state materials, Weibull distribution statistic method, BP neural network method and SPFM are usually used to forecast the shelf life of two-state materials and their predictive mechanisms are quite different.We can use these methods to predict the shelf life of two state materials respectively.

Weibull Distribution Statistic Method
The statistic method used in this paper was twoparameter Weibull distribution.This probability distribution was initially developed by Waloddi Weibull, professor of Applied Physics at Royal Institute of Technology in Stockholm, Sweden.It was widely applied in the field of reliability, fatigue testing, and quality control.The distribution equation is described as follows: where represents the probability of shelf life.The term and refer to shape and scale parameters respectively.The best method for estimating these parameters is the Maximum Likelihood (ML) method, which gives a solution using simultaneous equations.Generally, these unknown parameters are determined from ML method of the extreme value distribution.Their relationship is expressed by the following equations: , One problem is that the above constraining equations Eq.( 2) and Eq.( 3) are not linear.Therefore, there equations must be iteratively solved by the computer.The best iterative method is the well-known Newton-Raphson method or EM (Estimation Maximization) method.

Back Propagation (BP) Network Method
Artificial Neural Network (ANN) is a non-linear computational method based on the locally sensitivity characteristic of human brain neurons, which was applied widely in nonlinear, uncertain or unknown complex engineering problems without explicit understanding of the physical mechanism, such as pattern recognition, fault diagnosis and trend prediction problems.ANN such as BP network, RBF network and Elman network are used commonly in prediction problems.The RBF network needs a large number of sample data, but the sample data obtained in practice is limited.The Elman network is a single recursive network that has a context layer as an inside self-referenced layer, which enhances its ability to predict the time-varying system.However, the shelf life of most two-state materials is quite stable.We can draw a conclusion that BP network is suitable for this task.

Sliding Poly-Nominal Fitting Method (SPFM)
The SPFM is a method based on PFM (Poly-nominal Fitting Method), which has similar principle.The essence of this method is least-squares estimate, which is also a process of matrix calculation.

Introduce of Combination Forecasting Model
The combination forecasting model can take full advantage of every single forecasting method to improve the predictive precision.In the n-model forecasting combination case, the combined forecast value is , and the series of are technical inspection data of a certain two-state materials, where and represent the numbers of technical inspection.The aim of combination forecast is finding a mapping function to assure , where represents Euclidean norm .The key point is how to find a proper mapping function .The usual combination forecast method is to define mapping function as weighted mean operation.In other word, define a weight vector , , where .
This method has a flaw.The weighted vector element is a constant, but the shelf life of two-state materials is relevant to many factors, include environment factor, design factor and management factor that can hardly be represented by a constant.Thus, the mapping function should be defined as a non-linear function.

Combination Forecast Model of SVM
The combination forecast model of SVM means the inner product kernel function of SVM is used as the mapping function.SVM can minimize both the empirical risk and confidence interval.The basic idea of SVM is mapping the data which is inseparable in low-dimensional space to a high dimensional eigenspace.The algorithm model is as follow: (4) where is precision error, and note relaxation factors when considering the fitting error.Based on the theorem of optimal separating hyperplanes, the equivalence of optimal regression function is: (5) where represents penalty coefficient to control the penalty degree when sample data excess precision error .Transform the above equation into its dual problem by Lagrange method: (6) For Lagrange factor and , the maximization target function is (7) Thus, the regression function is (8) The samples corresponding to are called "support vectors".If only we replace Eq.( 4) with kernel function , the inner product operation can be transformed into following non-linear fitting function: (9) where, notes coefficient, notes support vector and represents threshold.
can be solved by least-squares method.Considering the characteristic of storage, RBF is chosen to be the core function of , which is (10) where, notes the width parameter of kernel function, which adjusts the radial range of RBF.

ANALYSIS OF A CASE
Many types of POL products are typical two-state materials.A consecutive study was conducted in cooperation with Petroleum Center which routinely conducts shelf life testing on various stored POL.The laboratory shelf life data for Silicone Brake Fluid which meets the specification MIL-B-46176 are listed in the following table:

Weibull Statistic Model
We can substitute the shelf life data in Table 1 into Eq.(1) ~Eq.(3), and solute them by Newton-Raphson method.The results show that shape parameter , scale parameter , the 90 percent reliable life estimate of two-sided 90 percent confidence intervals is 6.3.Fig. (1) shows the probability shelf life diagram obtained for a silicone brake fluid product evaluated.

BP Network
Given the sample data in Table 1, we can initialize a BP network with 6 hidden layer nodes.The training of BP network by back propagation involves three stages: the feed forward of the input training pattern, the calculation and  We use the 10 th year's shelf life data to verify the trained network, the result is 0.9851 whereas the expect result is nearly 1, and then the error rate is 1.49%.Bi-cubic interpolation algorithm was applied to calculate the 3 rd , 5 th , 7 th , 9 th year's shelf life reliable values.We mark these data and training data in asterisk notations, and mark the forecast reliable value in circle notations in Fig. (3).Fig. (3) shows that prediction errors of 9th and 10th year are -0.79% and -1.51% respectively, of which the errors are larger than others.The result indicates that the 90 percent reliable life is 4.1807 years.From the above figure, we can see that insufficient-fitting problem will occur when , and over-fitting problem will occur when .Then we set , hence the selection of error accuracy and

Analysis
The prediction results and error analysis of all the three single forecast method and the combination forecast method are listed in Table 2. Comparative analysis suggests that: 1.
The result of combination forecast has higher accuracy than other three methods, which proved the feasibility of combination forecast method.

2.
When the samples are insufficient, the statistic method will lead to an obvious error.

3.
The forecast accuracies of BP network and SPFM are in the same level.

CONCLUSION
The combination forecast model of SVM is indeed an effective measure to forecast the shelf life of two-state materials if the parameters and kernel function are chosen properly.Although the result is better than any other result of single forecast model, the combination forecast model has some deficiency.The accuracy of combination forecast model relays on the accuracy of single forecast models, but the training process of BP network trends to nonconvergence when the samples have a series of sequential 0 values and the predict results of SPFM allow values that greater than 100 percent because SPFM doesn't concern about the physical meanings.These problems will be solved in the next study.
and extreme value distribution.and are the parameters of extreme value distribution.The ML equation and the two partial differential equations of extreme value distribution are described in Eq.(1)~Eq.(3).Extreme value log likelihood equation: and censored data; =sum of failure data; =sum of censored data; =the number of failure in a sample.
back propagation of error and the adjustment of the weights.The 10 th year's shelf life data was chosen to verify the accuracy of BP network.After 88-step iterations, the training has finished.Fig.(2) shows the trend curve of associated error.

A
matlab program designed to implement the SPFM mentioned in section 1.3, and Fig. (4) displays the expect values noted in asterisk notations and the predicted values noted in cycle notations.

4. 2 .
Combination Forecasts of SVM RBF was selected for the kernel function of SVM, and a simulation has been conducted by libsvm toolbox developed by Professor Zhi-Ren Lin in Taiwan University.parameter of RBF has a great influence on the generalization ability of the model.Fig. (5) shows the mean absolute percent error (MAPE) when range from 1 to 1000.