Bond Price Prediction

This is part of the group project I did in course STAT447. This part of the material is not covered in lectures and totally done by myself. I believe it is a nice experience for exploring things by myself so I post the detailed version here on my personal blog.


The dataset we used is from a Kaggle competition Link here. In our group project, we tried several models: Ordinary least square (OLS), weighted least square (WLS), regression tree, quantile random forest, and time series. It turns out that the time series model is the best. The Time series model, as for what I believe, should work well in basically two cases: one is the process in nature has periodic behavior (i.e precipitation) or, for those have previous stage dependency like financial data. People always look at the previous performance when making decisions.

The dataset has very high usability with only 2.3% missing values. It contains 14 categorical variables and 48 numerical variables. The variable set consists of two parts: The true variables and the generated variables. True variables are

and those generated variables are simply the variables above at different time. One interest variable is the trade_type which is a binary variable indicating deal is made by individuals or dealers. We should expect the performance of individuals to be worse than dealers and they buy at high and sell at low more often.

Plot the series we see

where the curve base price is computed based on the theoretical discounting formula



where F is the book value (fixed amount) then times r, the coupon rate, is the fixed coupon payment, i is the interest rate, C is the final payment. The above plot indicates two things: first, the series is not stationary with an obvious trend (i.e not mean stationary); second, the right graph shows extremely heteroscedastic. This implies two things: It is not valid to apply the usual ARMA model to the original price data (i.e stationarity required). Second, there must be a way to deal with heteroscedastic.

P = F*r \frac{1 - (1+i)^{-n}}{i} + C(1+i)^{-n}


So the first idea to deal with non-stationarity is to do differencing which is ARIMA. The fact is that it takes me almost 20 differencing to see stationarity so this idea is abandoned. Also the ACF plots mostly shows that

which does not give any useful information for parameter determination.


Next, I turned to do deal with the heteroscedastic. The difference between curve-based price and the actual trading price seems stationary (the right graph). Then the model suggested by TA in this course is the ARMA-GARCH model.

One important thing is that, people at time t only have access to data up to t-1 so the response variable should be

P_{t-1} - P_{\text{curve}, t}

The ARMA(p_1,q_1)-GARCH(p_2,q_2) model is in form of

Y_t = \alpha_i Y_{t-i} + \sum_{j=1}^{q_2} \beta_j e_{t-j} + e_t, e_n \sim N(0,\sigma^2)


and the grach part is

\sigma^2 = \sum_{k=1}^{q_2} \gamma_k e_{t-k}^2 + \sum_{l=1}^{p_2}\theta_l \sigma_{t-l}^2


This GRACH part can handle the heteroscedastic since it regresses not only on those previous residuals but also their corresponding variance.

Now the only problem is to chose the best hyperparameter which are p_1,q_1,p_2,q_2. The time series model is different from the OLS WLS model. Its testing and training set can not be obtained by randomly drawn since it is time-dependent. You can’t fit time series without a time index. So I restructure the data to make time series for each bond then take the first 60% of time recording as a training set and the later 40% as a test set. Also, I add one more restriction

p_1 + p_2 + q_1 + q_2 < 6


This type of constrains comes from the thought about overfitting and also since in the test set only 10 previous data is available which means we can at most have 10 parameters to predict. Notice we also need to predict \sigma^2 so in order to have a reasonable prediction, for each parameter we give 1 and then add 1 more degree of freedom for each process. I also think about adding external regressors so the code is done in two parts as follows. A prediction for a single bond should be like

where the yellow chunk is the confidence interval. Notice the following algorithm, the forecasting method is a one-step-ahead fitting. To be more specific, this is a rolling forecast one step ahead.

Model Comparsion Criterion

In the whole project, we computed different criteria including average length of the confidence interval, with alpha = 0.5 and 0.8, coverage rate, interval score with alpha = 0.5 and 0.8. I will introduce these in more detail if I got time. After obtaining the model, use the corresponding historical data to predict one step forward and then based on the predicted value to do the next prediction until the desired number of predictions (steps ahead) are obtained.

Coding up the model using package rugarch

The package I adapted in R is the rugarch. The code is divided into two parts, one is for ARMA-GARCH only and try all different combinations of p and q under a total of 6 constraints. The other one is to try adding external regressor which is decided to be tried are trade_size, trade_type, time_to_maturity, reporting_delay.

Combinations of hyper parameter candidates

The function for criteria

Some helper functions

Then the model with no external regressor is

Then the fitting process

The performance evaluation function

Result

The performance distribution of different combinations is

The scale of the graph above shows that all combinations of p and q have a super similar performance. The best model is ARMA(3,1)−GARCH(1,0)with trade size as an external regressor.

Summary

This is quite an interesting experience. Afterward, as I do some own research, this is a very popular model used in financial modeling. The method applied here can be approved better. It can be more precise if the treasury bond interest rate were merged into the data set and do the curve price computation by ourselves instead of using the given curve-based price.

The project is much longer than what I showed above. It starts with data cleaning and necessary PCA and transformation. I just showed the part I did. I attached my code and data set here in case of your interest.


Leave a Reply

Your email address will not be published. Required fields are marked *