Modelling and Forecasting the Volatility of Cryptocurrencies: A Comparison of Nonlinear GARCH-Type Models

This study is set out to model and forecast the cryptocurrency market by concentrating on several stylized features of cryptocurrencies. The results of this study assert the presence of an inherently nonlinear mean-reverting process, leading to the presence of asymmetry in the considered return series. Consequently, nonlinear GARCH-type models taking into account distributions of innovations that capture skewness, kurtosis and heavy tails constitute excellent tools for modelling returns in cryptocurrencies. Finally, it is found that, given the high volatility dynamics present in all cryptocurrencies, correct forecasting could help investors to assess the unique risk-return characteristics of a cryptocurrency, thus helping them to allocate their capital.


Introduction
Since the introduction of cryptocurrencies to the financial market, many researchers have sought to clarify their behaviour. Cryptocurrencies form a secured electronic cash system that enables people to transfer payments online. Moreover, they have no intrinsic value (Cheah and Fry, 2015) and do not promise any future payment. Some researchers (e.g. Yermack, 2015) do not consider them to be currencies at all, but rather speculative assets. Their main distinguishing feature is the absence of any legal and official authority to control cryptocurrency transactions, which makes them riskier than other assets in the market. This feature results in a highly volatile market. It means that this market has higher average monthly volatility than that for gold or any set of currencies (see, Ciaian et al., 2016;Dwyer, 2015).
In spite of its high volatility this market is found by some researchers to offer diversification and benefits for investors with short horizon investment plans (Eisl et al., 2015;Briè re et al., 2015). Urquhart (2017) also finds price clustering at round numbers in cryptocurrencies. A very few studies maintain that persistence in the cryptocurrency market could be used as a basis for trading strategies that would make abnormal profits (e.g. Charfeddine and Maouchi, 2018;Caporale et al., 2018;Jiang et al., 2018). The seminal studies in this area are by Tiwari et al., 2018 and by Bariviera, 2017; they provide empirical evidence that returns volatility displays long-memory characteristics. Hence, studying the volatility of cryptocurrencies is very important.
A handful of studies have investigated the topic of cryptocurrencies from two standpoints, volatility and asymmetry. The few studies that have begun to examine the volatility, on the one hand, focus mainly on the linear GARCH-type models (Glaser et al., 2014;Gronwald, 2014). On the other, various studies have attempted to explain the stylized features, such as volatility clustering and the time-varying volatility asymmetric, with financial time series. An influential study by Baur and Dimpfl (2018) claims that the asymmetry, in this case, is due to one of the liquidity traders who provide liquidity to the market, since they trade for other reasons than to exploit information. In the same way, a vast literature accounts for asymmetry using a range of GARCH-type models; this builds on the assumption that the errors are normally distributed (for instance, Pichl, and Kaizoji, 2017;Katsiampa, 2017;Balcilar et al., 2017;Bariviera, 2017 and its references).
The debate about cryptocurrency has recently come into prominence again, with many arguing that the normal distribution is flawed in that it assumes symmetry (a loss is just as probable as a gain). However, investors are more averse to negative shocks resulting from underestimating extreme losses than they are to positive shocks from ISSN 1923-4023 E-ISSN 1923 unexpected substantial gains. Consequently, the presence of long-range dependence in the kind of data generating process referred to above creates a tendency to inaccuracy in estimating the persistence of volatility. For instance, the presence of such level shifts in a return series might appear as increased persistence (Diebold and Inoue, 2001). Very little attention, however, has focused on different distributions of innovation. Xiong and Idzorek (2011) and Ghalanos (2018) have demonstrated that accounting for the stylized features with a financial time series in return modelling and optimization makes a significant impact on decisions to allocate assets, especially when it comes to performance during a crisis. Early examples of research comparing the performance of GARCH-type models with different distributions of innovation include the work of Chu et al. (2017) in which the authors estimated the volatility of seven cryptocurrencies and conclude that the IGARCH (1, 1) model is the most appropriate for estimating Bitcoin volatility. Liu and Tsyvinski (2018) find that the Bitcoin returns can be best fitted using the GARCH-type model with Student's t distributed innovations.
Surprisingly, most of the previous studies of cryptocurrency volatility have used the Bitcoin price or that of a few other cryptocurrencies with a single conditional heteroskedasticity model or single innovation distribution, specifically in the in-sample modelling framework. Only the study of Ngunyi et al. (2019) has considered determining the most appropriate GARCH-type model as well as the best fitting distribution to model the volatility of the major cryptocurrencies returns.
Drawing upon this fact, the present study finds three ways of extending the work of Ngunyi et al. (2019) and related research. First, because scrupulous investigations of the nonlinear statistical properties of most cryptocurrencies are, to the best of our knowledge, absent from recent works, this paper statistically tests whether nonlinearities concerning the time series behaviour of cryptocurrency returns arise because the data generating process is inherently a nonlinear mean-reverting process of the STAR type. In this regard, this paper raises the following question: do the cryptocurrency returns follow the hypothesis of nonlinear mean reversion? In order to answer this question, we use first the non-parametric Triples test of Randles et al. (1980). The advantage of this test with respect to the other available inference procedures is its good finite sample properties and its robustness to outliers (see Eubank, LaRiccia and Rosenstein, 1992). For robustness, we borrow from the methods described in the financial cycle literature and deploy third-order auxiliary regression, as in Luukkonen et al. (1988). Implications from this finding show more clearly that adopting the linear GARCH is inadequate to characterize the behaviour of the series under review.
Second, this paper seeks to shed light on the significance of adopting nonlinear (asymmetrical) GARCH-type models in terms of different innovations in term distribution and has a more extended time which takes into account Fat tails, Excess kurtosis, the Taylor effect and Leverage Effects. These all enable us to investigate which conditional heteroskedasticity model can describe the asymmetrical (tail risk) properties of cryptocurrency returns. This raises two critical empirical questions: first, does the volatility of cryptocurrency returns display nonlinearity? And, second, do positive shocks increase the volatility more than negative shocks? If the previous question can be answered positively, how does this asymmetric volatility from past shocks affect the persistence of volatility?
Finally, in the existing literature, the issue of the forecasting performance of the nonlinear model is still open. Consequently, first, a comprehensive out-of-sample comparison is implemented to consider whether using the nonlinear GARCH-type models with different distributions of innovation for forecasting leads to essential improvements over forecasting with an incorrectly specified linear GARCH-type model. In this regard, we also make a comprehensive out-of-sample comparison between these individual nonlinear models and another type of nonlinear model such as the Artificial Neural Network (ANN) which has been used successfully to predict the volatility of other stock returns.
The next section describes the procedures and methods used in this investigation. Section 3 presents the estimation results. We conclude in section 4. ISSN 1923-4023 E-ISSN 1923 (1) and the error term with the conditional variance of the returns follows a process taking the form . Glosten et al. (1993) introduced the GJR GARCH (1,1) model to capture the long-lasting impact of a negative shock that would possibly cause the asymmetric leverage volatility effect. This model can be expressed as (2) where the dummy controls the impact of the news (shocks) such that

{ }
Other types of nonlinear GARCH models may be used to identify both shifts and rotations in the news impact curve, where the shift is the main source of asymmetry for small shocks while rotation drives the large shocks. Following Hentschel (1995), this family GARCH model can be generally expressed as where the shape is determined by , and the parameter transforms the absolute value functions that are subject to rotations and shifts through and , respectively.
In Equation (3), Higgins and Bera (1992) proposed the nonlinear GARCH (NGARCH) model, in which a small shock is no different from a large one, i.e.
. The latter indicates that rotations and shifts are zero, . This can be given the following form: | | (4) Another attractive nonlinear model is the one proposed by Engle and Ng (1993), in which the rotation parameter is eliminated, and . Thus Ding et al. (1993) present the Asymmetric Power ARCH Model (APARCH) which delivers a general class of volatility models that well reveal Fat tails, Excess kurtosis, the Taylor effect, and Leverage Effects. Such a model as can be expressed as follows: | | (6) In Equations (2-6) parameters and respectively, measure the persistence and size effect of the shocks on volatility. In contrast, the sign effect is given by . The power term is noted by . Further, these estimated parameters should satisfy the following conditions Having discussed nonlinear GARCH-class models, it may be helpful now to explain in greater detail the distributional behaviour for cryptocurrency optimisation and hedging risk.
Following the literature in this regard, we assumed that the cryptocurrency index returns exhibit return distributions that are skewed from the mean and that have fatter tails (excess kurtosis) than a normal distribution. Consequently, accounting for this skewness and excess kurtosis in the modelling and forecasting of returns makes a significant impact on asset-allocation decisions. In this study, we distinguished between four selected conditional distributions in the selected nonlinear GARCH models.
The Generalized Error Distribution (henceforward, GED) proposed by Giller (2005) is symmetrical exponential with standardised p.d.f. given by According to Equation (7), the distribution is defined by three parameters: the mode of the distribution, ; the dispersion of the distribution defined by ; and , which controls the skewness. It is worth noting that the expected moment in this distribution functions in all the mentioned parameters and it is thus not obvious how to obtain the standardised form given in Equation (7).
One of the best-known distributions in breaking down the standardising and estimating density function is the Generalised Hyperbolic Distribution (GHYP). This distribution was introduced to finance as a more realistic model for returns series in Eberlein (2001), and Eberlein and Prause (2002). This distribution of the Location-Scale Mixture of Normal in an N-dimensional random vector is given as (8) The random vector in Equation (7) follows a Generalised Inverse Gaussian distribution (GIG) IFF; the density of the GH random vector is given by where and are N-dimensional vectors, is a positive definite matrix of order , and is an independent positive mixing variable. Particular care, however, should be exercised when choosing this GHYP distribution in GARCH models, in order to avoid any identification problems resulting from variations in the GIG parameter.
Another motivating distribution for modelling asset returns is Johnson's SU-distribution, which is a four-parameter family of probability distributions given as

Data snd Preliminary Analysis
The dataset consisted of 2016 daily closing prices for the period from 1 st June 2014 to 8 th December 2019. While almost all previous studies in this field have been limited to Bitcoin, our dataset comprised the five cryptocurrencies with the largest market capitalization and a sufficient history of data throughout interest, namely; Bitcoin, Ripple (XRP), Litecoin (LTC), Monero, and Dash.
The summary statistics for the price indexes of the daily closing returns of these five cryptocurrencies are presented in panel A of Table 1. Panel A shows that the returns are not normally distributed but instead positively skewed in all cases (except for Bitcoin). Additionally, the estimated kurtosis was much higher than the value of the normal distribution, suggesting an excess kurtosis and, thus, leptokurtic behaviour. The value of the ARCH (p) test confirmed the existence of ARCH effects in the return indices that we considered, suggesting the usefulness of adopting the GARCH model for conditional variance. Besides, stationarity was guaranteed in the return series, since we did not reject the null hypothesis of the KPSS test.
Furthermore, an interesting finding appears in panel C of Table 1. This is the statistically significant steepness and deepness in every case. To be specific, deepness parameters for the time series exhibited negative skewness in relation to the mean or trend. These negative signs calculated in levels for the series indicated that the negative shocks were deeper than the positive were high. Moreover, the steepness could be seen as a negative skewness in the differenced series, which also provided evidence of such asymmetry. As a robustness check, we tested whether the asymmetric time series behaviour of cryptocurrencies rises because the data generating process is an inherently nonlinear mean-reverting process. Following the relevant literature (e.g. Omay, 2011;Canepa et al., 2019), this could be done by using the LM-test, following Luukkonen et al. (1988). It was clear by this time that if it was a valid hypothesis that the return series could have a nonlinear mean-reverting process (see, among other, Bahmani-Oskooee et al., 2019), we should expect the series under consideration to be nonlinear and to feature more asymmetric cycles. As shown in Panel D of Table 1, the linearity was rejected in all cases since the value was less than 5% and, hence, we accepted the nonlinearity. The latter features of the return series can be illustrated in Figure 1. These features of cryptocurrencies are consistent with those referred to in Caporale et al. (2018) and Jiang et al. (2018). Therefore, we adopted the nonlinear GARCH models to overcome the existence of asymmetry, excess kurtosis, and long-term memory in order to obtain more accurate results.

The Nonlinear GARCH Models
This study considered various conditional heteroskedasticity models and error distributions, leading to a total of 16 models estimated using a maximum likelihood. A comparison of these models was drawn on the AIC (Akaike Information Criterion) and BIC (Bayes Information Criterion) to evaluate these models. Interestingly, specification results indicated that even an AR (1) model was not necessary since there was no evidence of autocorrelation in any of the cryptocurrency returns under review.
Similarly, Table 2 presents a comparison of the Information Criterion values from different nonlinear GARCH-model specifications (NGARCH, NAGARCH, GJR-GARCH and APARCH) fitted to the five cryptocurrencies with the selected innovation distributions. Panel E of Table 2 implies that the NGARCH has a GHYP in the case of the Litecoin and XRP, suggesting that these two series have semi-heavy tails (thus introducing heavier tails and skewness) and hence are highly prone to the news effect. Surprisingly, however, size shocks had no impact, according to NGARCH. Another remarkable outcome was that the semi-heavy tail in the Monero return could be best fitted through NAGARCH with a GHYP. Strong evidence of a leptokurtic return series was found in case of Bitcoin and Dash, indicating a series with fatter tails that the previous series had. Moreover, the adopted APARCH in the case of Dash implies the presence in the observed data of the Taylor effect. The results obtained from the nonlinear GARCH models (Note 1) are summarised in Panel A of Table 3. We computed the robust standard errors to obtain robust inferences about the estimated models. It is apparent from this table that all the parameters are significant. Loosely speaking, both and guaranteed the non-negativity of the conditional variance. Similarly, the parameters of skew and shape (where applicable) were significant.
Closer inspection of the table shows that the leverage term is negative in the case of Monero and Bitcoin, suggesting an unequal response to market innovations. That is to say; a positive return had less influence on future volatility than did negative returns. For the adopted cryptocurrencies, these results are likely to be related to the negative correlation between returns and volatility. Surprisingly, an inverted asymmetric effect was found in the case of Dash. This effect can be explained by the herding of uninformed investors whenever prices go up and contrarian behaviour among informed investors when prices go down. This result is consistent with findings in other studies reporting that uninformed or unsophisticated investors are particularly active in these markets (e.g. Baur and Dimpfl, 2018). These results were further supported by the sign bias testing by Engle and Ng (1993) for misspecification of the conditional volatility models, which allowed the impact to be tested of positive and negative shocks on volatility that were not predicted by the model. In all cases, the nullity was generally not rejected because there was no evidence that the sign of the shocks played an important role in predicting the variable.

Forecasting Accuracy of the Nonlinear GARCH Models
In order to assess the predictive properties of the adopted non-linear GARCH models, we compared the forecasting performance offered by these models with the forecasting performance of a free dynamics model. Our example of the latter is the Artificial Neural Networks (ANNs) model which has been widely used in forecasting tasks because it efficiently handles the uncertainty inherent in a nonlinear time series. Moreover, ANNs is determined by the characteristics of the data and thus requires no prior assumption in the model building process (Chen, Leung, & Hazem, 2003;Zhang & Min Qi, 2005).
The model connects a network of three layers of simple processing units by acyclic links, as shown in Figure 2, below:

Figure 2
These layer between the input nodes and the output through a number of hidden nodes which can be presented mathematically as follows: where the connection weights and are the model parameters to be estimated.
In Equation (11), the activation functions provide a smooth change in output nodes as the input values change. Further, the type of this function is indicated by the situation of the neuron within the network. In our case, the logistic is used as in Equation (12) below (12) For the experiments in the present study, Package 'nnfor' in R was used, where the numbers of input and hidden nodes of the individual ANN models for the data were investigated 2016 times before we selected the one that produced the lowest BIC value. We also excluded the possibility of zero hidden nodes, because in this case, the resulting ANN turned out to be merely a linear autoregressive model.
The forecasts obtained using both approaches were evaluated using several forecasting accuracy tests. These comprised Root Mean Squared Error (RMSE), Average Mean Absolute Percentage Error (AMAPE) and Mean directional accuracy (MDA). We bore in mind that both RMSE and AMAPE were used to evaluate the forecasting value accurately, whereas MDA measures quantitative errors and is used to evaluate the accuracy of forecasts of the direction of change.  Table 4 reports the result of forecasting performance. It is worth noting that the series is split into two subsamples: a pre-forecast period (for ) from which the model was estimated and a forecast period ). Then h-step-ahead forecasts were computed and compared with the pre-forecast period. The forecast period under consideration was days. Panels A, B and C present the h-step-ahead forecasts. It is apparent from Table 4 that the different fitted nonlinear GARCH-model specifications with the selected innovation distributions produced the smallest amount of error in forecasting compared with the ANN forecasting results. It may be the case, therefore, that those fitted nonlinear GARCH-models outperformed the forecasting of the ANN model.

Conclusion
This paper advocates the dramatic growth in our understanding of modelling and forecasting the cryptocurrency market by concentrating on several stylized features of cryptocurrencies, such as Fat tails and Excess Kurtosis, Volatility Clustering, Long Memory and Leverage and its effects. To this end, we conducted a three-step investigation. In the first step, we tested for potential asymmetric behaviour in the cryptocurrency markets using the non-parametric Triples test. For robustness, we test whether such asymmetry occurred because the data generating process was inherently a nonlinear mean-reverting process of the STAR type. Drawing upon these results, the second step shed light on the significance of adopting nonlinear (asymmetrical) GARCH-type models in terms of different innovations and term distributions, together with a more extended time period to investigate which model most informatively described the asymmetrical (tail risk) properties of cryptocurrency returns.
The results of this study are threefold: first, it finds evidence of asymmetry in the return series that it reviewed, in which the negative shocks are deeper than the positive are high. Moreover, such asymmetry stems from the inherently nonlinear mean-reverting process of the STAR type. Second, this study finds generally that nonlinear GARCH-type models taking into account distributions of innovation that capture skewness, kurtosis and heavy tails constitute excellent tools in modelling cryptocurrency returns.