How Early Can Non-Performance Loan Predict Bank Failure? Evidence from US Bank Failure during 2008-2010

Probit model was applied on the non-performance loans (NPL) of eight quarters, quarter 1— quarter 8, in determining the significant quarter before the bank was declared failure. The result of the Probit estimates found that as early as one-year ahead (4 quarter-ahead) bank-failure can be alerted and predicted. The NPL of the 4 quarter was a significant predictor of bank failure. The estimates of the model correctly predicts 89.6 percent of the U.S. banks that failed and 97.6 percent of the banks that survived during 2008-2010. Overall, the estimated model correctly predicts 95.5 percent of the observations (89.6 percent of the failure =0 and 97.6 percent of the survival=1 observations). The paper provides policy prescription that bank managements and bank regulators should pay attention to the early quarter(s) that are significant factor (s) for bank failure. JEL Classification: G01, G21, G28, G33


Introduction
The study of the early prediction of bank failure has been an important issue since bank failures of 1920s. The study of early prediction has become more serious since the U.S. bank failures in 2008-2010. More than two hundred U.S. banks were wiped out in 2009-2010. In 2009, there were one hundred fifty U.S. banks failed (Report from FDIC).
Under the U.S. easy monetary policy during 2002-2007, commercial banks offered subprime mortgage rate, the rate which was significantly lower than the prime rate, and did not maintain strict procedure of fulfilling requirement, minimum collateral in particular. As a result, buying houses were easy which generated bubble in the U.S housing market. When the price of houses plummeted, the value of collateral against loans became insignificantly less than the original market value; and it led to the collapse of the housing market. Defaults of mortgage payments increased. Consequently, banks' nonperformance loan as percentage of total loans increased leading bank's liabilities exceeding its assets and that collapse the huge number of bank failures. Such a large bank failure did not happen since the Great depression of the 1930s.
Banks may take precautionary measures before it is too late if the bank knows well ahead of time that the bank is moving into a deep financial danger. In this respect, the study of bank failure prediction and the quarterly data of NPL is of great importance and support for bank managements and bank regulators by providing the early warning.
The survey of literature shows no evidence of the early bank failure prediction study using quarterly NPL as the predictor of bank failure. Determining the significant quarter ahead of failure, this study provides an important contribution in the bank failure literature.
The paper is organized as: a survey of bank failure literature is discussed in Section 2. Data and methodology are outlined in Section 3. Section 4 provides empirical results and conclusions.

Section Survey of Literature
The study of bank failure prediction and bankruptcy has been very popular to researchers since Secrist (1938) publication. His pioneering work examined national banks that failed and the banks survived during 1920s. Beaver (1966), one of the first researchers who studied bankruptcy prediction using several financial ratios, investigated the predictive ability of financial ratios. Altman (1968) introduced bankruptcy models based on discriminant analysis and classified bankruptcies according to five financial variables, working capital/total assets, retained earnings/total assets, earnings before interest and taxes/total assets, market value of equity/total debt, and sales/total assets. Barr and Sims (1996) claimed to have built two new bank failure prediction models. In their models of CAMEL variables, they used a new measure of efficiency representing management quality (M). In both one-year-ahead model and two-year-ahead model, they found that the new efficiency measure obtained from Data Envelopment Analysis (DEA) developed by Charnes, Cooper, and Rhodes (1978), was a significant factor among the CAMEL variables in predicting bank failure. Jordan et. al. (2010) studied bank's failure risk using multiple discriminant analysis and found the bank failure could be detected up to four years prior to failure for banks which failed between February 2, 2007 and April 23, 2010. Kolari et al. (2002), applied logit approach on a small sample of banks for predicting the large U.S. commercial bank failures and found that bank failures could be predicted from one year and two years prior to the failure. Zaghdoudi (2013) examined Tunisian bank failure applying logistic regression and found that bank's ability to repay its debt, bank's operational variable, bank profitability per employee, and the leverage ratio had negative impact on the probability of failure.
Using a large quarterly data set of FDIC insured US banks from 1992 to 2012 Mayes and Hano (2012) contrasted two methods, the Logit technique and the discrete survival time analysis, to predict bank failures and drew inferences about the stability of contributing bank characteristics. The models incorporated CAMELS indicators and macroeconomic variables and contrasted risk-based and non-risk-weighted measures of capital adequacy. They found that the non-risk-weighted capital measure and the adjusted leverage ratio could explain the bank distress and failures best.
Shuangjie & Wang (2014) developed a new Financial Early Warning (FEW) logit model using non-financial efficiency indicators of data envelopment analysis. The model was applied to Chines firm. They claimed that the proposed new FEW logit model had improved the accuracy of prediction and stability; the approach which used non-financial efficiency indicators to verify the results of FEW logit model had significantly ensured the reliability of the FEW models.
The early study of Martin [1977] used both Logit and DEA statistical methods to predict bank failures during 1975 -1976 and found that the two models had similar results in terms of identifying failures/nonfailures of banks.
Using a non-parametric proportional-hazard model on Vanezuelan banking sector, Molina (2014) found that a bank's ability to generate more and sounder profits during the crisis was the most important factor. The banks with higher ROA, and more investments in government bonds were less probable to fail.
Arabi (2013) estimated bank failure of Sudan by logistic regression and discrimant analysis found that earning was the most significant factor for bank failure followed by asset quality, liquidity, and capital adequacy. Samad (2012) empirically examined the significant determinants of the credit risk variables of US bank failure. Applying the Probit Model he found that among the five credit risk variables, the credit loss provision to net charge off, loan loss allowance to non-current loans, and non-current loans to loans were significant for predicting bank failures. These factors predict 76.8 percent to 77.25 of total observation correctly. The model predicts 97 out of 121 failures i.e. 80.17 percent correctly. Net charge off to loans and loan loss to non-current loans, though most reliable measures, were not significant predictors for the US bank failures during 2009. Samad (2011) examined failed and non-failed banks using ANOVA and KrosKal Wali tests and found that failed banks had significantly lower capital ratio than those of non-failed banks. Thomson (1991) studied the factors that influenced commercial bank failures during the 1980s. He found that the economic environment in which banks operated affected the probability of bank failure. The model was estimated for the bank failure that occurred during 1984-1989.
The survey of literature finds no evidence of the early bank failure model using NPL. The paper is, thus, an important contribution in the bank failure literature.

Data
This study examined 202 failed banks and 505 survived banks of the United States. Quarterly data of the nonperformance loan (NPL) from immediate 1 st quarter through 8 th quarter during 2008-2010 before the bank was declared failure was collected from the Federal Deposit Insurance Corporation (FDIC).

Methodology
As the bank failure, failure=1, non-failure=0, is a binary index, probit or logit is the appropriate model. Logit uses non-normal distribution to the probability of an event occurring, whereas the Probit assumes standardized normal distribution. In this paper, probit regression is used as: Where Φ is the cumulative density function of the standard normal distribution, which takes a real value ranging between zero and one.
The probability functions used in the Probit Model are the standard normal distribution. Being distribution functions, they are symmetric around 0 and variance equal 1, and they are bounded between 0 and 1 (Amemiya, 1981).
Using the conventions of notation, the estimated model can be written in general form: Where Y i is a dependable variable which represents the final outcome: Y i = 1 for failed banks, Y i = 0 for success banks. X i are the number of explanatory variables that have impact on bank failures or successes; X ij the value of ith variable for the ith observation. β1, β2…… coefficient associated with explanatory variable X i is estimated from the sample. In compact notation, (2) can be stated as: Where Y = 1, Y = 2, Y = 3 are dependent variable representing failure or success (Y i = 1 for failed banks, Y i = 0 for success banks) X it is a (1x k) vector of explanatory variables used and β is a (k x 1) vector of unknown parameters to be estimated. U t is the white noise. Then the observed dependent variable Y i *, is determined by whether y exceeds a threshold value, 0.5, in this paper. That is, Robustness i.e. expectation-prediction of the model as well as the likelihood ratio (LR) statistics will be examined. The LR statistics tests the joint null-hypothesis that all slope coefficients are simultaneously equal to zero and is computed as -2(l(β)/l(β).
The model's goodness of fit is estimated from the Hosmer-Lameshow and Andrews tests. ISSN 1923-4023 E-ISSN 1923 Y i is binary variable: Y = 1 for failed banks; Y = 0 for survived banks

Independent variables:
The independent variables were eight quarters' non-performance loans as percentage of total net loans from quarter 1 through 8 before the bank was declared failed.
NPL1= 1 st quarter non-performance loan before the failure NPL2= 2 nd quarter non-performance loan before the failure NPL3= 3rd quarter non-performance loan before the failure NPL4= 4 th quarter non-performance loan before the failure NPL5= 5 th quarter non-performance loan before the failure NPL6= 6 th quarter non-performance loan before the failure NPL8= 8 th quarter non-performance loan before the failure The descriptive statistics of the independent variables from quarter 8 to quarter 2 before the failure is presented in Table 1.  Table 1 demonstrates the real life fact. That is, the closer to the quarter a bank was declared failure, the mean of the non-performance loan (NPL) increased. The ratio of NPL as percentage of the total loans of the 8 th quarter, 7 th quarter, 6 th quarter, 5 th quarter, 4 th quarter, 3 rd quarter, and 2 nd quarter before the bank was predicted failure was 1.29, 1.55, 1.90, 2.45, 2.96, 3.65, 4.44, and 5.32 respectively. The relationship between independent variables i.e. NPL and dependent variable, Y, i.e. bank failure is presented in Table 2. The result of the probit model for the bank failure prediction is presented in Table 3. The signs for the coefficient of NPL, the predictors of bank failure, are consistent as per the expectation of the probit model outlined in Section 3. That is, the signs of the coefficient are positive for all NPL except for the NPL of the 3 rd quarter, the 5 th quarter,7 th quarter, and 8 th quarter. The sign for wrong coefficients were not significant.
Since the coefficients of the 4 th quarter and the 2 nd quarter non-performance loans (NPL) are significant, the probit model suggest that the bank failure can be predicted as early as one-year ahead before the bank was officially declared failure. Because 4 th quarter NPL was statistically significant. The Z-statistics of 1.94 substantiated the claim.
McFadden R 2 = 0.79 suggests that the independent variables of the model explains 79 percent of the U.S. bank failure.
The probability of 0.0000 associated LR statistics = 6975.71 rejects the null hypothesis that coefficients of all variables are simultaneously equal to zero. The marginal effect of the independent variables on U.S. bank failure is presented in Table 4. Marginal effect of the 4 th quarter NPL shows that every 1 percent increase in NPL increased the probability of 10 percent bank failure in USA and the effect is statistically significant. The most significant failure can be predicted in the 1 st quarter before the failure. 1 st quarter NPL had significant impact on the US bank. The Z-statistics of 6.37 substantiated the claim.
Marginal effect of the 1 st quarter NPL shows that every 1 percent increase in NPL increased the probability of 20 percent bank failure in USA and the effect is statistically significant. As all quarters-ahead of the NPL were not significant, bank management and bank regulators should scrutinize the default of 90-day past loan payment for the quarter that are statistically significant. This is an important policy prescription for this paper. Bank management should be alerted when the quarter-ahead-NPL is a significant predictor of failure. The predictive power of the estimated model for correct prediction and wrong prediction is provided in Table 5. In the left-hand table, the classification of observations as having predicted probabilities that are above or below the specified cutoff value 0.5. In the upper right-hand table, we classify observations using , the sample proportion of observations. This probability, which is constant across individuals, is the value computed from estimating a model that includes only the intercept term, C.
-Correct‖ classifications are obtained when the predicted probability is less than or equal to the cutoff = 0.5 and the observed , or when the predicted probability is greater than the cutoff=0.5 and the observed . show the restricted model predicts that all 554 individuals will have Dep=0. This prediction is correct for the 554 observations, but is incorrect for the 202 observations. The estimated model improves on the Dep=1 predictions by 89.6 percentage points, but does poorly on the Dep=0 predictions (-2.3 percentage points). Overall, the estimated equation is 22.22 percentage points better at predicting responses than the constant probability model. This change represents a 83.17 percent improvement over the 73.28 percent correct prediction of the default model. The p-value for the HL test is large while the value for the Andrews test statistic is small, providing mixed evidence of problems. As the Andrews Statistics (304.92) associated with their Prob=0.0000, the Chi-Sq provides support that there are no significant differences between the fitted expected values and the observed values. The estimated model provides a good fit.

Conclusion
A total of seven hundred sixty six U.S. banks, failed bank=202 and non-failed banks= 554, were examined applying the probit model in determining the significant early quarter/s for the U.S. bank failure during 2008-2010. Probit model was employed. Non-performance loans (NPL) of eight quarters, quarter 1 through quarter 8, before the bank failure were regressed on Y i , where Y=1 is failure and Y=0 is non-failure.
The result of probit regression, in Table 3, shows that the NPL of the quarter 1 and the quarter 4 before the bank was declared failure was a significant factor which suggests that the bank failure could be predicted as early as one-year ahead (4 th quarter).
The estimates of marginal impact, in Table 4, shows that the 2 nd quarter-ahead NPL and the 4 th quarter-ahead NPL were the significant predictors for the U.S. bank failure. So, the bank failure can be predicted as early as 4 th quarter-ahead i.e. as early as one-year-head before the bank was officially declared failure applying the index of NPL.
The estimated model correctly predicts 89.6 percent of the U.S banks that failed and 97.6 percent of the U.S. banks that survived. Overall, the estimated model correctly predicts 95.5 percent of the observations (89.6 percent of the Dep=0 and 97.6 percent of the Dep=1 observations).
The estimated model improves on the Dep=1 predictions by 89.6 percentage points, but does poorly on the Dep=0 predictions (-2.3 percentage points). Overall, the estimated equation is 22.22 percentage points better at predicting responses than the constant probability model.
Policy makers should pay attention to the quarter (s) of nonperformance, which are statistically significant. The nonperformance loan of the early quarter(s) which are significant factor in predicting bank failure is an important policy prescription of this paper for the bank management and bank regulators. They should seriously scrutinize the NPL as early as one year ahead of the bank failure.