Financial Models After COVID-19: Challenges and Opportunities

The performance of models built on pre-COVID data is likely to deteriorate in the wake of this unique crisis. The Modelry has developed a proprietary technology-driven approach to assess such effects and a consistent set of remediation strategies

The Modelry development framework
adapted for a post-COVID assessment

Input data

Internal and external data inputs used in the development process. Examples include:

  • Historical transaction level attributes as well as losses and/or credit migrations
  • Macroeconomic data (including regulatory scenarios as applicable)
  • Simulation scenarios
  • Operational assumptions
  • Etc.. 
Property data:
  • Geography (state, MSA, ZIP)
  • Type (office, retail, etc..)
  • Construction or Income Producing (NOI)
Loan & Obligor
  • Sponsor history / strength
  • LTV, collateral, DCR
  • Term, price
Economic data
  • Interest rates forecast
  • GDP, unemployment at state level
  • Cap rates at MSA or ZIP code level

Internal and external inputs typically include historical transaction / obligor level data, performance and losses, macroeconomic data and scenarios, etc..

For a post-COVID model assessment some additional considerations apply:

  • Data needs to be segregated into pre and post pandemic
  • Additional external / industry data generally has to be incorporated since the post-pandemic period is quite short
  • In many cases, structural scenarios and synthetic data points need to be used as supplements

If this point is reached, the current model cannot be used. Either a redevelopment or an entirely new approach is necessary

Model execution inputs

Input data for the execution of the models in the suite. This will typically include portfolios in canonical form (prepared in the data pre-processors) and either scenarios for forecast models or application data for originations, transaction history for behavioral models, etc..

Data preparation components

Data preprocessing components include a suite of tools to homogenize and prepare various types of datasets for modeling. Some of the standardized items included in this step are:

  • Transforming raw input data into canonical data structures that can be processed by the analytic and reporting engines
  • Patching gaps using variety of smoothing techniques (e.g. arithmetic and geometric attribution, exponential and weighted smoothing)
  • Removing outliers and other data defects
  • De-trending time-series data (using spectral analysis, ARIMA, GARCH, etc..)
  • Attaching data tags that are persisted throughout the process and facilitate pivoting and aggregation at the end

Data preprocessing components include a suite of tools to homogenize and prepare various types of datasets for modeling.

Additional functionality specific for a COVID-19 suitability assessment incudes:

  • Inclusion of external data where previously none was used
  • New blending schemes for internal and external data
  • More sophisticated weighting schemes that can handle multiple distinct phase transitions
Parameterized business assumptions

Our modeling framework is designed to incorporate business assumptions as parameterized constraints in the development phase, rather than the usual ad-hoc overlays.

In addition to making all related models consistent among themselves, these assumptions are fully captured into the validation and governance process, as well as automatically inserted into documentation.

A comprehensive review of models post-COVID requires an assessment of the suitability of most business rules and assumptions, separating them into the following two groups:

  • Temporary changes during the lockdown that are expected to revert back to normal quickly (e.g. spikes in credit-card late payments became a poor indicator of subsequent defaults during the pandemic but the correlation will almost certainly revert to previous levels)
  • Secular changes that will take a long time to unwind, if ever (e.g. usage patterns of office real estate)
COVID model assessment

At this stage we apply our diagnostic scoring methodology to classify models into the following groups:

No significant deterioration
Model can continue to be used as is subject to its' regular checks
Significant deterioration in performance but not in factor sensitivities
Model should undergo the changes prescribed under its' own Performance Monitoring plan, most likely re-calibrated
Factor sensitivity outside valid thresholds
Models with performance that is too degraded for a simple re-calibration are likely reliant on wrong risk factors and need to be re-developed
Performance and /or sensitivity tests failed to converge
The regime change is so severe that even a new model is unlikely to work across the board. These extreme cases will require different classes of models, splitting or substantial qualitative overlays
Model suites

A model suite is a set of integrated models and model components that cover a particular analytical space - usually a product

A banking example would be a Commercial and Industrial (C&I) suite that includes:

  • Credit scorecards (PD, LGD, EAD)
  • Behavioral scorecards
  • Loss forecasts (baseline, stress, lifetime)
  • Balance dynamics & origination volumes
  • Pricing
COVID data assessment

Pre-processed "data components" are stored in production databases and are used by any modeling component that needs them. Most of the items stored here are

  • Product specific data that will be used by a particular suite (e.g. historical portfolio characteristics)
  • Macroeconomic data such as stress test time series that have been homogenized and transformed, and are used by any component that runs scenarios

Statistical analysis of the pre and post COVID-19 data cohorts creates the first decision point. If both data sets are sufficiently similar, then proceed to running the model under various parameter combinations. Otherwise, the following choices apply:

Go back and re-process the input data
This will generally work if the differences are not extreme. A sample approach here would be to change the blending algorithm of internal and external data, change the weighting schemes, etc.. Note that this would usually mean inserting new rules into the "context specific" hooks of the modeling machinery
Attempt to run anyway
If data re-processing does not produce better results, running the existing models under a wide set of parameters may lead to the right solution
Stop and consider alternatives
If the extent of the difference is such that the existing model clearly cannot work in the new regime, then an entirely new model or approach needs to be considered
Methodology components

Components that include all primary data science techniques:

Linear regressions:
  • Ordinary least squares
  • Weighted OLS
  • Kalman filter
Generalized linear models
  • Log-Linear models
  • Multivariate (Logit / Probit)
  • Poisson Models
AI and ML techniques
  • Neural networks
  • Deep learning
  • Decision trees
The powerful graph infrastructure that underpins our model development machinery enables us to run hundreds or thousands of parameter / data combinations in a fraction of the time this would otherwise take. This in turn enables us to more accurately diagnose the root of potential problems and find more stable and understandable solutions.
Standardized quantitative tests

Statistical tests, with defined parameters, soft and hard passing thresholds control the modeling "loop" through a scoring and weighting scheme. Typical examples include:

  • Residual analysis
  • Goodness of fit (R-squared)
  • Discriminatory power (Kolmogorov-Smirnov, Gini)
  • Correlation / auto-correlation analysis
  • Stationarity tests (Dickey-Fuller (DF), modified DF, KPSS)

Additional tests check against constraints imposed by business or context specific assumptions.

Statistical tests, with defined parameters, soft and hard passing thresholds control the modeling "loop" through a scoring and weighting scheme

Because of this structured AI development approach, we are able to insert any number of additional tests, or finetune existing ones specifically for the COVID-19 assessment exercise

Standardized performance tests

Preliminary model outputs are processed through a series of performance tests, which can be quantitative or in some instances qualitative rules. If all tests are successful, the outputs are stored in a production database. If not, any of the following may happen, depending on the nature of the test and the issue:

  • Model goes back for redevelopment or recalibration
  • Overlays are applied to results
  • Outputs are accepted with conditions and/or limitations on their use

If this point is reached, the current model cannot be used. Either a redevelopment or an entirely new approach is necessary

Context specific components
   Pre-COVID    Post-COVID    Blended

A critical aspect of our analytical framework is the ability to incorporate context-specific components into the core of the development machinery. This enables out "tree trunk" approach to building model suites: a common core "trunk" that incorporates the risk drivers for that particular product, and "branches" that are then adapted for specific use

A common example is a data weighting component that can, for instance, increase the weight on stress periods for stress test models, while emphasizing recency for the models used for day-to-day business decisions

A critical aspect of our analytical framework is the ability to incorporate context-specific components into the core of the development machinery

For the purpose of assessing models' post-COVID suitability and remediating deficiencies we use additional contexts that represent pre and post pandemic environments as well as stable long-term blends.

Production outputs
Model outputs are stored in production databases and feed reporting, decision tools and applications throughout the institution

The Modeling Engine

COVID-19 Assessment FAQ

The COVID-19 pandemic amounts to a real life economic scenario that is unlike any other in modern history, with several unique factors:

  • Breakdown of many correlations that had been typically modeled as "stable"
  • The speed of change - both on the way down as well as the recovery
  • The extremely idiosyncratic behavior of unemployment, the most recurring variable among banking stress modes

All of this implies that models calibrated to pre-pandemic data are likely to significantly lose predictive power and fail to capture secular post-pandemic shifts.

While all models should be reviewed to ensure they can continue to perform, some classes will experience more severe impact. Banking stress test models in particular, which typically overweigh data from the single 2008 financial crisis are particularly vulnerable as the behaviors of many key scenario variables has been drastically different.

We have developed a systematic approach and a proprietary "scoring" mechanism for each model based on a large number of input sets and parameters. This approach is made possible by the power of our platform and its dependency graph which enables computations under thousands of scenarios in a fraction of the time it would take to run them otherwise.

For the assessment, we first review the characteristics of the development data, then, based on validated rules, create a set of "scenarios" each representing a different blend of pre-, post- (and in some cases "during") pandemic data. We then "score" the model performance under each such scenario against a set of standardized statistical tests appropriate for that model class. The resulting scores, with some additional expert judgement form the basis of the assessment.

In most cases we are able to connect clients' models to our platforms via wrappers or other techniques, and while there is some additional work involved it will typically still be a much faster process than a manual, unstructured review of each model one at a time.

The assessment classifies failing outcomes into three categories, in increasing order of severity:

  1. The model fails some statistical tests but remains fundamentally sound - this is comparable to failing performance monitoring which typically requires a recalibration or some other routine type of adjustment
  2. The drivers of the model no longer work and a full redevelopment is most likely the right approach (additional analysis may be needed in these cases to determine the appropriate action)
  3. The regime for this particular model has changed so drastically that an entirely new approach may be necessary, using different methodologies, data and/or segmentations

While the amount of work will vary substantially depending on the number, complexity and specific client implementations of various models, our systematic approach and technology specifically designed for the fast execution of large numbers of scenarios is certain to make the process as efficient as it can be.

Before any commitment, we perform an initial analysis of the clients' model inventory and implementation - typically a few days to a couple of weeks - at the end of which we are better able to estimate the effort required for the full assessment.