Abstract
Damage models for natural hazards are used for decision making on reducing and transferring risk. The damage estimates from these models depend on many variables and their complex sometimes nonlinear relationships with the damage. In recent years, data‐driven modeling techniques have been used to capture those relationships. The available data to build such models are often limited. Therefore, in practice it is usually necessary to transfer models to a different context. In this article, we show that this implies the samples used to build the model are often not fully representative for the situation where they need to be applied on, which leads to a “sample selection bias.” In this article, we enhance data‐driven damage models by applying methods, not previously applied to damage modeling, to correct for this bias before the machine learning (ML) models are trained. We demonstrate this with case studies on flooding in Europe, and typhoon wind damage in the Philippines. Two sample selection bias correction methods from the ML literature are applied and one of these methods is also adjusted to our problem. These three methods are combined with stochastic generation of synthetic damage data. We demonstrate that for both case studies, the sample selection bias correction techniques reduce model errors, especially for the mean bias error this reduction can be larger than 30%. The novel combination with stochastic data generation seems to enhance these techniques. This shows that sample selection bias correction methods are beneficial for damage model transfer.