A Relational Model for Predicting Farm-Level Crop Yield Distributions in the Absence of Farm-Level Data

Individual farm-level expected yields serve as the foundation for crop insurance design and rating. Therefore, constructing a reasonable, accurate, and robust model for the farm-level loss distribution is essential. Unfortunately, farm-level yield data is often insufficient or unavailable in many regions to conduct sound statistical inference, especially in developing countries. This paper develops a new two-step relational model to predict farm-level crop yield distributions in the absence of farm yield losses, through "borrowing" information from a neighbouring country, where detailed farm-level yield experience is available. The first step of the relational model defines a similarity measure based on a Euclidean metric to select an optimal county, considering weather information, average farm size, county size and county-level yield volatility. The second step links the selected county with the county to be predicted through modeling the dependence structures between the farm-level and county-level yield losses. Detailed farm-level and county-level corn yield data in the U.S. and Canada are used to empirically examine the performance of the proposed relational model. The results show that the approach developed in this paper may be useful in improving yield forecasts and pricing in the case where farm-level data is limited or not available. Further, this approach may also help to address the issue of aggregation bias, when county-level data is used as a substitute for farm-level data, which tend to result in underestimating the predicted risk relative to the true risk.

