Modeling Non-response in National Agricultural Statistics Service (NASS) Surveys Using Classification Trees

This paper describes the use of classification trees to predict survey refusals and inaccessibles. Data from auxiliary sources were matched to the 2006 and 2007 March, September, and December Crops/Stocks survey sample members. The data matched included variables such as establishment size (both in dollars and acres), type of commodities produced, operating arrangement, operator characteristics (such as race, age, gender, etc.) from the Census of Agriculture, paradata describing their NASS reporting history (past NASS survey response, refusals, etc.), Joint Burden Indicators, and characteristics of the location of the operation (by county and zip code) that were available from the Census Bureau. Classification trees used these data to repeatedly divide our dataset to identify subsets of records more likely to be survey non-respondents. This approach was initially applied to the NASS Crops/Stocks survey, and then applied to other NASS surveys. The results from our models indicate the relatively small subset of variables that are important in predicting survey response. The most useful variables all come from the set of NASS reporting history variables. These models work consistently for the Crops/Stocks survey and for some surveys such as Cattle, but less well for others such as ARMS. Using these models, sampled operations can be ranked based on their predicted response likelihood. These may be useful for field offices to plan alternative data collection strategies for the operations most likely to be non-respondents.

Issue Date:
Publication Type:
PURL Identifier:
Total Pages:
Series Statement:
RDD Research Report

 Record created 2017-04-01, last modified 2017-04-24

Download fulltext

Rate this document:

Rate this document:
(Not yet reviewed)