Housing is more than a physical structure—it has a profound impact on health. Enforcing housing codes is a primary strategy for breaking the link between poor housing and poor health.
The objective of this study was to determine whether machine learning algorithms can identify properties with housing code violations at a higher rate than inspector-informed prioritization. We also show how city data can be used to describe the prevalence and location of housing-related health risks, which can inform public health policy and programs.
This study took place in Chelsea, Massachusetts, a demographically diverse, densely populated, low-income city near Boston.
Using data from 1611 proactively inspected properties, representative of the city's housing stock, we developed machine learning models to predict the probability that a given property would have (1) any housing code violation, (2) a set of high-risk health violations, and (3) a specific violation with a high risk to health and safety (overcrowding). We generated predicted probabilities of each outcome for all residential properties in the city (N = 5989).
Housing code violations were present in 54% of inspected properties, 85% of which were classified as high-risk health violations. We predict that if the city were to use integrated city data and machine learning to identify at-risk properties, it could achieve a 1.8-fold increase in the number of inspections that identify code violations as compared with current practices.
Given the strong connection between housing and health, reducing public health risk at more properties—without the need for additional inspection resources—represents an opportunity for significant public health gains. Integrated city data and machine learning can be used to describe the prevalence and location of housing-related health problems and make housing code enforcement more efficient, effective, and equitable in responding to public health threats.