The last time the CFPB updated its BISG proxy methodology was in April of 2017, and the updated BISG proxy methodology materials are available at the link provided in the resources below. Since the BISG proxy methodology materials are based on publicly available surname and geography data from the U.S. Census Bureau, we would not expect another update until 2020 census data is available.
The BISG proxy methodology combines geography and surname information as proxy information to fill in a customer’s demographic characteristics when those are unknown by the lender. The CFPB uses the BISG proxy methodology when conducting a fair lending analysis of non-mortgage credit products, since auto lenders and other non-mortgagor lenders generally are not allowed to collect consumers’ demographic information. As of the 2017 update, the CFPB’s BISG proxy methodology relies on a surname list derived from 2010 census data, with additional data from the 2000 census.
Regarding other resources for fair lending reviews, the U.S. Census Bureau website has demographic information available for names based on 2010 census data and on 2000 census data. We are not aware of any resources that contain surname-based information for race and ethnicity based on 2020 census data.
For resources related to our guidance, please see:
- CFPB Report, Using publicly available information to proxy for unidentified race and ethnicity (Summer 2014) (“Information on consumer race and ethnicity is required to conduct fair lending analysis of non-mortgage credit products, but auto lenders and other non-mortgage lenders are generally not allowed to collect consumers’ demographic information. As a result, substitute, or ‘proxy’ information is utilized to fill in information about consumers’ demographic characteristics. In conducting fair lending analysis of non-mortgage credit products in both supervisory and enforcement contexts, the Bureau’s Office of Research (OR) and Division of Supervision, Enforcement, and Fair Lending (SEFL) rely on a Bayesian Improved Surname Geocoding (BISG) proxy method, which combines geography- and surname-based information into a single proxy probability for race and ethnicity.”)
- CFPB Report, Using publicly available information to proxy for unidentified race and ethnicity (Summer 2014) (“The statistical software code, written in Stata, and the publicly available census data files used to build the BISG proxy are available at: https://github.com/cfpb/proxy-methodology.”)
- CFPB/proxy-methodology, Update to proxy methodology – April 2017 (“In the summer 2014 edition of Supervisory Highlights, the Bureau previously reported that examination teams use a Bayesian Improved Surname Geocoding (BISG) proxy methodology for race and ethnicity in their fair lending analysis of non-mortgage credit products. . . . As of April 2017, examination teams are relying on an updated proxy methodology that reflects the newly available surname data from the Census Bureau. Our updated proxy methodology relies on the race and ethnicity shares for the 162,253 names that appear on the 2010 list and supplements this list with the race and ethnicity shares for the 5,155 names that appear on the 2000 list but not on the 2010 list, resulting in a list of 167,409 surnames in total. The updated name list, statistical software code written in Stata, and other publicly available data used to build the BISG proxy are now available in this repository.”)
- U.S. Census Bureau, Frequently Occurring Surnames from the 2010 Census (Last revised December 27, 2016)
- U.S. Census Bureau, Frequently Occurring Surnames from the 2000 Census (Last revised September 15, 2014)