Prepare ethnicity label mapping from variable metadata
prepare_ethnicity_labels.RdGenerates a mapping of survey variable names to standardized ethnicity labels using regular expressions and metadata. Use to harmonize survey ethnicity columns for imputation and analysis.
Arguments
- ethnicity_map
named character vector. Ethnicity labels and regex patterns for matching descriptions.
- default_eth
character(1). Default ethnicity label when none found.
- variable_list
data.table. Survey variable metadata. Required columns:
variable— column name in persons description— label or description
- settings
list. Settings object (not used directly).
Value
data.table. Ethnicity label mapping. Columns:
variable— column name in persons label— lowercase description short_label— standardized ethnicity label Rows: one per ethnicity variable. Keys: ( variable).
Details
Filters variable_list for ethnicity columns using regex (e.g.,
ethnicity_[0-9]+$).Converts descriptions to lowercase labels.
If no ethnicity columns found, falls back to race columns and sets default label.
Assigns short_label to each variable using ethnicity_map regex patterns.
Checks that all expected labels are present and non-missing.
Returns a data.table with variable, label, and short_label columns.
Assumes valid variable_list and ethnicity_map; errors if mapping is incomplete.
See also
prepare_ethnicity_survey_data, impute_ethnicity
Other imputation utilities:
calculate_acs_proportions(),
get_acs_ethnicity(),
get_acs_race(),
get_hh_person_sums(),
impute_ethnicity(),
impute_gender(),
impute_income_nonrelatives(),
impute_income_pnta(),
impute_race(),
make_binary(),
prep_hhs_for_income_imputation(),
prepare_acs_income(),
prepare_ethnicity_survey_data(),
prepare_impute_targets(),
prepare_income_fit_dt(),
prepare_persons_dt(),
update_hh_income_imputed()