Skip to contents

Generates a mapping of survey variable names to standardized ethnicity labels using regular expressions and metadata. Use to harmonize survey ethnicity columns for imputation and analysis.

Usage

prepare_ethnicity_labels(ethnicity_map, default_eth, variable_list, settings)

Arguments

ethnicity_map

named character vector. Ethnicity labels and regex patterns for matching descriptions.

default_eth

character(1). Default ethnicity label when none found.

variable_list

data.table. Survey variable metadata. Required columns:

  • variable — column name in persons

  • description — label or description

settings

list. Settings object (not used directly).

Value

data.table. Ethnicity label mapping. Columns:

  • variable — column name in persons

  • label — lowercase description

  • short_label — standardized ethnicity label Rows: one per ethnicity variable. Keys: (variable).

Details

  • Filters variable_list for ethnicity columns using regex (e.g., ethnicity_[0-9]+$).

  • Converts descriptions to lowercase labels.

  • If no ethnicity columns found, falls back to race columns and sets default label.

  • Assigns short_label to each variable using ethnicity_map regex patterns.

  • Checks that all expected labels are present and non-missing.

  • Returns a data.table with variable, label, and short_label columns.

  • Assumes valid variable_list and ethnicity_map; errors if mapping is incomplete.

Settings

None.

Examples

## Not run:
prepare_ethnicity_labels(ethnicity_map, default_eth = "hispanic", variable_list, settings)
#> Error in prepare_ethnicity_labels(ethnicity_map, default_eth = "hispanic",     variable_list, settings): could not find function "prepare_ethnicity_labels"
## End(Not run)