Prepare PUMS data for non-relative income imputation
prepare_income_fit_dt.RdCleans and formats PUMS person-level data for use in non-relative income imputation models, generating predictors and flags for model fitting. Use to create model-ready data for quantile regression or similar approaches.
Arguments
- pums_cleaned
data.table. Cleaned PUMS person-level data. Required columns:
PUMA— PUMA identifier SERIALNO— household ID SPORDER— person sequence number PINCP— personal income AGEP— age RELSHIPP_label— relationship label Rows: one per person. Keys: ( SERIALNO,SPORDER). Modified by reference: no (returns copy).
- target_names
character vector. Names of model predictor variables (e.g., 'p_employment', 'p_age', 'p_univstudent').
- settings
list. Settings object with configs. Keys:
age_employable, optional — minimum age for employable persons. Default 16.
Value
data.table. Model-ready PUMS data for imputation. Columns:
person_id— unique person identifier hh_id— household identifier model predictor columns for each variable in target_names
PINCP— personal income AGEP— age unrelated— flag for nonrelative status Rows: one per person. Keys: ( person_id,hh_id).
Details
Reads PUMS codebook and sets employable age threshold from settings (default 16).
Copies input data to avoid reference modification.
Adds identifiers: puma_id, hh_id, person_id.
Calls prepare_impute_targets to generate model predictors for each person.
Sets negative incomes and ages below employable threshold to zero.
Flags nonrelatives using RELSHIPP_label pattern matching.
Subsets to working-age, unrelated persons for model fitting.
Returns a data.table with predictors and flags for imputation.
Assumes valid PUMS data and codebook; errors if required columns are missing.
See also
prepare_impute_targets, prepare_persons_dt
Other imputation utilities:
calculate_acs_proportions(),
get_acs_ethnicity(),
get_acs_race(),
get_hh_person_sums(),
impute_ethnicity(),
impute_gender(),
impute_income_nonrelatives(),
impute_income_pnta(),
impute_race(),
make_binary(),
prep_hhs_for_income_imputation(),
prepare_acs_income(),
prepare_ethnicity_labels(),
prepare_ethnicity_survey_data(),
prepare_impute_targets(),
prepare_persons_dt(),
update_hh_income_imputed()
Examples
## Not run:
prepare_income_fit_dt(pums_cleaned, target_names = c('p_employment', 'p_age'), settings)
#> Error in prepare_income_fit_dt(pums_cleaned, target_names = c("p_employment", "p_age"), settings): could not find function "prepare_income_fit_dt"
## End(Not run)