Skip to contents

Checks that initial weights match PUMS reference counts within a threshold. Use to validate initial weighting step for survey data.

Usage

check_initial_weights(
  seed,
  pums_targets,
  settings,
  threshold = 0.05,
  stop_on_fail = TRUE
)

Arguments

seed

data.table with required columns:

  • p_total

  • h_total

  • initial_weight

  • day_group Rows: one per household-day. Keys: (day_group). Modified by reference: no (returns copy).

pums_targets

data.table with required columns:

  • h_total

  • p_total Rows: one per household-day. Keys: (day_group). Modified by reference: no (returns copy).

settings

list. Must include:

  • weight_dow_groups — valid day-of-week groups

  • study_unit — study unit type

threshold

numeric(1). Acceptable percent difference. Default 0.05.

stop_on_fail

logical. If TRUE, stops on failure.

Value

NULL. Used for side effect of validation.

Details

  • Calculates expected total weights for households and persons.

  • Compares weighted sums to PUMS targets by day group.

  • Reports percent differences and stops if above threshold.

  • Handles household and person study units.

  • Used for validation; returns nothing.

Settings

  • weight_dow_groups (direct): Valid day-of-week groups. Default from settings.

  • study_unit (direct): Study unit type. Default from settings.

See also

  • scripts/weighting/initial_weights.R

Other validation helpers: check_daypat_results(), check_group_sum(), check_group_sums(), check_ref_counts(), check_weight_skew()

Examples

## Not run:
seed <- data.table(p_total = 100, h_total = 50, initial_weight = 1.2, day_group = "weekday")
pums_targets <- data.table(h_total = 50, p_total = 100, day_group = "weekday")
settings <- list(weight_dow_groups = c("Mon", "Tue"), study_unit = "household")
check_initial_weights(seed, pums_targets, settings)
#> Checking base weights (5-year ACS) against PUMS targets (1-year ACS)...
#> Day group: weekday
#>     PER: 120 (5-year), 50 (1-year), 140%
#>     HH: 60 (5-year), 25 (1-year), 140%
#> Error in check_initial_weights(seed, pums_targets, settings): 
#> Reference count differs from ACS by more than 5%.
#> Check the sample plan or the seed data.
## End(Not run)