Skip to contents

Summarizes disaggregated survey data, applying target updates for consistency with PUMS targets, and computes totals and confidence intervals by group. Use for reporting and validation of survey results against control targets.

Usage

summarize_survey(
  survey_wide,
  puma_zone_group_xwalk,
  id_cols = "hh_id",
  weight_col = "final_weight",
  strata_col = NULL,
  group_col = NULL,
  ci_level = 0.9,
  settings
)

Arguments

survey_wide

data.table. Disaggregated survey target data. Required columns:

  • hh_id : household ID

  • person_id : person ID

  • final_weight : survey weights

  • Additional columns for targets Rows: one per person or household. Modified by reference: no (returns copy).

puma_zone_group_xwalk

data.table. Crosswalk between PUMS and zone groups via client zones. Must include all grouping columns except geometry.

id_cols

character vector. Columns to identify unique units (hierarchy). Passed to ids argument of survey::svydesign. Default: "hh_id".

weight_col

character(1). Column with weights. Default: "final_weight".

strata_col

character(1). Column to stratify by. Default: NULL.

group_col

character vector. Columns to group by. Default: NULL.

ci_level

numeric. Confidence interval level (fraction, not percent). Default: 0.9.

settings

list. Project settings; must include target update definitions.

Value

data.table. Summarized target data by group, with columns:

  • Grouping columns (as specified)

  • Target variable columns

  • total, lower, upper for each target (confidence interval)

  • Row order: by group and target

Details

  • Checks for required columns: hh_id, final_weight, strata_col, group_col.

  • Drops geometry from crosswalk to reduce memory usage.

  • Applies update_targets() to harmonize survey targets with PUMS definitions.

  • Calls summarize_data() to aggregate by group and calculate confidence intervals.

  • Uses message() to report grouping.

  • Returns a data.table of summarized targets by group, with confidence intervals.

  • Error handling: stops if required columns are missing.

Settings

  • Uses target update definitions from settings for harmonization.

See also

summarize_data, update_targets, summarize_pums

Other reporting utilities: find_level_idx(), get_target_map(), summarize_data(), summarize_pums(), tabulate_target()

Examples

## Not run:
summarize_survey(survey_wide, 
                 puma_zone_group_xwalk, 
                 id_cols = "hh_id", 
                 weight_col = "final_weight", 
                 group_col = "zone_group", 
                 settings = settings)
#> Error: object 'survey_wide' not found
## End(Not run)