NA
Audience: analysts who want to run, explore, or extend the HTS weighting workflow. Purpose: Explain how to use the repo day-to-day — running it end-to-end, stepping into specific chapters, using the package functions, and managing caches. —
What is This Repository?
This repository contains three tightly connected things:
An R package (
hts.weighting) — a library of helper functions that the book uses internally and that you can also call interactively.A YAML configuration system (
configs/) — a set of example configuration files that define inputs, outputs, and parameters for different regions and scenarios. You can create your own by copying and modifying these.A runnable workflow — the Quarto book (
manual/) executes the full weighting process, step by step, from configuration through final weights.
The Quarto chapters are the “scripts.” They read your YAML configuration (see configs/) process data, and write both intermediate data caches and final results to a structured folder.
Running the Workflow End-to-End
Once your environment and configuration file are set up (see INSTALL.md and CONFIGURATION.qmd), you can recreate all weights by rendering the entire manual:
This runs every chapter in order, using cached results when possible. The final HTML lives at manual/_book/index.html and the data reside in the project directories under the WEIGHTING_DATA_PATH specified in your .Renviron(default: test_cache/).
If you only want to rerun a single stage (for example, data cleaning or initial expansion):
Each chapter is self-contained: it reads inputs from the cache, runs its step, and saves new outputs back to the cache; however, note that later chapters depend on earlier ones, so you may need to run prior steps first – or update subsequent chapters if you change earlier outputs. Future versions may include more granular dependency tracking, e.g., with makepipe or targets.
Some code chunks hide code, messages, or warnings for cleaner output. To view all code and outputs, open the
.qmdfiles and setecho: true,messages: true, andwarnings: truein chunk options. See Quarto Execution Options for details.
Exploring and Modifying Quarto “Scripts”
You can open any .qmd in your IDE (Positron, VSCode, RStudio) and run it interactively, line by line.
This is the best way to explore logic, test small changes, or inspect objects mid-run.
Typical workflow:
Open the chapter you want (e.g., manual/040_initial_expansion.qmd).
Run code cells interactively using your IDE’s “Run Cell” or “Run Line” tools.
Unpack
hts.weightingfunctions by navigating to their definitions (in theR/folder) and running them line-by-line as needed, or withdebug()When satisfied, re-render the chapter to update its cached outputs.
Exploring and Modifying the Package Functions
You can call the same functions used in the quarto “scripts” directly from an R session:
devtools::load_all() # load the package from source
settings <- get_settings() # reads your YAML config, sets up pathsFrom there, you can inspect, modify, or reuse the intermediate data created in the caches.
For example:
# load an intermediate dataset
hh <- readRDS(file.path(settings$working_dir, "household_clean.rds"))
# experiment with a package function interactively
debug(calc_initial_weights)
calc_initial_weights(hh, settings)
undebug(calc_initial_weights)Edits to functions under R/ take effect immediately after running (or re-running) devtools::load_all() — no reinstall needed.
Understanding and Managing Caches
The weighting project maintains two layers of caching:
Quarto Execution Caches
- Stored in _freeze/ inside the manual/ directory.
- Controls whether Quarto re-runs code chunks.
- Each chapter’s cache is automatically invalidated when its code or inputs change.
- You can safely delete
manual/_freeze/to force a clean rebuild of the manual. - Or turn off caching entirely by adding
cache: falseto the chapter or book YAML header (see:manual/_quarto.yml). This is the safest option when testing code.
Data Caches in the Project Root (the “Weighting Cache”)
Located under your working data path, typically test_cache/ or a directory defined in your .Renviron settings.
A typical structure:
test_cache/
input/ # raw survey + control data
working/ # intermediate outputs per stage
output/ # final household/person/day weights
report/ # diagnostics, summaries, plotsEach chapter reads from and writes to these subfolders using paths defined in settings.
Cache management tips:
- To rerun one stage cleanly: delete that stage’s files in
working/andreport/, then re-render its.qmd. - To reset the entire project: delete all subfolders inside
test_cache/. (Sometimes: delete all exceptinput/to keep raw data.) - To test edits without overwriting production outputs: point to a different cache root in your YAML (e.g.,
test_cache_dev/). - To inspect results: you can open any intermediate RDS or CSV file in
working/orreport/.
Because the Quarto book uses the same cache directories across chapters, outputs cascade automatically: one chapter’s outputs become the next chapter’s inputs.
Debugging
To step through a function while running a chapter or in the console:
debug(calc_initial_weights)In Positron and VSCode, execution will pause inside the function, letting you inspect variables with n, c, and Q.
To turn debugging off afterward:
undebug(calc_initial_weights)