2 Installation & Setup
Installation Guide for hts_weighting
2.1 At a Glance
Purpose: Guide users through installing all necessary software and configuring their machine to run the hts_weighting project.
Audience: anyone who needs to run the hts_weighting project locally.
Time to Complete:** ~30-60 minutes, depending on prior experience and what’s already installed.
Key Components: You’ll install R, Python, and optionally an IDE before configuring project-specific settings. - R 4.4.3 with renv package management - Python 3.12.x with PopulationSim via uv environment manager - Quarto for rendering documentation - Plus some extras if you want them!
2.2 Prerequisites
Git & Repository Setup
Install Git from https://git-scm.com/downloads if you haven’t already.
Next, clone this repository to your local machine. Do this with your preferred Git client, or run in terminal:
git clone https://github.com/RSGInc/hts_weighting-psrc_2025.git- RSG Users: If you’ll be doing a lot of weighting, you might want to place all your clones of
hts_weightingin a common parent directory, e.g.,C:/Users/your.name/Documents/locals/hts_weighting_forks/. This makes it easier to switch between different versions or forks of the project, and avoids clutter.
R Environment Setup
We pin to R 4.4.3 for reproducibility across machines and CI. This is the “latest stable version” of R before the 4.5.x series. You will also need to install build tools (Rtools) for your OS to compile packages from source.
Windows:
- Install R 4.4.3 (default
C:\Program Files\R\R-4.4.3).
- Install Rtools44 at this install page: Rtools44 for Windows.
- Verify your R build tools are working. In R, run:
install.packages("pkgbuild")
pkgbuild::has_build_tools(debug = TRUE)macOS:
- Install R 4.4.3 for your CPU.
- Install Xcode Command-Line Tools (CLT):
xcode-select --installWilling to be a guinea pig? Mac users may want to try installing the toolchain with the package macrtools:
remotes::install_github("rmacoslib/macrtools")
macrtools::macos_rtools_install()RSG has not tested macrtools (we are a Windows shop); please report back any issues – or if you find success!
Python Environment Manager
Install uv
Install uv, a lightweight Python environment manager, from https://docs.astral.sh/uv/reference/installer/. You can do this via command line:
# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows (PowerShell)
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"Create and Sync Virtual Environment
Then, from the repo root, sync the environment and install PopulationSim:
# from the repo root
uv syncThis creates .venv/ and installs PopulationSim.
Verify Python Executable
Open a bash or PowerShell terminal (use the dropdown arrow beside the terminal name).
Then run:
uv run python --versionYou should see something like:
Python 3.12.xIDE (optional but recommended)
Choose one editor.
Option 1: Positron
Positron is the preferred IDE for this repository. It combines RStudio’s tight R integration with VS Code’s Quarto and Python tooling. Plus, really nice YAML support through YAML Language Support.
Download from: https://positron.posit.co/download
After installing:
Open the repository folder: (
File → Open Folder, then select your clonedhts_weighting-psrc_2025directory).Confirm Positron detects R, Python, and Quarto Open an R Terminal in Positron (
Ctrl+Shift+Ror use the dropdown in the terminal panel).
Then run:
R.version.stringYou should see:
"R version 4.4.3 (2025-01-15)"Next check for Python:
uv run python --versionYou should see something like:
Python 3.12.xNot seeing the right versions? Check the Positron documentation at Interpreter Startup.
Quarto comes bundled with Positron, so just verify it’s working.
Open the command palette with Cmd/Ctrl + Shift + P → search “Extensions: Show Installed Extensions”
Confirm Quarto appears and is Enabled.
In terminal, run:
quarto checkYou should see something like:
Quarto 1.7.x
[>] Checking environment information...
Quarto cache location: C:\Users\your.name\AppData\Local\quarto(And a bunch of other stuff). You can ignore warnings about R version here, since we’ll fix that later.
Option 2: RStudio
- Install RStudio Desktop.
- Set R version: Tools → Global Options → General → R version → Change… → 4.4.3.
Option 3: VS Code
We recommend VSCode for power users or people who want tight integration with GitHub Copilot (still in development for Positron, RStudio). VS Code has excellent support for R, Python, Quarto, and YAML, but requires more setup (e.g., JSON workspace settings and extensions)
You’ll want to install extensions R Extension for Visual Studio Code, Quarto, and YAML (Red Hat).
Verify R your R version:
R --version
# or in R
R.version.stringIf VS Code can’t find R 4.4.3, or loads the wrong version, set Settings JSON:
Open VS Code.
Press Ctrl+Shift+P (or Cmd+Shift+P on Mac) and type Preferences: Open Workspace Settings (JSON)
Add or edit the following lines, then restart your terminal:
Windows
{
"r.rterm.windows": "C:\\Program Files\\R\\R-4.4.3\\bin\\x64\\R.exe",
"r.alwaysUseActiveTerminal": true,
"r.plot.useHttpgd": true
}macOS
{
"r.rterm.mac": "/Library/Frameworks/R.framework/Resources/bin/R",
"r.alwaysUseActiveTerminal": true,
"r.plot.useHttpgd": true
}2.3 Configure Environment Variables
Create .Renviron
This installation guide assumes you are using a project-level .Renviron file to set environment variables specific to this project and your machine. This file is git-ignored by default.
What is
.Renviron?.Renvironis a simple text file that R reads on startup to set environment variables. It can be placed at the user level (e.g.,~/.Renviron) or project level (in the root of your project). See Managing R for details.
Before proceeding, users should create a .Renviron in the root of this repository like the following, adjusting paths and credentials as needed. We’ll cover each variable below.
Template .Renviron File (Minimal)
PYTHON_VENV_PATH=.venv/Scripts/python.exe
RAW_DATA_PATH=path/to/your/PSRC_2025_client_inputs
# POPS_USER=YourUsername
# POPS_PASSWORD=SecretPassword
# POPS_HOST=db.example.com
# POPS_PORT=xxxx
WEIGHTING_SETTINGS_PATH=psrc_2025.yaml
WEIGHTING_DATA_PATH=test_cacheAdd Python Executable Path to .Renviron
To ensure R can find the Python executable managed by uv, set the PYTHON_VENV_PATH variable in your .Renviron file. This path should be relative to the repository root and point to the Python executable inside the .venv directory created by uv sync.
On Windows, use:
PYTHON_VENV_PATH=.venv/Scripts/python.exeOn macOS/Linux, use:
PYTHON_VENV_PATH=.venv/bin/python
Edit your .Renviron file accordingly after running uv sync.
Configure Data Source Access
Next, we add variables to point hts_weighting to the raw data inputs. RSG users will point to the POPS database with POPS_* variables. Clients will point to the path to the raw client inputs, typically shared on Sharepoint.
RAW_DATA_PATH=path/to/your/PSRC_2025_client_inputs
# POPS_USER=YourUsername
# POPS_PASSWORD=SecretPassword
# POPS_HOST=db.example.com
# POPS_PORT=xxxxSet Weighting Config Path
Next, we specify the path to the settings config yaml file. If absolute path is not used (recommended), it will assume configs directory is in getwd()/configs.
WEIGHTING_SETTINGS_PATH=psrc_2025.yamlSet Weighting Data Path
Finally, we set the WEIGHTING_DATA_PATH variable to point to where the project directories (input/, working/, output/, report/) will be created on first run. This path can be absolute or relative to the repository root. If not set, a temporary directory will be used (see ?tempdir), which is not persistent.
Note: If you name this anything but “test_cache”, you’ll need to add it to the .gitignore file in the root of this repository. Don’t commit your data!
WEIGHTING_DATA_PATH=test_cacheOptional Developer Settings
These settings are optional, and mainly useful for developers working on the package or running tests. To use, uncomment and set in your .Renviron file.
WEIGHTING_CODE_PATH can be used to specify where the hts_weighting code is stored, if different from the current working directory. This is useful when running devtools::check() to avoid path issues. It should point to the absolute path of the cloned repository.
TESTTHAT_CPUS can be used to specify the number of CPUs to use for parallel testing with testthat. Defaults to 2. TEARDOWN_TEST_DATA can be used to specify which data directories to teardown after tests complete. Options are inputs_dir, outputs_dir, report_dir, and working_dir. Use * to teardown the entire data directory.
# WEIGHTING_CODE_PATH=C:/you/local/path/hts_weighting
# TESTTHAT_CPUS=4
# TEARDOWN_TEST_DATA=inputs_dir, outputs_dir, report_dir, working_dir
# TEARDOWN_TEST_DATA=* # to teardown entire data dirInstall Census API Key
Many steps (ACS/PUMS pulls, geographies) rely on the tidycensus package, which makes calls to the Census’ API service. To use service, you need a Census API key. Follow these steps to install it:
- Sign up for a free API key at the Census Bureau’s website.
- Once you have your key, add it to your user-level (preferred) or project-level
.Renvironfile:
CENSUS_API_KEY=your_api_key_hereRestart R afterward.
2.4 Set Up R Package Environment
Initialize renv and Restore Packages
This repository currently uses renv 1.1.5. If needed, first install or update renv:
install.packages("renv")
renv::activate()Some users may need to run:
renv::load()We don’t know why this is, but suspect it has something to do with OneDrive mucking up file paths. If you discover the source of this error, please let us know!
Next, restore the R environment from the lockfile (renv.lock):
renv::restore()On restart you should see something like:
- Project '...' loaded. [renv 1.1.5]If you add new packages later, remember to run renv::snapshot() to update the lockfile.
Load the hts.weighting Package
Although hts.weighting is an R package, we prefer to load locally from source rather than installing from GitHub. Though it adds an extra step to installation and use, the main advantage is that it permits users to modify functions in tandem with the scripts that use them. We may change this workflow in the future, as the package matures.
Install devtools (once per project):
renv::install("devtools")
renv::snapshot()Then, try loading the package:
devtools::load_all()Finally, test that the package loaded correctly and that your environment is set up:
settings <- get_settings()Optional: If you’d like to avoid calling devtools::load_all() every time you open an R terminal, you can tell R to auto-load the package in interactive sessions via a git-ignored .Rprofile.local file in the root of your project. Add this chunk to .Rprofile.local:
if (interactive() && !identical(Sys.getenv("CI"), "true")) {
if (!requireNamespace("devtools", quietly = TRUE)) {
message("devtools not installed; run renv::install('devtools') then renv::snapshot(); restart R.")
} else {
try(devtools::load_all("."), silent = TRUE)
}
}Restart R, then verify with:
settings <- get_settings()Sometimes this is persnickety in Positron or VSCode; if you run into issues, just call devtools::load_all() manually.
2.5 Install Quarto (only for RStudio and VSCode users)
Note: Quarto is used to render the book, and some reporting
.qmdfiles inscripts/. It is not needed to run all the weighting scripts.
Positron bundles Quarto in its download; only RStudio/VS Code users will need to install Quarto ≥ 1.7. Download and install Quarto from https://quarto.org/docs/get-started/. After installation, verify that quarto is on your PATH. In a terminal, run:
quarto checkIf an old version appears, update your PATH and restart your terminal and R session.
Additionally, you may need to point Quarto to the correct R version.
During quarto check, you should see something like:
Checking R installation...........OK
Version: 4.4.3VSCode users may see the wrong R version here. To fix, add the path to your R installation to your Workspace Settings (Command Palette → “Preferences: Open Workspace Settings (JSON)”):
{
"terminal.integrated.env.windows": {
"QUARTO_R": "C:\\Program Files\\R\\R-4.4.3\\bin\\x64"
}
}For details and other potential solutions, see Quarto Environment Variables.
2.6 Optional: IDE Enhancements
YAML Schema Validation
This is optional but recommended - only for Positron/VSCode users.
To get live validation and autocomplete for your configuration .yaml files:
- In Positron or VSCode, install the YAML (Red Hat) extension.
- Open the Command Palette → “Preferences: Open Workspace Settings (JSON)”.
- Add this entry:
"yaml.schemas": {
"./configs/utils/configs_schema.json": ["configs/*.yaml"]
}You’ll now see instant linting and tooltips when editing any configs/ .yaml file. This is powered by the JSON schema at configs/utils/configs_schema.json.
⚠️ In development: Settings schema validation is relatively new to the hts.weighting package. Please report any issues you encounter, and use as a general guide rather than a definitive source of truth.
2.7 Troubleshooting
(If you run into new issues, report back here and add to this list!)
Windows: compilation errors
Install Rtools44, then in R:install.packages("pkgbuild") pkgbuild::has_build_tools(debug = TRUE)Quarto not found Install Quarto and ensure it’s on PATH;
quarto checkshould pass. Restart terminal + R after changing PATH.VS Code can’t find R Set
r.rterm.*in Settings JSON (examples above).