Skip to contents

Converts a set of probability columns to binary indicators by selecting the maximum value in each row. Intended for multinomial imputation or categorical assignment from probabilities. This function is experimental and not ready for production use.

Usage

make_binary(data, grp_pattern, id_var)

Source

Placeholder logic adapted from impute_race and make_probs_binary.

Arguments

data

data.table. Input with probability columns to convert.

  • Required columns: all matching grp_pattern (e.g., f_a, f_b, ...)

  • Other columns: must include id_var (character or integer, unique row ID)

  • Rows: one per observation.

  • Modified by reference: no (returns copy).

grp_pattern

character(1). Regex pattern matching columns to convert (e.g., 'f_').

id_var

character(1). Name of ID column for row identification.

Value

data.table. Copy of input with binary indicator columns.

  • Columns: all original columns except replaced probability columns (now binary indicators)

  • Each binary column: 1 for max probability, 0 for others

  • Row order preserved

Details

  • Identifies columns matching a regex pattern (e.g., '^f_').

    • Uses: str_subset(names(data), paste0('^', grp_pattern))

    • Example: grp_pattern = 'f_' matches columns like f_a, f_b, etc.

  • For each row, sets 1 for the column with the maximum probability, 0 for others.

  • Returns a copy of input with binary columns replacing probabilities.

  • Checks that row sums equal 1 (within tolerance).

  • Assumes input is a data.table; does not modify by reference.

  • Placeholder for future implementation using probabilistic sampling (see commented code).

  • If called, function halts with stop("Reimplement using sample").

  • Example regex patterns:

    • ^f_ — selects columns starting with 'f_'.

    • Used for multinomial assignment from probability columns.

Settings

None.

Examples

## Not run:
dt <- data.table(id = 1:3, f_a = c(0.2, 0.5, 0.3), f_b = c(0.8, 0.5, 0.7))
make_binary(dt, 'f_', 'id')
#> Error in make_binary(dt, "f_", "id"): could not find function "make_binary"
## End(Not run)