Software Implementation¶
This page describes the bca4abm software implementation and how to contribute.
The implementation starts with the ActivitySim framework, which serves as the foundation for the software. The framework, as briefly described below, includes features for data pipeline management, expression handling, testing, etc. Built upon the framework are additional core components for benefits calculation.
ActivitySim Framework¶
bca4abm is implemented in the ActivitySim framework. As summarized here, being implemented in the ActivitySim framework means:
Overall Design
Data Handling
Inputs are in CSV format, with the exception of settings
CSVs are read-in as pandas tables and stored in an intermediate HDF5 binary file that is used for data I/O throughout the model run
Key outputs are written to CSV files
Key Data Structures
pandas.DataFrame - A data table with rows and columns, similar to an R data frame, Excel worksheet, or database table
pandas.Series - a vector of data, a column in a DataFrame table or a 1D array
numpy.array - an N-dimensional array of items of the same type, such as a matrix
Model Orchestrator
ORCA is used for running the overall model system and for defining dynamic data tables, columns, and injectables (functions). ActivitySim wraps ORCA functionality to make a Data Pipeline tool, which allows for re-starting at any model step.
Expressions
Model expressions are in CSV files and contain Python expressions, mainly pandas/numpy expression that operate on the input data tables. This helps to avoid modifying Python code when making changes to the model calculations.
-
Python code according to pycodestyle style guide
Written in reStructuredText markup, built with Sphinx and docstrings written in numpydoc
Common¶
Software components common to both ABM and four-step model usage.
bca4abm¶
-
bca4abm.bca4abm.
calc_rows_per_chunk
(chunk_size, df, spec, extra_columns=0, trace_label=None)¶ simple rows_per_chunk calculator for chunking calls to assign_variables
ActivitySim’s chunk.rows_per_chunk method handles the main logic, including a missing/zero chunk size
- Parameters
- chunk_sizeint
- dfpandas DataFrame
- specpandas DataFrame
- extra_columnsint, optional
- trace_labelstr, optional
- Returns
- num_rowsint
- effective_chunk_sizeint
-
bca4abm.bca4abm.
eval_and_sum
(assignment_expressions, df, locals_dict, group_by_column_names=None, df_alias=None, chunk_size=0, trace_rows=None)¶ Evaluate assignment_expressions against df, and sum the results (sum by group if list of group_by_column_names is specified. e.g. group by coc column names and return sums grouped by community of concern.)
- Parameters
- assignment_expressions
- df
- locals_dict
- group_by_column_namesarray of str
list of names of the columns to group by (e.g. coc_column_names of trip_coc_end)
- df_aliasstr
assign_variables df_alias (name of df in assignment_expressions)
- chunk_sizeint
- trace_rowsarray of bool
array indicating which rows in df are to be traced
-
bca4abm.bca4abm.
read_assignment_spec
(fname)¶ Read a CSV model specification into a Pandas DataFrame or Series.
The CSV is expected to have columns for component descriptions targets, and expressions,
The CSV is required to have a header with column names. For example:
Description,Target,Expression,Silos
- Parameters
- fnamestr
Name of a CSV spec file.
- Returns
- specpandas.DataFrame
The description column is dropped from the returned data and the expression values are set as the table index.
-
bca4abm.bca4abm.
scalar_assign_variables
(assignment_expressions, locals_dict)¶ Evaluate a set of variable expressions from a spec in the context of a given data table.
Python expressions are evaluated in the context of this function using Python’s eval function. Users should take care that these expressions must result in a scalar
- Parameters
- assignment_expressionspandas sequence of str
- locals_dictDict
This is a dictionary of local variables that will be the environment for an evaluation of an expression that begins with @
- Returns
- variablespandas.DataFrame
Will have the index of df and columns of exprs.
aggregate_trips¶
-
bca4abm.processors.aggregate_trips.
aggregate_trips_processor
(aggregate_trips_spec, settings, data_dir)¶ Compute aggregate trips benefits
The data manifest contains a list of trip count files (one for base, one for build) along with their their corresponding in-vehicle-time (ivt), operating cost (aoc), and toll skims.
Since the skims are all aligned numpy arrays , we can express their benefit calculation as vector computations in the aggregate_trips_spec
-
bca4abm.processors.aggregate_trips.
logger
= <Logger bca4abm.processors.aggregate_trips (WARNING)>¶ Aggregate trips processor
ABM Processors¶
Software components for ABM model usage.
abm_results¶
auto_ownership¶
-
bca4abm.processors.abm.auto_ownership.
auto_ownership_processor
(persons_merged, auto_ownership_spec, auto_ownership_settings, coc_column_names, chunk_size, trace_hh_id)¶ Compute auto ownership benefits
-
bca4abm.processors.abm.auto_ownership.
logger
= <Logger bca4abm.processors.abm.auto_ownership (WARNING)>¶ auto ownership processor
demographics¶
-
bca4abm.processors.abm.demographics.
logger
= <Logger bca4abm.processors.abm.demographics (WARNING)>¶ Demographics processor
person_trips¶
-
bca4abm.processors.abm.person_trips.
logger
= <Logger bca4abm.processors.abm.person_trips (WARNING)>¶ Person trips processor
-
bca4abm.processors.abm.person_trips.
person_trips_processor
(trips_with_demographics, person_trips_spec, person_trips_settings, coc_column_names, settings, chunk_size, trace_hh_id)¶ Compute disaggregate trips benefits
physical_activity¶
-
bca4abm.processors.abm.physical_activity.
logger
= <Logger bca4abm.processors.abm.physical_activity (WARNING)>¶ physical activity processor
-
bca4abm.processors.abm.physical_activity.
physical_activity_processor
(trips_with_demographics, persons_merged, physical_activity_trip_spec, physical_activity_person_spec, physical_activity_settings, coc_column_names, settings, chunk_size, trace_hh_id)¶ Compute physical benefits
Physical activity benefits generally accrue if the net physical activity for an individual exceeds a certain threshold. We calculate individual physical activity based on trips, so we need to compute trip activity and then sum up to the person level to calculate benefits. We chunk trips by household id to ensure that all of a persons trips are in the same chunk.
Four Step Processors¶
Software components for four-step model usage.
aggregate_demographics¶
-
bca4abm.processors.four_step.aggregate_demographics.
aggregate_demographics_processor
(zone_hhs, aggregate_demographics_spec, settings, trace_od)¶ - Parameters
- zone_hhsorca table
input zone demographics
-
bca4abm.processors.four_step.aggregate_demographics.
logger
= <Logger bca4abm.processors.four_step.aggregate_demographics (WARNING)>¶ Aggregate demographics processor
each row in the data table to solve is an origin zone and this processor calculates communities of concern (COC) / market segments based on mf.cval.csv
aggregate_zone¶
-
bca4abm.processors.four_step.aggregate_zone.
aggregate_zone_processor
(zones, trace_od)¶ zones: orca table
zone data for base and build scenario dat files combined into a single dataframe with columns names prefixed with base_ or build_ indexed by ZONE
-
bca4abm.processors.four_step.aggregate_zone.
logger
= <Logger bca4abm.processors.four_step.aggregate_zone (WARNING)>¶ Aggregate zone processor
each row in the data table to solve is an origin zone and this processor calculates zonal auto ownership differences as well as the differences in the destination choice logsums - ma.<purpose|income>dcls.csv Maybe the ma.<purpose|income>dcls.csv files should be added to the mf.cval.csv before input to the bca tool?
aggregate_od¶
-
class
bca4abm.processors.four_step.aggregate_od.
ODSkims
(omx_file_path, name, zone_index, transpose=False, cache_skims=True)¶ Wrapper for skim arrays to facilitate use of skims by aggregate_od_processor
- Parameters
- skims_dictempty dict to cache skims read from file
- omx: open omx file object
this is only used to load skims on demand that were not preloaded
- length: int
number of zones in skim to return in skim matrix in case the skims contain additional external zones that should be trimmed out so skim array is correct shape to match (flattened) O-D tiled columns in the od dataframe
- transpose: bool
whether to transpose the matrix before flattening. (i.e. act as a D-O instead of O-D skim)
-
bca4abm.processors.four_step.aggregate_od.
create_zone_matrices
(model_settings, zones)¶ ODSkims look-alikes that have identical values for all zone origins/dests
i.e. we either repeat (origin_zone_matrices) or tile (dest_zone_matrices) zone values to expand zones columns into ODSkims-style flattened arrays
-
bca4abm.processors.four_step.aggregate_od.
logger
= <Logger bca4abm.processors.four_step.aggregate_od (WARNING)>¶ Aggregate OD processor
each row in the data table to solve is an OD pair and this processor calculates trip differences. It requires the access to input zone tables, the COC coding, trip matrices and skim matrices. The new OD_aggregate_manifest.csv file tells this processor what data it can use and how to reference it. The following input data tables are required: assign_mfs.omx, inputs and results of the zone aggregate processor, and skims_mfs.omx.
Contribution Guidelines¶
bca4abm development follows the same development guidelines as ActivitySim.
Release Notes¶
v0.4 - first release
v0.5 - add Python 3.5+ support
v0.6 - update 4step example