Software Implementation¶
This page describes the bca4abm software implementation and how to contribute.
The implementation starts with the ActivitySim framework, which serves as the foundation for the software. The framework, as briefly described below, includes features for data pipeline management, expression handling, testing, etc. Built upon the framework are additional core components for benefits calculation.
ActivitySim Framework¶
bca4abm is implemented in the ActivitySim framework. As summarized here, being implemented in the ActivitySim framework means:
- Overall Design 
- Data Handling - Inputs are in CSV format, with the exception of settings 
- CSVs are read-in as pandas tables and stored in an intermediate HDF5 binary file that is used for data I/O throughout the model run 
- Key outputs are written to CSV files 
 
- Key Data Structures - pandas.DataFrame - A data table with rows and columns, similar to an R data frame, Excel worksheet, or database table 
- pandas.Series - a vector of data, a column in a DataFrame table or a 1D array 
- numpy.array - an N-dimensional array of items of the same type, such as a matrix 
 
- Model Orchestrator - ORCA is used for running the overall model system and for defining dynamic data tables, columns, and injectables (functions). ActivitySim wraps ORCA functionality to make a Data Pipeline tool, which allows for re-starting at any model step. 
 
- Expressions - Model expressions are in CSV files and contain Python expressions, mainly pandas/numpy expression that operate on the input data tables. This helps to avoid modifying Python code when making changes to the model calculations. 
 
- 
- Python code according to pycodestyle style guide 
- Written in reStructuredText markup, built with Sphinx and docstrings written in numpydoc 
 
Common¶
Software components common to both ABM and four-step model usage.
bca4abm¶
- 
bca4abm.bca4abm.calc_rows_per_chunk(chunk_size, df, spec, extra_columns=0, trace_label=None)¶
- simple rows_per_chunk calculator for chunking calls to assign_variables - ActivitySim’s chunk.rows_per_chunk method handles the main logic, including a missing/zero chunk size - Parameters
- chunk_sizeint
- dfpandas DataFrame
- specpandas DataFrame
- extra_columnsint, optional
- trace_labelstr, optional
 
- Returns
- num_rowsint
- effective_chunk_sizeint
 
 
- 
bca4abm.bca4abm.eval_and_sum(assignment_expressions, df, locals_dict, group_by_column_names=None, df_alias=None, chunk_size=0, trace_rows=None)¶
- Evaluate assignment_expressions against df, and sum the results (sum by group if list of group_by_column_names is specified. e.g. group by coc column names and return sums grouped by community of concern.) - Parameters
- assignment_expressions
- df
- locals_dict
- group_by_column_namesarray of str
- list of names of the columns to group by (e.g. coc_column_names of trip_coc_end) 
- df_aliasstr
- assign_variables df_alias (name of df in assignment_expressions) 
- chunk_sizeint
- trace_rowsarray of bool
- array indicating which rows in df are to be traced 
 
 
- 
bca4abm.bca4abm.read_assignment_spec(fname)¶
- Read a CSV model specification into a Pandas DataFrame or Series. - The CSV is expected to have columns for component descriptions targets, and expressions, - The CSV is required to have a header with column names. For example: - Description,Target,Expression,Silos - Parameters
- fnamestr
- Name of a CSV spec file. 
 
- Returns
- specpandas.DataFrame
- The description column is dropped from the returned data and the expression values are set as the table index. 
 
 
- 
bca4abm.bca4abm.scalar_assign_variables(assignment_expressions, locals_dict)¶
- Evaluate a set of variable expressions from a spec in the context of a given data table. - Python expressions are evaluated in the context of this function using Python’s eval function. Users should take care that these expressions must result in a scalar - Parameters
- assignment_expressionspandas sequence of str
- locals_dictDict
- This is a dictionary of local variables that will be the environment for an evaluation of an expression that begins with @ 
 
- Returns
- variablespandas.DataFrame
- Will have the index of df and columns of exprs. 
 
 
aggregate_trips¶
- 
bca4abm.processors.aggregate_trips.aggregate_trips_processor(aggregate_trips_spec, settings, data_dir)¶
- Compute aggregate trips benefits - The data manifest contains a list of trip count files (one for base, one for build) along with their their corresponding in-vehicle-time (ivt), operating cost (aoc), and toll skims. - Since the skims are all aligned numpy arrays , we can express their benefit calculation as vector computations in the aggregate_trips_spec 
- 
bca4abm.processors.aggregate_trips.logger= <Logger bca4abm.processors.aggregate_trips (WARNING)>¶
- Aggregate trips processor 
ABM Processors¶
Software components for ABM model usage.
abm_results¶
auto_ownership¶
- 
bca4abm.processors.abm.auto_ownership.auto_ownership_processor(persons_merged, auto_ownership_spec, auto_ownership_settings, coc_column_names, chunk_size, trace_hh_id)¶
- Compute auto ownership benefits 
- 
bca4abm.processors.abm.auto_ownership.logger= <Logger bca4abm.processors.abm.auto_ownership (WARNING)>¶
- auto ownership processor 
demographics¶
- 
bca4abm.processors.abm.demographics.logger= <Logger bca4abm.processors.abm.demographics (WARNING)>¶
- Demographics processor 
person_trips¶
- 
bca4abm.processors.abm.person_trips.logger= <Logger bca4abm.processors.abm.person_trips (WARNING)>¶
- Person trips processor 
- 
bca4abm.processors.abm.person_trips.person_trips_processor(trips_with_demographics, person_trips_spec, person_trips_settings, coc_column_names, settings, chunk_size, trace_hh_id)¶
- Compute disaggregate trips benefits 
physical_activity¶
- 
bca4abm.processors.abm.physical_activity.logger= <Logger bca4abm.processors.abm.physical_activity (WARNING)>¶
- physical activity processor 
- 
bca4abm.processors.abm.physical_activity.physical_activity_processor(trips_with_demographics, persons_merged, physical_activity_trip_spec, physical_activity_person_spec, physical_activity_settings, coc_column_names, settings, chunk_size, trace_hh_id)¶
- Compute physical benefits - Physical activity benefits generally accrue if the net physical activity for an individual exceeds a certain threshold. We calculate individual physical activity based on trips, so we need to compute trip activity and then sum up to the person level to calculate benefits. We chunk trips by household id to ensure that all of a persons trips are in the same chunk. 
Four Step Processors¶
Software components for four-step model usage.
aggregate_demographics¶
- 
bca4abm.processors.four_step.aggregate_demographics.aggregate_demographics_processor(zone_hhs, aggregate_demographics_spec, settings, trace_od)¶
- Parameters
- zone_hhsorca table
- input zone demographics 
 
 
- 
bca4abm.processors.four_step.aggregate_demographics.logger= <Logger bca4abm.processors.four_step.aggregate_demographics (WARNING)>¶
- Aggregate demographics processor - each row in the data table to solve is an origin zone and this processor calculates communities of concern (COC) / market segments based on mf.cval.csv 
aggregate_zone¶
- 
bca4abm.processors.four_step.aggregate_zone.aggregate_zone_processor(zones, trace_od)¶
- zones: orca table - zone data for base and build scenario dat files combined into a single dataframe with columns names prefixed with base_ or build_ indexed by ZONE 
- 
bca4abm.processors.four_step.aggregate_zone.logger= <Logger bca4abm.processors.four_step.aggregate_zone (WARNING)>¶
- Aggregate zone processor - each row in the data table to solve is an origin zone and this processor calculates zonal auto ownership differences as well as the differences in the destination choice logsums - ma.<purpose|income>dcls.csv Maybe the ma.<purpose|income>dcls.csv files should be added to the mf.cval.csv before input to the bca tool? 
aggregate_od¶
- 
class bca4abm.processors.four_step.aggregate_od.ODSkims(omx_file_path, name, zone_index, transpose=False, cache_skims=True)¶
- Wrapper for skim arrays to facilitate use of skims by aggregate_od_processor - Parameters
- skims_dictempty dict to cache skims read from file
- omx: open omx file object
- this is only used to load skims on demand that were not preloaded 
- length: int
- number of zones in skim to return in skim matrix in case the skims contain additional external zones that should be trimmed out so skim array is correct shape to match (flattened) O-D tiled columns in the od dataframe 
- transpose: bool
- whether to transpose the matrix before flattening. (i.e. act as a D-O instead of O-D skim) 
 
 
- 
bca4abm.processors.four_step.aggregate_od.create_zone_matrices(model_settings, zones)¶
- ODSkims look-alikes that have identical values for all zone origins/dests - i.e. we either repeat (origin_zone_matrices) or tile (dest_zone_matrices) zone values to expand zones columns into ODSkims-style flattened arrays 
- 
bca4abm.processors.four_step.aggregate_od.logger= <Logger bca4abm.processors.four_step.aggregate_od (WARNING)>¶
- Aggregate OD processor - each row in the data table to solve is an OD pair and this processor calculates trip differences. It requires the access to input zone tables, the COC coding, trip matrices and skim matrices. The new OD_aggregate_manifest.csv file tells this processor what data it can use and how to reference it. The following input data tables are required: assign_mfs.omx, inputs and results of the zone aggregate processor, and skims_mfs.omx. 
Contribution Guidelines¶
bca4abm development follows the same development guidelines as ActivitySim.
Release Notes¶
v0.4 - first release
v0.5 - add Python 3.5+ support
v0.6 - update 4step example