Skip to contents

Remove outliers from a numeric variable

Usage

hts_remove_outliers(var_dt, numvar = NULL, threshold = 0.975)

Arguments

var_dt

Dataset with a numeric variable to remove outliers from in data.table format.

numvar

Numeric variable to remove outliers from. Default is NULL.

threshold

Threshold to define what an outlier is. Default is .975.

Value

List of outliers removed and the dataset without the outliers.

Examples


require(data.table)

hts_remove_outliers(var_dt = trip, numvar = "speed_mph")
#> Warning: 378 outliers were removed based on the threshold of 0.975.
#> $outlier_description
#>    threshold num_removed min_outlier max_outlier
#> 1:     0.975         378    112.9918    228233.1
#> 
#> $dt
#>        day_id trip_id  speed_mph distance_miles mode_type mode_1 mode_2
#>     1:      1    6848  0.3570582     0.07736261         8      6    995
#>     2:      1    6099  3.8030030     0.31691692         8     34    995
#>     3:      1   15759  9.2827577     0.16244826         1      1    995
#>     4:      1   13883 10.7289440    10.72894403        13      2     23
#>     5:      1    9240  1.3936891     0.47308002         2      2    995
#>    ---                                                                 
#> 14718:   4125    4505 16.3377147     3.23577517         8      6    995
#> 14719:   4125    7897 42.9297111    22.65734754         8      6    995
#> 14720:   4125     719  1.5648402     7.77203953         1      1    995
#> 14721:   4125   14260 10.5795319     1.76325532         8      7    995
#> 14722:   4125    4397  8.5320851     1.42201419         8     34    995
#>        num_travelers d_purpose_category hh_id person_id travel_date trip_weight
#>     1:             1                  7   642       820  2023-05-28         957
#>     2:             2                  7   642       820  2023-05-28         237
#>     3:             1                  9   642       820  2023-05-28         287
#>     4:             1                 11   642       820  2023-05-28         361
#>     5:             1                  1   642       820  2023-05-28         578
#>    ---                                                                         
#> 14718:             1                 12   876      1684  2023-05-30         999
#> 14719:             1                  2   876      1684  2023-05-30         167
#> 14720:             1                 12   876      1684  2023-05-30         954
#> 14721:             1                  2   876      1684  2023-05-30         841
#> 14722:             2                  7   876      1684  2023-05-30         977
#>