Convenience function for splitting a column at a delimiter, unnesting (one row per value after splitting) and removing unnecessary whitespace. Default is to split at brackets. Returns a tibble tbl_df.
Arguments
- df
A data.frame or tibble
- ab
(character(1), default "Antigen) Name of the column to remove prefixes from
- split
(character(1), default "[\(\)]") A regular expression for using with strsplit. The default expression splits at "(" or ")".
- new_col
(character(1), default NA Name of the column to add to df. If NA, column ab is modified.
- exclude
(character(1), default NA) a regex - do not split if ab matches.
Examples
df <- data.frame(Antigen = c("CD279 (PD-1)", "Mac-2 (Galectin-3)"))
splitUnnest(df, "Antigen", new_col = "Split")
#> # A tibble: 4 × 2
#> Antigen Split
#> <chr> <chr>
#> 1 CD279 (PD-1) CD279
#> 2 CD279 (PD-1) PD-1
#> 3 Mac-2 (Galectin-3) Mac-2
#> 4 Mac-2 (Galectin-3) Galectin-3