Skip to contents

Filter by First Order Parameters

Usage

filterFoc(
  foc_param,
  tid_data,
  cw_selection,
  is_dep_fun = function(foc_param) stringr::str_starts(foc_param, "LEMMA"),
  max_steps_fun = function(foc_param) if (foc_param == "LEMMAPATH2") 2 else 3,
  window_filter_fun = windowFilter,
  pos_filter_fun = posFilter,
  bound_filter_fun = function(foc_param) stringr::str_starts(foc_param, "nobound")
)

Arguments

foc_param

Character string coding the relevant first-order parameters.

tid_data

Subsection of a context-word-by-token dataframe, as outputted by setupConcordancer, with information for one token.

cw_selection

Vector of context words selected by the model for that token.

is_dep_fun

Function that takes foc_param as input and returns TRUE if dependency information should be collected and FALSE if the model is based on bag-of-words instead.

max_steps_fun

Function that takes foc_param as input and returns, for dependency-based models, the maximum number of steps in the dependency path to accept as viable context words.

window_filter_fun

Function that takes foc_param as input and returns a vector or list with two elements: the left and right window sizes (for bag-of-words models).

pos_filter_fun

Function that takes foc_param as input and returns a vector. If the vector is empty, no pos filter is implemented, while if it has values, the rows with pos included in that vector will be selected.

bound_filter_fun

Function that takes foc_param as input and returns TRUE if words outside the sentence are modelled and FALSE if they are not.

Value

Enriched dataframe including columns with filtering information.