semasioFlow.contextwords module¶
- semasioFlow.contextwords.listContextwords(type_name, tokenlist, fnames, settings, left_win=None, right_win=None)¶
Create dataframe with detail on context words of tokens.
It includes the elements that global_columns and line_machine extract from the corpus along with the distance (and side) to the target and whether they occur in the same sentence.
- Parameters
type_name (str) – Name of the type
tokenlist (list of str) – List of token IDs
fnames (list of str) – List of file names to find the tokens in
settings (dict) – Settings as created for the full workflow
left_win (int, optional) – Number of context words to extract from the left side, including sentence delimiters. Defaults to the settings values.
right_win (int, optional) – Number of context words to extract from the right side, including sentence delimiters. Defaults to the settings values.
- Returns
Data frame with one row per context word per token, information from the corpus and information relative to the target.
- Return type
pandas.DataFrame