The function expects a dataframe where at least you have token-id's (e.g. _id
),
a column with character vectors of context words (e.g. cws
)
and a column with names of clusters (e.g. cluster
).
The example below shows how to also turn ;
-separated values into character vectors
within a tibble dataframe.
Arguments
- variables
Dataframe with IDs, clusters and lists of context words
- cws_column
Character string: Name of the column with the character vectors (one per row) of context words
- cluster_column
Character string: Name of the column with the name of the clusters (must be a factor)
- b
Weight for computing
fscore
Value
a tibble with one row per context word per cluster, with frequency information.