semasioFlow.load module

semasioFlow.load.loadColloc(fname, settings, row_vocab=None, fnames=None, col_vocab=None)

Load an existing vocabulary or create one.

Parameters
  • fname (str) – Path where an existing vocabulary is stored or where it would be stored.

  • settings (dict) – Settings for creating the vocabulary and to extract the encoding information.

  • fnames (str or list, optional) – Corpus file names

  • row_vocab (Vocab, optional if fname exists) – Vocabulary for the rows of the collocation matrix.

  • col_vocab (Vocab, optional) – Vocabulary for the columns of the collocation matrix.

Returns

freqMTX – Type-level co-occurrence matrix matrix.

Return type

TypeTokenMatrix

Note

If the file does not exist, it creates it and stores it in the filename given.

semasioFlow.load.loadFocRegisters(register_path, type_name, prefixes=['bow', 'rel', 'path'])

Load and combine first-order register dataframes.

Parameters
  • register_path (str) – Directory where the dataframes are stored.

  • type_name (str) – First part of the file names.

  • prefixes (list of str) – Infixes in the filenames

Returns

registers – Merged register dataframes

Return type

pandas.DataFrame

semasioFlow.load.loadMacro(templates_dir, graphml_name, macro_name)

Load patterns and templates to create dependency-based models.

The output can be used as macro argument for nephosem.depmodel.DepHandler objects.

Parameters
  • templates_dir (str) – Directory where the templates are stored.

  • graphml_name (str) – Basename of the pattern file (before the “.template.graphml” extension).

  • macro_name (str) – Basename of the feature-template file (before the “.target-feature-macro.xml” extension).

Returns

Return type

list of MacroGraph

semasioFlow.load.loadVocab(fname, settings, fnames=None)

Load an existing vocabulary or create one.

Parameters
  • fname (str) – Path where an existing vocabulary is stored or where it would be stored.

  • fnames (str or list, optional) – Corpus file names

  • settings (dict) – Settings for creating the vocabulary and to extract the encoding information.

Returns

vocab

Return type

Vocab

Note

If the file does not exist, it creates it and stores it in the filename given.