kindred.load

kindred.load(dataFormat, path, ignoreEntities=[], ignoreComplexRelations=True)[source]

Load a corpus from a variety of formats. If path is a directory, it will try to load all files of the corresponding data type. For standoff format, it will use any associated annotations files (with suffixes .ann, .a1 or .a2)

Parameters:
  • dataFormat (str) – Format of the data files to load (‘standoff’,’biocxml’,’pubannotation’,’simpletag’)
  • path (str) – Path to data. Can be directory or an individual file. Should be the txt file for standoff.
  • ignoreEntities (list) – List of entity types to ignore while loading
  • ignoreComplexRelations (bool) – Whether to filter out relations where one argument is another relation (must be True as kindred doesn’t currently support complex relations)
Returns:

Corpus of loaded documents

Return type:

kindred.Corpus