kindred.load¶
-
kindred.
load
(dataFormat, path, ignoreEntities=[], ignoreComplexRelations=True)[source]¶ Load a corpus from a variety of formats. If path is a directory, it will try to load all files of the corresponding data type. For standoff format, it will use any associated annotations files (with suffixes .ann, .a1 or .a2)
Parameters: - dataFormat (str) – Format of the data files to load (‘standoff’,’biocxml’,’pubannotation’,’simpletag’)
- path (str) – Path to data. Can be directory or an individual file. Should be the txt file for standoff.
- ignoreEntities (list) – List of entity types to ignore while loading
- ignoreComplexRelations (bool) – Whether to filter out relations where one argument is another relation (must be True as kindred doesn’t currently support complex relations)
Returns: Corpus of loaded documents
Return type: