kindred.Document

class kindred.Document(text, entities=None, relations=None, sourceFilename=None, metadata=None, loadFromSimpleTag=False)[source]

Span of text with associated tagged entities and relations between entities.

Variables:
  • text – Text in document (plain text or SimpleTag)
  • entities – Entities in document
  • relations – Relations in document
  • sourceFilename – Filename that this document came from
  • metadata – IDs and other information associated with the source (e.g. PMID)
  • sentences – List of sentences (kindred.Sentence) if the document has been parsed

Methods

__init__(text, entities=None, relations=None, sourceFilename=None, metadata=None, loadFromSimpleTag=False)[source]

Constructor for a Document that can take text using the SimpleTag XML format, or a set of Entities and Relations with associated text.

Parameters:
  • text (str) – Text in document (plain text or SimpleTag)
  • entities (list of kindred.Entity) – Entities in document
  • relations (list of kindred.Relation) – Relations in document
  • sourceFilename (str) – Filename that this document came from
  • metadata (dict) – IDs and other information associated with the source (e.g. PMID)
  • loadFromSimpleTag (bool) – Assumes the text parameter is in the SimpleTag format and will extract entities and relations accordingly
addEntity(entity)[source]

Add an entity to this document. If document has been parsed, it will add the entity into the sentence structure and associated with tokens.

Parameters:entity (kindred.Entity) – Entity to add
addRelation(relation)[source]

Add a relation to this document

Parameters:relation (kindred.Relation) – Relation to add
addSentence(sentence)[source]

Add a sentence to this document

Parameters:sentence (kindred.Sentence) – Sentence to add
clone()[source]

Clones the document

Returns:Clone of the document
Return type:kindred.Document
removeEntities()[source]

Remove all entities in this document

removeRelations()[source]

Remove all relations in this document

splitIntoSentences()[source]

Create a new corpus with one document for each sentence in this document.

Returns:Corpus with one document per sentence
Return type:kindred.Corpus