kindred.Document¶
-
class
kindred.
Document
(text, entities=None, relations=None, sourceFilename=None, metadata=None, loadFromSimpleTag=False)[source]¶ Span of text with associated tagged entities and relations between entities.
Variables: - text – Text in document (plain text or SimpleTag)
- entities – Entities in document
- relations – Relations in document
- sourceFilename – Filename that this document came from
- metadata – IDs and other information associated with the source (e.g. PMID)
- sentences – List of sentences (
kindred.Sentence
) if the document has been parsed
Methods
-
__init__
(text, entities=None, relations=None, sourceFilename=None, metadata=None, loadFromSimpleTag=False)[source]¶ Constructor for a Document that can take text using the SimpleTag XML format, or a set of Entities and Relations with associated text.
Parameters: - text (str) – Text in document (plain text or SimpleTag)
- entities (list of kindred.Entity) – Entities in document
- relations (list of kindred.Relation) – Relations in document
- sourceFilename (str) – Filename that this document came from
- metadata (dict) – IDs and other information associated with the source (e.g. PMID)
- loadFromSimpleTag (bool) – Assumes the text parameter is in the SimpleTag format and will extract entities and relations accordingly
-
addEntity
(entity)[source]¶ Add an entity to this document. If document has been parsed, it will add the entity into the sentence structure and associated with tokens.
Parameters: entity (kindred.Entity) – Entity to add
-
addRelation
(relation)[source]¶ Add a relation to this document
Parameters: relation (kindred.Relation) – Relation to add
-
addSentence
(sentence)[source]¶ Add a sentence to this document
Parameters: sentence (kindred.Sentence) – Sentence to add
-
clone
()[source]¶ Clones the document
Returns: Clone of the document Return type: kindred.Document
-
splitIntoSentences
()[source]¶ Create a new corpus with one document for each sentence in this document.
Returns: Corpus with one document per sentence Return type: kindred.Corpus