kindred.RelationClassifier

class kindred.RelationClassifier(classifierType='SVM', tfidf=True, features=None, threshold=None, entityCount=2, acceptedEntityTypes=None, model='en_core_web_sm')[source]

Manages binary classifier(s) for relation classification.

Parameters:
  • classifierType – Which classifier is used (‘SVM’ or ‘LogisticRegression’)
  • tfidf – Whether it will use tfidf for the vectorizer
  • features – A list of specific features. Valid features are “entityTypes”, “unigramsBetweenEntities”, “bigrams”, “dependencyPathEdges”, “dependencyPathEdgesNearEntities”
  • threshold – A specific threshold to use for classification (which will then use a logistic regression classifier)
  • entityCount – Number of entities in each relation (default=2). Passed to the CandidateBuilder (if needed)
  • acceptedEntityTypes – Tuples of entity types that relations must match. None will match allow relations of any entity types. Passed to the CandidateBuilder (if needed)
  • isTrained – Whether the classifier has been trained yet. Will throw an error if predict is called before it is trained.

Methods

__init__(classifierType='SVM', tfidf=True, features=None, threshold=None, entityCount=2, acceptedEntityTypes=None, model='en_core_web_sm')[source]

Constructor for the RelationClassifier class

Parameters:
  • classifierType (str) – Which classifier to use (must be ‘SVM’ or ‘LogisticRegression’)
  • tfidf (bool) – Whether to use tfidf for the vectorizer
  • features (list of str) – A list of specific features. Valid features are “entityTypes”, “unigramsBetweenEntities”, “bigrams”, “dependencyPathEdges”, “dependencyPathEdgesNearEntities”
  • threshold (float) – A specific threshold to use for classification (which will then use a logistic regression classifier)
  • entityCount (int) – Number of entities in each relation (default=2). Passed to the CandidateBuilder (if needed)
  • acceptedEntityTypes (list of tuples) – Tuples of entity types that relations must match. None will match allow relations of any entity types. Passed to the CandidateBuilder (if needed)
  • model (str) – Name of an available Spacy language model for any parsing needed (e.g. en/de/es/pt/fr/it/nl)
predict(corpus)[source]

Use the relation classifier to predict new relations for a corpus. The new relations will be added to the Corpus.

Parameters:corpus (kindred.Corpus) – Corpus to make predictions on
train(corpus)[source]

Trains the classifier using this corpus. All relations in the corpus will be used for training.

Parameters:corpus (kindred.Corpus) – Corpus to use for training