kindred.Parser¶

class kindred.Parser(model='en_core_web_sm')[source]¶

Runs Spacy on corpus to get sentences and associated tokens

Variables:	model – Model for parsing (e.g. en/de/es/pt/fr/it/nl) nlp – The underlying Spacy language model to use for parsing

Methods

__init__(model='en_core_web_sm')[source]¶

Create a Parser object that will use Spacy for parsing. It offers all the same languages that Spacy offers. Check out: https://spacy.io/usage/models. Note that the language model needs to be downloaded first (e.g. python -m spacy download en)

Parameters:	model (str) – Name of an available Spacy language model for parsing (e.g. en/de/es/pt/fr/it/nl)

parse(corpus)[source]¶

Parse the corpus. Each document will be split into sentences which are then tokenized and parsed for their dependency graph. All parsed information is stored within the corpus object.

Parameters:	corpus (kindred.Corpus) – Corpus to parse