Classs that holds token level lingustic information and the text of the token.

class UCREL_Token[source]

UCREL_Token(text:str, lemma:Optional[str]=None, pos_tag:Optional[str]=None, usas_tag:Optional[str]=None, mwe_tag:Optional[str]=None)

Classs that holds token level lingustic information and the text of the token.

This class is inspired by the Token class from the SpaCy API.

UCREL_Token.__init__[source]

UCREL_Token.__init__(text:str, lemma:Optional[str]=None, pos_tag:Optional[str]=None, usas_tag:Optional[str]=None, mwe_tag:Optional[str]=None)

  1. text: Text of the token.
  2. lemma: Lemma of the token.
  3. pos_tag: POS tag of the token.
  4. usas_tag: USAS tag of the token.
  5. mwe_tag: Multi Word Expression (MWE) tag. This is in the form of Unique ID. Length of MWE. Position in MWE e.g 2.2.1 would mean that the token is in the second unique MWE within it's context, the length of the MWE is 2, and this is the first token in this MWT.
great_token = UCREL_Token('Great', 'great', 'JJ', 'A5.1+', '1.1.1')

UCREL_Token.__eq__[source]

UCREL_Token.__eq__(other:Any)

Compare another instance with the current instance of this class.

  1. other: Another instance, if this instance is not of this class type it will raise a NotImplementedError.

returns True if the two instances are the same based on the token attributes.

raises NotImplementedError: If the other instance is not of the same class type as self.

great_token = UCREL_Token('Great', 'great', 'JJ', 'A5.1+', '1.1.1')
assert great_token == UCREL_Token('Great', 'great', 'JJ', 'A5.1+', '1.1.1')

great_without_usas = UCREL_Token('Great', 'great', 'JJ', mwe_tag='1.1.1')
assert great_token != great_without_usas

try:
    {'text': 'Great', 'pos_tag': 'JJ'} == great_without_usas
except NotImplementedError:
    print('UCREL_Token instances can only be compared '
          'with other UCREL_Token instances:')
UCREL_Token instances can only be compared with other UCREL_Token instances:

UCREL_Token.__repr__[source]

UCREL_Token.__repr__()

String representation of the UCREL Token instance, format:

UCREL Token: {self.text} Lemma: {self.lemma} POS tag: {self.pos_tag} USAS tag: {self.usas_tag} MWE tag: {self.mwe_tag}

The Lemma, POS, USAS, MWE tags will only appear if they are not None.

print(UCREL_Token('Great', 'great', 'JJ', 'A5.1+', '1.1.1'))
UCREL Token: Great	Lemma: great	POS tag: JJ	USAS tag: A5.1+	MWE tag: 1.1.1

UCREL_Token.to_json[source]

UCREL_Token.to_json()

returns This UCREL_Token as a JSON String.

great_token.to_json()
'{"text": "Great", "lemma": "great", "pos_tag": "JJ", "usas_tag": "A5.1+", "mwe_tag": "1.1.1"}'

Static Methods

UCREL_Token.from_json[source]

UCREL_Token.from_json(json_string:str)

A static method that given a json_string will return a UCREL_Token representation of that string.

  1. json_string: A string that is the return of UCREL_Token.to_json method

returns The given json_string represented through the UCREL_Token.

great_token_json_string = great_token.to_json()
another_great_token = UCREL_Token.from_json(great_token_json_string)
another_great_token
UCREL Token: Great	Lemma: great	POS tag: JJ	USAS tag: A5.1+	MWE tag: 1.1.1
great_token == another_great_token
True