ranking_meta_data

pymusas.rankers.ranking_meta_data

RankingMetaData

@dataclass(init=True, repr=True, eq=True, order=False,
           unsafe_hash=False, frozen=True)
class RankingMetaData

A RankingMetaData object contains all of the meta data about a lexicon entry match during the tagging process. This meta data can then be used to determine the ranking of the match comapred to other matches within the same text/sentence that is being tagged.

Instance Attributes¶

lexicon_type : LexiconType
Type associated to the lexicon entry.
lexicon_n_gram_length : int
The n-gram size of the lexicon entry, e.g. *_noun boot*_noun will be of length 2 and all single word lexicon entries will be of length 1.
lexicon_wildcard_count : int
Number of wildcards in the lexicon entry, e.g. *_noun boot*_noun will be 2 and ski_noun boot_noun will be 0.
exclude_pos_information : bool
Whether the POS information was excluded in the match. This is only True when the match ignores the POS information for single word lexicon entries. This is always False when used in a Multi Word Expression (MWE) lexicon entry match.
lexical_match : LexicalMatch
What LexicalMatch the lexicon entry matched on.
token_match_start_index : int
Index of the first token in the lexicon entry match.
token_match_end_index : int
Index of the last token in the lexicon entry match.
lexicon_entry_match : str
The lexicon entry match, which can be either a single word or MWE entry match. In the case for single word this could be Car|noun and in the case for a MWE it would be it's template, e.g. snow_noun boots_noun.
semantic_tags : Tuple[str, ...]
The semantic tags associated with the lexicon entry. The semantic tags are in rank order, the most likely tag is the first tag in the tuple. The Tuple can be of variable length hence the ... in the type annotation.

lexicon_type

class RankingMetaData:
 | ...
 | lexicon_type: LexiconType = None

lexicon_n_gram_length

class RankingMetaData:
 | ...
 | lexicon_n_gram_length: int = None

lexicon_wildcard_count

class RankingMetaData:
 | ...
 | lexicon_wildcard_count: int = None

exclude_pos_information

class RankingMetaData:
 | ...
 | exclude_pos_information: bool = None

lexical_match

class RankingMetaData:
 | ...
 | lexical_match: LexicalMatch = None

token_match_start_index

class RankingMetaData:
 | ...
 | token_match_start_index: int = None

token_match_end_index

class RankingMetaData:
 | ...
 | token_match_end_index: int = None

lexicon_entry_match

class RankingMetaData:
 | ...
 | lexicon_entry_match: str = None

semantic_tags

class RankingMetaData:
 | ...
 | semantic_tags: Tuple[str, ...] = None

RankingMetaData​

Instance Attributes¶​

lexicon_type​

lexicon_n_gram_length​

lexicon_wildcard_count​

exclude_pos_information​

lexical_match​

token_match_start_index​

token_match_end_index​

lexicon_entry_match​

semantic_tags​