utils
pymusas.utils
token_pos_tags_in_lexicon_entry​
def token_pos_tags_in_lexicon_entry(
lexicon_entry: str
) -> Iterable[Tuple[str, str]]
Yields the token and associated POS tag in the given lexicon_entry
.
Parameters¶​
- lexicon_entry :
str
Either a Multi Word Expression template or single word lexicon entry, which is a sequence of words/tokens and Part Of Speech (POS) tags joined together by an underscore and separated by a single whitespace, e.g.word1_POS1 word2_POS2 word3_POS3
. For a single word lexicon it would beword1_POS1
.
Returns¶​
Iterable[Tuple[str, str]]
Raises¶​
ValueError
If the lexicon entry when split on whitespace and then split by_
does not create aIterable[Tuple[str, str]]
whereby the tuple contains thetoken text
and it's associatedPOS tag
.
Examples¶​
from pymusas.utils import token_pos_tags_in_lexicon_entry
mwe_template = 'East_noun London_noun is_det great_adj'
assert ([('East', 'noun'), ('London', 'noun'), ('is', 'det'), ('great', 'adj')]
== list(token_pos_tags_in_lexicon_entry(mwe_template)))
single_word_lexicon = 'East_noun'
assert ([('East', 'noun')]
== list(token_pos_tags_in_lexicon_entry(single_word_lexicon)))
unique_pos_tags_in_lexicon_entry​
def unique_pos_tags_in_lexicon_entry(
lexicon_entry: str
) -> Set[str]
Returns the unique POS tag values in the given lexicon_entry
.
Parameters¶​
- lexicon_entry :
str
Either a Multi Word Expression template or single word lexicon entry, which is a sequence of words/tokens and Part Of Speech (POS) tags joined together by an underscore and separated by a single whitespace, e.g.word1_POS1 word2_POS2 word3_POS3
. For a single word lexicon it would beword1_POS1
.
Returns¶​
Set[str]
Raises¶​
ValueError
If the lexicon entry when split on whitespace and then split by_
does not create aList[Tuple[str, str]]
whereby the tuple contains thetoken text
and it's associatedPOS tag
.
Examples¶​
from pymusas.utils import unique_pos_tags_in_lexicon_entry
mwe_template = 'East_noun London_noun is_det great_adj'
assert ({'noun', 'adj', 'det'}
== unique_pos_tags_in_lexicon_entry(mwe_template))
single_word_lexicon = 'East_noun'
assert {'noun'} == unique_pos_tags_in_lexicon_entry(single_word_lexicon)