Skip to main content

utils

pymusas.utils

[SOURCE]


This module contains various helper functions that are used in other modules.

Attributes

  • NEURAL_EXTRA_PACKAGES : list[str]
    The Python packages that are required for the pymusas[neural] extra.

NEURAL_EXTRA_PACKAGES

NEURAL_EXTRA_PACKAGES: list[str] = ['transformers', 'wsd_torch_models', 'torch']

token_pos_tags_in_lexicon_entry

def token_pos_tags_in_lexicon_entry(
lexicon_entry: str
) -> Iterable[Tuple[str, str]]

Yields the token and associated POS tag in the given lexicon_entry.

Parameters

  • lexicon_entry : str
    Either a Multi Word Expression template or single word lexicon entry, which is a sequence of words/tokens and Part Of Speech (POS) tags joined together by an underscore and separated by a single whitespace, e.g. word1_POS1 word2_POS2 word3_POS3. For a single word lexicon it would be word1_POS1.

Returns

  • Iterable[Tuple[str, str]]

Raises

  • ValueError
    If the lexicon entry when split on whitespace and then split by _ does not create a Iterable[Tuple[str, str]] whereby the tuple contains the token text and it's associated POS tag.

Examples

from pymusas.utils import token_pos_tags_in_lexicon_entry
mwe_template = 'East_noun London_noun is_det great_adj'
assert ([('East', 'noun'), ('London', 'noun'), ('is', 'det'), ('great', 'adj')]
== list(token_pos_tags_in_lexicon_entry(mwe_template)))
single_word_lexicon = 'East_noun'
assert ([('East', 'noun')]
== list(token_pos_tags_in_lexicon_entry(single_word_lexicon)))

unique_pos_tags_in_lexicon_entry

def unique_pos_tags_in_lexicon_entry(
lexicon_entry: str
) -> Set[str]

Returns the unique POS tag values in the given lexicon_entry.

Parameters

  • lexicon_entry : str
    Either a Multi Word Expression template or single word lexicon entry, which is a sequence of words/tokens and Part Of Speech (POS) tags joined together by an underscore and separated by a single whitespace, e.g. word1_POS1 word2_POS2 word3_POS3. For a single word lexicon it would be word1_POS1.

Returns

  • Set[str]

Raises

  • ValueError
    If the lexicon entry when split on whitespace and then split by _ does not create a List[Tuple[str, str]] whereby the tuple contains the token text and it's associated POS tag.

Examples

from pymusas.utils import unique_pos_tags_in_lexicon_entry
mwe_template = 'East_noun London_noun is_det great_adj'
assert ({'noun', 'adj', 'det'}
== unique_pos_tags_in_lexicon_entry(mwe_template))
single_word_lexicon = 'East_noun'
assert {'noun'} == unique_pos_tags_in_lexicon_entry(single_word_lexicon)

are_packages_installed

def are_packages_installed(packages: list[str]) -> bool

Returns True if all packages are installed, False otherwise.

Parameters

  • packages : list[str]
    A list of package names to check if they are installed.

Returns

  • bool

neural_extra_installed

def neural_extra_installed() -> None

Checks if the pymusas[neural] extra is installed by checking if the packages required for the neural extra are installed.

Raises

  • ImportError
    If pymusas[neural] is not installed.