Contents Menu Expand
pyms-nist-search 0.8.0 documentation
pyms-nist-search 0.8.0 documentation
  • Home
  • Usage
  • API Reference
  • Contributing
  • License
  • Downloading source code

Links

  • GitHub
  • PyPI

API Reference¶

PyMassSpec extension for searching mass spectra using NIST’s Mass Spectrum Search Engine.

base¶

Base class for other PyMassSpec NIST Search classes.

class NISTBase(name='', cas='---')[source]¶

Bases: object

Base class for other PyMassSpec NIST Search classes.

Parameters
  • name (str) – The name of the compound. Default ''.

  • cas (Union[str, int]) – The CAS number of the compound. Default '---'.

Methods:

__eq__(other)

Return self == other.

__str__()

Return str(self).

from_dict(dictionary)

Construct an object from a dictionary.

from_json(json_data)

Construct an object from json data.

from_pynist(pynist_dict)

Create an object from the raw data returned by the C extension.

to_dict()

Convert the object to a dictionary.

to_json()

Convert the object to json.

Attributes:

cas

The CAS number of the compound.

name

The name of the compound.

__eq__(other)[source]¶

Return self == other.

Return type

bool

__str__()[source]¶

Return str(self).

Return type

str

property cas¶

The CAS number of the compound.

Return type

str

classmethod from_dict(dictionary)[source]¶

Construct an object from a dictionary.

Parameters

dictionary (Dict[str, Any])

classmethod from_json(json_data)[source]¶

Construct an object from json data.

Parameters

json_data (str)

classmethod from_pynist(pynist_dict)[source]¶

Create an object from the raw data returned by the C extension.

Parameters

pynist_dict (Dict[str, Any])

property name¶

The name of the compound.

Return type

str

to_dict()[source]¶

Convert the object to a dictionary.

New in version 0.6.0.

Return type

Dict[str, Any]

to_json()[source]¶

Convert the object to json.

Return type

str

docker_engine¶

Search engine for Linux and other platforms supporting Docker.

Classes:

Engine(lib_path[, lib_type, work_dir, debug])

Search engine for Linux and other platforms supporting Docker.

Functions:

hit_list_from_json(json_data)

Parse json data into a list of SearchResult objects.

hit_list_with_ref_data_from_json(json_data)

Parse json data into a list of (SearchResult, ReferenceData) tuples.

require_init(func)

Decorator to ensure that functions do not run after the class has been uninitialised.

class Engine(lib_path, lib_type=1, work_dir=None, debug=False)[source]¶

Bases: object

Search engine for Linux and other platforms supporting Docker.

The first time the engine is initialized it will download the latest version of the docker image automatically. This can also be performed manually, such as to upgrade to the latest version, with the following command:

docker pull domdfcoding/pywine-pyms-nist

The engine must be uninitialized when no longer required to shut down the underlying docker container. This is achieved with the uninit() method. Alternatively, this class can be used as a contextmanager to automatically uninitialize the engine upon leaving the with block:

with pyms_nist_search.Engine(
        FULL_PATH_TO_MAIN_LIBRARY,
        pyms_nist_search.NISTMS_MAIN_LIB,
        FULL_PATH_TO_WORK_DIR,
        ) as search:
    search.full_spectrum_search(ms, n_hits=5)

Changed in version 0.6.0: Added context manager support.

Parameters
  • lib_path (Union[str, Path, PathLike, Sequence[Tuple[Union[str, Path, PathLike], int]]]) – The path to the mass spectral library.

  • lib_type (int) – The type of library. One of NISTMS_MAIN_LIB, NISTMS_USER_LIB, NISTMS_REP_LIB. Default 1.

  • work_dir (Union[str, Path, PathLike, None]) – The path to the working directory. Default None.

Methods:

cas_search(cas)

Search for a compound by CAS number.

full_search_with_ref_data(mass_spec[, n_hits])

Perform a Full Spectrum Search of the mass spectral library, including reference data.

full_spectrum_search(mass_spec[, n_hits])

Perform a Full Spectrum Search of the mass spectral library.

get_active_libs()

Returns the active librararies, as their (zero-based) indices in the output of :meth:~.WinEngine.get_lib_names()`.

get_lib_paths()

Returns the list of library names currently in use.

get_reference_data(spec_loc)

Get reference data from the library for the compound at the given location.

spectrum_search(mass_spec[, n_hits])

Perform a Quick Spectrum Search of the mass spectral library.

uninit()

Uninitialize the Search Engine.

Attributes:

image_name

The name (and label) of the docker image to use.

static cas_search(cas)[source]¶

Search for a compound by CAS number.

Note

This function does not appear to work with user libraries converted using LIB2NIST.

Parameters

cas (str)

Return type

List[SearchResult]

Returns

List of results for CAS number (usually just one result).

full_search_with_ref_data(mass_spec, n_hits=5)[source]¶

Perform a Full Spectrum Search of the mass spectral library, including reference data.

Parameters
  • mass_spec (MassSpectrum) – The mass spectrum to search against the library.

  • n_hits (int) – The number of hits to return. Default 5.

Return type

List[Tuple[SearchResult, ReferenceData]]

Returns

List of tuples containing possible identities for the mass spectrum, and the reference data.

full_spectrum_search(mass_spec, n_hits=5)[source]¶

Perform a Full Spectrum Search of the mass spectral library.

Parameters
  • mass_spec (MassSpectrum) – The mass spectrum to search against the library.

  • n_hits (int) – The number of hits to return. Default 5.

Return type

List[SearchResult]

Returns

List of possible identities for the mass spectrum.

get_active_libs()[source]¶

Returns the active librararies, as their (zero-based) indices in the output of :meth:~.WinEngine.get_lib_names()`.

Return type

List[int]

get_lib_paths()[source]¶

Returns the list of library names currently in use.

Return type

List[str]

get_reference_data(spec_loc)[source]¶

Get reference data from the library for the compound at the given location.

Parameters

spec_loc (int)

Return type

ReferenceData

image_name = 'domdfcoding/pywine-pyms-nist:latest'¶

Type:    str

The name (and label) of the docker image to use.

New in version 0.8.0.

spectrum_search(mass_spec, n_hits=5)[source]¶

Perform a Quick Spectrum Search of the mass spectral library.

Parameters
  • mass_spec (MassSpectrum) – The mass spectrum to search against the library.

  • n_hits (int) – The number of hits to return. Default 5.

Return type

List[SearchResult]

Returns

List of possible identities for the mass spectrum.

uninit()[source]¶

Uninitialize the Search Engine.

hit_list_from_json(json_data)[source]¶

Parse json data into a list of SearchResult objects.

Parameters

json_data (str) – str

Return type

List[SearchResult]

hit_list_with_ref_data_from_json(json_data)[source]¶

Parse json data into a list of (SearchResult, ReferenceData) tuples.

Parameters

json_data (str) – str

Return type

List[Tuple[SearchResult, ReferenceData]]

require_init(func)[source]¶

Decorator to ensure that functions do not run after the class has been uninitialised.

Parameters

func (Callable) – The function or method to wrap.

Return type

Callable

reference_data¶

Class to store reference data from NIST MS Search.

class ReferenceData(name='', cas='---', nist_no=0, id='', mw=0.0, formula='', contributor='', mass_spec=None, synonyms=None, exact_mass=None, lib_idx=0)[source]¶

Bases: NISTBase

Class to store reference data from NIST MS Search.

Parameters
  • name (str) – The name of the compound. Default ''.

  • cas (Union[str, int]) – The CAS number of the compound. Default '---'.

  • nist_no (Union[int, str]) – Default 0.

  • id (Union[str, int]) – Default ''.

  • mw (Union[float, str]) – Default 0.0.

  • formula (str) – The formula of the compound. Default ''.

  • contributor (str) – The contributor to the library. Default ''.

  • mass_spec (Optional[MassSpectrum]) – The reference mass spectrum. Default None.

  • synonyms (Optional[Sequence[str]]) – List of synonyms for the compound. Default None.

  • exact_mass (Optional[Any]) – Not used. Default None.

  • lib_idx (int) – The (zero-based) index of the library the result was found in (see get_lib_names()). Default 0.

Methods:

__repr__()

Return a string representation of the ReferenceData.

from_jcamp(file_name[, ignore_warnings])

Create a ReferenceData object from a JCAMP-DX file.

from_json(json_data)

Construct an object from JSON data.

from_mona_dict(mona_data)

Construct an object from Massbank of North America json data that has been loaded into a dictionary.

from_pynist(pynist_dict)

Create a ReferenceData object from the raw data returned by the C extension.

to_dict()

Convert the object to a dictionary.

to_json()

Convert the object to JSON.

to_msp()

Returns the ReferenceData object as an MSP file similar to that produced by NIST MS Search's export function.

Attributes:

contributor

The name of the contributor to the library.

exact_mass

The exact mass of the compound (not used).

formula

The formula of the compound.

id

The ID of the compound.

lib_idx

The (zero-based) index of the library the result was found in (see get_lib_names()).

mass_spec

The mass spectrum of the compound.

mw

The molecular weight of the compound.

nist_no

The NIST number of the compund.

synonyms

A list of synonyms for the compound.

__repr__()[source]¶

Return a string representation of the ReferenceData.

Return type

str

property contributor¶

The name of the contributor to the library.

Return type

str

property exact_mass¶

The exact mass of the compound (not used).

Return type

float

property formula¶

The formula of the compound.

Return type

str

classmethod from_jcamp(file_name, ignore_warnings=True)[source]¶

Create a ReferenceData object from a JCAMP-DX file.

Parameters
  • file_name (Union[str, Path, PathLike]) – Path of the file to read.

  • ignore_warnings (bool) – Whether warnings about invalid tags should be shown. Default True.

Authors

Qiao Wang, Andrew Isaac, Vladimir Likic, David Kainer, Dominic Davis-Foster

Return type

ReferenceData

classmethod from_json(json_data)[source]¶

Construct an object from JSON data.

Parameters

json_data (str)

Return type

ReferenceData

classmethod from_mona_dict(mona_data)[source]¶

Construct an object from Massbank of North America json data that has been loaded into a dictionary.

Parameters

mona_data (Dict) – dict

Return type

ReferenceData

classmethod from_pynist(pynist_dict)[source]¶

Create a ReferenceData object from the raw data returned by the C extension.

Parameters

pynist_dict (Dict[str, Any])

Return type

ReferenceData

property id¶

The ID of the compound.

Return type

str

property lib_idx¶

The (zero-based) index of the library the result was found in (see get_lib_names()).

Return type

int

property mass_spec¶

The mass spectrum of the compound.

Return type

Optional[MassSpectrum]

property mw¶

The molecular weight of the compound.

Return type

int

property nist_no¶

The NIST number of the compund.

Return type

int

property synonyms¶

A list of synonyms for the compound.

Return type

List[str]

to_dict()[source]¶

Convert the object to a dictionary.

New in version 0.6.0.

Return type

Dict[str, Any]

to_json()[source]¶

Convert the object to JSON.

Return type

str

to_msp()[source]¶

Returns the ReferenceData object as an MSP file similar to that produced by NIST MS Search’s export function.

Return type

str

search_result¶

Class to store search results from NIST MS Search.

class SearchResult(name='', cas='---', match_factor=0, reverse_match_factor=0, hit_prob=0.0, spec_loc=0, lib_idx=0)[source]¶

Bases: NISTBase

Class to store search results from NIST MS Search.

Parameters
  • name (str) – The name of the compound. Default ''.

  • cas (Union[str, int]) – The CAS number of the compound. Default '---'.

  • match_factor (float) – Default 0.

  • reverse_match_factor (float) – Default 0.

  • hit_prob (float) – Default 0.0.

  • spec_loc (float) – The location of the reference spectrum in the library. Default 0.

  • lib_idx (int) – The (zero-based) index of the library the result was found in (see get_lib_names()). Default 0.

Methods:

from_pynist(pynist_dict)

Create a SearchResult object from the raw data returned by the C extension.

to_dict()

Convert the object to a dictionary.

Attributes:

hit_prob

Returns the probability of the hit being the compound responsible for the mass spectrum.

lib_idx

The (zero-based) index of the library the result was found in (see get_lib_names()).

match_factor

Returns a score (out of 1000) representing the similarity of the searched mass spectrum to the search result.

reverse_match_factor

A score (out of 1000) representing the similarity of the searched mass spectrum to the search result, but ignoring any peaks that are in the searched mass spectrum but not in the library spectrum.

spec_loc

The location of the reference spectrum in the library.

classmethod from_pynist(pynist_dict)[source]¶

Create a SearchResult object from the raw data returned by the C extension.

Parameters

pynist_dict (Dict[str, Any])

Return type

SearchResult

property hit_prob¶

Returns the probability of the hit being the compound responsible for the mass spectrum.

Return type

float

property lib_idx¶

The (zero-based) index of the library the result was found in (see get_lib_names()).

Return type

int

property match_factor¶

Returns a score (out of 1000) representing the similarity of the searched mass spectrum to the search result.

Return type

int

property reverse_match_factor¶

A score (out of 1000) representing the similarity of the searched mass spectrum to the search result, but ignoring any peaks that are in the searched mass spectrum but not in the library spectrum.

Return type

int

property spec_loc¶

The location of the reference spectrum in the library.

This can then be searched using the get_reference_data() method of the search engine to obtain the reference data.

Return type

int

to_dict()[source]¶

Convert the object to a dictionary.

New in version 0.6.0.

Return type

Dict[str, Any]

utils¶

General utilities.

Functions:

lib_name_from_path(lib_path)

Given the path to a mass spectral library, returns the library name (the final path component).

pack(mass_spec[, top])

Convert a pyms.Spectrum.MassSpectrum object into a string.

parse_name_chars(name_char_list)

Takes a list of Unicode character codes and converts them to characters, taking into account the special codes used by the NIST DLL.

lib_name_from_path(lib_path)[source]¶

Given the path to a mass spectral library, returns the library name (the final path component).

Return type

str

pack(mass_spec, top=20)[source]¶

Convert a pyms.Spectrum.MassSpectrum object into a string.

Adapted from https://sourceforge.net/projects/mzapi-live/

Parameters
  • mass_spec (MassSpectrum)

  • top (int) – The number of largest peaks to identify. Default 20.

Return type

str

parse_name_chars(name_char_list)[source]¶

Takes a list of Unicode character codes and converts them to characters, taking into account the special codes used by the NIST DLL.

Parameters

name_char_list (Sequence[int])

Return type

str

Returns

The parsed name.

win_engine¶

Search engine for Windows systems.

class Engine(lib_path, lib_type=1, work_dir=None, debug=False)[source]¶

Bases: object

Search engine for Windows systems.

Parameters
  • lib_path (Union[str, Path, PathLike, Sequence[Tuple[Union[str, Path, PathLike], int]]]) – The path to the mass spectral library.

  • lib_type (int) – The type of library. One of NISTMS_MAIN_LIB, NISTMS_USER_LIB, NISTMS_REP_LIB. Default 1.

  • work_dir (Union[str, Path, PathLike, None]) – The path to the working directory. Default None.

Methods:

cas_search(cas)

Search for a compound by CAS number.

full_search_with_ref_data(mass_spec[, n_hits])

Perform a Full Spectrum Search of the mass spectral library, including reference data.

full_spectrum_search(mass_spec[, n_hits])

Perform a Full Spectrum Search of the mass spectral library.

get_active_libs()

Returns the active librararies, as their (zero-based) indices in the output of get_lib_names().

get_lib_paths()

Returns the list of library names currently in use.

get_reference_data(spec_loc)

Get reference data from the library for the compound at the given location.

spectrum_search(mass_spec[, n_hits])

Perform a Quick Spectrum Search of the mass spectral library.

uninit()

Uninitialize the Search Engine.

static cas_search(cas)[source]¶

Search for a compound by CAS number.

Note

This function does not appear to work with user libraries converted using LIB2NIST.

Parameters

cas (str)

Return type

List[SearchResult]

Returns

List of results for CAS number (usually just one result).

full_search_with_ref_data(mass_spec, n_hits=5)[source]¶

Perform a Full Spectrum Search of the mass spectral library, including reference data.

Parameters
  • mass_spec (MassSpectrum) – The mass spectrum to search against the library.

  • n_hits (int) – The number of hits to return. Default 5.

Return type

List[Tuple[SearchResult, ReferenceData]]

Returns

List of tuples containing possible identities for the mass spectrum, and the reference data

full_spectrum_search(mass_spec, n_hits=5)[source]¶

Perform a Full Spectrum Search of the mass spectral library.

Parameters
  • mass_spec (MassSpectrum) – The mass spectrum to search against the library.

  • n_hits (int) – The number of hits to return. Default 5.

Return type

List[SearchResult]

Returns

List of possible identities for the mass spectrum.

static get_active_libs()[source]¶

Returns the active librararies, as their (zero-based) indices in the output of get_lib_names().

Return type

List[int]

get_lib_paths()[source]¶

Returns the list of library names currently in use.

Return type

List[str]

static get_reference_data(spec_loc)[source]¶

Get reference data from the library for the compound at the given location.

Parameters

spec_loc (int)

Return type

ReferenceData

static spectrum_search(mass_spec, n_hits=5)[source]¶

Perform a Quick Spectrum Search of the mass spectral library.

Parameters
  • mass_spec (MassSpectrum) – The mass spectrum to search against the library.

  • n_hits (int) – The number of hits to return. Default 5.

Return type

List[SearchResult]

Returns

List of possible identities for the mass spectrum.

uninit()[source]¶

Uninitialize the Search Engine.

Next
Contributing
Previous
Usage
Copyright © 2020-2021 Dominic Davis-Foster | Built with Sphinx and @pradyunsg's Furo theme. | Show Source
Contents
  • API Reference
    • base
      • NISTBase
    • docker_engine
      • Engine
      • hit_list_from_json
      • hit_list_with_ref_data_from_json
      • require_init
    • reference_data
      • ReferenceData
    • search_result
      • SearchResult
    • utils
      • lib_name_from_path
      • pack
      • parse_name_chars
    • win_engine
      • Engine