MSA File¶
This module defines functions and classes for parsing, manipulating, and analyzing multiple sequence alignments.
-
class
MSAFile(msa, mode='r', format=None, aligned=True, **kwargs)[source]¶ Handle MSA files in FASTA, SELEX, CLUSTAL and Stockholm formats.
msa may be a filename or a stream. Multiple sequence alignments can be read from or written in FASTA (
.fasta), Stockholm (.sth), CLUSTAL (.aln), or SELEX (.slx) format. For specified extensions, format argument is not needed. If aligned is True, unaligned sequences in the file or stream will cause anIOErrorexception. filter, a function that returns a boolean, can be used for filtering sequences, seeMSAFile.setFilter()for details. slice can be used to slice sequences, and is applied after filtering, seeMSAFile.setSlice()for details.-
setFilter(filter, filter_full=False)[source]¶ Set function used for filtering sequences. filter will be applied to split sequence label, by default. If filter_full is True, filter will be applied to the full label.
-
setSlice(slice)[source]¶ Set object used to slice sequences, which may be a
slice()or alist()of numbers.
-
closed¶ True for closed file.
-
format¶ Format of the MSA file.
-
-
splitSeqLabel(label)[source]¶ Returns label, starting residue number, and ending residue number parsed from sequence label.
-
parseMSA(filename, **kwargs)[source]¶ Returns an
MSAinstance that stores multiple sequence alignment and sequence labels parsed from Stockholm, SELEX, CLUSTAL, PIR, or FASTA format filename file, which may be a compressed file. Uncompressed MSA files are parsed using C code at a fraction of the time it would take to parse compressed files in Python.
-
writeMSA(filename, msa, **kwargs)[source]¶ Returns filename containing msa, a
MSAorMSAFileinstance, in the specified format, which can be SELEX, Stockholm, or FASTA. If compressed is True or filename ends with.gz, a compressed file will be written.MSAinstances will be written using C function into uncompressed files.Can also write CLUSTAL or PIR format files using Python functions.