- link in reference to buy cheap neurontin online
- relent online without prescription out of indiana
- read in relation to buy ultimate viagra pack (viagra + viagra soft tabs + viagra oral jelly) without prescription out of new mexico
- news relatively purchase amoxicillin online out of nevada
- cephalexin online
- sites relatively reglan of kansas
- find info in respect of synthroid
- tips concerning methotrexate no prescription
- article as regards tetracycline no prescription from canada
This is another blog post to highlight one of the neat tricks you’ll be able to do with Biopython 1.54 (which you can help test with the Biopython 1.54 beta release).
It is often useful to be able to extract a few records from a larger sequence file – for example, some sequences of interest from a full UniProt or GenBank dump. One obvious way to try to do this is to parse the file into an object representation (i.e. SeqRecord objects using Bio.SeqIO.parse(...)), filter to pick out the entries you want, and then write them back to disk (using Bio.SeqIO.write(...)). However, for complex file formats like GenBank this can be lossy (Bio.SeqIO does not support a 100% identical round trip), and Biopython don’t currently support writing out the SwissProt plain text format used by UniProt. So, that approach won’t work.
Biopython 1.52 introduced a new indexing function, Bio.SeqIO.index(...), which allows a large multi-sequence file to be treated like a Python dictionary – parsing requested records on request. This has been enhanced for Biopython 1.54 with a method get_raw(...) to extract the raw for a record as a string.
How is this useful? Well, take your large (UniProt) file, index it, then extract the records you want and write them to your output file unmodified:
from Bio import SeqIO
uniprot = SeqIO.index("uniprot_sprot.dat", "swiss")
handle = open("selected.dat", "w")
for acc in ["P33487", "P19801", "P13689", "Q8JZQ5", "Q9TRC7"]: