.. Created by Adam Cunnningham on Mon May 30 2016. **Week 15: December 5 - 9** =========================== Bioinformatics -------------- `Biopython `_ is a set of Python modules containing tools for computational molecular biology. We `install `_ Biopython and learn to use one such tool to identify a set of :download:`unknown genetic sequences <./sequences.txt>`. Class Activity: Bioinformatics ------------------------------ Use the tools available in Biopython to identify the origin of six real nucleotide sequences presented in class. Determine which nucleotide sequence is the fake one. The code we will use for this is shown below. .. code-block:: python def identify_sequence(seq_data): 'Identify a genetic sequence' # Second (database) argument can also be "nt" results = NCBIWWW.qblast("blastn", "nr", seq_data, hitlist_size=2 ) records = NCBIXML.parse(results) E_VALUE_THRESH = 0.04 for record in records: for alignment in record.alignments: for hsp in alignment.hsps: if hsp.expect < E_VALUE_THRESH: print('****Alignment****') print('sequence:', alignment.title) print('length:', alignment.length) print('e value:', hsp.expect) nshow = 95 if len(hsp.query)<=nshow: print(hsp.query) print(hsp.match) print(hsp.sbjct) else: print(hsp.query[0:nshow-10] + '...' + hsp.query[-10:]) print(hsp.match[0:nshow-10] + '...' + hsp.match[-10:]) print(hsp.sbjct[0:nshow-10] + '...' + hsp.sbjct[-10:]) Quiz 12: More Plotting ---------------------- - **loglog** - **semilogx** - **semilogy** - **polar** - **hist** - **contour** - **contourf** - **colorbar** - **axvline** - **axhline** :download:`Sample Quiz 12 <../Quizzes/Quiz12Sample.pdf>` Report 10 --------- :doc:`language_models`