There was a question on Biostars about how to make EC assignments based on sequence. I gave one of the answers, suggesting a few possible solutions. One of my solutions was to Blast against ExPASy ENZYME and base the annotations on the best hit from there. In this post I explain how to do that, and supply the necessary Python code.
[Note: PRIAM is another tool that assigns EC numbers to sequences based on the data in the ENZYME database. Instead of individual blast hits, it uses profiles built from multiple sequence alignments of peptides known to catalyze a given reaction. It's also a lot quicker to use than the method outlined here, so it's probably worth checking out first]
Tuesday, July 29, 2014
Sunday, July 27, 2014
Wikipedia (cat)
Here's a silly poem I wrote a few years ago. It comes with its own silly introduction. I think this is pretty much the apex of my poetical achievement, so if you're not a fan of it... well, it just goes downhill from here folks... Also, I'd love to see this turned into a music video, so if anyone with musical or artistic talent would like to collaborate on that, let me know and maybe we can make something really cool.
srj_chembiolib a set of scripts for doing Bioinformaticky type stuff
This post is to announce srj_chembiolib which in an uncreatively named set of scripts I've written to perform various bioinformatics (and hopefully eventually some cheminformatic) tasks that would otherwise be a pain in the rear to accomplish. Most of the scripts work both as libraries that can be imported into other scripts, and as stand alone command line scripts. Some depend on external libraries such as BioPython. Documentation is mostly found in block comments at the top of the script files. Where possible, for example with the scripts that manipulate fasta files, I've tried to make it so that multiple programs can be chained together with pipes on the command line to accomplish more complex tasks. I hope they're useful for someone.
Here's an overview of some (but not all!) of the scripts in the package:
blast_xml_to_outfmt6.py: Converts Blast+ xml output to '-outfmt 6' style output (a tab separated form). Allows for some additional features not available in the standard outfmt 6, such as printing a line for query sequences that had no hits.
subset_fasta.py: Give it a fasta file and a list of strings, and it will give you a fasta file containing only those sequences whose names contain something from the list of strings as a substring.
extract_massbank.py: A class to read and store data from MassBank format text files, such as those used by MassBank, ReSpect, and Spektraris (my favorite database...).
extract_top_blast_hits.py: Reads a blast xml file and outputs the names of the top hits for each query sequence. There's an option for making a file listing the sequences the queries where the top hit matched in the reverse direction, which is useful, for example with Blastx, to determine whether a nucleotide sequence represents the coding strand, or the non-coding strand.
Here's an overview of some (but not all!) of the scripts in the package:
blast_xml_to_outfmt6.py: Converts Blast+ xml output to '-outfmt 6' style output (a tab separated form). Allows for some additional features not available in the standard outfmt 6, such as printing a line for query sequences that had no hits.
subset_fasta.py: Give it a fasta file and a list of strings, and it will give you a fasta file containing only those sequences whose names contain something from the list of strings as a substring.
extract_massbank.py: A class to read and store data from MassBank format text files, such as those used by MassBank, ReSpect, and Spektraris (my favorite database...).
extract_top_blast_hits.py: Reads a blast xml file and outputs the names of the top hits for each query sequence. There's an option for making a file listing the sequences the queries where the top hit matched in the reverse direction, which is useful, for example with Blastx, to determine whether a nucleotide sequence represents the coding strand, or the non-coding strand.
Monday, July 14, 2014
Head hair as a sensory organ
It's no mystery that hair acts to amplify the sense of touch. Everyone is familiar with the feeling of a bug crawling on their arm, and everyone who has ever gone swimming with a beard knows how much more pleasurable (like a million tiny hands gently pulling at your chin) it is than swimming without a beard. There are also more mystical ideas about long hair granting a kind of sixth sense.
I recently gave myself a buzz cut after letting my hair grow for about 2 years. My hair is moderately curly, so when it's long it poofs out an inch or two from my scalp. From my experience, I don't think long hair grants any kind of mysterious powers (nor do I think it has any effect on personality or intelligence, although people with certain personality traits may be more likely to choose to grow their hair long), but I was surprised to find that scalp hair does seem to contribute to spatial awareness. My evidence for this (which is admittedly circumstantial and has a low sample size, but this would be a very hard thing to test in a controlled environment, you can't very well have a "double blind" haircut) is that in the 24 hours after I cut my hair, I bonked my head into the wall twice. Once I was leaning over to put something in the trash can, and hit my head on the corner of the doorway that the can is next to. The other time I was in the shower and leaning around the too-hot stream of water to adjust the knobs. So I think scalp hair can function for people kind of like how whiskers function for cats.
I think that when I had poofy hair, I subconsciously used it to provide spatial information about the environment immediately around my head. Lacking that information, but not yet having had a chance to compensate by other means, my lack of hair temporarily increased my propensity to bonk into things.
Subscribe to:
Comments (Atom)
