Thursday, August 17, 2017

targetP wrapper for large queries

As far as I know, TargetP is still (17 years after its original publication!) the best software for predicting subcellular localization for plant proteins, and also the location of truncation sites.

Without any modifications, targetp works well with small (by modern standards) queries, of less than 2,000 sequences at a time. But becomes glitchy when running with larger queries, such as the 30k-100k genes that are typical from a plant transcriptome assembly.

To adapt TargetP for larger queries, I wrote a Python script that acts as a wrapper around TargetP, called targetp_all.py. The script works by separating the input into smaller subsets of sequences and running those, and combining the output.

Interface is the same as the original program but with a few additional options. The output is somewhat simplified to be in tab-separated format.

It would also be nice to be able to parallelize the execution of TargetP to run on multiple cores at once, but I haven't attempted this yet. I believe that there will be complications involving conflicting temporary files, that may require careful modification of the original source code.

Source code follows. BioPython is a dependency.

Saturday, April 29, 2017

Mira4 assembly of 454 reads from SRA

I want to make an assembly of the Annona squamosa fruit transcriptome data from this paper (http://dx.doi.org/10.1186/s12864-015-1248-3). They give in the paper a link to a web resource (http://www.annonatranscriptome.nabi.res.in/), but the resource appears to now be defunct, so to get contigs reads, I will have to assemble the reads myself. The reads are from two different cultivars of Annona squamosa, so I'm going to assemble each cultivar separately first, and then if that works, I'll try a combined assembly.

MIRA is a nice, free, software package that can assemble 454 data. I've had success with it before, so that's what I'll use for this project too.


Monday, March 20, 2017

Tips for Methods Development and Optimization in Biology



I haven't posted anything in quite a while. Mostly that's because I haven't written anything that I thought would be of general interest. I once again have a young lab assistant to preach at, so I might as well preach at the world too. Here is what I've written for her about methods development in the biology laboratory.

Thursday, November 10, 2016

The Loss of the Creature, Walker Percy, detailed commentary

I think the essay "The Loss of the Creature" by Walker Percy (from the book The Message in the Bottle) is well worth reading and thinking about. In this post I offer a detailed, paragraph-by-paragraph commentary of the essay. This post isn't really meant to be read from beginning to end (if you try that, it may get repetitive). It's meant more as a set of detailed footnotes for people who find the essay to be confusing. The way to follow this post is to print off Percy's essay, then number the paragraphs. By my numbering, there are 38 paragraphs in section I, and 24 paragraphs in section II (starting with #39, ending with #62). When reading, if you get stuck on a paragraph, look it up here, and maybe my comments will make it more clear (hopefully they won't make it even more confusing). For a more personalized discussion, please leave a comment. I'm well aware of the irony of writing an analysis of this essay as though I expect people to experience the essay through my interpretation (exactly opposite to how the essay encourages us to experience the world). I would encourage you to not think of my commentary as authoritative, but maybe just as a spark for your own thought.

(This is a work in progress, I got about half way through and then set it aside. It's been sitting unfinished for long enough that I feel I might as well just publish what I have. I hope to finish it eventually. In the mean time, if anybody else would like to contribute commentary for the remaining paragraphs, that would be nice.)

Wednesday, October 26, 2016

My dissertation is now available for download from ProQuest

My PhD dissertation is now available for free online at http://pqdtopen.proquest.com/pubnum/10164019.html

I'm also making my laboratory notebooks from grad school available. They don't cover everything I did, they're pretty messy, and I'm not sure they will be useful or interesting to anyone, but here they are.
book1
book2
book3


Here is the abstract to my dissertation:
Plant natural products are useful for many different applications, including medicines, flavors and fragrances, and industrial uses. Two important aspects of plant natural products research are the identification of compounds in their source plants, and the characterization of the processes involved in their biosynthesis. To aid in the identification of plant natural products, we developed the Spektraris family of databases. These databases include highperformance liquid chromatography mass spectrometry data, and 13C and 1H nuclear magnetic resonance data, which are searchable through an online interface. The utility of Spektraris was validated by using it to identify compounds in plant extracts and as part of a workflow to elucidate the structure of a previously undescribed compound.
Mints have a long history of use as model systems for studying the processes of terpene natural products biosynthesis in specialized plant tissues. The mint family (Lamiaceae), synthesizes and stores volatile terpenes in glandular trichomes. Using a comparative transcriptomic approach, we identified differences in gene expression of monoterpene biosynthetic genes among mint species with different oil profiles. We also assembled the genome of a mint species, Mentha longifolia. The genome assembly will be valuable for future mint research.
To further investigate biosynthetic processes in mint, I developed a detailed mathematical model of the metabolism of peppermint glandular trichomes. The model incorporates multiple sources of data, including transcriptome data, metabolite data, enzymatic data from the peppermint literature, and previously developed models of plant metabolism. The creation of a new metabolic modeling software package, called YASMEnv, facilitated construction of the model. Model-based simulated reaction knockouts using flux balance analysis revealed that fermentation may be important for ATP regeneration in secretory phase glandular trichomes. Follow up experiments confirmed high levels of alcohol dehydrogenase activity in secretory phase isolated trichomes. Simulations also supported an essential role for ferredoxin and ferredoxin-NADP reductase. Transcriptome analysis revealed the presence of an isoform of ferredoxin in trichomes distinct from the one expressed in root. The presence of a distinct ferredoxin isoform in trichomes supports the hypothesis that selection pressure for efficient natural products biosynthesis may also act on the enzymes of primary metabolism.

Thursday, October 13, 2016

rendering zdock server results using python and UCSF Chimera

Problem:
I was interested in learning where the protein-protein interaction sites were on a particular protein (the receptor). I had pdb files for that protein, and for three proteins that it is known to bind to (the ligands). There are specific amino-acids of interest on the receptor protein, where there is variability among different species. For every combination of receptor and ligand, I want to perform a prediction of where the proteins interact, and then generate an image where the variable amino-acids are highlighted.

Gratefulness and Compassion meditation

Meditation is a way to slow down and appreciate life. For me, it is a way to relax, and prepare for the day, and to fight back against negative thoughts and attitudes that sometimes accumulate in my mind.

There's no one right way to meditate. You can try to empty your mind. You can focus intently on your breath, or on some mantra, or on different parts of your body. You can listen carefully and try to concentrate on perceiving your surroundings. The only rules are to be calm and to be positive. Meditation can sometimes veer into or negative thinking, or rumination: thinking about how other people owe you something, or how you've been mistreated or disrespected, or how you're somehow in an unfixable situation (you're not). Take care that your meditation does not become rumination.

I like to try out different kinds of meditation. Some I read about, some I invent on my own. I recently explained how I think that gratefulness and compassion are the two most important emotions. The rest of this post explains Gratefulness and Compassion Meditation, which is a way to cultivate those two emotions through practice.