Saturday, April 29, 2017

Mira4 assembly of 454 reads from SRA

I want to make an assembly of the Annona squamosa fruit transcriptome data from this paper ( They give in the paper a link to a web resource (, but the resource appears to now be defunct, so to get contigs reads, I will have to assemble the reads myself. The reads are from two different cultivars of Annona squamosa, so I'm going to assemble each cultivar separately first, and then if that works, I'll try a combined assembly.

MIRA is a nice, free, software package that can assemble 454 data. I've had success with it before, so that's what I'll use for this project too.

Monday, March 20, 2017

Tips for Methods Development and Optimization in Biology

I haven't posted anything in quite a while. Mostly that's because I haven't written anything that I thought would be of general interest. I once again have a young lab assistant to preach at, so I might as well preach at the world too. Here is what I've written for her about methods development in the biology laboratory.

Thursday, November 10, 2016

The Loss of the Creature, Walker Percy, detailed commentary

I think the essay "The Loss of the Creature" by Walker Percy (from the book The Message in the Bottle) is well worth reading and thinking about. In this post I offer a detailed, paragraph-by-paragraph commentary of the essay. This post isn't really meant to be read from beginning to end (if you try that, it may get repetitive). It's meant more as a set of detailed footnotes for people who find the essay to be confusing. The way to follow this post is to print off Percy's essay, then number the paragraphs. By my numbering, there are 38 paragraphs in section I, and 24 paragraphs in section II (starting with #39, ending with #62). When reading, if you get stuck on a paragraph, look it up here, and maybe my comments will make it more clear (hopefully they won't make it even more confusing). For a more personalized discussion, please leave a comment. I'm well aware of the irony of writing an analysis of this essay as though I expect people to experience the essay through my interpretation (exactly opposite to how the essay encourages us to experience the world). I would encourage you to not think of my commentary as authoritative, but maybe just as a spark for your own thought.

(This is a work in progress, I got about half way through and then set it aside. It's been sitting unfinished for long enough that I feel I might as well just publish what I have. I hope to finish it eventually. In the mean time, if anybody else would like to contribute commentary for the remaining paragraphs, that would be nice.)

Wednesday, October 26, 2016

My dissertation is now available for download from ProQuest

My PhD dissertation is now available for free online at

I'm also making my laboratory notebooks from grad school available. They don't cover everything I did, they're pretty messy, and I'm not sure they will be useful or interesting to anyone, but here they are.

Here is the abstract to my dissertation:
Plant natural products are useful for many different applications, including medicines, flavors and fragrances, and industrial uses. Two important aspects of plant natural products research are the identification of compounds in their source plants, and the characterization of the processes involved in their biosynthesis. To aid in the identification of plant natural products, we developed the Spektraris family of databases. These databases include highperformance liquid chromatography mass spectrometry data, and 13C and 1H nuclear magnetic resonance data, which are searchable through an online interface. The utility of Spektraris was validated by using it to identify compounds in plant extracts and as part of a workflow to elucidate the structure of a previously undescribed compound.
Mints have a long history of use as model systems for studying the processes of terpene natural products biosynthesis in specialized plant tissues. The mint family (Lamiaceae), synthesizes and stores volatile terpenes in glandular trichomes. Using a comparative transcriptomic approach, we identified differences in gene expression of monoterpene biosynthetic genes among mint species with different oil profiles. We also assembled the genome of a mint species, Mentha longifolia. The genome assembly will be valuable for future mint research.
To further investigate biosynthetic processes in mint, I developed a detailed mathematical model of the metabolism of peppermint glandular trichomes. The model incorporates multiple sources of data, including transcriptome data, metabolite data, enzymatic data from the peppermint literature, and previously developed models of plant metabolism. The creation of a new metabolic modeling software package, called YASMEnv, facilitated construction of the model. Model-based simulated reaction knockouts using flux balance analysis revealed that fermentation may be important for ATP regeneration in secretory phase glandular trichomes. Follow up experiments confirmed high levels of alcohol dehydrogenase activity in secretory phase isolated trichomes. Simulations also supported an essential role for ferredoxin and ferredoxin-NADP reductase. Transcriptome analysis revealed the presence of an isoform of ferredoxin in trichomes distinct from the one expressed in root. The presence of a distinct ferredoxin isoform in trichomes supports the hypothesis that selection pressure for efficient natural products biosynthesis may also act on the enzymes of primary metabolism.

Thursday, October 13, 2016

rendering zdock server results using python and UCSF Chimera

I was interested in learning where the protein-protein interaction sites were on a particular protein (the receptor). I had pdb files for that protein, and for three proteins that it is known to bind to (the ligands). There are specific amino-acids of interest on the receptor protein, where there is variability among different species. For every combination of receptor and ligand, I want to perform a prediction of where the proteins interact, and then generate an image where the variable amino-acids are highlighted.

Gratefulness and Compassion meditation

Meditation is a way to slow down and appreciate life. For me, it is a way to relax, and prepare for the day, and to fight back against negative thoughts and attitudes that sometimes accumulate in my mind.

There's no one right way to meditate. You can try to empty your mind. You can focus intently on your breath, or on some mantra, or on different parts of your body. You can listen carefully and try to concentrate on perceiving your surroundings. The only rules are to be calm and to be positive. Meditation can sometimes veer into or negative thinking, or rumination: thinking about how other people owe you something, or how you've been mistreated or disrespected, or how you're somehow in an unfixable situation (you're not). Take care that your meditation does not become rumination.

I like to try out different kinds of meditation. Some I read about, some I invent on my own. I recently explained how I think that gratefulness and compassion are the two most important emotions. The rest of this post explains Gratefulness and Compassion Meditation, which is a way to cultivate those two emotions through practice.

Monday, July 11, 2016

Figuring out which adapters to trim in Illumina data

Often times it's difficult to know what adapter sequences should be trimmed from Illumina data. This can occur if you download public data, for example from SRA or if you send samples for sequencing to a company that doesn't communicate with you very well (not that I have any experience with that...).

Previously it's been a bit of a a struggle for me to figure out which adapters to trim when processing Illumina high-throughput sequencing data. With a little bit of time, and some thought about how Illumina sequencing works, adapters can be identified and removed even if we don't know beforehand what the sequences are.

Read on to see how I figured out the adapter sequences in my most recent RNAseq analysis.