Thursday, December 10, 2015

Extracting gene ontology categories using python pandas and rdflib

I've got a table I exported from Blast2GO, which associates transcript IDs with GO term IDs (an "annot" file in Blast2GO terminology). I want to separate the GO IDs by category: biological_process, cellular_component, or molecular_function. (Note: in this example I use rdflib, a faster way to do the same thing would be to use the more specialized goatools, which is probably the best GO library for python. I used rdflib because I wanted practice SPARQL)

Thursday, November 12, 2015

Circle the aardvark in the back of the pickup truck

I've been working on this game for an embarrassingly long time. It's pretty much done now.
check it out at:

read on for more details and screen shots

Tuesday, September 8, 2015

Recipe: Hearty Chickpea Curry

At the Moscow Farmers Market there's a vendor who sells a delicious chickpea curry. I wanted to have access to chickpea curry on days other than Saturday, so I decided to try to make my own. Here's what I came up with. I think the recipe is fairly robust, and you can make various changes and still end up with something delicious. I'm not in the habit of measuring my ingredients, so I don't have precise measurements here.

Wednesday, July 22, 2015

resizing (scaling) svg images using rsvg-convert

I had some svg files that I wanted to display on a Drupal website, but they were larger than I wanted for the page. For some reason scaling in the html tags wasn't working, so I had to change the source images.

Sunday, June 28, 2015

Lupin milk and lupin yoghurt

In the previous post I talked about how to make lupin ice cream. You can use the same protein isolation technique (steps 1 to 10 of the previous post) as the basis for lupin milk and yoghurt. The lupin yoghurt has a softer taste (less bitter) than the milk or ice cream. The milk is a bit bitter, but still definitely drinkable.

Sunday, June 21, 2015

Lupin Ice Cream: making vegan ice cream from scratch!

As part of my ongoing long term project to research ways to cook with lupins, I've developed a process for producing home-made lupin ice cream. The consistency of the ice cream turned out great, and the flavor is acceptable, but unfortunately there is a bitter aftertaste that many people will probably find unpleasant. The only flavoring I added was sugar and vanilla extract, maybe some kind of stronger flavoring would be better at masking the bitter taste. In this post I show you how to make lupin ice cream for yourself. I made this recipe based on two patents: US20080089990, and US20070154611. I recommend reading those patents if you are interested in learning more about the science behind lupin ice cream.

Thursday, June 18, 2015

Radical tolerance: A newish take on an old old morality

Maybe you've heard of radical honesty, or radical kindness (I think my favorite quote from Scaughdt Iam is "many of the platitudes that most of us still take for granted actually function exactly as they are written - with no exceptions whatsoever!"). I tend to favor radical kindness (as explained by Scaughdt Iam and Peace Pilgrim). And while I think dishonesty in personal relationships is almost always bad, I don't think silence is necessarily dishonest, nor do I think people have an obligation to volunteer honest opinions unsolicited (that is: to say everything on their mind as soon as it comes into their mind). My take on honesty ("radical" or otherwise), is that everyone should strive to be the sort of person whose thoughts would not surprise his/her friends, nor lower their opinion of him/her. In this spirit, I offer you the concept of "Radical Tolerance" as a step towards honesty and radical kindness. If a person thinks in a way that is radically tolerant, it will be easier for him/her to speak honestly (what could a tolerant person possibly have to hide?), and to act kindly.

Monday, April 13, 2015

MGR CMS: a simple Django-based content management system for genomic data

MGR-CMS (Mint Genomics Resource) is a content management system for genomic and transcriptomic data. The code for MGR-CMS can be found Here. As implied by the name, it was originally built to house data for mint plants, but it can be easily adapted for data from other organisms. I made MGR because I couldn't figure out how to get Tripal to behave how I wanted it to. I later found out that was my fault, not Tripal's fault, so if you want a more or less feature complete genomics content management system, look there. If you want a simple system written in Python (Django), or you have the misfortune of having a Windows server (as I do), then MGR might be just the thing for you (but it still might be worth your while to shoot the Tripal people an e-mail, just to make sure it really can't handle your situation).

Monday, April 6, 2015

evaluating boolean gene-reaction associations and assigning expression levels to reactions in stoichiometric models

Problem: You've got a stoichometric model, and an RNAseq dataset, and you want to use the transcriptome data to determine which reactions in the model are likely to be active, and at what level.

Solution: Use pandas to read the model and the RNAseq data. Use a pyparsing grammar to parse the boolean gene-reaction association expressions.Write a function calculate a reaction expression level based on the parsed gene-reaction association expression, and the RNAseq data. Write a new model file with a column for gene expression.

Read on for details....

Saturday, April 4, 2015

Easter Memories: a poem

Easter has always been one of my favorite holidays. It's an opportunity to see my extended family. The weather is usually nice. The trees are becoming green, and the early-season flowers are blooming. I also like it because I think it is less touched by commercialism than other holidays. People still give each other Easter gifts, but it's not taken to the ridiculous extremes of Christmas. And the gifts are mostly chocolate, which is much more welcome than yet another pair of socks, or some game I'll play twice and set on the shelf for the next 100 years. There's no Easter equivalent of Black Friday. People get together on Easter to enjoy each other and to enjoy nature, and that's really how every holiday should be.

As a child, the Easter morning church service was probably my favorite service of the year. The sanctuary, the clothes, the congregation, and the music were always so bright and cheerful. Even as I've gotten older and more cynical about organized religion, and no longer attend an Easter morning church service, I still feel a strong sense of nostalgia about Easter. As organized religion is becoming less popular with each generation in the US, I think my sentiment is one that increasing numbers of young people can probably relate to.

Here's a poem I wrote to try to capture that sentiment.

Wednesday, April 1, 2015

Personality tests: are they baloney?

I've been thinking recently about personality tests (specifically the MBTI): the one I consistently test as is INFP, (wiki) (maybe I should quit science and become a social worker...). Reading the descriptions of that type, I have to admit that it sounds awfully familiar. When I read the description of the exact opposite type: ESTJ, it was pretty clear that I'm not anything like that. One letter switches from INFP tend to look fairly familiar to me also. So there certainly seems to be some validity to personality tests (this has also been shown in the scientific literature), they measure something, even if it's not always clear exactly what. However, I think people need to be very careful with interpretation and application of the results of personality tests, and not take them too seriously. You are precisely who you are, and not precisely who some silly test says you are!

parameter fitting and opitimisation with PySCeS: hack for allowing integration at specific time points

PySCes is a really awesome modeling environment for kinetic models of biochemical networks. Scripting it is a heck of a lot easier to pick up and learn than the COPASI python bindings.

Clustal Omega not good for nucleotide alignment

Clustal Omega claims that "the quality of alignments is superior to previous versions, as measured by "a range of popular benchmarks." But it always seems to do worse than ClustalW and Kalign, when I try to align nucleotides. I suspect I'm just using the wrong settings, but even after I play with the settings a bit it still gives the same results. So I'm kind of at a loss. I think it may be that it's optimized for alignments of huge numbers of long sequences, and ends up not performing as well on small scale alignments. (If you know what's going on here, please let me know in the comments.)

Annotating a de-Novo transcriptome assembly

Here's how I'm annotating a de-Novo transcriptome assembly:

This post is kind of halfway between a how-to and a lab notebook entry.

I'm on a Linux box with 12 cores (24 if you count hyper threading), 48 GB of RAM, and Xubuntu Desktop 13.10

I'm basically just using the Trinity pipeline. I think it's turning into a really nice set of tools both for assembly and for annotation (because Trinotate is so new and apparently under heavy active development, the information in this guide may not be current for very long). I'm also using Blast2GO, which has been around longer and is very widely used.

I'm trying to annotate 5 transcriptomes simultaneously, so as I'm going, I'm writing little shell scripts to apply each step to all of the transcriptomes at once (sadly, I had a hard-drive crash and lost those scripts... oh well). At some point I might consolidate them into a more fully automated pipeline, but I'm not going to worry about that for now.

Monday, March 23, 2015

Prejudice against women's leg hair and why it has everything to do with everything

I think it is fascinating that most people (at least in the present-day United States) think that women's leg hair is gross. I think it's even more fascinating that pretty much no one is interested in talking about that prejudice, or even admitting that it's something worth talking about. As explained in detail in the following exchange, I think the leg hair issue is a very useful issue to talk about because it is a fairly benign (funny even) topic that people nevertheless have strong opinions about and are reluctant to discuss, so it is useful as a kind of model for self-examination, and a test case for fighting prejudice. Below is an interesting discussion of the topic I had on Facebook a few weeks ago. I edited it slightly to remove personally-identifiable information. My favorite part is where we come to an agreement that the only way to really solve the problem of leg-hair bias is to destroy the whole world. I've also been collecting web-links and academic papers related to the topic, which I will annotate and post later sometime.

Thursday, March 19, 2015

DNA and RNA Nucleotide Structure and Mnemonics Video

I've been planning to do a video about DNA and RNA nucleotide mnemonics for quite a while. I finally got around to making that video. I think I have a unique take on the mnemonics at least that may be helpful for some people. Hopefully this will make up for all the other goofy/useless videos I've been posting... I need to get something for my mic that makes it so it doesn't amplify my breathing so bad. Also, I need to say "umm" less, and instead of making half-under-my-breath asides, I should just say those things loudly like everything else. Other than that, I think it's a good video.

In summary:
The nucleotide mnemonic I use is:
GACT. I like it because it's goofy sounding, so it's easy to remember. The purines are before the pyrimidines (just like purine comes first in the dictionary). you can rearrange it like so:
to remember that G binds to C and A binds to T. GC bonds are stronger than AT bonds. So that pair comes first.

for RNA, it becomes GACU (pronounced: "Gackoo"). Everything about GACT is also true about GACU. If you can remember both, you will easily be able to remember that U replaces T in RNA.

Wednesday, February 18, 2015

I am who I am

The breadth of topics covered on this blog may strike some people as weird. Just in the past five days the posts have included, the personal (poetry), the professional (my lab notebook and code from work for an article published in a peer-reviewed scientific journal), and the silly (a video of myself shaving with a trowel). My justification for this is that the blog represents my interests, and I basically use it as a forum to make available anything I do that I think other people might find useful or entertaining. I don't magically turn into a different person when I walk into the laboratory, or when I come back home. I'm me all the time and, like Walt Whitman, I contain multitudes, and so does this blog, and that's just how I like it. And if you're only interested in one particular topic, then you can use the keyword labels to filter out the things you couldn't care less about..... yep.

My other shaver is a double-edged razor blade duct-taped to a trowel

Even though I only shave a few times a year, I consider myself somewhat of a shaving enthusiast. I've tried straight razors, and double edged razors, and electric razors, and whatever the heck you call the things Gillette is selling these days.

What I've learned is that you don't really need anything special to get a decent shave. All you really need is a really sharp blade mounted firmly on a handle. To prove that point, I decided to cut my big beard off with a razor blade duct-taped to a trowel!  Read on for details (and a 7.5 minute video of the act itself).

Monday, February 16, 2015

Comparing data from open natural products chemical spectral databases

I recently wrote a review article called "Open-access metabolomics databases for natural product research: present capabilities and future potential", for the journal Frontiers in Bioengineering and Biotechnology, section Bioinformatics and Computational Biology (link).

Part of the review involved evaluating the contents of In this post, this post consists of the notes I took when generating those comparisons. The code I used is available from my Bitbucket account (I apologize for the poor organization and near-lack of documentation), some of the scripts depend on srj_chembiolib, and/or Pandas. InChI normalization relies on ChemAxon molconvert.

Factors to compare: how many unique compounds are present in the database, how many structures have the various kinds of associated analytical spectra, what compound classes are represented.

Friday, February 13, 2015

Valentine's Day poems

It's Valentine's Day! To celebrate, here are some "love poems" that I've written over the years. I'll start out with a serious one before quickly devolving into the absurd.

Thursday, January 22, 2015

Match The Mustelidae: a weasel themed memory game

In keeping with the recent trend of posting projects from my misspent youth, I present to you "Match the Mustelidae", which was the second "complete" game I ever made, right after Fasmo (which I will post later).

Saturday, January 17, 2015

"Feeling Down" music video

This music video was my winter break project in 2010. The song was written and recorded by my brother Kyle. The animation was done with a Wacom pad, Gimp, and Blender. The animation style was inspired by the music video for the Moondoggies' Empress of the North (although I obviously don't have half the talent of Drew Christie)

Thursday, January 15, 2015

On the pronunciation of "dehydrogenase", "hydrogenated", and "hydrogenation"

The words "dehydrogenase", "hydrogenated", and "hydrogenation" are typically (as far as I've noticed) pronounced with "drog" as the most emphasized syllable. That's how I used to pronounce it too. However, when TAing a biochemistry class taught by a rather ancient professor (Ron Brosemer, who has been teaching at WSU since 1963) I noticed that he pronounces these words with the emphasis the syllable "hy". So the "hydrogen" part of the word sounds exactly like it would if it were pronounced on its own.

At first that sounded weird to me, but now I think it actually makes a lot more sense than the typical pronunciation. It may be that Dr. Brosemer pronounces the words that way intentionally for pedagogical purposes. But I suspect that when he learned these words (presumably in the 1950's), his pronunciation was actually the typical way to pronounce them. There is no element pronounced "hydROGen", so it seems unlikely that the chemists who coined these derivative terms would have said it that way in their new terms. I think it is more likely that the modern common pronunciation arose in people (like me) who learned the terms before understanding their chemical significance.

Using Dr. Brosemer's pronunciation with these words, brings to mind their actual chemical meaning, a dehyrogenase is an enzyme that removes hydrogens from one of its substrates, and hydrogenated oils are oils which have had hydrogens added.

I've adopted Dr. Brosemer's pronunciation, and I think you should too.