Thursday, September 16, 2010

Linking a remote command line and the clipboard

Being logged in to a server via ssh and iTerm, I sometimes want to get the result of a remote command to the local clipboard. The commands pbcopy and pbpaste do this for the local machine, but don't work remotely. However, it's possible to ssh back to you machine (if no firewalls interfere) and invoke these commands:
alias pbcopy "ssh `echo $SSH_CLIENT | cut -f 1 -d ' '` pbcopy"
alias pbpaste "ssh `echo $SSH_CLIENT | cut -f 1 -d ' '` pbpaste"
Of course this is only convenient if you have ssh key authentication set up.

Friday, August 27, 2010

Convert Stockholm sequence format to Fasta

The original stockholm2fasta.pl didn't preserve the order of sequences. I fixed it to preserve the order:

Thursday, August 19, 2010

Evolvability and the rate-limiting step

Do complex cellular processes (like the cell cycle) have a single rate-limiting step? (I.e., do they follow the Arrhenius equation?) Intuitively, this seems to be the case, as there should be one reaction that is the slowest (and thus rate-limiting).

Now, from the point of fitness, it would make sense to have all reactions occur at the same speed (e.g. by tuning enzymes levels). This would mean that there is not a single rate-limiting step, but rather a series of equally fast steps. Assuming that mutations are more likely to influence protein expression than to increase the catalytic activity of enzymes, this "balanced state" could actually occur.

However, once all reactions are tuned to be equally fast, how do you evolve? If there's only one slow reaction, clearly you have selection pressure on this reaction. If all reactions are equally fast, making one of them even faster (by changing the enzyme's activity) would have no influence, no?

Perhaps a way out is that after the enzyme's catalytic activity has increased, lower levels of this enzyme suffice, thus increasing fitness.

Thursday, July 1, 2010

Makefiles and concurrency

Makefiles offer an option for parallel execution, but there is a stumbling block: it is possible to create race conditions that can lead to corrupt output files if two Makefiles want to make the same target.

For example:

Makefile:
all: a b

a:
        $(MAKE) -f Makefile.a

b:
        $(MAKE) -f Makefile.b
Makefile.a:
a: x
        echo A
        cat x > a

x:
        echo A makes X!
        sleep 3
        echo 'A' >> x

Makefile.b:
b: x
        echo B
        cat x > b

x:
        echo B makes X!
        sleep 2
        echo 'B' >> x

When run single-threaded, the file x just contains 'A'. However, in multi-threaded mode, it will contain both lines: 'B' and 'A'. So there is no "stack" of commands to be executed that gets checked for duplicated commands to make the same file. Hence, files that are prerequisites for the parallel parts need to be made first.

Wednesday, April 28, 2010

Cuckoo-host evolutionary dynamics

A perfect example of selection pressures: Some host species are very good at recognizing cuckoo eggs. In other species, however, this is not the case. Why? Because there, the cuckoos actually check if their egg has been kicked out, and destroy all the other eggs if the hosts recognized the cuckoo eggs. So you have a selection pressure against being able to discriminate cuckoo eggs. (The second article even describes other species and their tactics, all of this via reddit)

Monday, March 29, 2010

Test post

If you can see this post, then the migration from publishing via FTP to hosting at Blogger has worked.

Update: I can see this post myself now. I didn't actually use the migration assistant, just created a new subdomain for the old images and told my ISP to direct this subdomain to Google.

Friday, March 26, 2010

How to find non-GPCR targets for neuroactive drugs

In a follow-up to their Science paper, Hillenmeyer et al. recently published an extended analysis in Genome Biology. While the first paper was more focussed on the genes, this paper examines the chemicals in more detail. In particular, I am intrigued by this finding:
Because their neurological targets do not exist in yeast, the sensitivity we observe is likely a result of these compounds affecting additional cellular targets in yeast [44] these “secondary” targets, if conserved, may correspond to additional targets of these compounds in human cells.
The cited paper is by Ericson et al. (PLoS Genetics) that seems to be rather underappreciated with just 6 citations in Web of Science (as of this writing). Both studies point to numerous non-GPCR off-targets of many drugs. I think the next step should be to indeed test the predicted drug targets, and I'm actually not quite sure why Hillenmeyer et al. didn't do this. Determining the affinities will be the only way to find out if the target found in yeast are relevant to human.

Wednesday, February 17, 2010

While we predict drug targets, pharma already knows them

All the excitement about predicting drug targets from remote chemical structure similarities (Nature) and drug side effects (Science) suddenly seems strangely unfounded when you realize that "they" already have the answer:
This figure is a tiny section of the data used for Preclinical Safety Profiling at the Novartis Institutes for BioMedical Research. They are not alone: BioPrint from Cerep is a repository of 2500 compounds assayed against 159 targets. Paolini et al. mention 600,000 binding data points at Pfizer.

Things are begining to change, thanks to academic curation efforts like BindingDB, PDSP Ki database, DrugBank etc. and not least to EMBL-EBI's acquisition of ChEMBL. Still, one can only imaging how different current academic research questions would be if the all-against-all drug–target assay information would be public. In essence, academia is working on predicting additional data points from a sparse matrix, while pharma companies already have the full matrix in hand, but want to add more rows/columns to the matrix in silico.

Update: As always, good discussion on FriendFeed.

Thursday, February 4, 2010

Human Protein Atlas data for download

As I just learned in our lab's journal club, the data from the Human Protein Atlas is available for download, thanks to their recent paper in MSB. Curiosly enough, the HPA help page still states that they do not make data available as a matter of "general policy."

Friday, January 15, 2010

A Newick parser for Python, supporting internal node labels

I just pushed a fork of Thomas Mailund's nice Newick parser for Python to bitbucket. I added support for labeled internal nodes, but probably partially broke support for bootstrap values.
>>> from newick import parse_tree
>>> t = parse_tree("((Human,Chimp)Primate,(Mouse,Rat)Rodent)Supraprimates;")
>>> print t
(('Human', 'Chimp')Primate, ('Mouse', 'Rat')Rodent)Supraprimates
>>> print t.identifier
Supraprimates