Thursday, April 25, 2013

A walk around Bristol

The power was off today - tree cropping by the electricity company - so we decided to walk around historic Bristol.  Brilliant chalk and charcoal pictures at the art school opposite the Victoria Rooms. Then via Brandon Hill (pictured) to Bristol Cathedral followed by lunch at the Shakespeare tavern by Queen's Square. 

The other picture is from St Nicholas' market (Bristol pounds accepted, Elaine) where Clare bought a handmade mobile. Then back to Park Row NCP via a shopping detour in Broadmead.


Tuesday, April 23, 2013

Genome on a Hard Drive

I ignored all of 23andMe's warnings about loss of security and downloaded my raw chromosome/SNP-level genome data from their servers onto my hard drive. It's a 10 Megabyte text file which starts at chromosome 1 and ends with X & Y chromosomes, and my mitochondrial DNA. Here is how the file starts.


# This data file generated by 23andMe at: Tue Apr 23 09:13:29 2013
#
# Below is a text version of your data.  Fields are TAB-separated
# Each line corresponds to a single SNP.  For each SNP, we provide its identifier
# (an rsid or an internal id), its location on the reference human genome, and the
# genotype call oriented with respect to the plus strand on the human reference sequence.
# We are using reference human assembly build 37 (also known as Annotation Release 104).
# Note that it is possible that data downloaded at different times may be different due to ongoing
# improvements in our ability to call genotypes. More information about these changes can be found at:
# https://www.23andme.com/you/download/revisions/
#
# More information on reference human assembly build 37 (aka Annotation Release 104):
# http://www.ncbi.nlm.nih.gov/mapview/map_search.cgi?taxid=9606
#
# rsid chromosome position genotype
rs4477212 1       82154 AA
rs3094315 1       752566 AA
rs3131972 1       752721 GG
rs12124819 1       776546 AA
rs11240777 1       798959 GG
rs6681049 1       800007 CC
rs4970383 1       838555 AC
rs4475691 1       846808 CT
rs7537756 1       854250 AG
rs13302982 1       861808 GG
rs1110052 1       873558 GT

... and so on for 16,563 pages ...

In Microsoft Word it takes ten minutes to open due to its vast size and occupies 20 Megabytes.

As you can see, to the human eye this enormous heap of data is both incomprehensible and useless, but there are obviously tools - programs - which can access it. These were used by 23andMe itself to profile my health risks, inherited traits and ethnicity/ancestry.

Some people in the genetics trade (Razib Khan) have in fact pledged to make their genome publicly available on the Internet.

So: boring or useful?

The insurance company argument has these organisations running their programs over your genome and raising your premiums (or denying you cover) for heritable conditions. This is meant to be illegal in some places but that won't stop them.

The police/security argument is that these agencies would like nothing more than everyone's full genotype publicly available (on Facebook?) because (i) it makes DNA matching so much more powerful; and (ii) as 23andMe show, you can get a lot of phenotype even from today's restricted genotyping (i.e. 23andMe know a scary amount about me just from running their analysis).

The personal identity argument is that, assuming a benign prenatal and childhood environment, most of the key facts about personal identity are gene-encoded (how could they not be - we were built by these things). So personality, intelligence, appearance, height and even many social attitudes are heavily influenced by the genome: see this surprising graphic from here where genetic contribution is to the right in blue.



Graphics like this are built from twin studies, not genomic analysis, but there is intense ongoing research looking at the specific gene-variants (alleles) which drive such phenotypical characteristics. At some stage after the research is in, there will be tools which can grab your or my genome and read off this kind of rather personal information.

I guess those nice folk from the security and police services will be first in line, followed by employers and then potential life partners. Actually, the line could be in any order!

I ignore really futuristic options open perhaps to our great-great-grandchildren to clone their genome-publicising ancestor either in virtuality or in the flesh! (I wrote about this at science-fiction.com).

Sunday, April 21, 2013

What would it take?

The Kepler space telescope has found two earth like planets orbiting in the habitable zones of distant stars. These are prime candidates for life,  but no-one cares: the news has passed without comment.

If life is subsequently found - bacteria,  algae, animals - it will be a ten day wonder.  What would get people's sustained attention?

An interstellar version of North Korea.

+++

There's a plethora of 'fast diet' cook books on sale now, with tasty 500 calorie meals for 'fast days'.

Of course,  Michael Mosley made up the 500 calorie rule based on the minimum amount of food he could eat without feeling starving. Ideally you would eat nothing on a fast day.

Saturday, April 20, 2013

Blaise Castle

The paths were dry, the trees budding and kiddies were out in force for the first decent day of spring. We traversed the dark tunnel behind St Mary's, walked through Blaise Castle woods and returned to the car across sun-lit parkland.

Name one good thing about feudalism: the great estates were preserved from short-term development.

Clare at the witch's cave


The author: 'lovers' leap' behind


Clare with the Folly

Thursday, April 18, 2013

My 23andMe results

The results of my genetic screening at 23andMe have now come back - almost a million SNPs analysed.

[Update (August 2017): see also my Personal Genome Project report (PDF)].

The human genome SNP database currently has around 20 million known SNPs but most have unknown consequences.

So, here are the highlights: overall I guess I was pleased and relieved! The impact on family? Well, I share half of my alleles with my siblings, my parents and my children (and I'm uncorrelated with my wife). So extrapolate with care .. we're none of us clones.

I was a bit surprised to find that my paternal lineage appears to be from Ireland! We had traced my great-great-grandfather up to Oldham, Lancs where he was a Hatter (see here also). I think we had supposed the Oldham Seels were Anglo-Saxon, but the Y-chromosome is pointing instead to Ireland. On the maternal side, it seems my mother's maternal ancestors are solidly Atlantic coastal celts.

Click on images to make them larger.





Clare says just by appearance - Neanderthal is obvious!



The site itself has masses of detailed results and research reports which will keep me occupied for a while.

Wednesday, April 17, 2013

Quadratic Equations

When I was in my first year at secondary school I used to walk down the road to the bus stop with a fellow pupil two years older than me. He was a rounded lad with, I recall, red hair and his name was Phil. I was learning to solve linear and simultaneous equations at the time and was becoming aware of quadratic equations, a topic which fascinated me. I also knew he had already studied this in his maths class ...

Suppose we have x2 - 5x + 6 = 0, or, less mysteriously to a 12 year old,

5x = x2 + 6.

Look! x occurs on both sides of the equation but not in a way which can be eliminated. There seems no easy way to proceed, but I had heard that there was a mysterious technique using "factorisation". I badgered Phil to tell me what the secret was.

As we sat on the 25 minute ride to the centre of Bristol, Phil wearily explained that you write:

x2 - 5x + 6 = (x-3)(x-2) = 0.

"So x must be either 3 or 2, see?" he concluded brusquely.

I did not see. Surely x couldn't be two things at once. I was happy that the values of 2 and 3 both 'worked' in the sense that each satisfied the equation  - and this itself was a source of wonderment, two different values and they both work! The teacher eventually explained that if two things, when multiplied together, equal zero, then one or the other must itself be zero because otherwise their product would be something non-zero. So take your pick of the two factors.

This is more sophisticated reasoning than it appears. The equation doesn't tell you a fact about a pre-existing x (as I thought, hence my confusion as to how it could be two different things at once). Instead, the equation is a kind of sieve, a constraint which "checks" all possible values for x and only allows through those for which it is true. I confess I did not have a mental model of the entire number line flowing through the equation, with just 2 and 3 being sieved out!

Quadratic equations are not trivial. Consider that we could rearrange x2 - 5x + 6 = 0  as an iterative equation

xn+1 = (xn2 + 6)/5.

A solution emerges when xn+1 and xn are equal, but if you start with a blind guess, perhaps the sequence you get by plugging each xn+1 back into the equation's right-hand-side will 'converge' to the right answer?

Is it obvious if, or when, this happens? Look at the spreadsheet below: starting values run along the top row in yellow; below we have twenty iterations. When stuff starts getting big, it really gets big and blows Excel apart, so not all the cells have numbers in them.


It's a bit more advanced to work out the domains of convergence, and to realise that 2 is a local attractor and 3 isn't, so you'd get just one of the two solutions if you relied on naive iteration.

I wonder how many kids in their school maths classes today experience the wow factor of getting to really know quadratic equations?

Sunday, April 14, 2013

"The Selfish Gene" - Richard Dawkins

There are some books which have been around for so long, and which are so well known that it appears pointless to actually read them. Surely we know the entirety of their content in advance.

Reading "The Selfish Gene" for the first time I'm reminded - yet again - how wrong this prejudice is. Dawkins' writing is rational and sophisticated,  his arguments both counterintuitive and persuasive. It's deeply unfamiliar to think of plant and animal (incl. human) behaviour in terms of whatever works best to maximise a gene's (or more accurately, an allele's) inclusive fitness.

I have two quibbles. 

1. Dawkins has avoided maths in favour of a textual, metaphorical presentation - this succeeds in making his book something which can be read rather than studied. I would have appreciated, however, to have been told explicitly when there was a robust mathematical analysis supporting his many hand-wavy conclusions. 

Recall that evolutionary psychology is riddled by vaguely-plausible speculative arguments. I understand that there is a formal underpinning (population genetics) to Dawkins's book: I'd just like to know when it's being appealed to.

2. Dawkins, in common with other popularisers like Pinker, is keen to avoid the charge of genetic determinism. This leads to arguments like: 'I am not bound by my genes, I choose for example not to have children (by using contraception)'. 

This is a very sloppy argument as it's unclear what is being counterposed to the 'best interests of the genes'. It sounds like 'free will' but that isn't a concept within scientific explanation (not as long as we believe the laws of physics and discount magic). Dawkins' and Pinker's recourse to contraception is something which needs to be explained by biology, and I think it can be cast as the same kind of 'categorisation mistake' as in the adoption by a desperate mother of a genetically-unrelated child. Surely the alleles which permit it will markedly decrease in frequency in the future.

By evading a proper discussion of this point, Dawkins departs from his own high standards and is thus diminished.

"The Selfish Gene" is billed as an accessible popularisation but is not at all an easy read: the concepts are strange and counterintuitive, the argumentation is frequently sophisticated and subtle and the overall worldview driving the book unfamiliar. It requires engagement and thought by the reader, a personal growth experience which should alter the lens through which the affairs of the world are viewed. The resulting paradigm is far from conventional, let alone politically correct. 

You may observe once again how everyone pays lip service to Darwinian evolution, yet carefully avoids thinking through to the (unpalatable) consequences.

Saturday, April 13, 2013

Poetry please

Rolly, volely, puddin' and pie
In through the cat flap loudly you cry
Is that a vole in your mouth that I spy?
Toss it and tap it but don't let it die!

Three voles brought into the house and three saved so far this spring.

Thursday, April 11, 2013

Things I learned at Heathrow today

1. If you take a perfectly ordinary cheese,  tomato and ham baguette and put it into one of those hot, squashing panini-making machines, it cooks to something delicious.

2. If afterwards,  you use the vacant basin in the baby-change booth to floss your teeth, you will be chased away by a member of cleaning staff despite protesting "Excuse me, do you see any babies here? "

As we were driving past Stonehenge, there were the usual 625 tourists forming a strung-out circle around the stones. 'A fine example of a Poisson distribution,' I thought to myself. 'The probability of any particular tourist being here today is minusculely small .. but there are a lot of tourists.'

If we take the mean number of stonehenge tourists (625 say) as the Poisson parameter λ, this will also be the variance.  The standard deviation is therefore 25 (square root of λ). Most always the number of walkers will be within 3 std dev of the mean (i.e. 550 - 700) which explains why Stonehenge, tourist-wise, always looks the same.

Adrian arrived safely and is currently adjusting to the local time zone.

Tuesday, April 09, 2013

Buying double glazing - via microeconomics

We're in the market for some double-glazing, replacing windows in the hall plus the patio door to the back garden. We had the salesman from one of the UK's leading double-glazing companies here this morning to give us a quote. There followed a mild form of 'sticker shock' - the number was big! - and though the salesman told us that his company's products were "by far the best, that's why they're so expensive," we decided to get some more quotes first.

If we have three quotes we can try to run an auction to get the best value. So what's the scope of a double-glazing company to reduce its price? Some of their websites advertise "50% discounts"! How on earth can they afford to do that?

We estimated that the cost of labour for installing our two windows and replacement door would be of the order of £500 - or c.15% of the quote we actually received. This means that roughly 85% of the price must cover the products, the hi-tech double-glazed windows/door themselves. Now, making any sophisticated product involves high fixed costs (the factory, advertising, marketing, R&D) as well as the variable cost of labour and materials for the products themselves. These fixed and variable costs naturally need to be covered by the price per unit sold.

However, the marginal cost per unit is far less. Just what it costs in extra materials and manpower to make one additional unit - without regard for paying for all the other fixed stuff.

If the double-glazing company sells a unit at any price above its marginal cost, it will make some money (albeit not enough to justify its whole operation in the longer term). However, better to make some money than lose a sale and make nothing.

So, in theory, we could bid the auction process down until we're in the region of the winning company's marginal cost - this could easily result in a 50% discount ... if the winning company could keep secret the fact it was prepared to sell at this price (or would do so under very limited marketing conditions). Anything so long as it could still hope to hold its regular price in the general market and so cover all its costs and make a reasonable return on capital.

Bring it on, guys! We promise to keep quiet!


Intermittent fasting: month 8

Margaret Thatcher 's death yesterday is in reality a ten minutes news story, followed by an hour's biography at some point. Instead we've had all-channels saturation coverage, which continues as I write. It's almost enough to drive one to Frasier re-runs.

Here's my current intermittent fasting protocol. As a rule, monday, wednesday and friday I go to the gym for a one-hour workout and then just have a light lunch c. 500 calories. The other days I try not to go over the 2,500 calorie daily-allowance, and avoid junk.

After eight months I would say that almost all of the "apple-shaped" abdominal fat I used to have is gone, save for a small ridge of flab under my ribs, just above my waist.

Date Stone Lb Pounds Kg d(Lb) BMI BMI-new   Height (m)
07/08/2012 13 8 190 86.4 27.20 26.49   1.782
08/09/2012 12 13 181 82.3 9 25.91 25.23
06/10/2012 12 7 175 79.5 6 25.05 24.39
08/11/2012 11 12 166 75.5 9 23.76 23.14
08/12/2012 11 6 160 72.7 6 22.90 22.30
08/01/2013 11 2 156 70.9 4 22.33 21.75 Waist
10/02/2013 11 1 155 70.5 1 22.19 21.61 34.5 inches
09/03/2013 11 0.4 154 70.2 1 22.10 21.52
09/04/2013 10 12.2 152 69.2 2 21.79 21.22

Anyway, enough of this. It's time to spend an hour hoovering the house - penance for having just eaten a rather delicious ginger biscuit (with xylitol), one of a batch Clare made yesterday.

Sunday, April 07, 2013

Extreme assortative mating and IQ

People assortatively mate on traits such as intelligence. Does this make a difference to the usual bell-shaped curve for IQ? Surely it ought to increase the spread, the standard deviation?

Take a European Caucasian population with mean IQ of 100 and standard deviation 15. Now suppose the right-hand side of the population with IQ > 100 reproduce assortively only amongst themselves. The mean of the new population is at the 75th percentile of the original bell-shaped curve,  at 0.67 sigma or approx 10 IQ points above the old mean.

This new distribution,  the right-hand side only of the original curve, is far from normal .. but by the magic of the central limit theorem, over some number of generations, random mating will result in a new bell curve centred on IQ 110 with a sigma, as before, of 15.

(Update: Oops!  The distribution has been pulled in; the standard deviation is actually reduced to 9 points. This counteracts the increased mean as far as very intelligent offspring are concerned,  absent continued selective pressure for increased intelligence and time for advantageous mutations.)

The situation for the reproductively-isolated left-hand side is symmetrical: they end up as a normally-distributed population with mean IQ 90.

If the environment is systematically culling the less-intelligent, as in our evolutionary history, then this analysis - simple as it is - applies.

It would be shocking to suggest that advanced societies today are culling the right-hand side of the bell-curve - as smart professionals decline to have offspring!

Saturday, April 06, 2013

Visualising Div and Curl

Some people are happy with just the maths equations. For myself, I want a geometric picture to really understand and use a concept. Thanks to Daniel Fleisch for the best description of how to visualise Div and Curl from this book. Click on the images to make them larger and readable.

A good introduction to Maxwell's Equations


Visualising the Div operator


Visualising the Curl operator

The Grad operator (on a scalar field), applied at a point on the surface, delivers the tangent vector pointing up the steepest slope, with its length measuring the steepness. That's pretty easy to visualise.

Richard Feynman commented somewhere that in the distant centuries yet-to-be, when people look back to the nineteenth century, they will remember it for just one thing: it was the century when Maxwell's equations were formulated.

Thursday, April 04, 2013

Studies checkpoint

I see before me Book One of the Open University's S377 Molecular and Cell Biology course. The four-volume half-credit course covers how cells work, how they reproduce and how they go wrong (now of non-academic interest to Iain Banks). With my 23andMe results due next month, it's time to get my head around how genes work, (I bought the four volumes via Amazon).

So here's an interesting test. I've got a good background in maths and physics but I have never formally studied biology. So how accessible is a final-year undergraduate course likely to be? My feeling is that, unlike maths and physics, biology is not deeply hierarchical or cumulative - with some background reading (for example on cell substructures) I'm confident that I can get through it. I'll let you know if such confidence is misplaced.

When I first started the Open University's Calculus of Variations course - M820, the foundation topic in the MSc - I was completely bewildered: it was like entering a foreign country where I knew nothing of the language, customs or ways of life. Now I'm about to start Chapter 13 of 14 and I've grown familiar with the apparatus of stationary values of functionals, the Euler-Lagrange equation, natural boundary conditions and the like. Today I was working through the solution to the brachistochrone problem in the presence of air-resistance and friction (separate cases): mountains of dense algebra but I was fine with the general techniques being used.

The course finishes with Sturm-Liouville theory and the Rayleigh-Ritz method: both are relevant in physics so I'm budgeting another month or so to complete.

The calculus of variations is a technical prerequisite to the application of Emmy Noether's theorem in physics, and I have a book about that next on my list. In parallel I'm going to work through a first course on differential geometry, a prerequisite both for Dr Einstein's more general theory as well as quantum field theory (my holy grail of interest). I guess I should finally mention Consistent Quantum Theory which is conceptually dense but accessible - it's a path which leads to concepts such as density matrices and decoherence, critical to the modern understanding of what quantum mechanics actually means. So far, I'm about halfway through and feeling the enlightenment!

Wednesday, April 03, 2013

Iain Banks

Such a shock today to hear the news that Iain Banks has terminal cancer, and only months to live. He seems to be dealing with the situation competently, stoically, and with what dark humour is possible.

Perhaps we should send in our Culture question before it's too late: Iain, what actually happened to the Culture in the end?

Tuesday, April 02, 2013

A Parallel Universe

I  was sat in The Swan, by the window, eating lunch with my mother and watching traffic on the A38.

Just a note on the Swan's cuisine: they have pretensions - food was served on rectangular wooden 'plates' - but the food itself was bland. The apple and apricot crumble was lukewarm and light on apricots. I guess another pub crossed off the list.

Back to the A38. It had a lot of white cars:  BMWs, VWs mostly. I never see white cars,  and now it's every other car. Did something happen over Easter?