Evan, even focusing on English, the amount of information in a single letter is indeed not a simple question. While ASCII uses 8 bits, including capital letters the number is smaller than 64, so 6 bits is enough to directly write them (Baudot code, Braille).

But optimal compression uses -lg(p) bits per situation of p probability and Morse code tries to get toward this direction.

Treating letters independently, according to letter frequency from

http://en.wikipedia.org/wiki/Letter_frequency (0.08167, 0.01492, 0.02782, 0.04253, 0.12702, 0.02228, 0.02015, 0.06094, 0.06966, 0.00153, 0.00772, 0.04025, 0.02406, 0.06749, 0.07507, 0.01929, 0.00095, 0.05987, 0.06327, 0.09056, 0.02758, 0.00978, 0.0236, 0.0015, 0.01974, 0.00074)

the Shannon entropy is -sum_i p_i lg(p_i) ~ 4.176

what is not much smaller than lg(26) ~ 4.7

so in Morse-like approaches we cannot get much better (we would need to group letters like in Chinese).

... but language contains a lot of redundancy - very complex correlations between letters. It allows us to reconstruct from damaged messages - removing it allows to further reduce the amount of bits/per letter. And so the best known compressors (

http://mattmahoney.net/dc/text.html ) can write 1GB of text into 127MB, what is about 1 bit per letter as you have written.

But it is not simple or straghforward - here is a nice table for a Scruples novel from 17th page of

http://kuscholarworks.ku.edu/dspace/bitstream/1808/411/1/j42-hamid.pdf - succeeding numbers are entropy in bits while using letter groups of lengths correspondingly 1 to 7:

Upper Bound Entropy 4.07 3.43 2.71 2.62 3.46 2.75 2.73

Lower Bound Entropy 3.14 2.49 1.68 1.59 2.44 1.71 1.67

Going to biology, the 33 bits of the article you mention is absolute minimum to distinguish inside the world population (almost 2^33).

In practice we need more ... because of randomness, the average length of distinguishing identifier is 1.33275 bits longer (D_n above), but the DNA test has to be general - don't know how many bits are required to distinguish given individual.

The article says that current tests use about 54 bits, what seems reasonable - probability that beside your own profile it will also fit to someone's else is about 1/2^21 ~ 1 per 2 millions (assuming well used bits (no twins etc...) and no errors).

Our genetic individuality is of scale of 0.2% of the whole genome (

http://en.wikipedia.org/wiki/Human_genetic_variation ), what is a few orders of magnitude larger (plus epigenetic, variations between cells ... ). And we have muuuch more individual features...

The limit 2.33 bits/individual is muuuch stricter lower boundary (or maybe there is a lower one?) - we would need very simplistic models to get it, or observe from a specific point of view - like in sociophysical models of crowd.

And it requires looking at population as a whole - summing lengths of distinguishing labels of individuals requires faster than linear entropy growth (at least n lg(n)).