Super Codons


Super Codons and Super Nucleotides allow researchers to create diversity in their DNA constructs which match a chosen amino acid distribution.

By using a mixture of nucleotides at each position instead of one single nucleotide, a codon of three nucleotides will produce a distribution of amino acids rather than a single amino acid. In this way, researchers can introduce randomization at sites of interest.



Each of four Super Nucleotides is formed by some mixture of the four traditional nucleotides.

For instance, one set of Super Nucleotides might look like this:
Super Codons are then formed from three Super Nucleotides.

The Super Codon created using Super Nucleotide 1, Super Nucleotide 1, Super Nucleotide 3 would give the following amino acid distribution:
This is easily seen by calculating the probability of each of the 64 possible codons, and summing the probabilities for given amino acids.

For instance, Proline can be formed from the codons CCT, CCC, CCA, and CCG.
p(CCT) = p(C in 1st position) * p(C in the 2nd position) * p(T in the 3rd position) = 0.65 * 0.65 * 0.05 = 0.021125
p(CCC) = p(C in 1st position) * p(C in the 2nd position) * p(T in the 3rd position) = 0.65 * 0.65 * 0.1 = 0.04225
p(CCA) = p(C in 1st position) * p(C in the 2nd position) * p(T in the 3rd position) = 0.65 * 0.65 * 0.05 = .021125
p(CCG) = p(C in 1st position) * p(C in the 2nd position) * p(T in the 3rd position) = 0.65 * 0.65 * 0.8 = 0.338
p(Proline) = p(CCT) + p(CCC) + p(CCA) + p(CCG) = 0.021125 + 0.04225 + 0.021125 + 0.338 = 0.4225

Here, we offer researchers the ability to create their own Super Codons which will provide amino acid distributions close to their given target distributions.

We offer five different options for creating Super Codons
  1. A pre-computed set of Super Codons based upon the BLOSUM substitution matrix. We created 18 target distributions - one for each amino acid (except cysteine and proline). Each distribution sets one "wild-type" amino acid at a target frequency of 50%, and the remainder of the amino acids are given distributions based upon the observed transitions from the "wild-type" amino acid as seen in the alignment data used to form the BLOSUM-62 table.
  2. Similar to above, but rather than attempting to create Super Codons to match the 18 amino acids, allows the researcher to specify their own subset of the 20 amino acids to target.
  3. The third method allows the researcher to specify positions in the CDR regions of antibodies. We use the data at ~site~ to create a multiple alignment of amino acids seen in antibodies at those positions, and attempt to create Super Codons which will mirror those distributions.
  4. The fourth method allows the researcher to upload their own target distributions.
  5. The fifth method allows the researcher to upload a multiple alignment. We then attempt to create Super Codons which will mirror the distribution of amino acids at each position in the multiple alignment.
Thanks to Valerie Hammer for logo design.