RNA to protein X95757


Statement
 

pdf   zip

html

Recall that the primary structure of a protein can be represented as a sequence over the alphabet of amino acids A (alanine, Ala), R (arginine, Arg), N (asparagine, Asn), D (aspartate, Asp), C (cysteine, Cys), E (glutamate, Glu), Q (glutamine, Gln), G (glycine, Gly), H (histidine, His), I (isoleucine, Ile), L (leucine, Leu), K (lysine, Lys), M (methionine, Met), F (phenylalanine, Phe), P (proline, Pro), S (serine, Ser), T (threonine, Thr), W (tryptophan, Trp), Y (tyrosine, Tyr), and V (valine, Val).

A codon of three nucleotides is translated into a single amino acid within a protein, with translation beginning with a start codon (AUG) and ending with a stop codon (UAA, UAG, or UGA). The 43=64 different nucleotide triplets code for 20 amino acids, one translation start signal (methionine, one of these amino acids) and three translation stop signals, with some redundancies. The genetic code defines a mapping between codons and amino acids, and despite variations in the genetic code across species, there is a standard genetic code common to most species.

AAAKAACNAAGKAAUNACATACCTACGTACUT
AGARAGCSAGGRAGUSAUAIAUCIAUGMAUUI
CAAQCACHCAGQCAUHCCAPCCCPCCGPCCUP
CGARCGCRCGGRCGURCUALCUCLCUGLCUUL
GAAEGACDGAGEGAUDGCAAGCCAGCGAGCUA
GGAGGGCGGGGGGGUGGUAVGUCVGUGVGUUV
UAA-UACYUAG-UAUYUCASUCCSUCGSUCUS
UGA-UGCCUGGWUGUCUUALUUCFUUGLUUUF

Write code for the protein translation problem. The program must implement and use the RNA-TO-PROTEIN function in the pseudocode discussed in class, which is iterative and is not allowed to perform input/output operations. Make one submission with Python code and another submission with C++ code.

Input

The input is a string s over the alphabet {A,C,G,U}.

Output

The output is the translation of a minimal substring of s from a start codon to a stop codon to a string (proteomic sequence) over the alphabet {A,R,N,D,C,E,Q,G,H,I,L,K,M,F,P,S, T,W,Y,V}.

Public test cases
  • Input

    GUCGCCAUGAUGGUGGUUAUUAUACCGUCAAGGACUGUGUGACUA
    

    Output

    MVVIIPSRTV
    
  • Information
    Author
    Gabriel Valiente
    Language
    English
    Official solutions
    C++ Python
    User solutions
    C++ Python