A nucleic acid or amino acid sequence of length n can be seen as composed of a number of possibly overlapping k-mers or words of length k, for 1 ≤ k ≤ n. An interesting problem is the generation of all the words of length k contained in a genomic sequence with n nucleotides, for all k with 1 ≤ k ≤ n. That is, the generation of all the subwords of a genomic sequence of length n.
Write code for the subwords problem. The program must implement and use the SUBWORDS function in the pseudocode discussed in class, which is recursive and is not allowed to perform input/output operations. Make one submission with Python code and another submission with C++ code.
Input
The input is a string s over the alphabet Σ={A,C,G,T}.
Output
The output is a sorted list of all the nonempty subwords of s, without repetitions.
Input
TATAAT
Output
A AA AAT AT ATA ATAA ATAAT T TA TAA TAAT TAT TATA TATAA TATAAT