To remove extra spaces or symbols from FASTA  format sequence

If you copy sequence data from a webpage or a consensus sequence from a sequence assembly program, it is very likely that you are using sequence full of junk symbols or spaces which might hurt some sequence analysis, such as Sim4 or even BLAST.  Such cases can be corrected here.

A further caution: this is not "readseq", which can convert sequence format. Here we only work on "dirty" FASTA format data.   So, don't try to give a Genbank or GCG format data and wish the magic of having a clean FASTA.

Only A/T/G/C/R/Y/W/S/M/K/B/D/H/V/N are allowed in the sequences.

AND, we only deal with plain text file rather than any other files like MS-WORD or MS-EXCEL files etc.

 

 

1. Please give the FASTA sequence data file:

2. The cleaned sequence data file can be delivered to:

Email address

3. Output file name you prefer

 choice of output in gzip compressed format:

Your email box accepts message size limit of  per message

If your input data file is from a Mac machine, please be sure to check this box