The following Perl script removes all digits (0-9) and white
spaces including "new line". One can easily modify
it to remove letters that don't represent any sequences
in a sequence file. To write the output to a file, use
perl my_perl.pl infile > outfile.
#!perl
#input name of the file to be read from command line
$infile = @ARGV[0];
#remind use there is no input file
if(!$infile) {
print "No input file.\nUsage: perl my_perl.pl infile\n";
}
#read the input file if given
if($infile) {
# open inputfile
open(IN, $infile) or die "can not open $infile\n";
#read the file
while( $line =<IN> ) {
#remove digits and spaces
$line =~ s/[\d\s]//g;
print "$line";
}
close(IN) or die "can not close $infile\n";
}
|