How do researchers identify the species of a bacteria or other organism? In the past, scientists used different traits (or phenotypes) such as the ability to catabolize ("eat") different nutrients, grow on specific types of media, the presence of flagella or shape of the bacteria, etc. Now we have the ability to determine the sequence of an organism's genomic DNA.
By comparing the sequence of different parts of an isolate's genome to sequences from known bacteria, we can identify those other bacteria that are most similar to it. For bacteria, we will use what is called the 16S rRNA small subunit. Every bacteria has a gene encoding for the 16S ribosomal RNA small subunit. Since every bacteria contains this gene, we can compare its DNA sequence across all bacteria to group them into species.
ls /nfs1/Teaching/CGRB/dbbc_s16/data/examples/
Each of these files contains the 16S rDNA sequence from a bacterial plant pathogen. We will identify that bacteria by performing a BLAST search against the SILVA database of 16S sequences.
Use a random number generator to pick a number between 1 and 10:
rand 10
cp the file corresponding to your number to the current directory, replace # with your number:
cp /nfs1/Teaching/CGRB/dbbc_s16/data/examples/#.fasta ./
Now we will BLAST your file against a database using the blastn program.
The blastn program takes several arguments:
-query ./inputfile.fasta This is your input file with the sequence to search for
-db /path/to/database This is your database of sequences to search against
-out ./outputfile.txt This is where to store the output from your blast search
Let's run the blast search, replace # with your number:
blastn -query ./#.fasta -db /nfs1/Teaching/CGRB/dbbc_s16/data/db/bacteria16s -out ./blastoutput.txt
Look at the output of your blast search using nano:
nano blastoutput.txt
The search results are sorted from most similar to least similar. What is the genus and species of the bacteria most similar to yours?