Genome sequences of candidate wheat blast biocontrol bacteria

Emilie Chanclud, Joe Win, Jacob Malone, Musrat Zahan Surovy, Dipali Rani Gupta, Tofazzal Islam, Sophien Kamoun

 

Prof. Tofazzal Islam and team have identified several bacterial biocontrol agents that have the ability to inhibit fungal growth on wheat (Surovy et al., 2017). They have isolated a number of these agents and we have sequenced the genomes of the four bacterial strains to 30x coverage. The genome sequence data is now available to download from links in the tables below. Initial analysis of the genome data by Tofazzal and team is described here:

Dutta, Sudipta; Surovy, Musrat Zahan; Gupta, Dipali Rani; Mahmud, Nur Uddin; Chanclud, Emilie; Win, Joe; Kamoun, Sophien; Islam, Tofazzal. 2018. Genomic analyses reveal that biocontrol of wheat blast by Bacillus spp. may be linked with production of antimicrobial compounds and induced systemic resistance in host plants.   https://doi.org/10.6084/m9.figshare.5558641.v1

 

Wheat blast biocontrol bacteria

Bacterial isolateSourceAppearanceIdentification from 16S rRNA sequenceFiltered ReadsAssemblies
BTS 3 (R2)Seeds of Ranga binni (local rice)WhiteBacillus subtilis10294_R2_1_trimmed.fastq.gz
10294_R2_2_trimmed.fastq.gz
10294_R2_U1_trimmed.fastq.gz
10294_R2_U2_trimmed.fastq.gz
fasta, gff, gbk
BTS 4 (R3)Seeds of Ranga binni (local rice)White, stickyBacillus amyloliquefaciens10295_R3_1_trimmed.fastq.gz
10295_R3_2_trimmed.fastq.gz
10295_R3_U1_trimmed.fastq.gz
10295_R3_U2_trimmed.fastq.gz
fasta, gff, gbk
BTS 5 (SL3)Seeds of Shakhorkora (local rice)Light yellowishStaphylococcus saprophyticus10293_SL3_1_trimmed.fastq.gz
10293_SL3_2_trimmed.fastq.gz
10293_SL3_U1_trimmed.fastq.gz
10293_SL3_U2_trimmed.fastq.gz
fasta, gff, gbk
BTLK6A (K6A)Seeds of Kanchan (wheat)WhiteBacillus amyloliquefaciens10296_K6A_1_trimmed.fastq.gz
10296_K6A_2_trimmed.fastq.gz
10296_K6A_U1_trimmed.fastq.gz
10296_K6A_U2_trimmed.fastq.gz
fasta, gff, gbk

Below are the metadata from bacterial genome sequencing and assembly provided by MicrobesNG.

 

Trimmed Reads

The reads were trimmed using Trimmomatic and the quality was assessed using in-house scripts combined with the following software: Samtools, BedTools  and bwa-mem. four files for each sample, samplename_1_trimmed.fastq.gz (forward reads for sample), samplename_2_trimmed.fastq.gz (reverse reads for sample), samplename_U1_trimmed.fastq.gz (forward reads that are unpaired, the reverse read was lost during trimming), samplename_U2_trimmed.fastq.gz (reverse reads that are unpaired, the forward read was lost during trimming).

Sample idMedian insert sizeMean coverageMean coverage excluding 0sNumber of readsNumber of reads w/ insert size > 300Links to sequence reads
BST 3 (R2)51091.291.288894060158110294_R2_1_trimmed.fastq.gz
10294_R2_2_trimmed.fastq.gz
10294_R2_U1_trimmed.fastq.gz
10294_R2_U2_trimmed.fastq.gz
BST 4 (R3)59158.858.7954194139326410295_R3_1_trimmed.fastq.gz
10295_R3_2_trimmed.fastq.gz
10295_R3_U1_trimmed.fastq.gz
10295_R3_U2_trimmed.fastq.gz
BST 5 (SL3)444217.14217.17125177426224310293_SL3_1_trimmed.fastq.gz
10293_SL3_2_trimmed.fastq.gz
10293_SL3_U1_trimmed.fastq.gz
10293_SL3_U2_trimmed.fastq.gz
BTLK6A (K6A)57379.0379.0373118750496210296_K6A_1_trimmed.fastq.gz
10296_K6A_2_trimmed.fastq.gz
10296_K6A_U1_trimmed.fastq.gz
10296_K6A_U2_trimmed.fastq.gz

 

Assembly Data

The assemblies are in fasta format, you can open them in a genome browser such as Artemis or IGV. The annotations are in general feature format (gff), and genbank (gbk) , these can also be opened using the previously mentioned software. The assembly metrics in the table below are calculated using QUAST, for further details about what they mean visit the QUAST manual.

Sample idDownload links#contigs (>= 0 bp)#contigs (>= 1000 bp)Total length (>= 0 bp)Total length (>= 1000 bp)#contigsLargest contigTotal lengthGC (%)N50N75L50L75#N's per 100 kbp
BTS 3 (R2)fasta, gff, gbk491641222534108665231140720411325043.4810638291023790230
BTS 4 (R3)fasta, gff, gbk281339078143901959132032688390195946.5120326881024524120
BTS 5 (SL3)fasta, gff, gbk80372644501262688039572144262854233.5128421845546120
BTLK6A (K6A)fasta, gff, gbk331539088273901415161083238390222246.511024542947863230

 

Taxonomic Distribution

The table below shows the top families and genera that the reads map to, this has been calculated using the software Kraken.

SampleUnclassified (%)Most frequent Family (%)2nd most frequent Family (%)Most frequent genus (%)2nd Most frequent genus (%)Most frequent species (%)Escherichia coli (%)
BTS 3 (R2)1.73Bacillaceae (96.65)Enterobacteriaceae (0.02)Bacillus (96.63)Salmonella (0.01)Bacillus subtilis (91.86)0.0
BTS 4 (R3)1.9Bacillaceae (97.98)Rhizobiaceae (0.02)Bacillus (97.96)Rhizobium (0.02)Bacillus amyloliquefaciens (32.1)0.0
BTS 5 (SL3)72.25Staphylococcaceae (22.44)Enterococcaceae (2.49)Staphylococcus (22.22)Enterococcus (2.47)Staphylococcus saprophyticus (7.64)0.0
BTLK6A (K6A)2.27Bacillaceae (97.61)Rhizobiaceae (0.01)Bacillus (97.59)Rhizobium (0.01)Bacillus amyloliquefaciens (32.1)
0.0