The Global Threat of ST147 Klebsiella pneumoniae

- 10 mins

Bangalore, India 2023


Introduction

ST147 Klebsiella pneumoniae has emerged as a significant global health concern due to its multidrug resistance and rapid spread. While extensively studied in European and North American settings, its prevalence and characteristics in India and Southeast Asia remain under-explored. This study aimed to comprehensively characterize ST147 isolates from India and compare them to a global dataset to understand their unique features and contribution to the global burden of MDR K. pneumoniae.

Background

My Role in the Study

As a member of the Bioinformatics team, I played a crucial role in:

Results

Global Phylogeny and Time-Scaled Analysis
I played a pivotal role in constructing the time-scaled phylogeny using the GHRU-SNP phylogeny pipeline and removing recombinations using the Gubbins software. By carefully analyzing the genomic data, we were able to identify multiple introduction events of ST147 into India, suggesting ongoing clonal expansion. Additionally, we employed the BactDating R package to estimate the time of the most recent common ancestor (MRCA), finding it to be around 1987.
Plasmid Replicon Types and Resistance Gene Associations I also contributed to the analysis of plasmid replicon types and their association with resistance genes. Using tools like Plasmidfinder and Mobsuite, We were able to identify the nature of contigs containing AMR genes and determine whether they were carried on plasmids or chromosomes. This analysis provided valuable insights into the molecular mechanisms underlying the observed multidrug resistance phenotype facilitated by hybrid plasmids carrying carbapenemase genes.

Here are some R scripts that I have written to contribute to this study:

Phylogenetic Tree Plot with Metadata Bars

In this section, we load the phylogenetic tree data and metadata, create a plot using the ggtree package, and overlay bars for each gene from the metadata.

Required Libraries

# Load required libraries
library(ggtree)
library(ape)
library(tidyr)

Libraries: We load ggtree for phylogenetic tree visualization, ape for handling tree objects, and tidyr for data manipulation.

Reading Metadata and Tree File

# Read the metadata from the CSV file
metadata <- read.csv("/data/internship_data/srikanth_kpn/global_st147_fastqs/metadata/lj.csv", stringsAsFactors = FALSE)

# Read the tree file (assuming it's in Newick format)
tree <- read.tree("/data/internship_data/srikanth_kpn/global_st147_fastqs/new_fastqs/snp_phylogeny_output/2023/gubbins_new/mid_point_rooted_ape.nwk")

Preparing the Tree Data Frame

# Initialize vectors to store tip labels and branch lengths
tip_labels <- character(length = 2 * tree$Nnode + 1)
branch_lengths <- numeric(length = length(tip_labels))

# Assign tip labels and branch lengths
tip_labels[tree$edge[, 1]] <- tree$tip.label
branch_lengths[tree$edge[, 1]] <- tree$edge.length

# Create the tree_df data frame
tree_df <- data.frame(tip.label = tip_labels[tip_labels != ""], branch = branch_lengths[tip_labels != ""])

# Save the data frame to a CSV file
write.csv(tree_df, file = "/data/internship_data/srikanth_kpn/global_st147_fastqs/metadata/tree_df.csv")

Merging Metadata with Tree Data

# Calculate y-coordinates for the bars
y_coords <- seq(0, -1, length.out = nrow(metadata))

# Merge filtered metadata with the tree data based on sample IDs
tree_with_metadata <- left_join(tree_df, metadata, by="tip.label")

Plotting the Tree with Metadata Bars

# Make the original tree plot
p <- ggtree(tree_with_metadata)

# Create bars for each gene
for (gene in colnames(metadata)[-1]) {  # Exclude the 'ids' column
  tree_with_metadata <- mutate(tree_with_metadata, !!paste0(gene, "_y") := y_coords)
  
  p <- p + geom_segment(data = tree_with_metadata,
                        aes(x = branch, xend = branch, y = !!sym(paste0(gene, "_y")),
                            yend = !!sym(paste0(gene, "_y")), color = !!sym(gene)),
                        size = 3) +
    scale_color_manual(name = gene, values = c("yes" = "red", "no" = "white"), guide = "none")
}

# Show the plot
print(p)

Script 2: BactDating Phylogenetic Analysis

BactDating Phylogenetic Analysis

This section demonstrates how to run BactDating to estimate the divergence times in a phylogenetic tree, including loading the tree, merging metadata, and visualizing the results.

Required Libraries and Reading Data

# Load libraries
library(ape)
library(BactDating)

# Read the tree file and metadata
t <- read.tree(file="~/pw_fastqs/snp_phylogeny_output/gubbin_out/aligned_pseudogenome.node_labelled.final_tree.tre")
metadata <- read.csv(file="~/Kpn_data/bactdating/dates.csv")
# Extract tree tip labels and merge with metadata
tree_labels <- as.data.frame(t$tip.label)
merged <- merge(tree_labels, metadata, by.x='t$tip.label', by.y='name', all.x = TRUE, all.y = FALSE)

# Reorder to match the tree tip order
merged2 <- merged[match(t$tip.label, merged$`t$tip.label`),]
# Initialize the rooted tree
rooted <- initRoot(t, merged2$date)

# Root-to-tip regression for time estimates
r <- roottotip(rooted, merged2$date)

# Run BactDating with a relaxed gamma model
res <- bactdate(unroot(t), merged2$date, nbIts=10000, initSigma=0.000005, 
                updateSigma=TRUE, updateRoot=TRUE, updateAlpha=TRUE, 
                updateMu=TRUE, model="relaxedgamma")
# Plot the results with confidence intervals
plot(res, 'treeCI', show.tip.label = TRUE)

# Additional BactDating run on previously rooted tree
res0 <- bactdate(rooted, merged2$date)

# Run BactDating again on unrooted tree from root-to-tip regression
res2 <- bactdate(unroot(r), merged2$date, nbIts=10000, initSigma=0.000005, 
                 updateSigma=TRUE, updateRoot=TRUE, updateAlpha=TRUE, 
                 updateMu=TRUE, model="relaxedgamma")

# Plot second set of results
plot(res0, 'treeCI', show.tip.label = TRUE)
Srikanth Srinivas

Srikanth Srinivas

Bioinformatics graduate student at University of Bristol | Former Bioinformatics Intern at Global Health Research Unit, CRL, KIMS

rss facebook twitter github youtube mail spotify instagram linkedin google pinterest medium vimeo gscholar