HrdDetective

Overview

HrdDetective is an R package for detecting Homologous Recombination Deficiency (HRD) from whole-genome sequencing data. It implements an algorithm based on the PELT algorithm and Sequenza statistical model to identify HRD in human cancer cells. The package provides a complete workflow from BAM file processing to copy number segmentation, ploidy/cellularity estimation, and HRD scar score calculation.

The package performs:

BAM to seqz conversion – extracts heterozygous SNPs and computes depth/BAF
Copy number segmentation – identifies genomic segments with uniform copy number using PELT algorithm
Model optimization – estimates tumor purity (cellularity) and ploidy based on Sequenza statistical model
Visualization – generates publication-ready plots of segments and model fits

Installation

# Install from GitHub (if available)
# devtools::install_github("limeng12/HrdDetective")

# For development installation:
# install.packages("devtools")
devtools::install_github("limeng12/HrdDetective")

Dependencies

Required R Packages

options(BioC_mirror = "https://mirrors.westlake.edu.cn/bioconductor")
options(repos = c(CRAN = "https://mirrors.westlake.edu.cn/CRAN/"))

# CRAN 
cran_pkgs <- c("XML", "RCurl", "Rcpp", "plyr", "dplyr", "iotools", "readr", 
               "tidyr", "devtools", "slider", "reshape2", "callr", "squash", 
               "ggplot2", "stringr", "testthat", "knitr", "rmarkdown", 
               "pheatmap", "gridExtra", "cowplot")
install.packages(cran_pkgs)

### Bioconductor Packages
if (!require("BiocManager")) install.packages("BiocManager")
BiocManager::install(c("Rsamtools", "GenomicAlignments", "Biostrings", 
                       "GenomicRanges", "rtracklayer", "BiocParallel", 
                       "BiocStyle", "copynumber"))
              
devtools::install_github("igordot/copynumber")

Quick Start

# Install packages
# Install from CRAN
conda create -n r_hrd -c conda-forge -c bioconda \
r-base=4.4 \
r-xml r-rcurl r-rcpp r-plyr r-dplyr r-iotools r-readr r-tidyr r-devtools \
r-slider r-reshape2 r-callr r-squash r-ggplot2 r-stringr \
r-testthat r-knitr r-rmarkdown r-pheatmap r-gridextra r-cowplot \
bioconductor-rsamtools bioconductor-genomicalignments \
bioconductor-biostrings bioconductor-genomicranges \
bioconductor-rtracklayer bioconductor-biocparallel \
bioconductor-biocstyle bioconductor-copynumber \
-y

conda activate r_hrd

R
# options(BioC_mirror="https://mirrors.westlake.edu.cn/bioconductor")
BiocManager::install("BSgenome.Hsapiens.UCSC.hg38")
install.packages("patchwork")

devtools::install_github("limeng12/HrdDetective")



# Load the package
library(HrdDetective)
library(BSgenome.Hsapiens.UCSC.hg38)

# Run the complete workflow
t_output_dir_name <- "./test_results"

# Step 1: Convert BAM to seqz (using example data)
bamfile_normal <- system.file("extdata", "target.chr1.normal.dedup.bam", package = "HrdDetective")
bamfile_tumor <- system.file("extdata", "target.chr1.tumor.dedup.bam", package = "HrdDetective")

bam2seqz_r_snps(
  normal_bam = bamfile_normal,
  tumor_bam = bamfile_tumor,
  bsgenome = BSgenome.Hsapiens.UCSC.hg38,
  output_file = "sample.snps.seqz.txt"
)

# Step 2: Process seqz file
data.file <- "sample.snps.seqz.txt"
data <- read.table(data.file, header = TRUE, stringsAsFactors = FALSE, sep = "\t")
filtered_data <- data[!grepl("^(M|chrM)", data[, 1]), ]
write.table(filtered_data, data.file, row.names = FALSE, col.names = FALSE, sep = "\t", quote = FALSE)

# Step 3: Run optimization and segmentation
seg_obj<-opti_and_segs(data.file=data.file, min.tumor.reads = 10);

# Step 4: Generate output
html_file <- generate_png_hrd_report(
  seg.tab = seg_obj$seg.tab,
  seqz_list = seg_obj$seqz_list,
  celluPloidyMat = seg_obj$celluPloidyMat,
  best_ploidy = seg_obj$best_ploidy,
  best_cellularity = seg_obj$best_cellularity,
  output_dir_name = t_output_dir_name
)

Output Files

The analysis generates the following output files:

File	Description
`raw_segments.txt`	Copy number segments with A/B allele counts
`scarHRD_input_test.tsv`	Formatted input for HRD scoring
`HRD_re.txt`	Estimated cellularity, ploidy, and sample/tumor ploidy
`hrd_plot_full.pdf`	Per-chromosome BAF, depth ratio, and copy number plots
`hrd_plot_seg.pdf`	Genome-wide horizontal segment plot
`model_fit_contour.pdf`	Likelihood landscape for cellularity/ploidy optimization
Various HRD score files	HRD, LST, and TAI scores

Key Functions

Core Functions

bam2seqz_r_snps() – Converts paired normal/tumor BAM files to seqz format using heterozygous SNPs
opti_and_segs() – Main function for copy number segmentation and model optimization
generate_png_hrd_report() – Generates all output files and visualization plots

Supporting Functions

pileup_whole_bam() – Fast pileup extraction from BAM files (C++ backend)
pileup_from_bam() – Region-specific pileup extraction
draw_model_fit() – Contour plot of cellularity/ploidy likelihood

Parameters

bam2seqz_r_snps()

Parameter	Default	Description
`normal_bam`	Required	Path to normal (germline) sample BAM file
`tumor_bam`	Required	Path to tumor sample BAM file
`genome_fasta`	NULL	Path to reference genome FASTA file
`bsgenome`	NULL	BSgenome object (e.g., BSgenome.Hsapiens.UCSC.hg38)
`min_depth`	10	Minimum read depth per position
`min_af`	0.25	Minimum allele frequency for heterozygous SNPs
`max_af`	0.75	Maximum allele frequency for heterozygous SNPs
`gc_window`	50	Window size for GC content calculation (±250bp)
`output_file`	"output.seqz"	Path to output seqz file

opti_and_segs()

Parameter	Default	Description
`data.file`	Required	Input seqz file path
`number_of_cores`	6	CPU cores for parallel optimization
`CNt.max`	8	Maximum copy number considered
`min.tumor.reads`	20	Minimum tumor reads for SNP inclusion
`cellularity`	seq(0.1, 1, by=0.01)	Cellularity search range
`ploidy`	seq(1, 6, by=0.1)	Ploidy search range

Method Details

Algorithm

Copy number segmentation using the PELT (Pruned Exact Linear Time) algorithm
Model optimization based on the Sequenza statistical model for estimating tumor cellularity and ploidy

Example Data

The package includes example BAM files for chromosome 1: - target.chr1.normal.dedup.bam – Normal sample BAM - target.chr1.tumor.dedup.bam – Tumor sample BAM

These files are located in the extdata directory and can be accessed using system.file().

Performance

Parallel processing for model optimization
Efficient C++ backend for pileup extraction model optimization

Troubleshooting

Common Issues

BAM file errors: Ensure BAM files are sorted and indexed (.bai files present)
Compilation errors: Ensure C++11 compiler is available

Citation

License

MIT License - see LICENSE file for details.

Maintainer

Meng Li limeng49631@aliyun.com{.email}

Acknowledgments

Built upon methods from Sequenza and PELT algorithm
Uses efficient C++ implementation for pileup extraction
Inspired by existing HRD detection tools

Support

For bugs, feature requests, or questions: - Open an issue on GitHub - Contact: Meng Li limeng49631@aliyun.com{.email} ```

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
R		R
inst/extdata		inst/extdata
man		man
src		src
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.md		README.md
master		master

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HrdDetective

Overview

Installation

Dependencies

Required R Packages

Quick Start

Output Files

Key Functions

Core Functions

Supporting Functions

Parameters

bam2seqz_r_snps()

opti_and_segs()

Method Details

Algorithm

Example Data

Performance

Troubleshooting

Common Issues

Citation

License

Maintainer

Acknowledgments

Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HrdDetective

Overview

Installation

Dependencies

Required R Packages

Quick Start

Output Files

Key Functions

Core Functions

Supporting Functions

Parameters

bam2seqz_r_snps()

opti_and_segs()

Method Details

Algorithm

Example Data

Performance

Troubleshooting

Common Issues

Citation

License

Maintainer

Acknowledgments

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages