From 351ee5f055e6b4d2bfd15683f2d22bd3ea4865f7 Mon Sep 17 00:00:00 2001
From: Yu Wang <43355429+yuw444@users.noreply.github.com>
Date: Fri, 3 Oct 2025 15:00:24 -0500
Subject: [PATCH 1/2] Clarify GC bias preference in computeGCBias documentation

Updated the description of GC bias in sequencing to clarify the preference of DNA polymerases for GC-moderate regions instead of GC-rich regions.
---
 docs/content/tools/computeGCBias.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/content/tools/computeGCBias.rst b/docs/content/tools/computeGCBias.rst
index c3ba679f58..f0e0cfafaf 100644
--- a/docs/content/tools/computeGCBias.rst
+++ b/docs/content/tools/computeGCBias.rst
@@ -15,7 +15,7 @@ Background
 
 ``computeGCBias`` is based on a paper by `Benjamini and Speed <http://nar.oxfordjournals.org/content/40/10/e72>`_.
 The basic assumption of the GC bias diagnosis is that an ideal sample should show a uniform distribution of sequenced reads across the genome, i.e. all regions of the genome should have similar numbers of reads, regardless of their base-pair composition.
-In reality, the DNA polymerases used for PCR-based amplifications during the library preparation of the sequencing protocols prefer GC-rich regions. This will influence the outcome of the sequencing as there will be more reads for GC-rich regions just because of the DNA polymerase's preference.
+In reality, the DNA polymerases used for PCR-based amplifications during the library preparation of the sequencing protocols prefer GC-moderate regions. This will influence the outcome of the sequencing as there will be more reads for GC-moderate regions just because of the DNA polymerase's preference. As shown **real-life-data** below, the peak is at where the GC content is moderate.
 
 ``computeGCbias`` will first calculate the **expected GC profile** by counting the number of DNA fragments of a fixed size per GC fraction where GC fraction is defined as the number of G's or C's in a genome region of a given length.
 The result is basically a histogram depicting the frequency of DNA fragments for each type of genome region with a GC fraction between 0 to 100 percent. This will be different for each reference genome, but is independent of the actual sequencing experiment.

From 64779257279b6eebaf8c799eabbb89909815ca1d Mon Sep 17 00:00:00 2001
From: Yu Wang <43355429+yuw444@users.noreply.github.com>
Date: Fri, 3 Oct 2025 15:23:33 -0500
Subject: [PATCH 2/2] Clarify expected values in GC-bias correction

Updated description to clarify GC-bias correction method.
---
 deeptools/correctGCBias.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/deeptools/correctGCBias.py b/deeptools/correctGCBias.py
index 1154b93688..81fdfd858e 100755
--- a/deeptools/correctGCBias.py
+++ b/deeptools/correctGCBias.py
@@ -33,7 +33,7 @@ def parse_arguments(args=None):
         ' method proposed by [Benjamini & Speed (2012). '
         'Nucleic Acids Research, 40(10)]. It will remove reads'
         ' from regions with too high coverage compared to the'
-        ' expected values (typically GC-rich regions) and will'
+        ' expected values (typically GC-moderate regions) and will'
         ' add reads to regions where too few reads are seen '
         '(typically AT-rich regions). '
         'The tool ``computeGCBias`` needs to be run first to generate the '