Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Dec 9, 2025

📄 35% (0.35x) speedup for _get_link_annotation in lib/matplotlib/backends/backend_pdf.py

⏱️ Runtime : 4.95 milliseconds 3.66 milliseconds (best of 127 runs)

📝 Explanation and details

The optimization achieves a 35% speedup by eliminating redundant operations and reducing Python overhead in the _get_coordinates_of_block function.

Key optimizations applied:

  1. Single-pass min/max calculation: Replaced four separate generator expressions (min(v[0] for v in vertices), max(v[0] for v in vertices), etc.) with one explicit loop that computes all bounds in a single pass over the vertices. This eliminates the overhead of creating four separate generators and iterating over the same data four times.

  2. Direct tuple flattening: Removed itertools.chain.from_iterable(vertices) and replaced it with explicit coordinate access (vertices[0][0], vertices[0][1], ...). For the fixed case of 4 vertices (8 coordinates), this direct approach avoids iterator overhead and function call costs.

Why this is faster:

  • Reduced iterations: The original code made 4 separate passes over the vertices list; the optimized version makes just 1 pass
  • Eliminated generator overhead: Direct loops are faster than generator expressions for small, fixed-size datasets
  • Removed function call overhead: Direct tuple creation is faster than itertools.chain for this specific use case

Impact on workloads:
Based on the function references, this function is called from draw_text, draw_mathtext, and draw_tex methods when gc.get_url() is not None - meaning it's used for creating clickable links in PDF annotations. The optimization is particularly beneficial for:

  • Text-heavy documents with many hyperlinks (35-38% faster per annotation based on test results)
  • Mathematical expressions with links (the function is in the hot path for mathtext rendering)
  • Bulk annotation creation where the function may be called hundreds of times

The test results show consistent 15-38% improvements across various scenarios, with the largest gains on simpler cases and bulk operations where the reduced overhead compounds significantly.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 1071 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
# imports
from matplotlib.backends.backend_pdf import _get_link_annotation


# Helper class to simulate the graphics context (gc) with a get_url method
class DummyGC:
    def __init__(self, url):
        self._url = url

    def get_url(self):
        return self._url


# Helper class to simulate PDF Name objects
class Name(str):
    def __repr__(self):
        return f"/{self}"


# ------------------- UNIT TESTS -------------------

# 1. BASIC TEST CASES


def test_basic_axis_aligned_rectangle():
    # Test with angle = 0 (no rotation), positive coordinates and size
    gc = DummyGC("http://example.com")
    codeflash_output = _get_link_annotation(gc, 10, 20, 30, 40, angle=0)
    ann = codeflash_output  # 13.4μs -> 11.2μs (19.7% faster)
    # The rect should contain the original rectangle
    min_x, min_y, max_x, max_y = ann["Rect"]


def test_basic_axis_aligned_rectangle_negative_coords():
    gc = DummyGC("https://test.org")
    codeflash_output = _get_link_annotation(gc, -50, -60, 20, 10, angle=0)
    ann = codeflash_output  # 13.6μs -> 11.1μs (22.4% faster)
    min_x, min_y, max_x, max_y = ann["Rect"]


def test_basic_rotated_rectangle():
    # Test with a non-zero angle (not a multiple of 90)
    gc = DummyGC("http://rotated.com")
    codeflash_output = _get_link_annotation(gc, 0, 0, 10, 10, angle=45)
    ann = codeflash_output  # 13.7μs -> 11.3μs (21.3% faster)
    quad = ann["QuadPoints"]
    # Rect should contain all quad points
    min_x, min_y, max_x, max_y = ann["Rect"]
    xs = quad[::2]
    ys = quad[1::2]
    for x in xs:
        pass
    for y in ys:
        pass


def test_angle_multiple_of_90_no_quadpoints():
    # For angle=90, QuadPoints should NOT be present
    gc = DummyGC("http://angle90.com")
    codeflash_output = _get_link_annotation(gc, 0, 0, 10, 10, angle=90)
    ann = codeflash_output  # 13.4μs -> 11.2μs (18.9% faster)
    # For angle=180, QuadPoints should NOT be present
    codeflash_output = _get_link_annotation(gc, 0, 0, 10, 10, angle=180)
    ann2 = codeflash_output  # 6.33μs -> 4.96μs (27.6% faster)


def test_angle_non_multiple_of_90_quadpoints_present():
    # For angle=135, QuadPoints should be present
    gc = DummyGC("http://angle135.com")
    codeflash_output = _get_link_annotation(gc, 0, 0, 10, 10, angle=135)
    ann = codeflash_output  # 13.2μs -> 11.2μs (18.3% faster)


# 2. EDGE TEST CASES


def test_zero_width_height():
    # width=0, height>0
    gc = DummyGC("http://zero.com")
    codeflash_output = _get_link_annotation(gc, 10, 10, 0, 20, angle=0)
    ann = codeflash_output  # 13.5μs -> 11.1μs (22.0% faster)
    min_x, min_y, max_x, max_y = ann["Rect"]
    # height=0, width>0
    codeflash_output = _get_link_annotation(gc, 10, 10, 20, 0, angle=0)
    ann2 = codeflash_output  # 6.30μs -> 4.70μs (34.0% faster)
    min_x2, min_y2, max_x2, max_y2 = ann2["Rect"]
    # width=0, height=0
    codeflash_output = _get_link_annotation(gc, 10, 10, 0, 0, angle=0)
    ann3 = codeflash_output  # 4.95μs -> 3.71μs (33.6% faster)


def test_negative_width_height():
    # Negative width and/or height should still produce a valid rectangle
    gc = DummyGC("http://neg.com")
    codeflash_output = _get_link_annotation(gc, 10, 10, -20, 30, angle=0)
    ann = codeflash_output  # 12.8μs -> 11.1μs (15.3% faster)
    min_x, min_y, max_x, max_y = ann["Rect"]
    # Negative height
    codeflash_output = _get_link_annotation(gc, 10, 10, 20, -30, angle=0)
    ann2 = codeflash_output  # 6.29μs -> 4.77μs (31.9% faster)
    min_x2, min_y2, max_x2, max_y2 = ann2["Rect"]


def test_large_angle_wraparound():
    # Large angle should wrap (e.g. 450 == 90)
    gc = DummyGC("http://wrap.com")
    codeflash_output = _get_link_annotation(gc, 0, 0, 10, 10, angle=450)
    ann = codeflash_output  # 13.2μs -> 11.2μs (17.7% faster)
    codeflash_output = _get_link_annotation(gc, 0, 0, 10, 10, angle=451)
    ann2 = codeflash_output  # 6.47μs -> 4.94μs (31.1% faster)


def test_url_with_special_characters():
    # URL with special characters should be preserved
    url = "https://example.com/path?foo=bar&baz=qux#frag"
    gc = DummyGC(url)
    codeflash_output = _get_link_annotation(gc, 0, 0, 10, 10, angle=0)
    ann = codeflash_output  # 12.5μs -> 10.7μs (16.3% faster)


def test_gc_returns_non_string_url():
    # If get_url returns non-string, should still be accepted
    gc = DummyGC(12345)
    codeflash_output = _get_link_annotation(gc, 0, 0, 10, 10, angle=0)
    ann = codeflash_output  # 13.4μs -> 10.5μs (27.1% faster)


def test_rect_and_quadpoints_are_different_for_rotation():
    # For nonzero angle, QuadPoints and Rect should not be the same as axis-aligned
    gc = DummyGC("http://diff.com")
    codeflash_output = _get_link_annotation(gc, 0, 0, 10, 20, angle=0)
    ann0 = codeflash_output  # 13.2μs -> 10.9μs (21.2% faster)
    codeflash_output = _get_link_annotation(gc, 0, 0, 10, 20, angle=45)
    ann45 = codeflash_output  # 6.81μs -> 5.44μs (25.1% faster)


def test_angle_is_negative():
    # Negative angle should be handled
    gc = DummyGC("http://negangle.com")
    codeflash_output = _get_link_annotation(gc, 0, 0, 10, 10, angle=-45)
    ann = codeflash_output  # 13.4μs -> 10.9μs (22.8% faster)


def test_float_inputs():
    # All parameters as floats
    gc = DummyGC("http://float.com")
    codeflash_output = _get_link_annotation(gc, 1.5, 2.5, 3.5, 4.5, angle=17.3)
    ann = codeflash_output  # 13.5μs -> 10.7μs (25.6% faster)


# 3. LARGE SCALE TEST CASES


def test_many_annotations_unique_rects():
    # Generate many annotations and ensure their rects are unique
    gc = DummyGC("http://large.com")
    rects = set()
    for i in range(100):
        codeflash_output = _get_link_annotation(
            gc, i, i * 2, i + 1, i + 2, angle=i % 360
        )
        ann = codeflash_output  # 448μs -> 335μs (33.7% faster)
        rects.add(ann["Rect"])


def test_many_rotated_annotations_quadpoints_present():
    # For many rotated rectangles, QuadPoints should be present when angle%90!=0
    gc = DummyGC("http://rot.com")
    count_with_quad = 0
    for angle in range(1, 100):
        codeflash_output = _get_link_annotation(gc, 0, 0, 10, 10, angle=angle)
        ann = codeflash_output  # 440μs -> 324μs (35.9% faster)
        if angle % 90:
            count_with_quad += 1
        else:
            pass


def test_performance_large_number_of_annotations():
    # Create 500 annotations and check they are all valid and quick
    gc = DummyGC("http://perf.com")
    for i in range(500):
        codeflash_output = _get_link_annotation(
            gc, i, i + 1, i + 2, i + 3, angle=i % 360
        )
        ann = codeflash_output  # 2.15ms -> 1.57ms (36.6% faster)
        # If angle not a multiple of 90, QuadPoints present
        if (i % 360) % 90:
            pass
        else:
            pass


def test_rect_encloses_quadpoints_large_sample():
    # For a large sample, ensure rect always encloses quadpoints
    gc = DummyGC("http://enclose.com")
    for i in range(100, 200):
        codeflash_output = _get_link_annotation(
            gc, i, i + 2, i + 3, i + 4, angle=i % 360
        )
        ann = codeflash_output  # 449μs -> 325μs (38.1% faster)
        if "QuadPoints" in ann:
            quad = ann["QuadPoints"]
            min_x, min_y, max_x, max_y = ann["Rect"]
            xs = quad[::2]
            ys = quad[1::2]
            for x in xs:
                pass
            for y in ys:
                pass


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
# imports
from matplotlib.backends.backend_pdf import _get_link_annotation


# Helper class to simulate gc with get_url
class DummyGC:
    def __init__(self, url):
        self._url = url

    def get_url(self):
        return self._url


# Helper class to simulate PDF Name objects
class Name(str):
    def __repr__(self):
        return f"/{self}"


# unit tests

# BASIC TEST CASES


def test_basic_zero_angle():
    """Test with zero angle and basic rectangle."""
    gc = DummyGC("http://example.com")
    x, y, width, height, angle = 10, 20, 30, 40, 0
    codeflash_output = _get_link_annotation(gc, x, y, width, height, angle)
    annot = codeflash_output  # 13.3μs -> 11.1μs (20.3% faster)


def test_basic_nonzero_angle():
    """Test with non-zero angle and basic rectangle."""
    gc = DummyGC("https://pytest.org")
    x, y, width, height, angle = 0, 0, 10, 20, 45
    codeflash_output = _get_link_annotation(gc, x, y, width, height, angle)
    annot = codeflash_output  # 13.5μs -> 10.9μs (24.2% faster)


def test_basic_90_angle():
    """Test with angle exactly 90 degrees (should not have QuadPoints)."""
    gc = DummyGC("http://angle90.com")
    codeflash_output = _get_link_annotation(gc, 0, 0, 10, 10, 90)
    annot = codeflash_output  # 13.4μs -> 10.8μs (24.0% faster)


def test_basic_url_types():
    """Test with different URL types."""
    gc = DummyGC("ftp://files.local")
    codeflash_output = _get_link_annotation(gc, 1, 2, 3, 4, 0)
    annot = codeflash_output  # 13.2μs -> 10.6μs (24.6% faster)
    gc2 = DummyGC("")
    codeflash_output = _get_link_annotation(gc2, 1, 2, 3, 4, 0)
    annot2 = codeflash_output  # 5.82μs -> 4.28μs (36.0% faster)


# EDGE TEST CASES


def test_edge_zero_width_and_height():
    """Test with zero width and height."""
    gc = DummyGC("http://zero.com")
    codeflash_output = _get_link_annotation(gc, 0, 0, 0, 0, 0)
    annot = codeflash_output  # 12.8μs -> 10.2μs (25.4% faster)


def test_edge_negative_width_height():
    """Test with negative width and height."""
    gc = DummyGC("http://neg.com")
    codeflash_output = _get_link_annotation(gc, 0, 0, -10, -20, 0)
    annot = codeflash_output  # 13.3μs -> 10.4μs (27.3% faster)


def test_edge_large_angle():
    """Test with angle > 360 degrees."""
    gc = DummyGC("http://angle.com")
    codeflash_output = _get_link_annotation(gc, 0, 0, 10, 10, 450)
    annot = codeflash_output  # 13.6μs -> 11.0μs (23.1% faster)


def test_edge_fractional_angle():
    """Test with fractional angle."""
    gc = DummyGC("http://fraction.com")
    codeflash_output = _get_link_annotation(gc, 0, 0, 10, 10, 22.5)
    annot = codeflash_output  # 13.8μs -> 11.3μs (22.1% faster)
    # All coordinates should be floats
    for val in annot["QuadPoints"]:
        pass


def test_edge_extreme_coordinates():
    """Test with very large and very small coordinates."""
    gc = DummyGC("http://extreme.com")
    codeflash_output = _get_link_annotation(gc, 1e9, -1e9, 1e-5, 1e-5, 0)
    annot = codeflash_output  # 12.8μs -> 10.1μs (26.6% faster)


def test_edge_angle_modulo_90():
    """Test angle that is a multiple of 90 but not exactly 90 (e.g., 180, 270)."""
    gc = DummyGC("http://mod90.com")
    for angle in [180, 270, 360, -90]:
        codeflash_output = _get_link_annotation(gc, 0, 0, 10, 10, angle)
        annot = codeflash_output  # 30.4μs -> 23.8μs (27.6% faster)


def test_edge_unusual_url():
    """Test with unusual URL characters."""
    gc = DummyGC("http://weird.com/?a=1&b=2#frag")
    codeflash_output = _get_link_annotation(gc, 0, 0, 10, 10, 0)
    annot = codeflash_output  # 14.3μs -> 11.8μs (21.4% faster)


# LARGE SCALE TEST CASES


def test_large_scale_many_annotations():
    """Test creating many annotations in a loop."""
    gc = DummyGC("http://bulk.com")
    for i in range(100):
        codeflash_output = _get_link_annotation(gc, i, i * 2, i + 1, i + 2, i % 360)
        annot = codeflash_output  # 440μs -> 322μs (36.4% faster)
        # If angle is not a multiple of 90, QuadPoints should be present
        if (i % 360) % 90:
            pass
        else:
            pass


def test_large_scale_max_dimensions():
    """Test annotation with very large width/height."""
    gc = DummyGC("http://maxsize.com")
    codeflash_output = _get_link_annotation(gc, 0, 0, 1e6, 1e6, 0)
    annot = codeflash_output  # 12.7μs -> 10.5μs (20.4% faster)


def test_large_scale_many_angles():
    """Test with many different angles."""
    gc = DummyGC("http://angles.com")
    for angle in range(0, 360, 10):
        codeflash_output = _get_link_annotation(gc, 0, 0, 10, 10, angle)
        annot = codeflash_output  # 173μs -> 128μs (35.1% faster)
        if angle % 90:
            pass
        else:
            pass


def test_large_scale_varied_urls():
    """Test with many different URLs."""
    for i in range(100):
        url = f"http://test{i}.com"
        gc = DummyGC(url)
        codeflash_output = _get_link_annotation(gc, 0, 0, 10, 10, i)
        annot = codeflash_output  # 440μs -> 320μs (37.3% faster)


def test_large_scale_extreme_values():
    """Test with large negative and positive values."""
    gc = DummyGC("http://extremevalues.com")
    codeflash_output = _get_link_annotation(gc, -1e5, 1e5, 1e5, -1e5, 123)
    annot = codeflash_output  # 12.3μs -> 9.63μs (27.6% faster)
    # All rect values should be floats
    for val in annot["Rect"]:
        pass


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-_get_link_annotation-miyjt1yv and push.

Codeflash Static Badge

The optimization achieves a **35% speedup** by eliminating redundant operations and reducing Python overhead in the `_get_coordinates_of_block` function.

**Key optimizations applied:**

1. **Single-pass min/max calculation**: Replaced four separate generator expressions (`min(v[0] for v in vertices)`, `max(v[0] for v in vertices)`, etc.) with one explicit loop that computes all bounds in a single pass over the vertices. This eliminates the overhead of creating four separate generators and iterating over the same data four times.

2. **Direct tuple flattening**: Removed `itertools.chain.from_iterable(vertices)` and replaced it with explicit coordinate access (`vertices[0][0], vertices[0][1], ...`). For the fixed case of 4 vertices (8 coordinates), this direct approach avoids iterator overhead and function call costs.

**Why this is faster:**
- **Reduced iterations**: The original code made 4 separate passes over the vertices list; the optimized version makes just 1 pass
- **Eliminated generator overhead**: Direct loops are faster than generator expressions for small, fixed-size datasets
- **Removed function call overhead**: Direct tuple creation is faster than `itertools.chain` for this specific use case

**Impact on workloads:**
Based on the function references, this function is called from `draw_text`, `draw_mathtext`, and `draw_tex` methods when `gc.get_url()` is not None - meaning it's used for creating clickable links in PDF annotations. The optimization is particularly beneficial for:
- **Text-heavy documents** with many hyperlinks (35-38% faster per annotation based on test results)
- **Mathematical expressions** with links (the function is in the hot path for mathtext rendering)
- **Bulk annotation creation** where the function may be called hundreds of times

The test results show consistent 15-38% improvements across various scenarios, with the largest gains on simpler cases and bulk operations where the reduced overhead compounds significantly.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 December 9, 2025 12:20
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Dec 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant