Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Dec 2, 2025

📄 94% (0.94x) speedup for StylerRenderer._generate_trimmed_row in pandas/io/formats/style_render.py

⏱️ Runtime : 1.00 millisecond 516 microseconds (best of 15 runs)

📝 Explanation and details

The optimized code achieves a 94% speedup by eliminating redundant operations and reducing function call overhead in hot paths, particularly in _generate_trimmed_row which is the primary performance bottleneck.

Key optimizations:

  1. Eliminated expensive _element function calls: The original code called _element() for every data cell (1,215+ times in tests), which showed 43.6% of total runtime. The optimized version directly constructs dict literals inline, avoiding function call overhead entirely for the main loop.

  2. Hoisted expensive lookups outside loops: Variables like self.css, self.data.index.nlevels, and CSS class strings are now computed once before loops rather than being accessed repeatedly inside them. This reduces attribute access overhead from ~4.2% to minimal.

  3. Optimized membership testing: Converted self.hidden_columns list to a set for O(1) membership tests instead of O(n) list lookups, particularly beneficial when many columns are hidden.

  4. Reduced string formatting overhead: CSS class strings are pre-computed and reused rather than being formatted multiple times per iteration.

  5. Optimized constructor defaultdict creation: Created a single default_fmt partial function and reused it across all defaultdict lambdas, avoiding redundant partial object creation.

  6. Streamlined _element function: Replaced dict spreading (**kwargs) with direct dict construction + update(), reducing overhead for the few remaining calls.

Impact on workloads: The optimizations are most effective for scenarios with many columns or frequent styling operations, as demonstrated by test results showing 95-102% speedups for large datasets (100+ columns). The improvements scale with data size, making this particularly valuable for pandas styling operations on medium-to-large DataFrames where rendering performance matters.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 25 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import pytest
from pandas import DataFrame, Series
from pandas.io.formats.style_render import StylerRenderer

# ------------------------
# Unit tests for _generate_trimmed_row
# ------------------------

# BASIC TEST CASES


def test_single_level_index_and_columns_no_trim():
    # 1 index level, 3 columns, max_cols=3 (no trim)
    df = DataFrame([[1, 2, 3]], columns=["A", "B", "C"])
    styler = StylerRenderer(df)
    codeflash_output = styler._generate_trimmed_row(3)
    result = codeflash_output  # 12.5μs -> 7.36μs (69.9% faster)
    for i in range(1, 4):
        pass


def test_single_level_index_and_columns_with_trim():
    # 1 index level, 4 columns, max_cols=2 (should trim after 2 visible columns)
    df = DataFrame([[1, 2, 3, 4]], columns=["A", "B", "C", "D"])
    styler = StylerRenderer(df)
    codeflash_output = styler._generate_trimmed_row(2)
    result = codeflash_output  # 11.9μs -> 8.11μs (46.4% faster)


def test_zero_columns():
    # No columns, should only have index headers
    df = DataFrame([[]])
    styler = StylerRenderer(df)
    codeflash_output = styler._generate_trimmed_row(2)
    result = codeflash_output  # 6.02μs -> 4.04μs (48.9% faster)


def test_hidden_columns():
    # Hide some columns, only visible columns are counted for trim
    df = DataFrame([[1, 2, 3, 4]], columns=["A", "B", "C", "D"])
    styler = StylerRenderer(df)
    styler.hidden_columns = [1, 3]  # Hide columns B and D
    codeflash_output = styler._generate_trimmed_row(1)
    result = codeflash_output  # 13.1μs -> 9.29μs (40.7% faster)


def test_all_columns_hidden():
    # All columns hidden, should still produce data cells but all not visible
    df = DataFrame([[1, 2, 3]], columns=["A", "B", "C"])
    styler = StylerRenderer(df)
    styler.hidden_columns = [0, 1, 2]
    codeflash_output = styler._generate_trimmed_row(2)
    result = codeflash_output  # 11.3μs -> 7.21μs (56.0% faster)
    for cell in result[1:]:
        pass


def test_max_cols_zero():
    # max_cols=0, should immediately add a trim indicator after index headers
    df = DataFrame([[1, 2, 3]], columns=["A", "B", "C"])
    styler = StylerRenderer(df)
    codeflash_output = styler._generate_trimmed_row(0)
    result = codeflash_output  # 8.63μs -> 6.25μs (38.2% faster)


def test_negative_max_cols():
    # Negative max_cols should result in immediate trim after index headers
    df = DataFrame([[1, 2, 3]], columns=["A", "B", "C"])
    styler = StylerRenderer(df)
    codeflash_output = styler._generate_trimmed_row(-1)
    result = codeflash_output  # 10.3μs -> 6.91μs (48.7% faster)


def test_series_input():
    # Accepts Series as input
    s = Series([1, 2, 3], index=["a", "b", "c"])
    styler = StylerRenderer(s)
    codeflash_output = styler._generate_trimmed_row(2)
    result = codeflash_output  # 7.50μs -> 5.42μs (38.5% faster)


def test_non_dataframe_or_series_input():
    # Should raise TypeError if not DataFrame or Series
    with pytest.raises(TypeError):
        StylerRenderer([1, 2, 3])


# LARGE SCALE TEST CASES


def test_large_number_of_columns_no_trim():
    # 1 index, 100 columns, max_cols=100 (no trim)
    df = DataFrame([range(100)], columns=[f"col{i}" for i in range(100)])
    styler = StylerRenderer(df)
    codeflash_output = styler._generate_trimmed_row(100)
    result = codeflash_output  # 80.5μs -> 41.2μs (95.1% faster)
    for cell in result[1:]:
        pass


def test_large_number_of_columns_with_trim():
    # 1 index, 100 columns, max_cols=10 (should trim after 10 visible columns)
    df = DataFrame([range(100)], columns=[f"col{i}" for i in range(100)])
    styler = StylerRenderer(df)
    codeflash_output = styler._generate_trimmed_row(10)
    result = codeflash_output  # 17.4μs -> 11.0μs (58.3% faster)
    for i in range(1, 11):
        pass


def test_large_number_of_hidden_columns():
    # 1 index, 100 columns, hide 90, max_cols=5 (should trim after 5 visible)
    df = DataFrame([range(100)], columns=[f"col{i}" for i in range(100)])
    styler = StylerRenderer(df)
    styler.hidden_columns = list(range(90))
    codeflash_output = styler._generate_trimmed_row(5)
    result = codeflash_output  # 92.2μs -> 45.6μs (102% faster)
    for i in range(1, 6):
        pass


def test_performance_large_but_under_1000():
    # 1 index, 999 columns, max_cols=999 (no trim)
    df = DataFrame([range(999)], columns=[f"col{i}" for i in range(999)])
    styler = StylerRenderer(df)
    codeflash_output = styler._generate_trimmed_row(999)
    result = codeflash_output  # 730μs -> 363μs (101% faster)
    for cell in result[1:]:
        pass


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-StylerRenderer._generate_trimmed_row-mio4gmpr and push.

Codeflash Static Badge

The optimized code achieves a **94% speedup** by eliminating redundant operations and reducing function call overhead in hot paths, particularly in `_generate_trimmed_row` which is the primary performance bottleneck.

**Key optimizations:**

1. **Eliminated expensive `_element` function calls**: The original code called `_element()` for every data cell (1,215+ times in tests), which showed 43.6% of total runtime. The optimized version directly constructs dict literals inline, avoiding function call overhead entirely for the main loop.

2. **Hoisted expensive lookups outside loops**: Variables like `self.css`, `self.data.index.nlevels`, and CSS class strings are now computed once before loops rather than being accessed repeatedly inside them. This reduces attribute access overhead from ~4.2% to minimal.

3. **Optimized membership testing**: Converted `self.hidden_columns` list to a set for O(1) membership tests instead of O(n) list lookups, particularly beneficial when many columns are hidden.

4. **Reduced string formatting overhead**: CSS class strings are pre-computed and reused rather than being formatted multiple times per iteration.

5. **Optimized constructor defaultdict creation**: Created a single `default_fmt` partial function and reused it across all defaultdict lambdas, avoiding redundant partial object creation.

6. **Streamlined `_element` function**: Replaced dict spreading (`**kwargs`) with direct dict construction + `update()`, reducing overhead for the few remaining calls.

**Impact on workloads**: The optimizations are most effective for scenarios with many columns or frequent styling operations, as demonstrated by test results showing 95-102% speedups for large datasets (100+ columns). The improvements scale with data size, making this particularly valuable for pandas styling operations on medium-to-large DataFrames where rendering performance matters.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 December 2, 2025 05:13
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash labels Dec 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant