Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Dec 2, 2025

📄 47% (0.47x) speedup for maybe_convert_css_to_tuples in pandas/io/formats/style_render.py

⏱️ Runtime : 1.41 milliseconds 959 microseconds (best of 57 runs)

📝 Explanation and details

The optimization replaces the inefficient list comprehension with a more streamlined loop that eliminates redundant string operations. Key changes:

  1. Single split per property: Uses str.partition(":") instead of str.split(":") multiple times. The original code called x.split(":") twice - once for the key [0] and again for reconstructing the value with ":".join(x.split(":")[1:]). The optimized version uses partition which splits only once at the first colon and returns the separator, avoiding duplicate work.

  2. Reduced strip operations: The original code stripped x in the list comprehension condition and then stripped the split results. The optimized version strips x once upfront (x_stripped) and reuses this result.

  3. Eliminated unnecessary string reconstruction: The original ":".join(x.split(":")[1:]) pattern is expensive for values containing multiple colons. The optimized partition method directly provides the key and remaining value without reconstruction.

  4. Simpler control flow: Replaced the complex list comprehension with a straightforward loop that's easier for Python to optimize.

The optimization shows 46% speedup with particularly strong gains (30-60%) on CSS strings with multiple rules, colons in values, and large-scale inputs. Based on the function references, this function is called in hot paths within pandas DataFrame styling operations (_update_ctx, _update_ctx_header, set_table_styles), where CSS strings are processed for every styled cell. The optimization reduces CPU overhead when styling large DataFrames or applying complex CSS rules, making pandas styling operations more responsive.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 95 Passed
🌀 Generated Regression Tests 66 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
io/formats/style/test_style.py::TestStyler.test_maybe_convert_css_to_tuples 8.97μs 6.71μs 33.6%✅
io/formats/style/test_style.py::TestStyler.test_maybe_convert_css_to_tuples_err 1.39μs 1.30μs 6.74%✅
🌀 Generated Regression Tests and Runtime
from typing import Union

# imports
import pytest  # used for our unit tests
from pandas.io.formats.style_render import maybe_convert_css_to_tuples

CSSPair = tuple[str, Union[str, float]]
CSSList = list[CSSPair]
CSSProperties = Union[str, CSSList]

# unit tests

# ----------- Basic Test Cases ------------


def test_basic_css_string_single_rule():
    # Test a simple CSS string with one rule
    codeflash_output = maybe_convert_css_to_tuples(
        "color: red;"
    )  # 2.37μs -> 1.81μs (30.9% faster)


def test_basic_css_string_multiple_rules():
    # Test a CSS string with multiple rules
    codeflash_output = maybe_convert_css_to_tuples(
        "color: red; background: blue;"
    )  # 3.18μs -> 2.28μs (39.6% faster)


def test_basic_css_list_input():
    # Test input that is already a list of tuples
    input_list = [("color", "red"), ("background", "blue")]
    codeflash_output = maybe_convert_css_to_tuples(
        input_list
    )  # 436ns -> 422ns (3.32% faster)


def test_basic_css_string_with_extra_spaces():
    # Test CSS string with extra spaces around keys and values
    codeflash_output = maybe_convert_css_to_tuples(
        " color :  red ; background :  blue ; "
    )  # 3.51μs -> 2.54μs (38.2% faster)


def test_basic_css_string_with_float_value():
    # Test CSS string with a float value
    codeflash_output = maybe_convert_css_to_tuples(
        "opacity: 0.5;"
    )  # 2.48μs -> 1.81μs (37.0% faster)


# ----------- Edge Test Cases ------------


def test_empty_string():
    # Test empty string input returns empty list
    codeflash_output = maybe_convert_css_to_tuples("")  # 1.29μs -> 889ns (44.8% faster)


def test_empty_list():
    # Test empty list input returns empty list
    codeflash_output = maybe_convert_css_to_tuples([])  # 442ns -> 452ns (2.21% slower)


def test_string_without_colon_raises():
    # Test that a string without colon raises ValueError
    with pytest.raises(ValueError):
        maybe_convert_css_to_tuples("color")  # 1.21μs -> 1.18μs (2.36% faster)


def test_string_with_trailing_semicolon():
    # Test string with trailing semicolon
    codeflash_output = maybe_convert_css_to_tuples(
        "color: red;"
    )  # 2.64μs -> 2.08μs (26.7% faster)


def test_string_with_multiple_colons_in_value():
    # Test property value with colons
    codeflash_output = maybe_convert_css_to_tuples(
        "font: 12px/14px Arial:Bold;"
    )  # 2.66μs -> 1.94μs (37.3% faster)


def test_string_with_empty_rules_between_semicolons():
    # Test string with empty rules between semicolons
    codeflash_output = maybe_convert_css_to_tuples(
        "color: red;;background: blue;;"
    )  # 3.04μs -> 2.59μs (17.4% faster)


def test_string_with_only_colon_and_semicolon():
    # Test string with only colon and semicolon (no key/value)
    codeflash_output = maybe_convert_css_to_tuples(
        ":;"
    )  # 2.61μs -> 2.05μs (27.2% faster)


def test_string_with_empty_key_and_value():
    # Test string with empty key and value between colons
    codeflash_output = maybe_convert_css_to_tuples(
        ": ;"
    )  # 2.43μs -> 1.92μs (26.7% faster)


def test_string_with_numeric_key_and_value():
    # Test string with numeric key and value
    codeflash_output = maybe_convert_css_to_tuples(
        "123: 456;"
    )  # 2.50μs -> 1.87μs (33.3% faster)


def test_list_with_float_value():
    # Test input list with float value
    codeflash_output = maybe_convert_css_to_tuples(
        [("opacity", 0.5)]
    )  # 440ns -> 380ns (15.8% faster)


def test_string_with_special_characters():
    # Test string with special characters in key and value
    codeflash_output = maybe_convert_css_to_tuples(
        "font-family: 'Open Sans', sans-serif;"
    )  # 2.67μs -> 1.95μs (36.6% faster)


def test_string_with_multiple_semicolons_and_spaces():
    # Test string with multiple semicolons and spaces
    codeflash_output = maybe_convert_css_to_tuples(
        "  color: red; ; ; background: blue; ; "
    )  # 3.63μs -> 2.93μs (24.1% faster)


def test_string_with_unicode_characters():
    # Test string with unicode characters
    codeflash_output = maybe_convert_css_to_tuples(
        "content: '✓';"
    )  # 3.45μs -> 2.81μs (22.7% faster)


def test_list_with_empty_tuple():
    # Test input list with an empty tuple
    codeflash_output = maybe_convert_css_to_tuples(
        [("", "")]
    )  # 428ns -> 381ns (12.3% faster)


def test_string_with_key_only_and_colon():
    # Test string with key and colon but no value
    codeflash_output = maybe_convert_css_to_tuples(
        "color:;"
    )  # 2.44μs -> 1.75μs (39.3% faster)


def test_string_with_value_only_and_colon():
    # Test string with colon and value but no key
    codeflash_output = maybe_convert_css_to_tuples(
        ":red;"
    )  # 2.38μs -> 1.83μs (29.6% faster)


def test_string_with_multiple_rules_and_empty_rule():
    # Test string with multiple rules and an empty rule in the middle
    codeflash_output = maybe_convert_css_to_tuples(
        "color: red;;background: blue;"
    )  # 3.09μs -> 2.54μs (21.6% faster)


def test_list_with_mixed_types():
    # Test input list with mixed types (string and float)
    codeflash_output = maybe_convert_css_to_tuples(
        [("opacity", 0.5), ("color", "red")]
    )  # 438ns -> 434ns (0.922% faster)


def test_string_with_newline_characters():
    # Test string with newline characters
    codeflash_output = maybe_convert_css_to_tuples(
        "color: red;\nbackground: blue;"
    )  # 3.17μs -> 2.46μs (29.1% faster)


def test_string_with_tabs_and_spaces():
    # Test string with tabs and spaces
    codeflash_output = maybe_convert_css_to_tuples(
        "\tcolor: red;\tbackground: blue;"
    )  # 3.13μs -> 2.37μs (32.2% faster)


def test_string_with_multiple_colons_and_semicolons():
    # Test string with multiple colons and semicolons in value
    codeflash_output = maybe_convert_css_to_tuples(
        "url: http://example.com:8080;"
    )  # 2.77μs -> 1.76μs (57.2% faster)


# ----------- Large Scale Test Cases ------------


def test_large_number_of_rules():
    # Test performance and correctness with a large number of rules
    css_string = ";".join([f"prop{i}: val{i}" for i in range(500)]) + ";"
    expected = [(f"prop{i}", f"val{i}") for i in range(500)]
    codeflash_output = maybe_convert_css_to_tuples(
        css_string
    )  # 128μs -> 89.8μs (42.9% faster)


def test_large_list_input():
    # Test performance and correctness with a large input list
    input_list = [(f"prop{i}", f"val{i}") for i in range(500)]
    codeflash_output = maybe_convert_css_to_tuples(
        input_list
    )  # 450ns -> 425ns (5.88% faster)


def test_large_string_with_empty_rules():
    # Test large CSS string with many empty rules interspersed
    css_string = (
        ";".join([f"prop{i}: val{i}" if i % 2 == 0 else "" for i in range(500)]) + ";"
    )
    expected = [(f"prop{i}", f"val{i}") for i in range(0, 500, 2)]
    codeflash_output = maybe_convert_css_to_tuples(
        css_string
    )  # 73.8μs -> 52.7μs (40.1% faster)


def test_large_string_with_float_values():
    # Test large CSS string with float values
    css_string = ";".join([f"opacity{i}: {i * 0.1}" for i in range(500)]) + ";"
    expected = [(f"opacity{i}", f"{i * 0.1}") for i in range(500)]
    codeflash_output = maybe_convert_css_to_tuples(
        css_string
    )  # 133μs -> 89.6μs (48.8% faster)


def test_large_string_with_special_characters():
    # Test large CSS string with special characters
    css_string = (
        ";".join([f"font{i}: 'Open Sans {i}', sans-serif" for i in range(500)]) + ";"
    )
    expected = [(f"font{i}", f"'Open Sans {i}', sans-serif") for i in range(500)]
    codeflash_output = maybe_convert_css_to_tuples(
        css_string
    )  # 133μs -> 85.5μs (55.6% faster)


# ----------- Determinism Test ------------


def test_determinism():
    # Test that repeated calls with the same input produce the same output
    css_string = "color: red; background: blue;"
    codeflash_output = maybe_convert_css_to_tuples(css_string)
    result1 = codeflash_output  # 3.21μs -> 2.51μs (28.3% faster)
    codeflash_output = maybe_convert_css_to_tuples(css_string)
    result2 = codeflash_output  # 1.29μs -> 1.02μs (26.4% faster)


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from typing import Union

# imports
import pytest  # used for our unit tests
from pandas.io.formats.style_render import maybe_convert_css_to_tuples

CSSPair = tuple[str, Union[str, float]]
CSSList = list[CSSPair]
CSSProperties = Union[str, CSSList]

# unit tests

# 1. Basic Test Cases


def test_basic_single_rule():
    # Single CSS rule
    codeflash_output = maybe_convert_css_to_tuples(
        "color:red;"
    )  # 2.31μs -> 1.74μs (32.5% faster)


def test_basic_multiple_rules():
    # Multiple CSS rules
    codeflash_output = maybe_convert_css_to_tuples(
        "color:red; border:1px solid black;"
    )  # 2.99μs -> 2.28μs (31.3% faster)


def test_basic_with_spaces():
    # CSS rules with extra spaces
    codeflash_output = maybe_convert_css_to_tuples(
        " color : red ; border : 1px solid black ; "
    )  # 3.37μs -> 2.60μs (29.8% faster)


def test_basic_empty_string():
    # Empty string should return empty list
    codeflash_output = maybe_convert_css_to_tuples("")  # 1.28μs -> 961ns (32.9% faster)


def test_basic_list_input():
    # Already a list of tuples, should return unchanged
    style = [("color", "red"), ("border", "1px solid black")]
    codeflash_output = maybe_convert_css_to_tuples(
        style
    )  # 407ns -> 418ns (2.63% slower)


def test_basic_float_value():
    # Float value in tuple
    style = [("opacity", 0.5)]
    codeflash_output = maybe_convert_css_to_tuples(
        style
    )  # 442ns -> 400ns (10.5% faster)


# 2. Edge Test Cases


def test_edge_no_colon_raises():
    # String with no colon should raise ValueError
    with pytest.raises(ValueError):
        maybe_convert_css_to_tuples("colorred;")  # 1.24μs -> 1.20μs (3.42% faster)


def test_edge_trailing_semicolon():
    # Trailing semicolon should be ignored
    codeflash_output = maybe_convert_css_to_tuples(
        "color:red;"
    )  # 2.54μs -> 1.79μs (41.8% faster)


def test_edge_multiple_semicolons():
    # Multiple consecutive semicolons should be ignored
    codeflash_output = maybe_convert_css_to_tuples(
        "color:red;;border:1px solid black;;;"
    )  # 3.18μs -> 2.44μs (30.1% faster)


def test_edge_empty_rules_between_semicolons():
    # Empty rules between semicolons should be ignored
    codeflash_output = maybe_convert_css_to_tuples(
        "color:red; ; border:1px solid black; ;"
    )  # 3.38μs -> 2.70μs (24.9% faster)


def test_edge_colon_in_value():
    # Colon in value should be preserved
    codeflash_output = maybe_convert_css_to_tuples(
        "background:url(http://foo.com/img.png);"
    )  # 2.65μs -> 1.69μs (56.8% faster)


def test_edge_colon_in_attribute_name():
    # Unusual but valid: colon in attribute name (CSS custom properties)
    codeflash_output = maybe_convert_css_to_tuples(
        "font-family:'Open:Sans';"
    )  # 2.52μs -> 1.81μs (38.7% faster)


def test_edge_empty_list_input():
    # Empty list input should return itself
    codeflash_output = maybe_convert_css_to_tuples([])  # 421ns -> 418ns (0.718% faster)


def test_edge_non_string_non_list_input():
    # Non-string, non-list input should return itself
    # (not specified, but let's test e.g. None)
    codeflash_output = maybe_convert_css_to_tuples(
        None
    )  # 405ns -> 382ns (6.02% faster)


def test_edge_value_with_leading_and_trailing_spaces():
    # Value with leading/trailing spaces should be stripped
    codeflash_output = maybe_convert_css_to_tuples(
        "color:  red  ;"
    )  # 2.63μs -> 2.01μs (30.7% faster)


def test_edge_attribute_with_leading_and_trailing_spaces():
    # Attribute with leading/trailing spaces should be stripped
    codeflash_output = maybe_convert_css_to_tuples(
        "  color  :red;"
    )  # 2.52μs -> 1.99μs (26.7% faster)


def test_edge_attribute_and_value_with_spaces():
    # Both attribute and value with spaces
    codeflash_output = maybe_convert_css_to_tuples(
        "  color  :  red  ;"
    )  # 2.69μs -> 2.09μs (28.3% faster)


def test_edge_semicolon_in_value():
    # Semicolon in value should split the rule
    # CSS does not allow semicolons in values, but let's test what happens
    # "color:red;something:foo;bar" -> [("color","red"),("something","foo"),("bar","")]
    codeflash_output = maybe_convert_css_to_tuples(
        "color:red;something:foo;bar;"
    )  # 3.64μs -> 2.85μs (27.8% faster)


def test_edge_value_is_empty():
    # Value is empty
    codeflash_output = maybe_convert_css_to_tuples(
        "color:;"
    )  # 2.35μs -> 1.87μs (25.4% faster)


def test_edge_attribute_is_empty():
    # Attribute is empty
    codeflash_output = maybe_convert_css_to_tuples(
        ":red;"
    )  # 2.38μs -> 1.82μs (30.7% faster)


def test_edge_attribute_and_value_are_empty():
    # Both attribute and value are empty
    codeflash_output = maybe_convert_css_to_tuples(
        ":;"
    )  # 2.30μs -> 1.75μs (30.8% faster)


def test_edge_only_colon():
    # Only colon
    codeflash_output = maybe_convert_css_to_tuples(
        ":"
    )  # 2.13μs -> 1.55μs (37.3% faster)


def test_edge_multiple_colons():
    # Multiple colons in value
    codeflash_output = maybe_convert_css_to_tuples(
        "foo:bar:baz;"
    )  # 2.42μs -> 1.79μs (35.4% faster)


def test_edge_unicode_and_non_ascii():
    # Unicode characters in attribute and value
    codeflash_output = maybe_convert_css_to_tuples(
        "цвет:красный;"
    )  # 3.31μs -> 2.65μs (25.0% faster)


def test_edge_attribute_with_dash_and_underscore():
    # Attribute with dash and underscore
    codeflash_output = maybe_convert_css_to_tuples(
        "font-weight:bold;_custom:42;"
    )  # 2.92μs -> 2.24μs (30.4% faster)


def test_edge_value_is_float_string():
    # Value is a float string
    codeflash_output = maybe_convert_css_to_tuples(
        "opacity:0.5;"
    )  # 2.35μs -> 1.74μs (34.8% faster)


def test_edge_value_is_integer_string():
    # Value is an integer string
    codeflash_output = maybe_convert_css_to_tuples(
        "z-index:10;"
    )  # 2.31μs -> 1.67μs (38.8% faster)


def test_edge_attribute_with_special_chars():
    # Attribute with special chars (CSS custom property)
    codeflash_output = maybe_convert_css_to_tuples(
        "--my-var:42;"
    )  # 2.28μs -> 1.71μs (33.3% faster)


# 3. Large Scale Test Cases


def test_large_many_rules():
    # Large number of rules
    n = 1000
    css = ";".join([f"attr{i}:val{i}" for i in range(n)]) + ";"
    expected = [(f"attr{i}", f"val{i}") for i in range(n)]
    codeflash_output = maybe_convert_css_to_tuples(css)  # 228μs -> 148μs (54.5% faster)


def test_large_many_rules_with_spaces():
    # Large number of rules with spaces
    n = 1000
    css = ";".join([f"  attr{i}  :  val{i}  " for i in range(n)]) + ";"
    expected = [(f"attr{i}", f"val{i}") for i in range(n)]
    codeflash_output = maybe_convert_css_to_tuples(css)  # 262μs -> 179μs (45.6% faster)


def test_large_empty_rules_between():
    # Large number of rules with empty rules between
    n = 500
    css = ";".join([f"attr{i}:val{i}; ;" for i in range(n)]) + ";"
    expected = [(f"attr{i}", f"val{i}") for i in range(n)]
    codeflash_output = maybe_convert_css_to_tuples(css)  # 149μs -> 110μs (35.8% faster)


def test_large_list_input():
    # Large list input should return itself
    style = [(f"attr{i}", f"val{i}") for i in range(1000)]
    codeflash_output = maybe_convert_css_to_tuples(
        style
    )  # 476ns -> 470ns (1.28% faster)


def test_large_unicode_rules():
    # Large set of unicode rules
    n = 500
    css = ";".join([f"цвет{i}:значение{i}" for i in range(n)]) + ";"
    expected = [(f"цвет{i}", f"значение{i}") for i in range(n)]
    codeflash_output = maybe_convert_css_to_tuples(
        css
    )  # 158μs -> 96.3μs (64.3% faster)


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-maybe_convert_css_to_tuples-mio746gh and push.

Codeflash Static Badge

The optimization replaces the inefficient list comprehension with a more streamlined loop that eliminates redundant string operations. **Key changes:**

1. **Single split per property**: Uses `str.partition(":")` instead of `str.split(":")` multiple times. The original code called `x.split(":")` twice - once for the key `[0]` and again for reconstructing the value with `":".join(x.split(":")[1:])`. The optimized version uses `partition` which splits only once at the first colon and returns the separator, avoiding duplicate work.

2. **Reduced strip operations**: The original code stripped `x` in the list comprehension condition and then stripped the split results. The optimized version strips `x` once upfront (`x_stripped`) and reuses this result.

3. **Eliminated unnecessary string reconstruction**: The original `":".join(x.split(":")[1:])` pattern is expensive for values containing multiple colons. The optimized `partition` method directly provides the key and remaining value without reconstruction.

4. **Simpler control flow**: Replaced the complex list comprehension with a straightforward loop that's easier for Python to optimize.

The optimization shows **46% speedup** with particularly strong gains (30-60%) on CSS strings with multiple rules, colons in values, and large-scale inputs. Based on the function references, this function is called in hot paths within pandas DataFrame styling operations (`_update_ctx`, `_update_ctx_header`, `set_table_styles`), where CSS strings are processed for every styled cell. The optimization reduces CPU overhead when styling large DataFrames or applying complex CSS rules, making pandas styling operations more responsive.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 December 2, 2025 06:27
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Dec 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant