Skip to content

Conversation

@adamchainz
Copy link
Contributor

Changes in this PR

Like #2656, I spotted this function taking ~0.9% of request time in a project using mongoengine.

bytes.hex() was added in Python 3.5 and does the same as binascii.hexlify(...).decode() but faster, so I've replaced it here. (The binascii.hexlify() docs also now point to bytes.hex() as a faster alternative.)

Test Plan

Benchmarked again using richbench with the below script.

bench_objectid_str.py
from __future__ import annotations

import binascii
import os

class ObjectIdBefore:
    __slots__ = ("__id",)

    def __init__(self) -> None:
        self.__id = os.urandom(12)

    def __str__(self) -> str:
        return binascii.hexlify(self.__id).decode()

class ObjectIdAfter:
    __slots__ = ("__id",)

    def __init__(self) -> None:
        self.__id = os.urandom(12)

    def __str__(self) -> str:
        return self.__id.hex()

test_oids_before = [ObjectIdBefore() for _ in range(100)]
test_oids_after = [ObjectIdAfter() for _ in range(100)]

def bench_str_before():
    for oid in test_oids_before:
        for _ in range(100_000):
            str(oid)

def bench_str_after():
    for oid in test_oids_after:
        for _ in range(100_000):
            str(oid)

__benchmarks__ = [
    (bench_str_before, bench_str_after, "str(ObjectId) - hex conversion"),
]

Results show a ~1.4x speedup:

$ uvx -p 3.13 richbench .
                                            Benchmarks, repeat=5, number=5
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃                      Benchmark ┃ Min     ┃ Max     ┃ Mean    ┃ Min (+)         ┃ Max (+)         ┃ Mean (+)        ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ str(ObjectId) - hex conversion │ 3.000   │ 3.061   │ 3.019   │ 2.169 (1.4x)    │ 2.231 (1.4x)    │ 2.196 (1.4x)    │
└────────────────────────────────┴─────────┴─────────┴─────────┴─────────────────┴─────────────────┴─────────────────┘

Checklist

Checklist for Author

  • Did you update the changelog (if necessary)?
  • Is there test coverage?
  • Is any followup work tracked in a JIRA ticket? If so, add link(s).

Checklist for Reviewer

  • Does the title of the PR reference a JIRA Ticket?
  • Do you fully understand the implementation? (Would you be comfortable explaining how this code works to someone else?)
  • Is all relevant documentation (README or docstring) updated?

Like mongodb#2656, I spotted this function taking ~0.9% of request time in a project using mongoengine.

[`bytes.hex()`](https://docs.python.org/3/library/stdtypes.html#bytes.hex) was added in Python 3.5 and does the same as `binascii.hexlify(...).decode()` but faster, so I've replaced it here. (The `binascii.hexlify()` docs also now point to `bytes.hex()` as a faster alternative.)

Benchmarked again using [`richbench`](https://github.com/tonybaloney/rich-bench) with the below script.

<details>

<summary><code>bench_objectid_str.py</code></summary>

```python
from __future__ import annotations

import binascii
import os

class ObjectIdBefore:
    __slots__ = ("__id",)

    def __init__(self) -> None:
        self.__id = os.urandom(12)

    def __str__(self) -> str:
        return binascii.hexlify(self.__id).decode()

class ObjectIdAfter:
    __slots__ = ("__id",)

    def __init__(self) -> None:
        self.__id = os.urandom(12)

    def __str__(self) -> str:
        return self.__id.hex()

test_oids_before = [ObjectIdBefore() for _ in range(100)]
test_oids_after = [ObjectIdAfter() for _ in range(100)]

def bench_str_before():
    for oid in test_oids_before:
        for _ in range(100_000):
            str(oid)

def bench_str_after():
    for oid in test_oids_after:
        for _ in range(100_000):
            str(oid)

__benchmarks__ = [
    (bench_str_before, bench_str_after, "str(ObjectId) - hex conversion"),
]
```

</details>

Results show a ~1.4x speedup:

```
$ uvx -p 3.13 richbench .
                                            Benchmarks, repeat=5, number=5
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃                      Benchmark ┃ Min     ┃ Max     ┃ Mean    ┃ Min (+)         ┃ Max (+)         ┃ Mean (+)        ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ str(ObjectId) - hex conversion │ 3.000   │ 3.061   │ 3.019   │ 2.169 (1.4x)    │ 2.231 (1.4x)    │ 2.196 (1.4x)    │
└────────────────────────────────┴─────────┴─────────┴─────────┴─────────────────┴─────────────────┴─────────────────┘
```
@adamchainz adamchainz requested a review from a team as a code owner December 18, 2025 00:27
@adamchainz adamchainz requested a review from Jibola December 18, 2025 00:27
@blink1073 blink1073 changed the title Optimize ObjectId.__str__() PYTHON-5679 Optimize ObjectId.__str__() Dec 18, 2025
Copy link
Member

@blink1073 blink1073 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@blink1073 blink1073 merged commit e507078 into mongodb:master Dec 18, 2025
78 of 80 checks passed
@adamchainz adamchainz deleted the optimize_objectid_str branch December 18, 2025 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants