Commit e4c8ce2
committed
Optimize str.translate lookup for ASCII characters
This commit introduces two optimizations to charmaptranslate_lookup:
1. Use _PyLong_FromUnsignedChar for characters 0-255 instead of
PyLong_FromLong. This uses the small int singleton cache, avoiding
memory allocation and deallocation for the key object.
2. Use PyDict_GetItemRef directly when the mapping is a dict, instead
of the more general PyMapping_GetOptionalItem. This avoids the
overhead of the generic mapping protocol.
These optimizations reduce instruction count by approximately 7.4%
for ASCII translation workloads (measured with callgrind).
Note: For the specific use case of PEP 503 normalization (lowercase +
character replacement), str.lower().replace().replace() is still
faster than str.translate() because it uses specialized C code paths
that avoid dictionary lookups entirely. However, these optimizations
help str.translate() performance for general use cases.1 parent 7ca9e7a commit e4c8ce2
1 file changed
Lines changed: 24 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9115 | 9115 | | |
9116 | 9116 | | |
9117 | 9117 | | |
9118 | | - | |
| 9118 | + | |
9119 | 9119 | | |
| 9120 | + | |
9120 | 9121 | | |
9121 | | - | |
9122 | | - | |
9123 | | - | |
| 9122 | + | |
| 9123 | + | |
| 9124 | + | |
| 9125 | + | |
| 9126 | + | |
| 9127 | + | |
| 9128 | + | |
| 9129 | + | |
| 9130 | + | |
| 9131 | + | |
| 9132 | + | |
| 9133 | + | |
| 9134 | + | |
| 9135 | + | |
| 9136 | + | |
| 9137 | + | |
| 9138 | + | |
| 9139 | + | |
| 9140 | + | |
| 9141 | + | |
| 9142 | + | |
9124 | 9143 | | |
| 9144 | + | |
9125 | 9145 | | |
9126 | 9146 | | |
9127 | 9147 | | |
| |||
0 commit comments