UseJunior Book a Demo

Evals & Deep Dives

safe-docx vs python-docx

python-docx is the standard Python library for creating .docx files and making straightforward edits to existing ones.

When choosing a library to edit existing Word documents, the comparison that matters is behavioral: given the same document and the same operation, what does each library actually produce? Feature checklists answer what a library claims; a conformance suite answers what it does. The results below come from docx-platform-tests, an open test suite in the web-platform-tests tradition where every scenario asserts behavior derivable from a cited clause of ECMA-376 (the Office Open XML standard) — not from either library’s internals.

Each scenario runs unchanged against both libraries through a small adapter. An adapter may decline an operation its library cannot perform — reported as Unsupported rather than Fail, because the suite’s contribution rules forbid implementing missing library capabilities inside the adapter. An Unsupported cell measures the library, not the adapter author.

One framing note: this matrix measures a single axis — conformance on spec-anchored editing operations against existing documents. python-docx is a generation-first library and is widely used for building documents from scratch in Python; nothing below speaks to that use, where it remains a sound choice.

Results

Scenario safe-docx
v0.10.0+git.8a748ffd31c1
python-docx
v1.2.0
acceptDeletionsRemovesDelContent
ECMA-376 edition 5, Part 1 § 17.13.5.14 (del (Deleted Run Content))
Pass Unsupported
python-docx has no tracked-changes (revision) API
acceptInsertionsUnwrapsInsWrappers
ECMA-376 edition 5, Part 1 § 17.13.5.18 (ins (Inserted Run Content))
Pass Unsupported
python-docx has no tracked-changes (revision) API
replaceFirstOccurrencePreservesOffsets
ECMA-376 edition 5, Part 1 § 17.3.3.31 (t (Text))
Pass Pass

Suite run of 2026-06-11. Full matrix across all implementations: cross-implementation conformance.

Tracked changes

Tracked changes (called revisions in the standard) are how Word records edits for later review: inserted content is wrapped in w:ins elements and deleted content in w:del elements (ECMA-376 edition 5, Part 1 § 17.13.5). Accepting an insertion unwraps the w:ins and keeps the content; accepting a deletion removes the w:del and its content.

python-docx reports Unsupported on both accept scenarios because it has no revision API: its object model does not surface runs nested inside w:ins or w:del wrappers — a paragraph’s .runs yields only direct run children, so revision-wrapped text is invisible to it. A redline workflow (programmatically accepting or rejecting reviewer edits) is not expressible with the library’s public API.

safe-docx passes both scenarios: accept and reject are first-class operations, and the same engine is differentially tested against a formal Lean model and a LibreOffice oracle in its own repository.

Find and replace

Both libraries pass the find-replace scenario, which asserts that the replacement text lands at the exact character offset of the match. The scenario’s fixture keeps the matched text inside a single run (a run is WordprocessingML’s unit of identically-formatted text), because python-docx’s replace surface is per-run: a match that spans run boundaries — which happens routinely in real documents, since Word splits runs freely as text is edited — requires reassembling runs, which the suite’s glue-not-algorithms rule leaves to the library rather than the adapter.

safe-docx’s replace operates on the paragraph’s concatenated text and splices runs as needed, so run boundaries do not constrain the match.

Where to dig deeper