DOCX Comparison
11 requirements
·
31 scenarios
Correlation Status Enumeration
JR-docx-comparison-001
The system SHALL provide a
CorrelationStatus enum with the following values: Nil, Normal, Unknown, Inserted, Deleted, Equal, Group, MovedSource, MovedDestination, FormatChanged.
6 test scenarios
- Status assigned during comparison JR-docx-comparison-001.1
- Status for unmatched atoms JR-docx-comparison-001.2
- Status for deleted content JR-docx-comparison-001.3
- Status for moved source content JR-docx-comparison-001.4
- Status for moved destination content JR-docx-comparison-001.5
- Status for format-changed content JR-docx-comparison-001.6
Legal Numbering Continuation Pattern Detection
JR-docx-comparison-007
The system SHALL detect "continuation patterns" in legal numbering where a paragraph at
A continuation pattern exists when:
1. The paragraph is the first at this level in the current sequence, AND
2. The level's
ilvl > 0 continues a flat sequence rather than creating a nested hierarchy. When detected, the system SHALL use the effective level (level 0) properties instead of the declared level.A continuation pattern exists when:
1. The paragraph is the first at this level in the current sequence, AND
2. The level's
start value equals the parent level's counter + 1
3 test scenarios
- Orphan list item renders with parent format JR-docx-comparison-007.1
- Proper nested list renders hierarchically JR-docx-comparison-007.2
- Continuation pattern inherits formatting JR-docx-comparison-007.3
Footnote Sequential Numbering
JR-docx-comparison-008
The system SHALL calculate footnote display numbers sequentially based on document order, NOT using raw XML
w:id attribute values. The w:id is a reference identifier linking footnoteReference to footnote definitions; display numbers are determined by the order footnotes appear in the document flow.
3 test scenarios
- First footnote displays as 1 JR-docx-comparison-008.1
- Sequential numbering ignores XML IDs JR-docx-comparison-008.2
- Reserved footnote IDs excluded from numbering JR-docx-comparison-008.3
Move Detection Algorithm
JR-docx-comparison-010
The system SHALL provide a
1. Groups consecutive atoms by
2. Extracts text from each block by joining content element values
3. Filters blocks by minimum word count (configurable, default: 3)
4. Calculates Jaccard word similarity between deleted and inserted blocks
5. Converts matching pairs (above threshold) to
detectMovesInAtomList() function that identifies relocated content after LCS comparison. The algorithm:1. Groups consecutive atoms by
correlationStatus into blocks (Deleted blocks, Inserted blocks)2. Extracts text from each block by joining content element values
3. Filters blocks by minimum word count (configurable, default: 3)
4. Calculates Jaccard word similarity between deleted and inserted blocks
5. Converts matching pairs (above threshold) to
MovedSource and MovedDestination
3 test scenarios
- Move detected between similar blocks JR-docx-comparison-010.1
- Short blocks ignored JR-docx-comparison-010.2
- Below threshold treated as separate changes JR-docx-comparison-010.3
Move Detection Settings
JR-docx-comparison-012
The system SHALL provide configurable settings for move detection:
-
-
-
-
-
detectMoves: Enable/disable move detection (default: true)-
moveSimilarityThreshold: Jaccard threshold for move matching (default: 0.8)-
moveMinimumWordCount: Minimum words for move consideration (default: 3)-
caseInsensitive: Case-insensitive similarity matching (default: false)
2 test scenarios
- Move detection disabled JR-docx-comparison-012.1
- Custom threshold applied JR-docx-comparison-012.2
OpenXML Move Markup Generation
JR-docx-comparison-013
The system SHALL generate native Word move tracking markup when moves are detected:
For moved source (content moved FROM):
-
-
-
For moved destination (content moved TO):
-
-
-
For moved source (content moved FROM):
-
w:moveFromRangeStart with w:id, w:name, w:author, w:date-
w:moveFrom containing the moved content-
w:moveFromRangeEnd with matching w:idFor moved destination (content moved TO):
-
w:moveToRangeStart with w:id, w:name, w:author, w:date-
w:moveTo containing the moved content-
w:moveToRangeEnd with matching w:id
3 test scenarios
- Move source markup structure JR-docx-comparison-013.1
- Move destination markup structure JR-docx-comparison-013.2
- Range IDs properly paired JR-docx-comparison-013.3
Format Change Info Interface
JR-docx-comparison-014
The system SHALL provide a
-
-
-
FormatChangeInfo interface with:-
oldRunProperties: The w:rPr element from the original document (may be null)-
newRunProperties: The w:rPr element from the modified document (may be null)-
changedProperties: Array of friendly property names that differ (e.g., "bold", "italic")
2 test scenarios
- Bold added JR-docx-comparison-014.1
- Multiple properties changed JR-docx-comparison-014.2
Format Change Detection Algorithm
JR-docx-comparison-015
The system SHALL provide a
1. Iterates through atoms with
2. Skips atoms without
3. Extracts
4. Normalizes
5. Compares normalized properties for equality
6. Converts non-equal atoms to
detectFormatChangesInAtomList() function that identifies formatting differences in Equal atoms after LCS comparison. The algorithm:1. Iterates through atoms with
correlationStatus === Equal2. Skips atoms without
comparisonUnitAtomBefore reference3. Extracts
w:rPr from ancestor w:r element for both original and modified atoms4. Normalizes
w:rPr elements (removes existing w:rPrChange, sorts children)5. Compares normalized properties for equality
6. Converts non-equal atoms to
FormatChanged status with formatChange info
3 test scenarios
- Text becomes bold JR-docx-comparison-015.1
- No format change JR-docx-comparison-015.2
- Format detection with text change JR-docx-comparison-015.3
Format Change Detection Settings
JR-docx-comparison-019
The system SHALL provide configurable settings for format change detection:
-
-
detectFormatChanges: Enable/disable format change detection (default: true)
2 test scenarios
- Format detection disabled JR-docx-comparison-019.1
- Format detection enabled by default JR-docx-comparison-019.2
OpenXML Format Change Markup Generation
JR-docx-comparison-020
The system SHALL generate native Word format change tracking markup (
For format-changed content:
- The current
-
-
w:rPrChange) when format changes are detected.For format-changed content:
- The current
w:rPr contains the NEW properties-
w:rPrChange is added as a child of w:rPr containing the OLD properties-
w:rPrChange includes w:id, w:author, and w:date attributes
3 test scenarios
- Format change markup structure JR-docx-comparison-020.1
- Bold added markup JR-docx-comparison-020.2
- Bold removed markup JR-docx-comparison-020.3
Format Change Revision Reporting
JR-docx-comparison-021
The system SHALL include format changes in
GetRevisions() output with type FormatChanged, extracting revision information from w:rPrChange elements.
1 test scenario
- Get format change revisions JR-docx-comparison-021.1