UseJunior Book a Demo

safe-docx · Extract Revisions

Format change extraction from rPrChange

When reviewing tracked formatting in OOXML, run property changes (i.e., formatting revisions recorded inside run properties) need to appear in the revision report even when the paragraph text itself stays the same. Because those changes are stored as property records rather than insertion or deletion wrappers, a revision extractor has to treat property-change elements as tracked-change evidence.[3]

extractRevisions walks paragraphs in the tracked document, builds before and after text from accepted and rejected clones, and collects revision entries from both content wrappers and property-change records.[1] For run formatting, a <w:rPrChange> element represents the changed run properties, and the extractor reports it as a FORMAT_CHANGE revision.

Below is a test scenario of the baseline successful case of extractRevisions: a run property change is extracted as a format-change revision.

The scenario

Given a document with a format change by Carol,
When extractRevisions is called,
Then

  • one change is returned.
  • the revision list contains one revision.
  • the revision type is FORMAT_CHANGE.
  • the revision author is Carol.

Test fixture

The fixture creates a paragraph whose run properties include a tracked run-property change, while the visible paragraph content remains normal run text.

Below is the test fixture code.

test('should extract FORMAT_CHANGE from rPrChange', async ({ given, when, then, and }: AllureBddContext) => {
  let doc: Document;
  let result: ReturnType<typeof extractRevisions>;

  await given('a document with a format change by Carol', async () => {
    doc = makeDoc(
      '<w:p><w:r>' +
        '<w:rPr>' +
          '<w:b/>' +
          '<w:rPrChange w:author="Carol">' +
            '<w:rPr><w:i/></w:rPr>' +
          '</w:rPrChange>' +
        '</w:rPr>' +
        '<w:t>Formatted</w:t>' +
      '</w:r></w:p>',
    );
  });

  await when('extractRevisions is called', async () => {
    result = extractRevisions(doc, []);
  });

  await then('one change is returned', async () => {
    expect(result.total_changes).toBe(1);
  });

  await and('the revision is a FORMAT_CHANGE by Carol', async () => {
    expect(result.changes[0]!.revisions).toHaveLength(1);
    expect(result.changes[0]!.revisions[0]!.type).toBe('FORMAT_CHANGE');
    expect(result.changes[0]!.revisions[0]!.author).toBe('Carol');
  });
});

The expected result shape

The scenario asserts the returned revision summary rather than the whole ExtractRevisionsResult object, so the expected shape is the set of checks made against the returned value.[2]

Below is the result that extractRevisions is expected to return for this scenario.

expect(result.total_changes).toBe(1);
expect(result.changes[0]!.revisions).toHaveLength(1);
expect(result.changes[0]!.revisions[0]!.type).toBe('FORMAT_CHANGE');
expect(result.changes[0]!.revisions[0]!.author).toBe('Carol');

Below is a description of the expected fields:

A non-obvious detail

Run property changes do not carry inserted or deleted paragraph content, so their revision entry uses the format-change type instead of a text-bearing insertion or deletion type. The implementation collects <w:rPrChange> along with the other property-change records named by PR_CHANGE_LOCALS, then copies the w:author value into the revision entry.[1]