UseJunior Book a Demo

safe-docx · Move Detection

Exact move detection

When reviewing tracked edits in a Word document, moved content needs a source location and a destination location so reviewers can read the change as a relocation instead of two unrelated edits. The comparison layer represents candidate changes as comparison unit atoms (i.e., flattened pieces of compared document content), and move detection updates those atoms when deleted content matches inserted content.

The detectMovesInAtomList function groups consecutive deleted and inserted atoms, filters the groups by word count, and compares each deleted group against inserted groups with Jaccard word similarity (i.e., the size of the shared word set divided by the size of the combined word set). When a matching pair reaches the configured threshold, the function mutates the source atoms to MovedSource and the destination atoms to MovedDestination with the same move name.[1]

Below is a test scenario of the baseline successful case of detectMovesInAtomList: detects exact moves.

The scenario

Given a deleted atom and an inserted atom with the same text,
When moves are detected,
Then

  • the deleted atom is marked as MovedSource with move name move1.
  • the inserted atom is marked as MovedDestination with move name move1.

The Fixture

The fixture builds a short atom list with one deleted atom, one unchanged atom, and one inserted atom, then runs move detection with a low minimum word count and case-insensitive matching enabled.[2]

Below is the test fixture code.

test('detects exact moves', async ({ given, when, then }: AllureBddContext) => {
  let atoms: ComparisonUnitAtom[];

  await given('a deleted atom and an inserted atom with the same text', () => {
    atoms = [
      createTestAtom('this is some text that was moved', CorrelationStatus.Deleted),
      createTestAtom('unchanged', CorrelationStatus.Equal),
      createTestAtom('this is some text that was moved', CorrelationStatus.Inserted),
    ];
  });

  await when('moves are detected', () => {
    detectMovesInAtomList(atoms, {
      detectMoves: true,
      moveSimilarityThreshold: 0.8,
      moveMinimumWordCount: 1,
      caseInsensitiveMove: true,
    });
  });

  await then('the atoms are marked as MovedSource and MovedDestination', () => {
    const atom0 = atoms[0];
    const atom2 = atoms[2];
    assertDefined(atom0, 'atoms[0]');
    assertDefined(atom2, 'atoms[2]');
    expect(atom0.correlationStatus).toBe(CorrelationStatus.MovedSource);
    expect(atom0.moveName).toBe('move1');
    expect(atom2.correlationStatus).toBe(CorrelationStatus.MovedDestination);
    expect(atom2.moveName).toBe('move1');
  });
});

The Expected Outcome

The scenario asserts the mutated state of the source and destination atoms rather than a return value, because detectMovesInAtomList updates the supplied atom list in place.

Below is the mutated atom state asserted by this scenario.

{
  atom0: {
    correlationStatus: CorrelationStatus.MovedSource,
    moveName: 'move1',
  },
  atom2: {
    correlationStatus: CorrelationStatus.MovedDestination,
    moveName: 'move1',
  },
}

Below is a description of the expected fields:

A Non-Obvious Detail

The unchanged atom in the middle remains outside the matched blocks because move detection groups only deleted and inserted atoms. That separator matters because the function detects a source block and a destination block across stable content, then links the matching blocks without changing the equal atom.