Core Concepts

This section provides a deeper dive into the architecture of the XML synchronization engine. The system is designed around a layered architecture that bridges the gap between raw source code and application-level object models.

Architecture Overview

The system architecture is centered around the SyncEngine, which manages state transitions via Transactions.

Layer	Role	Characteristics
SyncEngine	Core Logic	Transaction processing, History, Event dispatching.
CST	Physical Layer	Exact source structure, validation, incremental parsing.
Model	Logical Layer	Source of Truth, persistent IDs, full fidelity.
Projection	View Layer	Schema-specific filtering (`SchemaView`), DOM API.

Transaction Architecture

xml-api employs a transactional state management model inspired by modern editors.

EditorState: An immutable object representing the state of the editor at a single point in time. It holds the source, model, and cst.
Transaction: Represents a unit of change. It encapsulates text patches and metadata.
SyncEngine: The processor that takes a Transaction, applies it to the current state, performs parsing and reconciliation, and produces a new EditorState.

This ensures that all updates are atomic, predictable, and historically trackable.

Layers in Depth

1. CST (Concrete Syntax Tree)

The CST is the result of parsing the source code against a strict grammar definition based on the W3C XML 1.0 Specification.

Full Fidelity: Captures every character, including whitespace, comments, and attribute quote styles.
Validation: Enforces Well-Formedness Constraints (WFC) such as matching start/end tags and unique attributes during the parsing process.
Incremental Parsing: Supports efficient updates by re-parsing only specific branches of the tree when the input changes (Parser.parseAt).

2. Model

The Model is the authoritative source of truth that connects the physical CST to the application.

Persistent Identity: Every node is assigned a unique, immutable ID upon creation. This allows external systems (like UI frameworks) to track nodes reliably even after re-parsing.
Binder Engine: The core logic that synchronizes data. It performs Reconciliation—intelligently updating the existing Model tree with new CST data to minimize object replacement—and calculates precise text patches for updates.
Linkage: Maintains direct references to CST nodes, enabling the retrieval of exact source code locations for every logical element.
Standard Operations: Provides built-in methods for data extraction (find, text) and formatting.

3. Projection Layer (SchemaView)

This layer provides a specialized view of the document tailored to specific application needs (e.g., an XHTML subset).

Filtering: Hides nodes that don't match the schema (e.g., comments, custom tags) while preserving them in the underlying Model.
SchemaView: The primary interface for accessing this filtered tree. It mimics the standard DOM API.
ViewBinder: Handles the complexity of reconciling changes from the filtered view back to the full Model, ensuring that "invisible" nodes are preserved and formatting is respected.

4. DOM Interface

This layer provides the primary interface for applications to interact with the document. It implements standard W3C interfaces (Document, Element) and acts as a wrapper around the Model (or SchemaView). Changes made here are observed and automatically synchronized with the source code.

Data Flow

The system maintains a bidirectional loop between the source code and the application.

Core Concepts ​

Architecture Overview ​

Transaction Architecture ​

Layers in Depth ​

1. CST (Concrete Syntax Tree) ​

2. Model ​

3. Projection Layer (SchemaView) ​

4. DOM Interface ​

Data Flow ​