Skip to content

Commit 2d18b56

Browse files
committed
docs: Adding life of a query
1 parent ed5057b commit 2d18b56

File tree

6 files changed

+206
-35
lines changed

6 files changed

+206
-35
lines changed

packages/firestore/devdocs/architecture.md

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -11,18 +11,20 @@ The SDK is composed of several key components that work together to provide the
1111
* **API Layer**: The public-facing API surface that developers use to interact with the SDK. This layer is responsible for translating the public API calls into the internal data models and passing them to the appropriate core components.
1212
* **Core**:
1313
* **Event Manager**: Acts as a central hub for all eventing in the SDK. It is responsible for routing events between the API Layer and Sync Engine. It manages query listeners and is responsible for raising snapshot events, as well as handling connectivity changes and some query failures.
14-
* **Sync Engine**: The central controller of the SDK. It acts as the glue between the Event Manager, Local Store, and Remote Store. Its responsibilities include:
15-
* Coordinating and translating client requests and remote events from the backend.
16-
* Initiating responses to user code from both remote events (backend updates) and local events (e.g. garbage collection).
17-
* Managing a "view" for each query, which represents the unified view between the local and remote data stores. The Sync Engine builds the user-facing "View" using the formula: `View = Remote Document + Overlay`. A **Remote Document** is the authoritative state from the backend. An **Overlay** is A computed "delta" representing pending local mutations. Overlays are calculated immediately when a mutation is applied and persisted separately. This allows for zero-latency "Optimistic Updates."
18-
* Deciding whether a document is in a "limbo" state (e.g. its state is unknown) and needs to be fetched from the backend.
19-
* Notifying the Remote Store when the Local Store has new mutations that need to be sent to the backend.
14+
* **Sync Engine**: The central controller of the SDK. It acts as the glue between the Event Manager, Local Store, and Remote Store.
15+
* **Coordinator**: It bridges the **User World** (Query) and **System World** (Target), converting public API calls into internal `TargetIDs`.
16+
* **View Construction**: It manages the user-facing view using the formula: `View = Remote Document + Overlay`.
17+
* **Remote Document**: The authoritative state from the backend.
18+
* **Overlay**: A computed delta representing pending local mutations.
19+
* **Limbo Resolution**: It detects "Limbo" documents (local matches not confirmed by server) and initiates resolution flows to verify their existence.
20+
* **Lifecycle Management**: It controls the [Query Lifecycle](./query-lifecycle.md), managing the initialization of streams, the persistence of data, and garbage collection eligibility.
2021
* **Local Store**: A container for the components that manage persisted and in-memory data.
2122
* **Remote Table**: A cache of the most recent version of documents as known by the Firestore backend (A.K.A. Remote Documents).
2223
* **Mutation Queue**: A queue of all the user-initiated writes (set, update, delete) that have not yet been acknowledged by the Firestore backend.
2324
* **Local View**: A cache that represents the user's current view of the data, combining the Remote Table with the Mutation Queue.
2425
* **Query Engine**: Determines the most efficient strategy (Index vs. Scan) to identify documents matching a query in the local cache.
2526
* **Overlays**: A performance-optimizing cache that stores the calculated effect of pending mutations from the Mutation Queue on documents. Instead of re-applying mutations every time a document is read, the SDK computes this "overlay" once and caches it, allowing the Local View to be constructed more efficiently.
27+
* For a detailed breakdown of the IndexedDB structure and tables, see [Persistence Schema](./persistence-schema.md).
2628
* **Remote Store**: The component responsible for all network communication with the Firestore backend. It manages the gRPC streams for reading and writing data, and it abstracts away the complexities of the network protocol from the rest of the SDK.
2729
* **Persistence Layer**: The underlying storage mechanism used by the Local Store to persist data on the client. In the browser, this is implemented using IndexedDB.
2830

@@ -89,3 +91,5 @@ Here's a step-by-step walkthrough of how data flows through the SDK for a write
8991
9. **Sync Engine**: The Sync Engine is notified of the updated documents. It re-calculates the query view by combining the new data from the Remote Table with any applicable pending mutations from the **Mutation Queue**.
9092
10. **API Layer**: If the query results have changed after this reconciliation, the new results are sent to the user's `onSnapshot` callback. This is why a listener may fire twice initially.
9193
11. **Real-time Updates**: From now on, any changes on the backend that affect the query are pushed to the Remote Store, which updates the Remote Table, triggering the Sync Engine to re-calculate the view and notify the listener.
94+
95+
**Note on Query Lifecycle:** The steps above describe the "Happy Path" of a query starting up. For details on how queries are deduplicated, how the data persists after a listener is removed, and how Garbage Collection eventually cleans it up, see the [Query Lifecycle](query-lifecycle.md).

packages/firestore/devdocs/code-layout.md

Lines changed: 20 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -3,19 +3,26 @@
33
This document explains the code layout in this repository. It is closely related to the [architecture](./architecture.md).
44

55
* `src/`: Contains the source code for the main `@firebase/firestore` package.
6-
* `api/`: Implements the **API Layer** for the main SDK.
7-
* `lite-api/`: Contains the entry point of for the lite SDK.
8-
* `core/`: Contains logic for the **Sync Engine** and **Event Manager**.
9-
* `local/`: Contains the logic the **Local Store**, which includes the **Mutation Queue**, **Remote Table**, **Local View**, **Overlays**, and the **Persistence Layer**
10-
* `local_store.ts`: The main entry point for persistence operations.
11-
* `query_engine.ts`: Implements the strategy selection logic (Scan vs. Index).
12-
* `index_backfiller.ts`: The background task that updates Client-Side Indexes.
13-
* `remote_document_cache.ts`: Manages the `remote_documents` table (base truth).
14-
* `overlay_cache.ts`: Manages pending mutation queue.
15-
* `remote/`: Contains the logic for the **Remote Store**, handling all network communication.
16-
* `model/`: Defines the internal data models used throughout the SDK, such as `Document`, `DocumentKey`, and `Mutation`. These models are used to represent Firestore data and operations in a structured way.
17-
* `platform/`: Contains platform-specific code to abstract away the differences between the Node.js and browser environments. This includes things like networking, storage, and timers. This allows the core logic of the SDK to be platform-agnostic.
18-
* `protos/`: Contains the Protocol Buffer (`.proto`) definitions that describe the gRPC API surface of the Firestore backend. These files are used to generate the client-side networking code.
6+
* `api/`: **API Surface**. Implements the public API (e.g., `doc`, `collection`, `onSnapshot`).
7+
* `database.ts`: The entry point for the `Firestore` class.
8+
* `reference.ts`: Implements `DocumentReference` and `CollectionReference`.
9+
* `core/`: **Sync Engine**. Contains the high-level orchestration logic.
10+
* `sync_engine.ts`: The central coordinator. It manages the "User World" <-> "System World" bridge, `TargetID` allocation, and the main async queue.
11+
* `event_manager.ts`: Handles `QueryListener` registration, fan-out (deduplication of identical queries), and raising snapshot events to the user.
12+
* `query.ts`: Defines the internal `Query` and `Target` models.
13+
* `firestore_client.ts`: The initialization logic that wires up the components.
14+
* `local/`: **Storage and Query Execution**. Manages persistence, caching, and local execution.
15+
* `local_store.ts`: The main interface for the Core layer to interact with storage. It coordinates the components below.
16+
* `indexeddb_persistence.ts`: The implementation of the [Persistence Schema](./persistence-schema.md) using IndexedDB.
17+
* `local_documents_view.ts`: Implements the logic to assemble the user-facing view (`RemoteDoc` + `Mutation`).
18+
* `query_engine.ts`: The optimizer that decides how to scan the cache.
19+
* `lru_garbage_collector.ts` & `reference_delegate.ts`: Implements the Sequence Number logic to clean up old data.
20+
* `remote/`: **Network**. Handles gRPC/REST communication.
21+
* `remote_store.ts`: Manages the "Watch Stream" (listening to queries) and the "Commit Stream" (sending mutations).
22+
* `connection.ts`: Abstracts the underlying networking transport.
23+
* `serializer.ts`: Converts between internal model objects and the Protobuf format used by the backend.
24+
* `model/`: Defines the immutable data structures used throughout the SDK (e.g., `DocumentKey`, `FieldPath`, `Mutation`).
25+
* `util/`: General purpose utilities (AsyncQueue, Assertions, Types).
1926
* `lite/`: Defines the entrypoint code for the `@firebase/firestore/lite` package.
2027
* `test/`: Contains all unit and integration tests for the SDK. The tests are organized by component and feature, and they are essential for ensuring the quality and correctness of the code.
2128
* `scripts/`: Contains a collection of build and maintenance scripts used for tasks such as bundling the code, running tests, and generating documentation.

packages/firestore/devdocs/overview.md

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ The primary goals of this SDK are:
2525
* **Overlay**: The computed result of applying a Mutation to a Document. We store these to show "Optimistic Updates" instantly without modifying the underlying "Remote Document" until the server confirms the write.
2626
* **Limbo**: A state where a document exists locally and matches a query, but the server hasn't explicitly confirmed it belongs to the current snapshot version. The SDK must perform "Limbo Resolution" to ensure these documents are valid.
2727

28+
For a detailed explanation of how these concepts interact during execution, see the [Query Lifecycle](./query-lifecycle.md) documentation.
2829

2930
## Artifacts
3031

@@ -33,23 +34,22 @@ The Firestore JavaScript SDK is divided into two main packages:
3334
* `@firebase/firestore`: The main, full-featured SDK that provides streaming and offline support.
3435
* `@firebase/firestore/lite`: A much lighter-weight (AKA "lite") version of the SDK for applications that do not require streaming or offline support.
3536

36-
For a detailed explanation of the architecture, components, and data flow, please see the [Architecture documentation](./architecture.md). Related, for a deailed overview of the source code layout, please see [Code layout](./code-layout.md).
3737

38+
## Documentation Map
3839

39-
## Build
40+
To navigate the internals of the SDK, use the following guide:
4041

41-
TODO: Add critical information about the build process including optimizations for code size, etc.
42+
### Core Concepts
43+
* **[Architecture](./architecture.md)**: The high-level block diagram of the system (API -> Core -> Local -> Remote).
44+
* **[Query Lifecycle](./query-lifecycle.md)**: The state machine of a query. **Read this** to understand how querying and offline capabilities work.
4245

43-
For information on how the artifacts are built, please see the [Build documentation](./build.md) file.
46+
### Subsystem Deep Dives
47+
* **[Persistence Schema](./persistence-schema.md)**: A reference guide for the IndexedDB tables (e.g., `remote_documents`, `mutation_queues`).
48+
* **[Query Execution](./query-execution.md)**: Details on the algorithms used by the Local Store to execute queries (Index Scans vs. Full Collection Scans).
49+
* **[Bundles](./bundles.md)**: How the SDK loads and processes data bundles.
4450

45-
## Testing
46-
47-
TODO: Add critical information about the tests harness, organization, spec tests, etc.
48-
49-
For information on how the tests are setup and organized [Testing documentation](./testing.md) file.
50-
51-
## Developer Workflow
52-
53-
TODO: Add list of common commands here.
54-
55-
For information on the developer workflow, including how to build, test, and format the code, please see the [CONTRIBUTING.md](../CONTRIBUTING.md) file.
51+
### Developer Guides
52+
* **[Code Layout](./code-layout.md)**: Maps the architectural components to specific source files and directories.
53+
* **[Build Process](./build.md)**: How to build the artifacts.
54+
* **[Testing](./testing.md)**: How to run unit and integration tests.
55+
* **[Contributing](../CONTRIBUTING.md)**: How to run unit and integration tests.
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
# Persistence Schema (IndexedDB)
2+
3+
The Firestore JS SDK persists data to IndexedDB to support offline querying, latency compensation, and app restarts.
4+
5+
While the Android/iOS SDKs use SQLite, the JS SDK uses IndexedDB Object Stores. However, the logical schema is identical across platforms to ensure consistent behavior.
6+
7+
## Core Object Stores
8+
9+
### `remote_documents`
10+
* **Concept**: The client's cache of the backend's "Source of Truth."
11+
* **Key**: `DocumentKey` (Path to the document).
12+
* **Value**:
13+
* **Document Data**: The serialized Protobuf of the document.
14+
* **ReadTime**: The snapshot version at which this document was read.
15+
* **Note**: This store **never** contains local, unacknowledged writes. It only contains data confirmed by the server. To see what the developer sees, we overlay the `mutation_queues` on top of this.
16+
17+
### `mutation_queues`
18+
* **Concept**: The "Pending Writes" queue.
19+
* **Key**: `BatchID` (Integer, auto-incrementing).
20+
* **Grouping**: Queues are partitioned by **UID**. When a developer logs out, the SDK switches to a different queue.
21+
* **Value**:
22+
* **Mutation**: The serialized operation (Set, Patch, Delete).
23+
* **Metadata**: Timestamp, offsets.
24+
* **Behavior**: When the network is available, the `RemoteStore` reads from this queue to send write batches to the backend. Once acknowledged, entries are removed.
25+
26+
### `targets`
27+
* **Concept**: Metadata about active and cached queries.
28+
* **Key**: `TargetID` (Internal Integer allocated by `SyncEngine`).
29+
* **Value**:
30+
* **Canonical ID**: A hash string representing the query (filters, sort order). Used for deduplication.
31+
* **Resume Token**: An opaque token from the backend used to resume a stream without re-downloading all data.
32+
* **Last Sequence Number**: Used for Garbage Collection (LRU).
33+
34+
### `target_documents` (The Index)
35+
* **Concept**: A reverse index mapping `TargetID` $\leftrightarrow$ `DocumentKey`.
36+
* **Purpose**: Optimization. When a query is executed locally, the SDK uses this index to quickly identify which documents belong to a specific TargetID without scanning the entire `remote_documents` table.
37+
* **Maintenance**: This is updated whenever a remote snapshot adds/removes a document from a query view.
38+
39+
## Metadata & Garbage Collection Stores
40+
41+
### `target_globals`
42+
* **Concept**: A singleton store for global system state.
43+
* **Key**: Fixed singleton key.
44+
* **Value**:
45+
* **`last_sequence_number`**: A global integer counter incremented on every transaction.
46+
* **`target_count`**: Number of targets currently tracked.
47+
48+
### `remote_document_changes` (Ephemeral)
49+
* **Concept**: A temporary staging area used during `SyncEngine` processing.
50+
* **Purpose**: Used to track read-time updates for documents during a remote event application before they are committed to the main `remote_documents` store.
51+
52+
## Data Relationships
53+
54+
1. **The "View"**: To construct a document for the developer, the SDK reads `remote_documents[key]` and applies any mutations found in `mutation_queues` matching that key.
55+
2. **Garbage Collection**: The `LruGarbageCollector` uses `target_globals.last_sequence_number` and `targets.last_sequence_number` to determine which targets are old and can be evicted. It then uses `target_documents` to find which documents are no longer referenced by *any* target and deletes them from `remote_documents`.

packages/firestore/devdocs/query-execution.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Query Execution & Indexing
22

3-
This document details how the Firestore SDK executes queries against the local cache. Understanding this is crucial for debugging performance issues and understanding offline behavior.
3+
*Note: This document details the internal algorithms used during **View Calculation** of the [Query Lifecycle](./query-lifecycle.md). It focuses on the performance and mechanics of the **Local Store**.*
44

55
## The Query Engine
66

0 commit comments

Comments
 (0)