diff --git a/packages/firestore/CONTRIBUTING.md b/packages/firestore/CONTRIBUTING.md index 49780249ea..79f26f2684 100644 --- a/packages/firestore/CONTRIBUTING.md +++ b/packages/firestore/CONTRIBUTING.md @@ -5,6 +5,8 @@ contributing to the Firebase JavaScript SDK (including Cloud Firestore). Follow instructions there to install dependencies, build the SDK, and set up the testing environment. +For a deep dive into the testing strategy and architecture, see [Testing Strategy](devdocs/testing.md). + ## Integration Testing ### Setting up a project for testing diff --git a/packages/firestore/GEMINI.md b/packages/firestore/GEMINI.md new file mode 100644 index 0000000000..e4ce4c2123 --- /dev/null +++ b/packages/firestore/GEMINI.md @@ -0,0 +1,5 @@ +# Firestore JavaScript SDK +This project is the official JavaScript SDK for the [Google Cloud Firestore](https://firebase.google.com/docs/firestore) database. + +You are an expert in @devdocs/prerequisites.md +@devdocs/overview.md \ No newline at end of file diff --git a/packages/firestore/README.md b/packages/firestore/README.md index 504f6ab2b6..ac7017d289 100644 --- a/packages/firestore/README.md +++ b/packages/firestore/README.md @@ -17,6 +17,10 @@ Docs][reference-docs]. [reference-docs]: https://firebase.google.com/docs/reference/js/ +## Internal Documentation + +If you are a contributor or maintainer, please see the [Internal Developer Documentation](./devdocs/overview.md). + ## Contributing See [Contributing to the Firebase SDK](../../CONTRIBUTING.md) for general information about contributing to the firebase-js-sdk repo and diff --git a/packages/firestore/devdocs/GEMINI.md b/packages/firestore/devdocs/GEMINI.md new file mode 100644 index 0000000000..85212f0f48 --- /dev/null +++ b/packages/firestore/devdocs/GEMINI.md @@ -0,0 +1,47 @@ +# Firestore JavaScript SDK Developer Documentation + +This folder contains the developer documentation for the Firestore JavaScript SDK. + +**Audience:** +1. **Maintainers**: Engineers working on the SDK internals. +2. **AI Agents**: Automated assistants reading this consistency to understand the codebase. + +**NOT Audience:** +- 3rd Party Developers (App Developers). They should use the [official Firebase documentation](https://firebase.google.com/docs/firestore). + +# Entry Point +Start at [./overview.md](./overview.md). + +# Content Guidelines + +## Principles +- **High-Level Focus**: Explain architecture and data flow. Avoid duplicating code. +- **Why > How**: Explain *why* a design choice was made. The code shows *how*. +- **Reference by Name**: Use exact component/interface names (e.g., `Persistence`, `EventManager`). + +## Terminology +- **Concepts First**: **Aggressively favor** high-level English concepts over code identifiers. Only drop down to code identifiers when absolutely necessary for precise mapping. + * *Good*: "The Mutation Queue stores pending writes." + * *Bad*: "The `mutationQueues` store contains `DbMutationBatch` objects." + * *Acceptable (Mapping)*: "The Mutation Queue (implemented as `mutationQueues` store)..." +- **Avoid Over-Specification**: Do not generally reference private/internal variable names unless documenting that specific module's internals. +- **Strict Casing**: When you *must* reference code, use the **exact casing** found in the codebase (e.g., `mutationQueues`). +- **No "Translations"**: Never convert code names into snake_case. Either use the English Concept ("Remote Documents") or the exact Code Name (`remoteDocuments`). + +## Diagramming +Use **Mermaid** for diagrams. +- Flowcharts for logic. +- Sequence diagrams for async protocols. +- Class diagrams for component relationships. + +# Style Guide +- **Syntax**: Markdown (GFM). +- **Voice**: Active voice ("The SDK does X"). +- **Tense**: Present tense ("The query executes..."). +- **Mood**: Imperative for instructions ("Run the build..."). +- **Conciseness**: Short sentences. Bullet points where possible. + +# Maintenance +- **Co-location**: Keep documentation close to the code it describes (linked via `Code Layout`). +- **Atomic Updates**: Update documentation in the *same PR* as the feature or fix. +- **Freshness**: If you see stale docs, fix them immediately. diff --git a/packages/firestore/devdocs/architecture.md b/packages/firestore/devdocs/architecture.md new file mode 100644 index 0000000000..02d1702c23 --- /dev/null +++ b/packages/firestore/devdocs/architecture.md @@ -0,0 +1,76 @@ +# SDK Architecture + +This document provides a detailed explanation of the Firestore JavaScript SDK's architecture, its core components, and the flow of data through the system. + +## Core Components + +The SDK is composed of several key components that work together to provide the full range of Firestore features. + +![Architecture Diagram](./architecture.png) + +* **API Layer**: The public-facing API surface that developers use to interact with the SDK. This layer is responsible for translating the public API calls into the internal data models and passing them to the appropriate core components. +* **Core**: + * **Event Manager**: Acts as a central hub for all eventing in the SDK. It is responsible for routing events between the API Layer and Sync Engine. It manages query listeners and is responsible for raising snapshot events, as well as handling connectivity changes and some query failures. +* **Sync Engine**: The central controller of the SDK. It acts as the glue between the Event Manager, Local Store, and Remote Store. + * **Target**: The backend protocol's internal representation of a recurring Query. While a `Query` is a user-intent (e.g., "users where age > 18"), a `Target` is the allocated stream ID (`TargetID`) that the Watch implementation uses to track that query's state over the network. The **Coordinator** maps ephemeral user Queries to stable system Targets. + * **Coordinator**: It bridges the **User World** (Query) and **System World** (Target), converting public API calls into internal `TargetIDs`. + * **View Construction**: It manages the user-facing view using the formula: `View = Remote Document + Overlay`. + * **Remote Document**: The authoritative state from the backend. + * **Overlay**: A computed delta representing pending local mutations. + * **Limbo Resolution**: It detects "Limbo" documents (local matches not confirmed by server) and initiates resolution flows to verify their existence. + * **Lifecycle Management**: It controls the [Query Lifecycle](./query-lifecycle.md), managing the initialization of streams, the persistence of data, and garbage collection eligibility. +* **Local Store**: A container for the components that manage persisted and in-memory data. + * **Remote Table**: A cache of the most recent version of documents as known by the Firestore backend (A.K.A. Remote Documents). + * **Mutation Queue**: A queue of all the user-initiated writes (set, update, delete) that have not yet been acknowledged by the Firestore backend. + * **Local View**: A cache that represents the user's current view of the data, combining the Remote Table with the Mutation Queue. + * **Query Engine**: Determines the most efficient strategy (Index vs. Scan) to identify documents matching a query in the local cache. + * **Overlays**: A performance-optimizing cache that stores the calculated effect of pending mutations from the Mutation Queue on documents. Instead of re-applying mutations every time a document is read, the SDK computes this "overlay" once and caches it, allowing the Local View to be constructed more efficiently. + * For a detailed breakdown of the IndexedDB structure and tables, see [Persistence Schema](./persistence-schema.md). +* **Remote Store**: The component responsible for all network communication with the Firestore backend. + * It manages the **Watch Stream** (see **[The Watch System](./watch.md)**) for reading and listening to data. + * It manages the gRPC streams for writing data. + * It abstracts away the complexities of the network protocol from the rest of the SDK. +* **Persistence Layer**: The underlying storage mechanism used by the Local Store to persist data on the client. In the browser, this is implemented using IndexedDB. + +The architecture and systems within the SDK map closely to the directory structure, which helps developers navigate the codebase. Here is a mapping of the core components to their corresponding directories. + +* `src/`: + * `api/`: Implements the **API Layer** for the main SDK. + * `lite-api/`: Implements the **API Layer** for the lite SDK. + * `core/`: Implements the **Sync Engine** and **Event Manager**. + * `local/`: Implements the **Local Store**, which includes the **Mutation Queue**, **Remote Table**, **Local View**, and the **Persistence Layer**. + * `remote/`: Implements the **Remote Store**, handling all network communication. + +For a more detailed explanation of the contents of each directory, see the [Code Layout](./code-layout.md) documentation. + + +# Data Flow + +Here's a step-by-step walkthrough of how data flows through the SDK for a write operation, referencing the core components. + +## Write Data Flow + +1. **API Layer**: A user initiates a write operation (e.g., `setDoc`, `updateDoc`, `deleteDoc`). +2. **Sync Engine**: The call is routed to the Sync Engine, which wraps the operation in a "mutation". +3. **Mutation Queue (in Local Store)**: The Sync Engine adds this mutation to the Mutation Queue. The queue is persisted to the **Persistence Layer** (IndexedDB). At this point, the SDK "optimistically" considers the write successful locally. +4. **Local View (in Local Store)**: The change is immediately reflected in the Local View, making it available to any active listeners without waiting for backend confirmation. +5. **Remote Store**: The Sync Engine notifies the Remote Store that there are pending mutations. +6. **Backend**: The Remote Store sends the mutations from the queue to the Firestore backend. +7. **Acknowledgement**: The backend acknowledges the write. +8. **Mutation Queue (in Local Store)**: The Remote Store informs the Sync Engine, which then removes the acknowledged mutation from the Mutation Queue. + +## Read Data Flow (with a Real-Time Listener) + +1. **API Layer**: A user attaches a listener to a query (e.g., `onSnapshot`). +2. **Event Manager**: The Event Manager creates a listener and passes it to the Sync Engine. +3. **Sync Engine**: The Sync Engine creates a "view" for the query. +4. **Local View (in Local Store)**: The Sync Engine asks the Query Engine in the Local Store to execute the query. The Query Engine selects a strategy (e.g., Index Scan or Timestamp Optimization) to find matching keys. The Local Store then constructs the documents by applying cached Overlays on top of Remote Documents. +5. **API Layer**: The initial data from the Local View is sent back to the user's `onSnapshot` callback. This provides a fast, initial result. +6. **Remote Store**: Simultaneously, the Sync Engine instructs the Remote Store to listen to the query on the Firestore backend. +7. **Backend**: The backend returns the initial matching documents for the query. +8. **Remote Table (in Local Store)**: The Remote Store receives the documents and saves them to the Remote Table in the Local Store, overwriting any previously cached versions of those documents. +9. **Sync Engine**: The Sync Engine is notified of the updated documents. It re-calculates the query view by combining the new data from the Remote Table with any applicable pending mutations from the **Mutation Queue**. +10. **API Layer**: If the query results have changed after this reconciliation, the new results are sent to the user's `onSnapshot` callback. This is why a listener may fire twice initially. +11. **Real-time Updates**: From now on, any changes on the backend that affect the query are pushed to the Remote Store, which updates the Remote Table, triggering the Sync Engine to re-calculate the view and notify the listener. + +**Note on Query Lifecycle:** The steps above describe the "Happy Path" of a query starting up. For details on how queries are deduplicated, how the data persists after a listener is removed, and how Garbage Collection eventually cleans it up, see the [Query Lifecycle](query-lifecycle.md). \ No newline at end of file diff --git a/packages/firestore/devdocs/architecture.png b/packages/firestore/devdocs/architecture.png new file mode 100644 index 0000000000..ec1fc04010 Binary files /dev/null and b/packages/firestore/devdocs/architecture.png differ diff --git a/packages/firestore/devdocs/build.md b/packages/firestore/devdocs/build.md new file mode 100644 index 0000000000..14789a367a --- /dev/null +++ b/packages/firestore/devdocs/build.md @@ -0,0 +1,5 @@ +# Build Process + +> **Note**: This documentation is currently under construction. + +For current build instructions, test commands, and setup, please go to [CONTRIBUTING.md](../CONTRIBUTING.md). diff --git a/packages/firestore/devdocs/bundles.md b/packages/firestore/devdocs/bundles.md new file mode 100644 index 0000000000..4e4256df32 --- /dev/null +++ b/packages/firestore/devdocs/bundles.md @@ -0,0 +1,42 @@ +# Firestore Data Bundles + +This document provides an overview of Firestore data bundles, how they are processed, and how they are used within the SDK. + +## What is a Bundle? + +A Firestore data bundle is a serialized, read-only collection of documents and named query results. Bundles are created on a server using the Firebase Admin SDK and can be efficiently distributed to clients. + +While bundles can be used for several purposes, their primary design motivation is to optimize Server-Side Rendering (SSR) workflows. In an SSR setup, a server pre-renders a page with data from Firestore. This data can be packaged into a bundle and sent to the client along with the HTML. The client-side SDK can then load this bundle and "hydrate" a real-time query with the pre-existing data, avoiding the need to re-fetch the same documents from the backend. This results in a significant performance improvement and cost savings. + +## Primary Use Case: Server-Side Rendering (SSR) Hydration + +The main workflow for bundles is as follows: + +1. **Server-Side:** A server fetches data from Firestore to render a page. +2. **Bundling:** The server packages the fetched documents and the corresponding query into a bundle. +3. **Transmission:** The bundle is embedded in the HTML page sent to the client. +4. **Client-Side:** The client-side JavaScript calls `loadBundle()` to load the data from the bundle into the SDK's local cache. +5. **Hydration:** The client then attaches a real-time listener to the same query that was bundled. The SDK finds the query results in the local cache and immediately fires the listener with the initial data, avoiding a costly roundtrip to the backend. + +## Other Benefits and Use Cases + +Beyond SSR hydration, bundles offer several other advantages: + +* **Enhanced Offline Experience:** Bundles can be shipped with an application's initial assets, allowing users to have a more complete offline experience from the first launch, reducing the need to sync every document individually. +* **Efficient Data Distribution:** They provide an efficient way to deliver curated or static datasets to clients in a single package. For instance, an application could bundle a list of popular items or configuration data. + +## The Loading Process + +When an application provides a bundle to the SDK, a loading process is initiated. The SDK reads the bundle, which is a stream of documents and queries, and saves each item into its local cache. This process is asynchronous, allowing the application to continue running while the data is being loaded in the background. + +To give developers visibility into this process, the SDK provides progress updates, including the number of documents and bytes loaded so far. Once all the data has been successfully loaded into the cache, the SDK signals that the process is complete. + +## Error Handling + +The bundle loading process is designed to be robust. If an error is encountered at any point—for example, if the bundle data is malformed or there is an issue writing to the local cache—the entire operation is aborted. The SDK ensures that the application is notified of the failure, allowing developers to catch the error and implement appropriate fallback or recovery logic. + +## Interacting with Bundled Data + +Once a bundle has been loaded, the data it contains is available for querying. If the bundle included named queries, you can use the `getNamedQuery()` method to retrieve a `Query` object, which can then be executed. + +When a named query is executed, the Firestore SDK will first attempt to fulfill the query from the local cache. If the results for the named query are available in the cache (because they were loaded from a bundle), they will be returned immediately, without a server roundtrip. diff --git a/packages/firestore/devdocs/code-layout.md b/packages/firestore/devdocs/code-layout.md new file mode 100644 index 0000000000..d46cb3a194 --- /dev/null +++ b/packages/firestore/devdocs/code-layout.md @@ -0,0 +1,35 @@ +# SDK Code Layout + +This document explains the code layout in this repository. It is closely related to the [architecture](./architecture.md). + +* `src/`: Contains the source code for the main `@firebase/firestore` package. + * `api/`: **API Surface**. Implements the public API (e.g., `doc`, `collection`, `onSnapshot`). + * `database.ts`: The entry point for the `Firestore` class. + * `reference.ts`: Implements `DocumentReference` and `CollectionReference`. + * `core/`: **Sync Engine**. Contains the high-level orchestration logic. + * `sync_engine.ts`: The central coordinator. It manages the "User World" <-> "System World" bridge, `TargetID` allocation, and the main async queue. + * `event_manager.ts`: Handles `QueryListener` registration, fan-out (deduplication of identical queries), and raising snapshot events to the user. + * `query.ts`: Defines the internal `Query` and `Target` models. + * `firestore_client.ts`: The initialization logic that wires up the components. + * `local/`: **Storage and Query Execution**. Manages persistence, caching, and local execution. + * `local_store.ts`: The main interface for the Core layer to interact with storage. It coordinates the components below. + * `indexeddb_persistence.ts`: The implementation of the [Persistence Schema](./persistence-schema.md) using IndexedDB. + * `local_documents_view.ts`: Implements the logic to assemble the user-facing view (`RemoteDoc` + `Mutation`). + * `query_engine.ts`: The optimizer that decides how to scan the cache. + * `lru_garbage_collector.ts` & `reference_delegate.ts`: Implements the Sequence Number logic to clean up old data. + * `remote/`: **Network**. Handles gRPC/REST communication. + * `remote_store.ts`: Manages the "Watch Stream" (listening to queries) and the "Commit Stream" (sending mutations). + * `connection.ts`: Abstracts the underlying networking transport. + * `serializer.ts`: Converts between internal model objects and the Protobuf format used by the backend. + * `model/`: Defines the immutable data structures used throughout the SDK (e.g., `DocumentKey`, `FieldPath`, `Mutation`). + * `util/`: General purpose utilities (AsyncQueue, Assertions, Types). +* `lite/`: Defines the entrypoint code for the `@firebase/firestore/lite` package. +* `test/`: Contains all unit and integration tests for the SDK. The tests are organized by component and feature, and they are essential for ensuring the quality and correctness of the code. +* `scripts/`: Contains a collection of build and maintenance scripts used for tasks such as bundling the code, running tests, and generating documentation. + +TODO: Add more detailed information as appropriate on each folder + +TODO: Mention critical entry points + - `package.json` for packages and common commands. Go to [build.md](./build.md) for details + - rollup configs for main and lite sdks. Go to [build.md](./build.md) for details + - tests entry points. Go to [testing.md](./testing.md) for details \ No newline at end of file diff --git a/packages/firestore/devdocs/garbage-collection.md b/packages/firestore/devdocs/garbage-collection.md new file mode 100644 index 0000000000..376915d0e5 --- /dev/null +++ b/packages/firestore/devdocs/garbage-collection.md @@ -0,0 +1,62 @@ +# Garbage Collection (LRU) + +This document details how the SDK manages local cache size to prevent unbounded growth. It explains the distinction between Eager and LRU collection, the criteria for deletion, and the sequence-number-based algorithm used to identify old data. + +## Strategies: Eager vs. LRU + +The SDK employs two different strategies depending on the persistence mode: + +1. **Eager GC (Memory Persistence)**: + * Used when persistence is disabled (in-memory only). + * **Behavior**: When a query is stopped (`unsubscribe()`), the SDK immediately releases the reference to the data. If no other active query references those documents, they are deleted from memory instantly. + * **Pros/Cons**: extremely memory efficient, but offers no offline caching across app restarts. + +2. **LRU GC (Disk Persistence)**: + * Used when persistence is enabled (IndexedDB). + * **Behavior**: When a query is stopped, the data remains on disk. A background process periodically checks the total cache size. If it exceeds a threshold, the "Least Recently Used" data is purged. + * **Pros/Cons**: Supports offline apps and faster re-querying, but requires complex management of "Sequence Numbers" to track usage. + +*The rest of this document focuses on the LRU strategy.* + +## What is Collected? + +Garbage collection runs in the background. It does not indiscriminately delete data. It looks for **Eligible** items: + +### 1. Inactive Targets +A `Target` (internal query representation) is eligible for collection if it is no longer being listened to by the user. + +### 2. Orphaned Documents +A document is only eligible for collection if it is **Orphaned**. A document is Orphaned if: +* **No Active Targets**: It does not match *any* currently active query listener. +* **No Pending Mutations**: There are no local edits (Sets/Patches) waiting to be sent to the backend. + +> **Note**: Mutations are *never* garbage collected. They are only removed once the backend accepts or rejects them. + +## Key Concepts + +### Sequence Numbers (The Logical Clock) +To determine "recency," the SDK maintains a global `lastListenSequenceNumber` in the **Target Globals** store (`targetGlobal`). +* **Tick**: Every transaction (write, query listen, remote update) increments this number. +* **Tagging**: When a Target is actively listened to or updated, its `lastListenSequenceNumber` is updated to the current global tick. +* **Effect**: Higher numbers = More recently used. + +### The Reference Map (`targetDocuments`) +The **Target-Document Index** (`targetDocuments`) acts as a reference counter linking Documents to Targets. +* **Active Association**: If `targetId: 2` matches `documentKey: A`, a row exists. +* **Sentinel Rows (`targetId: 0`)**: If a document exists in the cache but is not matched by *any* specific target (perhaps previously downloaded, or part of a target that was deleted), it may have a row with `targetId: 0`. This marks the document as present but potentially orphaned. + +## The Collection Algorithm + +The `LruGarbageCollector` runs periodically (e.g., every few minutes). + +1. **Threshold Check**: It calculates the byte size of the current cache. If `CurrentSize < CacheSizeBytes` (default 100MB), the process aborts. +2. **Calculate Cutoff**: + * The GC decides how many items to cull (e.g., 10%). + * It queries the **Target-Document Index** (`targetDocuments`) table, ordered by `sequenceNumber` ASC. + * It finds the sequence number at the 10th percentile. This becomes the **Upper Bound**. +3. **Sweep Targets**: + * Any Target in the **Targets** table (`targets`) with a `lastListenSequenceNumber` <= **Upper Bound** is deleted. + * This removes the "Active" link for any documents associated with that target. +4. **Sweep Documents**: + * The GC scans for documents that have *no* rows in **Target-Document Index** (or only sentinel rows) AND have a sequence number <= **Upper Bound**. + * These "Orphaned" documents are deleted from the **Remote Document Cache** (`remoteDocuments`). \ No newline at end of file diff --git a/packages/firestore/devdocs/limbo-resolution.md b/packages/firestore/devdocs/limbo-resolution.md new file mode 100644 index 0000000000..3cdaafc7b8 --- /dev/null +++ b/packages/firestore/devdocs/limbo-resolution.md @@ -0,0 +1,51 @@ +# Limbo Resolution & Consistency + +This document details how the Firestore Client SDKs ensure the local cache remains consistent with the server after connectivity interruptions, specifically focusing on **Limbo Resolution**, **Resume Tokens**, and **Existence Filters**. + +## The Problem: Offline Drift + +When a client is online and listening to a query, the backend sends specific `Change` events (Added, Modified, Removed). The SDK keeps the local cache in sync by applying these deltas. + +However, when a client goes offline or disconnects: +1. **Drift Occurs**: Documents may be deleted or modified on the server such that they no longer match the query. +2. **Missing Negatives**: Upon reconnecting, the backend sends updates for documents that *changed*. However, if a document was deleted while the client was offline, the backend (using a Resume Token) might not send a specific "Delete" event because it doesn't track exactly which documents every offline client currently holds in their specific local cache. + +This creates a state where the local cache has "stale" documents that the server no longer considers part of the result set. + +## Mechanism 1: Resume Tokens (The Happy Path) + +To minimize bandwidth, the SDK and Backend use **Resume Tokens**. +* **Token**: An opaque binary blob (encoding a timestamp) received in every `ListenResponse`. +* **Resume**: When the SDK reconnects, it sends the last token it received. +* **Delta**: The backend sends only documents that have changed *since* that timestamp. +* **Gap**: This mechanism handles *adds* and *updates* perfectly. It struggles with *removes* if the server history is compacted or complex. + +## Mechanism 2: Existence Filters + +To detect inconsistency caused by the "Missing Negatives" problem, the backend includes an **Existence Filter** in the `ListenResponse`. + +* **The Count**: An Existence Filter is essentially a count of how many documents the server believes match the query. +* **The Check**: The SDK compares this count with the number of documents in its local cache for that specific target. +* **Mismatch**: If `LocalCount > ServerCount`, the SDK knows it is holding onto stale data. This triggers **Limbo Resolution**. + +## Mechanism 3: Bloom Filters (The Solution) + +Historically, upon detecting a mismatch, the SDK would drop the Resume Token and re-download the entire query result. This was expensive. The modern approach uses **Bloom Filters**. + +1. **Construction**: The backend constructs a Bloom Filter containing the IDs of all documents currently matching the query. +2. **Transmission**: This filter is sent to the client (probabilistic, highly compressed). +3. **Local Check**: The SDK checks every local document key against this Bloom Filter. + * **If NOT in filter**: The document is definitely gone from the server result set. + * **If IN filter**: The document *probably* exists (Bloom filters have false positives). + +## Limbo Resolution State Machine + +Documents that exist locally but failed the Bloom Filter check (or triggered a false positive check) enter **Limbo**. + +1. **Limbo State**: The document is present in the local cache, but the View cannot confirm if it is valid. We cannot simply delete it immediately because it might match *another* active query, or it might be a false positive from the Bloom Filter. +2. **Resolution Request**: The SDK spins up a dedicated, ephemeral `Watch` listener (effectively a `GetDocument` lookup) specifically for the key in Limbo. +3. **Outcome**: + * **Found**: The server confirms the document exists and sends the latest version. The SDK updates the cache. + * **Not Found**: The server returns "Not Found". The SDK deletes the document from the local cache (and removes it from the Query View). + +This process ensures that the client converges to a consistent state without re-downloading the entire result set, optimizing bandwidth usage. \ No newline at end of file diff --git a/packages/firestore/devdocs/multi-tab.md b/packages/firestore/devdocs/multi-tab.md new file mode 100644 index 0000000000..4f78d6621e --- /dev/null +++ b/packages/firestore/devdocs/multi-tab.md @@ -0,0 +1,7 @@ +# This document outlines how the Firestore JS SDK works across multiple tabs in the browser. + +> **Note**: This documentation is currently under construction. + +Reminder to self on what to include: + - This feature is only relevant in the JS SDK. The mobile SDKs do not have a concept of tabs therefore do not need to coordinate. + - End users might have multiple tabs open to the same website which puts contention on IndexedDB and the backend. \ No newline at end of file diff --git a/packages/firestore/devdocs/overview.md b/packages/firestore/devdocs/overview.md new file mode 100644 index 0000000000..d7e74a4b33 --- /dev/null +++ b/packages/firestore/devdocs/overview.md @@ -0,0 +1,97 @@ +# Firestore JavaScript SDK Overview + +This document is the starting point for navigating the Firestore JavaScript SDK codebase documentation. It provides a high-level overview of the SDK, how it is built, tested, and the developer workflow. + +All contributors are expected to be familiar with the [prerequisites](./prerequisites.md) before working in this codebase. + +## Project Goals + +The Firestore JavaScript SDK is one of the official client-side library for interacting with [Google Cloud Firestore](https://firebase.google.com/docs/firestore). It is designed to be used in a variety of JavaScript environments, including web browsers (primary and common) and Node.js (secondary and rare). It is important to distinguish this SDK from the [Google Cloud Firestore server-side SDK for Node.js](https://github.com/googleapis/nodejs-firestore). While this SDK can run in Node.js, it is primarily designed for client-side use. The server-side SDK is intended for trusted environments and offers different capabilities. However, the two SDKs are designed to harmonize where helpful (e.g. data models) to facilitate easier full-stack application development. + +The primary goals of this SDK are: + +* Provide a simple and intuitive API for reading and writing data to Firestore. +* Support real-time data synchronization with streaming queries. +* Enable offline data access and query caching. +* Offer a lightweight version for applications that do not require advanced features. +* Maintain API and architectural symmetry with the [Firestore Android SDK](https://github.com/firebase/firebase-android-sdk) and [Firestore iOS SDK](https://github.com/firebase/firebase-ios-sdk). This consistency simplifies maintenance and makes it easier to port features between platforms. The public API is intentionally consistent across platforms, even if it means being less idiomatic, to allow developers to more easily port their application code. + +## Designed for Flicker-Free Responsiveness + +Firestore is designed to help developers build responsive front-end applications that eliminate UI flicker. + +1. **Immediate Cache Results**: The SDK returns query results from the local cache immediately, while fetching the latest data from the server in the background. +2. **Optimistic Updates**: Writes are applied to the local cache *instantly*, allowing the UI to update without waiting for network confirmation. +3. **Background Synchronization**: The SDK handles the network communication to commit these changes to the backend asynchronously. + +*This means the "Happy Path" handles latency automatically. You don't write special code to manage loading states for every interaction; the SDK provides instant feedback by default.* + + + +## Operational Modes + +At a high level, all interactions with Firestore can be categorized as either reading or writing data. The SDK provides different mechanisms for these operations, each with distinct guarantees and performance characteristics. + +### Read Operations + +There are two fundamental ways to read data from Firestore: + +* **One-Time Reads**: This is for fetching a snapshot of data at a specific moment. It's a simple request-response model. You ask for a document or the results of a query, and the server sends back the data as it exists at that instant. + +* **Real-Time Listeners**: This allows you to subscribe to a document or a query. The server first sends you the initial data and then continues to push updates to your client in real time as the data changes. This is the foundation of Firestore's real-time capabilities. + +### Write Operations + +All data modifications—creates, updates, and deletes—are treated as "writes." The SDK is designed to make writes atomic and resilient. There are two fundamental ways to write data to Firestore: + +* **One-Time Writes**: When a user performs a write (create, update, or delete), the operation is not sent directly to the backend. Instead, it's treated as a "mutation" and added to the local **Mutation Queue**. The SDK "optimistically" assumes the write will succeed on the backend and immediately reflects the change in the local view of the data, making the change visible to local queries. The SDK then works to synchronize this queue with the backend. This design is crucial for supporting offline functionality, as pending writes can be retried automatically when network connectivity is restored. + +* **Transactions**: For grouping multiple write operations into a single atomic unit, the SDK provides `runTransaction`. Unlike standard writes, transactions do not use the optimistic, offline-capable write pipeline (Mutation Queue). Instead, they use an **Optimistic Concurrency Control** mechanism dependent on the backend. + * They are **Online-only**: Reads and writes communicate directly with the backend via RPCs. + * They are **Atomic**: All operations succeed or fail together. + * They are **Retriable**: The SDK automatically retries the transaction if the underlying data changes on the server during execution. + * For implementation details, see [Transactions](./transactions.md). + +## Key Concepts & Vocabulary + +* **Query**: The client-side representation of a data request (filters, order bys). + +* **Mutation**: A user-initiated change (Set, Update, Delete). Mutations are queued locally and sent to the backend. +* **Overlay**: The computed result of applying a Mutation to a Document. We store these to show "Optimistic Updates" instantly without modifying the underlying "Remote Document" until the server confirms the write. +* **Limbo**: A state where a document exists locally and matches a query, but the server hasn't explicitly confirmed it belongs to the current snapshot version. The SDK must perform "Limbo Resolution" to ensure these documents are valid. + +For a detailed explanation of how these concepts interact during execution, see the [Query Lifecycle](./query-lifecycle.md) documentation. + +## Artifacts + +The Firestore JavaScript SDK is divided into two main packages: + +* `@firebase/firestore`: The main, full-featured SDK that provides streaming and offline support. +* `@firebase/firestore/lite`: A much lighter-weight (AKA "lite") version of the SDK for applications that do not require streaming or offline support. + + +## Documentation Map + +To navigate the internals of the SDK, use the following guide: + +### Getting Started (Build & Run) +* **[Start Here: Build & Run](../CONTRIBUTING.md)**: How to set up the repo, build the SDK, and run tests. + +### Core Concepts +* **[Architecture](./architecture.md)**: The high-level block diagram of the system (API -> Core -> Local -> Remote). +* **[Query Lifecycle](./query-lifecycle.md)**: The state machine of a query. **Read this** to understand how querying and offline capabilities work. +* **[Write Lifecycle](./write-lifecycle.md)**: How writes work (Mutations, Batches, Overlays). + +### Subsystem Deep Dives +* **[Query Execution](./query-execution.md)**: Details on the algorithms used by the Local Store to execute queries (Index Scans vs. Full Collection Scans). +* **[Garbage Collection](./garbage-collection.md)**: Details the LRU algorithm, Sequence Numbers, and how the SDK manages cache size. +* **[Persistence Schema](./persistence-schema.md)**: A reference guide for the IndexedDB tables. +* **[Transactions](./transactions.md)**: How the SDK implements Optimistic Concurrency Control, retries, and the online-only write pipeline. +* **[Limbo Resolution](./limbo-resolution.md)**: How the SDK detects and cleans up stale data using Existence Filters and Bloom Filters. +* **[Bundles](./bundles.md)**: How the SDK loads and processes data bundles. + +### Developer Guides +* **[Code Layout](./code-layout.md)**: Maps the architectural components to specific source files and directories. +* **[Build Process](./build.md)**: How to build the artifacts. +* **[Testing](./testing.md)**: How to run unit and integration tests. +* **[Spec Tests](./spec-tests.md)**: Deep dive into the cross-platform JSON test suite. \ No newline at end of file diff --git a/packages/firestore/devdocs/persistence-schema.md b/packages/firestore/devdocs/persistence-schema.md new file mode 100644 index 0000000000..d018f84ec0 --- /dev/null +++ b/packages/firestore/devdocs/persistence-schema.md @@ -0,0 +1,68 @@ +# Persistence Schema (IndexedDB) + +The Firestore JS SDK persists data to IndexedDB to support offline querying, latency compensation, and app restarts. + +While the Android/iOS SDKs use SQLite, the JS SDK uses IndexedDB Object Stores. However, the logical schema is identical across platforms to ensure consistent behavior. + +## Core Object Stores + +### Remote Document Cache +* **Concept**: The client's cache of the backend's "Source of Truth." +* **Implementation**: Stored in `remoteDocuments` (legacy: `remoteDocumentsV14`). +* **Key**: `DocumentKey` (Path to the document). +* **Value**: + * **Document Data**: The serialized Protobuf of the document. + * **ReadTime**: The snapshot version at which this document was read. +* **Note**: This store **never** contains local, unacknowledged writes. It only contains data confirmed by the server. To see what the developer sees, we overlay the **Mutation Queue** on top of this. + +### Mutation Queue +* **Concept**: The "Pending Writes" queue. An ordered log of all local writes that have not yet been acknowledged by the server. +* **Implementation**: Split across `mutationQueues` (User Metadata) and `mutations` (Batch Data). +* **Key**: `(userId, batchId)`. Segregating by User ID ensures that if a user logs out, their pending writes do not leak to the next user. +* **Value**: A serialized `DbMutationBatch` containing one or more mutations (Set, Patch, Delete). +* **Behavior**: This is the "Single Source of Truth" for local changes. If the app restarts, the SDK replays these mutations to rebuild the **Document Overlays**. When the network is available, the `RemoteStore` reads from this queue to send write batches to the backend. Once acknowledged, entries are removed. + +### Document Overlays +* **Concept**: A cache of the *result* of applying pending mutations. +* **Implementation**: Stored in `documentOverlays`. +* **Key**: `(userId, documentKey)`. +* **Purpose**: Read Performance. Without this table, the SDK would have to read the Remote Document and re-apply every pending mutation from the queue every time a query ran. +* **Lifecycle**: Created immediately when a user writes. Deleted immediately when the backend acknowledges the write (or rejects it). +* **Priority**: When the `LocalStore` reads a document, it checks this table first. If an entry exists, it takes precedence over **Remote Document Cache**. + +### Targets +* **Concept**: Metadata about active and cached queries. +* **Implementation**: Stored in `targets`. +* **Key**: `TargetId` (Internal Integer allocated by `SyncEngine`). +* **Value**: + * **Canonical ID**: A hash string representing the query (filters, sort order). Used for deduplication. + * **Resume Token**: An opaque token from the backend used to resume a stream without re-downloading all data. + * **Last Sequence Number**: Used for Garbage Collection (LRU). + +### Target-Document Index +* **Concept**: A reverse index mapping `TargetId` $\leftrightarrow$ `DocumentKey`. +* **Implementation**: Stored in `targetDocuments`. +* **Purpose**: + 1. **Query Execution**: Quickly identify documents for a query. + 2. **Garbage Collection**: Acts as a reference counter. If a document has entries here with active `TargetId`s, it cannot be collected. +* **Sentinel Rows**: A row with `TargetId = 0` indicates the document exists in the cache but may not be attached to any active listener. These are primary candidates for Garbage Collection. +* **Maintenance**: This is updated whenever a remote snapshot adds/removes a document from a query view. + +## Metadata & Garbage Collection Stores + +### Target Globals +* **Concept**: A singleton store for global system state. +* **Implementation**: Stored in `targetGlobal`. +* **Key**: Fixed singleton key. +* **Value**: + * **`lastListenSequenceNumber`**: A global integer counter incremented on every transaction. + * **`targetCount`**: Number of targets currently tracked. + +### Remote Document Changes (Ephemeral) +* **Concept**: A temporary staging area used during `SyncEngine` processing. +* **Purpose**: Used to track read-time updates for documents during a remote event application before they are committed to the main **Remote Document Cache**. + +## Data Relationships + +1. **The "View"**: To construct a document for the developer, the SDK reads **Remote Document Cache** and applies any mutations found in **Mutation Queue** matching that key. +2. **Garbage Collection**: The `LruGarbageCollector` uses `TargetGlobal.lastListenSequenceNumber` and `Target.lastListenSequenceNumber` to determine which targets are old and can be evicted. It then uses **Target-Document Index** to find which documents are no longer referenced by *any* target and deletes them from **Remote Document Cache**. \ No newline at end of file diff --git a/packages/firestore/devdocs/prerequisites.md b/packages/firestore/devdocs/prerequisites.md new file mode 100644 index 0000000000..dc5f4962ae --- /dev/null +++ b/packages/firestore/devdocs/prerequisites.md @@ -0,0 +1,31 @@ +# Firestore JavaScript SDK Maintainer's Guide + +This document outlines the prerequisite knowledge for new maintainers of the Firestore JavaScript SDK. + +## Prerequisite Knowledge + +Before contributing to this codebase, you should have a strong understanding of the following technologies and concepts: + +### Core Technologies + +* **TypeScript:** The entire codebase is written in TypeScript. A deep understanding of TypeScript, including its type system, generics, and modules, is essential. +* **JavaScript (ES6+):** As a JavaScript SDK, a strong grasp of modern JavaScript features is required. +* **Node.js:** The SDK is isomorphic and runs in the Node.js environment. Familiarity with Node.js concepts, such as its module system and event loop, is important. +* **Browser Runtime Environment:** The SDK is also used in web browsers. A good understanding of the different browser execution contexts (e.g. main window, web/service workers) and subsystems (e.g. persistence like IndexedDB and Local Storage, networking) is necessary. + +### Build and Test Tooling + +* **Yarn:** We use Yarn for package management. You should be familiar with basic Yarn commands. +* **Rollup.js:** Our build process uses Rollup.js to bundle the code. Understanding Rollup's configuration and plugin system will be helpful. +* **Karma, Mocha, and Chai:** These are our testing frameworks. You should be comfortable writing and running tests using this stack. + + + +### Domain Knowledge + +* **[Google Cloud Firestore](https://firebase.google.com/docs/firestore):** A general understanding of Firestore's data model (documents, collections, subcollections), query language, and security rules is fundamental. +* **Databases:** A general understanding of databases, including key-value stores and relational databases, is helpful for understanding Firestore's design and trade-offs. +* **Modern Web Application Architecture:** Familiarity with modern web application architecture and also server-side rendering (SSR), is beneficial for understanding how the SDK is used in practice. +* **[Firebase](https://firebase.google.com/docs):** Familiarity with the Firebase platform is required, especially Firebase Auth and Firebase Functions. +* **Protocol Buffers / gRPC:** The main SDK uses Protocol Buffers over gRPC to communicate with the Firestore backend. A basic understanding of these technologies is helpful. +* **Firestore REST API:** The lite SDK uses the Firestore REST API. Familiarity with the REST API is useful when working on the lite version of the SDK. diff --git a/packages/firestore/devdocs/query-execution.md b/packages/firestore/devdocs/query-execution.md new file mode 100644 index 0000000000..9fd1cef364 --- /dev/null +++ b/packages/firestore/devdocs/query-execution.md @@ -0,0 +1,43 @@ +# Query Execution & Indexing + +*Note: This document details the internal algorithms used during **View Calculation** of the [Query Lifecycle](./query-lifecycle.md). It focuses on the performance and mechanics of the **Local Store**.* + +## The Query Engine + +The **Query Engine** is the component within the **Local Store** responsible for finding the set of document keys that match a given query. It does not load the full document data; it only identifies the keys. It employs a hierarchy of strategies, ordered by efficiency: + +1. **Full Index Scan (O(log N))**: + * Used when a Client-Side Index (CSI) exists that covers all filters and sort orders of the query. + * This is the most performant strategy. +2. **Partial Index Scan**: + * Used when an index covers some filters (typically equality filters like `where('status', '==', 'published')`). + * The engine uses the index to narrow down the potential keys and then performs a scan on that smaller subset to verify the remaining conditions. +3. **Index-Free (Timestamp) Optimization**: + * **Concept**: If the client has been online and syncing, it knows the state of the collection up to a specific point in time (the `lastLimboFreeSnapshot`). + * **Mechanism**: The SDK assumes the "base state" (documents matching at the last snapshot) is correct. It then only scans the `remote_documents` table for documents modified *after* that snapshot version. + * This drastically reduces the work required for active listeners, changing the cost from *Collection Size* to *Recent Change Volume*. +4. **Full Collection Scan (O(N))**: + * The fallback strategy. The engine iterates through every document in the collection locally to check for matches. + +## Client-Side Indexing (CSI) + +To support efficient querying without blocking the main thread, the SDK utilizes a decoupled indexing architecture. + +* **Structure**: Index entries are stored in a dedicated `index_entries` table. An entry maps field values (e.g., `(coll/doc1, fieldA=1)`) to a document key. +* **The Index Backfiller**: Indexes are **not** updated synchronously when you write a document. Instead, a background task called the **Backfiller** runs periodically (when the SDK is idle). It reads new/modified documents and updates the index entries. +* **Hybrid Lookup**: Because the Backfiller is asynchronous, the index might be "stale" (behind the document cache). To guarantee consistency, the Query Engine performs a **Hybrid Lookup**: + 1. Query the **Index** for results up to the `IndexOffset` (the point where the Backfiller stopped). + 2. Query the **Remote Document Cache** for any documents modified *after* the `IndexOffset`. + 3. Merge the results. + +## Composite Queries (OR / IN) + +Queries using `OR` or `IN` are not executed as a single monolithic scan. + +> [!NOTE] +> **Scalability & Watch**: While functionality exists to run these queries against the backend, the SDK implements **Disjunctive Normal Form (DNF)** transformation primarily to enable efficient **local** execution using simpler indexes (as seen in `IndexedDbIndexManager`). This allows the SDK to support complex queries offline or against the cache without requiring full table scans. [See Watch System](./watch.md) for more on the backend interaction. + +The SDK transforms these into **Disjunctive Normal Form (DNF)**—essentially breaking them into multiple sub-queries. + +* **Execution**: Each sub-query is executed independently using the strategies above (Index vs. Scan). +* **Union**: The resulting sets of document keys are unioned together in memory to produce the final result. \ No newline at end of file diff --git a/packages/firestore/devdocs/query-lifecycle.md b/packages/firestore/devdocs/query-lifecycle.md new file mode 100644 index 0000000000..942127f8f0 --- /dev/null +++ b/packages/firestore/devdocs/query-lifecycle.md @@ -0,0 +1,101 @@ +# Query Lifecycle & Persistence State + +This document details the internal state machine of a Firestore query from creation to garbage collection. While [architecture.md](./architecture.md) covers the high-level flow of data, this document focuses on the management of internal resources, specifically the distinction between user-facing `Queries` and system-facing `Targets`. + +## Key Concepts + +* **Query**: The immutable public object representing the user's request (e.g., `coll.where("status", "==", "online")`). +* **Target**: The internal representation of a Query sent to the backend. It is assigned a unique integer `TargetID`. +* **TargetData**: Metadata persisted about a Target in the `targets` table, including its `resumeToken`, `snapshotVersion`, and `lastListenSequenceNumber`. +* **Overlay**: A computed data structure representing a local mutation (Set/Patch) that has not yet been synced to the server. The SDK "overlays" this on top of the Remote Document to calculate the user's view. +* **Limbo**: A synchronization state where a document exists locally and matches a query, but the backend hasn't explicitly confirmed it exists in the current snapshot version. + +--- + +## Phase 1: Fan-Out & Deduplication (Event Manager) + +When a user calls `onSnapshot(query, callback)`, the **Event Manager** does not immediately spawn a network request. + +1. **Deduplication**: The Event Manager calculates the "Canonical ID" of the query to check if an identical `Query` is already active in the system. +2. **Fan-Out**: + * **If Active**: The new listener is attached to the existing `QueryListener`. The current cached view is returned immediately via `fromCache: true`. + * **If New**: The Event Manager forwards the request to the **Sync Engine**. + +## Phase 2: Target Allocation (Sync Engine) + +The **Sync Engine** acts as the coordinator between the User World (Query) and the System World (Target). + +1. **Allocation**: The Sync Engine asks the **Local Store** to allocate a `TargetID` for the query. +2. **Persistence**: The `TargetData` is written to the `targets` table (IndexedDB) with an initial `sequence_number`. +3. **Watch Stream**: The Sync Engine instructs the **Remote Store** to begin listening to this `TargetID` over the network. + +## Phase 3: The "View" Calculation (Local Store) + +The "View" is what the user actually sees via the API. It is constructed purely from local data, allowing for offline access and optimistic updates. + +> **Formula:** `View = RemoteDocuments (Cache) + Overlays (Pending Writes)` + +When a query runs (initially or after a remote update), the **Local Store** performs the following: + +1. **Execution**: The `Query Engine` determines the most efficient strategy to find the set of matching `DocumentKeys`. + * *Deep Dive*: For details on Index Scans, Full Scans, and Optimization strategies, see [Query Execution & Indexing](./query-execution.md). +2. **Base State**: The store retrieves the confirmed server state from the `remote_documents` table. +3. **Overlay Application**: + * The store checks the `mutation_queue` for any pending writes associated with these keys. + * These mutations are converted into **Overlays**. + * The Overlay is applied on top of the Remote Document. +4. **Projection**: The final composed documents are sent to the Event Manager. + +*Note: This design ensures users always see their own pending writes immediately (Latency Compensation), even if the backend has not acknowledged them.* + +## Phase 4: Synchronization & Limbo Resolution + +As the **Remote Store** receives snapshot events from the backend, the Sync Engine reconciles the state. This usually follows a "Happy Path," but occasionally encounters "Limbo." + +### The Happy Path +1. Backend sends a `DocumentChange`. +2. Sync Engine updates `remote_documents` table. +3. Local Store recalculates the View. +4. If the view changes, Event Manager fires the user callback. + +### The Limbo State +Limbo occurs when the local cache holds a document that the server implies should not be there (usually detected via an **Existence Filter Mismatch**). + +1. **Detection**: The Sync Engine compares the server's count of matching documents against the local count. If they disagree, it initiates the resolution process. +2. **Resolution**: The SDK uses **Bloom Filters** to identify exactly which local documents are stale. These documents enter "Limbo." +3. **Verification**: The Sync Engine spins up a targeted listener for the Limbo documents. + * If the server returns the document: It is updated. + * If the server returns "Not Found": It is removed from the view. + +*For a detailed explanation of Resume Tokens, Bloom Filters, and the mechanics of this process, see **[Limbo Resolution](./limbo-resolution.md)**.* + + +## Phase 5: Teardown (Stop Listening) + +When a user calls `unsubscribe()`, data is **not** immediately deleted. + +1. **Event Manager**: Decrements the listener count for that Query. +2. **Zero-Count**: If the count hits 0, the Event Manager notifies the Sync Engine. +3. **Sync Engine**: + * Removes the `TargetID` from the Remote Store (stopping the network stream). + * **Crucial**: The data in `remote_documents` and the metadata in `targets` remains on disk. This allows for "Offline Query Acceleration" if the user restarts the app or re-issues the query later. + +## Phase 6: Garbage Collection (The "Death" of a Query) + +Since data persists after `unsubscribe()`, the SDK must actively manage disk usage. + +* **Eager GC**: If persistence is disabled, data is cleared from memory immediately when the listener count hits 0. +* **LRU GC**: If persistence is enabled, the data remains on disk for offline availability. + +The **LruGarbageCollector** runs periodically to keep the cache within the configured size (default 40MB/100MB). It uses a "Sequence Number" system to track when data was last used. + +For a detailed walkthrough of the algorithm, Sequence Numbers, and Orphaned Documents, see **[Garbage Collection](./garbage-collection.md)**. + + +## Debugging Tips + +If you are debugging a **"Zombie Document"** (data appearing that should be gone) or **"Missing Data"**: + +1. **Check `targets`**: Is there an active target (valid `resumeToken`) covering that document? +2. **Check `mutation_queue`**: Is there a pending mutation (BatchID) that hasn't been acknowledged? This creates an Overlay that persists even if the remote doc is deleted. +3. **Check `target_documents`**: Is the document explicitly linked to a TargetID that you thought was closed? \ No newline at end of file diff --git a/packages/firestore/devdocs/spec-tests.md b/packages/firestore/devdocs/spec-tests.md new file mode 100644 index 0000000000..8c66e2c112 --- /dev/null +++ b/packages/firestore/devdocs/spec-tests.md @@ -0,0 +1,69 @@ +# Spec Tests (Cross-Platform Verification) + +Spec Tests are the backbone of the Firestore SDK's correctness strategy. They are a suite of deterministic, data-driven tests that verify the behavior of the **Sync Engine**, **Local Store**, and **Event Manager** without connecting to a real backend. + +## The "Sandwich" Architecture + +Spec tests operate by mocking the edges of the SDK while running the core logic for real. + +1. **Mocked Top (API Layer)**: instead of user code calling `doc.set()`, the test runner injects "User Actions" directly into the Event Manager/Sync Engine. +2. **Real Middle (Core)**: The Sync Engine, Local Store, Query Engine, and Garbage Collector run exactly as they do in production. +3. **Mocked Bottom (Remote Store)**: The network layer (gRPC/REST) is mocked out. The test runner intercepts outgoing writes and injects incoming Watch Stream events (snapshots, acks). + +This allows us to simulate complex race conditions—such as a Write Acknowledgment arriving exactly when a Watch Snapshot is processing—that are impossible to reproduce reliably in a real integration test. + +## Cross-Platform Consistency + +A unique feature of the Firestore SDKs is that they share logic tests across platforms (JavaScript, Android, iOS). + +* **Source of Truth**: The tests are written in **TypeScript** within the JavaScript SDK repository. +* **Compilation**: A build script runs the TS tests in a special mode that exports the steps and expectations into **JSON files**. +* **Consumption**: The Android and iOS SDKs ingest these JSON files and run them using their own platform-specific Test Runners. + +This ensures that if a query behaves a certain way on the Web, it behaves *exactly* the same on an iPhone or Android device. + +## Anatomy of a Spec Test + +A spec test consists of a sequence of **Steps**. Each step performs an action or asserts a state. + +```typescript +specTest('Local writes are visible immediately', [], () => { + // Step 1: User sets a document + userSets('collection/key', { foo: 'bar' }); + + // Step 2: Expect an event (Optimistic update) + expectEvents({ + acknowledgedDocs: ['collection/key'], + events: [{ type: 'added', doc: 'collection/key' }] + }); + + // Step 3: Network acknowledges the write + writeAcks(1); // Ack version 1 + + // Step 4: Watch stream sends the confirmed data + watchSends({ affects: ['collection/key'] }, ...); +}); + +### Key Helpers +* `userListens(query)`: Simulates a user calling `onSnapshot`. +* `userSets(key, val)`: Simulates a user write. +* `watchSends(snapshot)`: Simulates the backend sending data over the Watch stream. +* `expectEvents(events)`: Asserts that the Event Manager emitted specific snapshots to the user. + +## Configuration & Tags + +Spec tests can be configured to run only in specific environments using **Tags**. + +* `multi-client`: Runs the test in a multi-tab environment (simulating IndexedDB concurrency). +* `eager-gc`: Runs only when Garbage Collection is set to Eager (Memory persistence). +* `durable-persistence`: Runs only when using IndexedDB/SQLite. +* `exclusive`: **Debug Tool**. If you add this tag to a test, the runner will skip all other tests and only run this one. This is critical for debugging because the sheer volume of spec tests makes logs unreadable otherwise. + +## Debugging Spec Tests + +Debugging spec tests can be challenging because the code you are stepping through is often the *Test Runner* interpreting the JSON, rather than the test logic itself. + +1. **Use `exclusive`**: Always isolate the failing test. +2. **Trace the Helper**: If `userListens` fails, set a breakpoint in the `spec_test_runner.ts` implementation of that step to see how it interacts with the Sync Engine. +3. **Check Persistence**: Remember that spec tests usually run twice: once with Memory Persistence and once with IndexedDB. A failure might only happen in one mode. + diff --git a/packages/firestore/devdocs/testing.md b/packages/firestore/devdocs/testing.md new file mode 100644 index 0000000000..3af4e7ac48 --- /dev/null +++ b/packages/firestore/devdocs/testing.md @@ -0,0 +1,31 @@ +# Testing Strategy + +This document provides a detailed explanation of the Firestore JavaScript SDK testing strategy, tech stack, and patterns and practices. + +# Tech Stack +- karma, mocha, chai + +# Testing Strategy + +The Firestore JS SDK employs a three-tiered testing strategy to ensure reliability, correctness, and cross-platform consistency. + +## 1. Unit Tests +* **Scope**: Individual classes and functions. +* **Location**: Co-located with source files (e.g., `src/core/query.test.ts`). +* **Purpose**: Validating low-level logic, util functions, and individual component behavior in isolation. + +## 2. Spec Tests (The Core Logic) +* **Scope**: The interaction between the Sync Engine, Local Store, and Event Manager. +* **Location**: `src/core/test/spec_test.ts` and `src/specs/*.ts`. +* **Purpose**: Validating the complex state machine of Firestore without the flakiness of a real network. These tests mock the network layer to simulate specific protocol events (e.g., Watch stream updates, write handshakes). +* **Cross-Platform**: These tests are exported as JSON and run by the Android and iOS SDKs to ensure consistent behavior across all platforms. +* **Deep Dive**: See **[Spec Tests](./spec-tests.md)** for details on how to write and debug these. + +## 3. Integration Tests +* **Scope**: End-to-End verification against a real Firestore backend (prod or emulator). +* **Location**: `test/integration/`. +* **Purpose**: Verifying that the client protocol actually matches what the real backend server expects. +* **Behavior**: These tests create real writes and listeners. They are slower and subject to network timing, but essential for catching protocol drifts. + +> **Note**: For instructions on how to run these tests, see **[CONTRIBUTING.md](../CONTRIBUTING.md)**. + diff --git a/packages/firestore/devdocs/transactions.md b/packages/firestore/devdocs/transactions.md new file mode 100644 index 0000000000..102321e072 --- /dev/null +++ b/packages/firestore/devdocs/transactions.md @@ -0,0 +1,72 @@ +# Transaction Lifecycle & Mechanics + +This document details the internal implementation of Transactions in the Firestore JavaScript SDK. Unlike standard writes, which use the [Write Lifecycle](./write-lifecycle.md) (Mutation Queue, Overlays, Sync Engine), Transactions are **online-only** operations that communicate directly with the backend. + +## Optimistic vs. Pessimistic Concurrency + +It is critical to distinguish how the Client-Side SDKs (JS, Android, iOS) handle transactions versus the Server-Side SDKs (Node.js, Java, Go, etc.). + +* **Server SDKs (Pessimistic Locking)**: + 1. Call `BeginTransaction` RPC. Backend returns a `TransactionID`. + 2. Reads hold a **Lock** on the documents on the server. + 3. Other clients cannot modify these documents until the transaction commits or times out. + 4. Writes are committed using the `TransactionID`. + +* **JavaScript / Mobile SDKs (Optimistic Concurrency)**: + 1. **No Locks**: The SDK reads documents without acquiring a server-side lock. + 2. **Preconditions**: When committing, the SDK sends the writes along with a **Precondition** (usually the `updateTime` of the document version that was read). + 3. **Verification**: The backend verifies that the documents have not changed since they were read. If they have, the transaction fails. + 4. **Retry**: The SDK automatically retries the transaction function (up to 5 times) with exponential backoff. + +### Why Optimistic? +Mobile and Web clients have unreliable network connectivity. If a client acquired a Pessimistic Lock and then lost connectivity, that lock would block all other writers to that document until it timed out. Optimistic concurrency ensures that a single flaky client cannot paralyze the system for others. + +## The Transaction Lifecycle + +Because transactions bypass the `LocalStore` and `SyncEngine`, they do not persist data to IndexedDB during execution. + +### 1. Execution +The `runTransaction` function accepts an `updateFunction`. This function is executed repeatedly until success or a non-retriable error occurs. + +### 2. Reads (`get`) +When a user reads a document inside a transaction: +* **Bypasses Cache**: The SDK does *not* look in `remote_documents` or `mutation_queue`. It forces a network fetch. +* **RPC**: It uses the `BatchGetDocuments` RPC. +* **No Transaction ID**: Unlike server SDKs, the `BatchGetDocuments` request does **not** include a Transaction ID. It is a standard read. +* **Versioning**: The SDK records the `updateTime` and `key` of every document read. These will be used later for verification. + +### 3. Writes (`set`, `update`, `delete`) +Writes are not sent to the network immediately. They are buffered in memory within the `Transaction` object. +* **Validation**: The SDK enforces the rule that **all reads must occur before any writes**. Once a write is buffered, the SDK throws an error if a subsequent read is attempted. + +### 4. Commit +When the `updateFunction` completes successfully, the SDK attempts to commit. +* **RPC**: It uses the `Commit` RPC (a single atomic batch). +* **Preconditions**: For every document that was read and is now being written to (or verified), the SDK attaches a `Precondition`. + * *Example*: "Only update `doc/A` if its `updateTime` is still `Timestamp(123)`." +* **Verify Mutations**: If a document was read but *not* modified, the SDK still needs to ensure it didn't change (as it might have influenced the business logic). The SDK adds a specific `VerifyMutation` to the commit batch, which performs a precondition check without modifying data. + +### 5. Backend Response +* **Success**: The backend applies all writes atomically. The SDK returns the result to the user. +* **Failure (Precondition Failed)**: This indicates contention (another client modified a document). The SDK captures this specific error code. +* **Retry**: If the error is retriable (Precondition Failed, Aborted), the SDK waits for a backoff period and then **re-runs the `updateFunction` from the start**. This requires re-fetching the fresh documents. + +## Architectural Bypass + +Transactions utilize a dedicated pathway in the `RemoteStore`/`Datastore` layer. + +1. **API Layer**: `runTransaction` is called. +2. **Core**: `TransactionRunner` manages the retry loop and backoff. +3. **Remote Store**: + * Standard queries use `WatchStream` (long-lived connection). + * Standard writes use `CommitStream` (requires `mutation_queue` persistence). + * **Transactions** use direct Unary RPCs (`BatchGetDocuments` and `Commit`) via the underlying `Datastore` helper. + +> [!WARNING] +> **Consistency Warning**: Transactions use a different endpoint (`runTransaction`) than the standard `Listen` (Watch) system. As a result, they **do not** guarantee consistency with the Watch stream. A write committed via Watch might not be immediately visible to a transaction, and vice versa, due to the distributed nature of the backend. + +## Constraints + +* **Online Only**: Because they bypass the local cache and require server-side verification, transactions require an active network connection. They will fail if the client is offline. +* **Read-Your-Writes**: Within the scope of the transaction function, the SDK does not update the local cache with the pending writes. However, the transaction object tracks local changes to ensure that if you write to `DocA` and then read `DocA` (which is illegal in the public API, but conceptually relevant), you would see the change. +* **5 Retry Limit**: To prevent infinite loops during high contention, the SDK caps retries at 5 attempts. \ No newline at end of file diff --git a/packages/firestore/devdocs/watch.md b/packages/firestore/devdocs/watch.md new file mode 100644 index 0000000000..4acaf7b84f --- /dev/null +++ b/packages/firestore/devdocs/watch.md @@ -0,0 +1,41 @@ +# The Watch System + +This document explains the "Watch" backend system, which powers the real-time capabilities and standard read/write operations of the Firestore SDKs. + +## Overview + +"Watch" is the internal name for the high-scale system that the SDKs interact with. It serves two primary purposes: + +1. **Reads & Writes**: It is the main entry point for standard document operations. +2. **Real-Time Listeners**: It powers the `onSnapshot` live updates by tracking database changes and pushing them to clients. + +## Architecture: The Reverse Proxy + +Watch functions as a massive **Reverse Proxy**. + +1. **Connection**: Massive numbers of end-user devices (phones, browsers) connect directly to Watch. +2. **Routing**: Watch takes the incoming query or write and forwards it to the appropriate underlying storage backend (partition). +3. **Observation**: For queries, Watch doesn't just fetch data once. It "watches" the backend for any writes that would affect the query results and pushes those changes to the subscribed client. + +This architecture allows Firestore to handle millions of concurrent connections while abstracting the complexity of sharding and storage from the client. + +## Consistency Guarantees + +The Watch system (exposed via the `Listen` endpoint) enables strong consistency between reads and writes. + +* **Consistency**: All reads and writes performed through this system are consistent with each other. If you write a document and then immediately read it (or listen to it) via Watch, you will see the latest version. +* **Authentication**: Watch interacts directly with Firebase Auth to identify the user and enforces Firestore Security Rules for every operation. + +> [!IMPORTANT] +> **Transactions** use a different endpoint (`runTransaction`) and **do not** guarantee consistency with the Watch stream. See [Transactions](./transactions.md) for details. + +### Scalability and Client-Side Logic +The Watch system operates at a massive scale. To maintain performance, the backend may rely on the SDK to handle certain query complexities, particularly for **local** execution and consistency. + +> [!NOTE] +> **Composite Queries (OR/IN)** +> While the Watch system supports complex queries (including `OR` and `IN`), the SDK performs significant client-side logic to support them efficiently **locally**. +> * **Local Indexing**: For local cache execution, the SDK transforms composite queries into **Disjunctive Normal Form (DNF)** (breaking them into simpler sub-queries) to utilize simple field indexes. +> * **consistency**: The SDK merges results from the Watch stream and local cache to ensure a consistent view. + +This architectural decision explains why you see complex logic like **Composite Query** execution in the [Query Engine](./query-execution.md). The SDK implements this logic to bridge the gap between user intent and Watch's scalability constraints. diff --git a/packages/firestore/devdocs/write-lifecycle.md b/packages/firestore/devdocs/write-lifecycle.md new file mode 100644 index 0000000000..a86c83d87d --- /dev/null +++ b/packages/firestore/devdocs/write-lifecycle.md @@ -0,0 +1,54 @@ +# Write Lifecycle & Latency Compensation + +This document details the lifecycle of a write operation (Set, Update, Delete) from the moment the API is called to when it is committed to the backend. It focuses on **Mutations**, **Overlays**, and how the SDK achieves instant **Latency Compensation**. + +## Key Concepts + +* **Mutation**: An operation that modifies a document (e.g., `SetMutation`, `PatchMutation`, `DeleteMutation`). +* **Mutation Batch**: A group of mutations that must be applied atomically. Every user write creates a new Batch with a unique `BatchID`. +* **Overlay**: A "Materialized View" of the changes applied to a document. Instead of re-calculating the result of a mutation every time a query runs, the SDK calculates the result *once* at write time and saves it as an Overlay. + +## Phase 1: Mutation Creation & Batching + +When a user calls `setDoc` or `updateDoc`: +1. **Validation**: The SDK validates the data locally. +2. **Batching**: The operation is wrapped in a `MutationBatch`. +3. **Persistence**: The batch is serialized and saved to the `mutation_queue` table in IndexedDB. + * **Partitioning**: Queues are partitioned by User ID. If the user is offline, these batches accumulate in the queue. + +## Phase 2: Overlay Calculation (Optimization) + +To ensure queries run fast, the SDK does not apply raw mutations to remote documents during every query execution. Instead, it pre-calculates the result. + +1. **Base State**: The SDK retrieves the current state of the document (from `remote_documents`). +2. **Apply**: It applies the new `Mutation` to the base state to determine what the document *should* look like locally. +3. **Persist Overlay**: This resulting state is saved to the `document_overlays` table. + * **Field Mask**: The overlay tracks specifically which fields were modified. +4. **Latency Compensation**: The `Event Manager` immediately triggers active listeners. The listeners read the `Overlay` instead of the `Remote Document`, giving the user the illusion of instant updates. + +> **Formula:** `Local View = Remote Document + Overlay` + +## Phase 3: Synchronization (The Write Pipeline) + +The `SyncEngine` manages the flow of data to the server: + +1. **Filling the Pipeline**: The `RemoteStore` reads the `mutation_queue` in order of `BatchID` (FIFO). +2. **Transmission**: Mutations are sent to the backend via gRPC (or REST in Lite). +3. **Atomicity**: If a batch contains multiple writes, the backend guarantees they are applied together or not at all. + +## Phase 4: Acknowledgement & Cleanup + +When the backend responds: + +### Scenario A: Success (Ack) +1. **Commit**: The backend commits the change and returns the authoritative version of the document (and transformation results, like server timestamps). +2. **Update Remote**: The SDK updates the `remote_documents` table with this new server version. +3. **Cleanup**: + * The `MutationBatch` is removed from `mutation_queue`. + * The corresponding `Overlay` is removed from `document_overlays` (since the Remote Document now matches the desired state). +4. **Re-Evaluation**: Active queries are re-run. Since the Overlay is gone, they now read the updated Remote Document. + +### Scenario B: Rejection +1. The `MutationBatch` is removed. +2. The `Overlay` is removed. +3. The Local View reverts to the `Remote Document` state (rolling back the optimistic update). \ No newline at end of file