firebase · rafikhan · Aug 29, 2025 · Nov 4, 2025 · Nov 5, 2025 · Nov 5, 2025
@@ -5,6 +5,8 @@ contributing to the Firebase JavaScript SDK (including Cloud Firestore).
 Follow instructions there to install dependencies, build the SDK, and set up
 the testing environment.
 
+For a deep dive into the testing strategy and architecture, see [Testing Strategy](devdocs/testing.md).
+
 ## Integration Testing
 
 ### Setting up a project for testing

@@ -0,0 +1,5 @@
+# Firestore JavaScript SDK
+This project is the official JavaScript SDK for the [Google Cloud Firestore](https://firebase.google.com/docs/firestore) database.
+
+You are an expert in @devdocs/prerequisites.md
+@devdocs/overview.md
@@ -17,6 +17,10 @@ Docs][reference-docs].
 
 [reference-docs]: https://firebase.google.com/docs/reference/js/
 
+## Internal Documentation
+
+If you are a contributor or maintainer, please see the [Internal Developer Documentation](./devdocs/overview.md).
+
 ## Contributing
 See [Contributing to the Firebase SDK](../../CONTRIBUTING.md) for general
 information about contributing to the firebase-js-sdk repo and

@@ -0,0 +1,47 @@
+# Firestore JavaScript SDK Developer Documentation
+
+This folder contains the developer documentation for the Firestore JavaScript SDK. 
+
+**Audience:**
+1.  **Maintainers**: Engineers working on the SDK internals.
+2.  **AI Agents**: Automated assistants reading this consistency to understand the codebase.
+
+**NOT Audience:**
+-   3rd Party Developers (App Developers). They should use the [official Firebase documentation](https://firebase.google.com/docs/firestore).
+
+# Entry Point
+Start at [./overview.md](./overview.md).
+
+# Content Guidelines
+
+## Principles
+-   **High-Level Focus**: Explain architecture and data flow. Avoid duplicating code.
+-   **Why > How**: Explain *why* a design choice was made. The code shows *how*.
+-   **Reference by Name**: Use exact component/interface names (e.g., `Persistence`, `EventManager`).
+
+## Terminology
+-   **Concepts First**: **Aggressively favor** high-level English concepts over code identifiers. Only drop down to code identifiers when absolutely necessary for precise mapping.
+    *   *Good*: "The Mutation Queue stores pending writes."
+    *   *Bad*: "The `mutationQueues` store contains `DbMutationBatch` objects."
+    *   *Acceptable (Mapping)*: "The Mutation Queue (implemented as `mutationQueues` store)..."
+-   **Avoid Over-Specification**: Do not generally reference private/internal variable names unless documenting that specific module's internals.
+-   **Strict Casing**: When you *must* reference code, use the **exact casing** found in the codebase (e.g., `mutationQueues`).
+-   **No "Translations"**: Never convert code names into snake_case. Either use the English Concept ("Remote Documents") or the exact Code Name (`remoteDocuments`).
+
+## Diagramming
+Use **Mermaid** for diagrams.
+-   Flowcharts for logic.
+-   Sequence diagrams for async protocols.
+-   Class diagrams for component relationships.
+
+# Style Guide
+-   **Syntax**: Markdown (GFM).
+-   **Voice**: Active voice ("The SDK does X").
+-   **Tense**: Present tense ("The query executes...").
+-   **Mood**: Imperative for instructions ("Run the build...").
+-   **Conciseness**: Short sentences. Bullet points where possible.
+
+# Maintenance
+-   **Co-location**: Keep documentation close to the code it describes (linked via `Code Layout`).
+-   **Atomic Updates**: Update documentation in the *same PR* as the feature or fix.
+-   **Freshness**: If you see stale docs, fix them immediately.
@@ -0,0 +1,76 @@
+# SDK Architecture
+
+This document provides a detailed explanation of the Firestore JavaScript SDK's architecture, its core components, and the flow of data through the system.
+
+## Core Components
+
+The SDK is composed of several key components that work together to provide the full range of Firestore features.
+
+![Architecture Diagram](./architecture.png)
+
+*   **API Layer**: The public-facing API surface that developers use to interact with the SDK. This layer is responsible for translating the public API calls into the internal data models and passing them to the appropriate core components.
+*   **Core**:
+    *   **Event Manager**: Acts as a central hub for all eventing in the SDK. It is responsible for routing events between the API Layer and Sync Engine. It manages query listeners and is responsible for raising snapshot events, as well as handling connectivity changes and some query failures.
+*   **Sync Engine**: The central controller of the SDK. It acts as the glue between the Event Manager, Local Store, and Remote Store.
+    *   **Target**: The backend protocol's internal representation of a recurring Query. While a `Query` is a user-intent (e.g., "users where age > 18"), a `Target` is the allocated stream ID (`TargetID`) that the Watch implementation uses to track that query's state over the network. The **Coordinator** maps ephemeral user Queries to stable system Targets.
+    *   **Coordinator**: It bridges the **User World** (Query) and **System World** (Target), converting public API calls into internal `TargetIDs`.
+    *   **View Construction**: It manages the user-facing view using the formula: `View = Remote Document + Overlay`.
+        *   **Remote Document**: The authoritative state from the backend.
+        *   **Overlay**: A computed delta representing pending local mutations.
+    *   **Limbo Resolution**: It detects "Limbo" documents (local matches not confirmed by server) and initiates resolution flows to verify their existence.
+    *   **Lifecycle Management**: It controls the [Query Lifecycle](./query-lifecycle.md), managing the initialization of streams, the persistence of data, and garbage collection eligibility.
+*   **Local Store**: A container for the components that manage persisted and in-memory data.
+    *   **Remote Table**: A cache of the most recent version of documents as known by the Firestore backend (A.K.A. Remote Documents).
+    *   **Mutation Queue**: A queue of all the user-initiated writes (set, update, delete) that have not yet been acknowledged by the Firestore backend.
+    *   **Local View**: A cache that represents the user's current view of the data, combining the Remote Table with the Mutation Queue.
+    *   **Query Engine**: Determines the most efficient strategy (Index vs. Scan) to identify documents matching a query in the local cache.
+    *   **Overlays**: A performance-optimizing cache that stores the calculated effect of pending mutations from the Mutation Queue on documents. Instead of re-applying mutations every time a document is read, the SDK computes this "overlay" once and caches it, allowing the Local View to be constructed more efficiently.
+    * For a detailed breakdown of the IndexedDB structure and tables, see [Persistence Schema](./persistence-schema.md).
+*   **Remote Store**: The component responsible for all network communication with the Firestore backend.
+    *   It manages the **Watch Stream** (see **[The Watch System](./watch.md)**) for reading and listening to data.
+    *   It manages the gRPC streams for writing data.
+    *   It abstracts away the complexities of the network protocol from the rest of the SDK.
+*   **Persistence Layer**: The underlying storage mechanism used by the Local Store to persist data on the client. In the browser, this is implemented using IndexedDB.
+
+The architecture and systems within the SDK map closely to the directory structure, which helps developers navigate the codebase. Here is a mapping of the core components to their corresponding directories.
+
+*   `src/`:
+    *   `api/`: Implements the **API Layer** for the main SDK.
+    *   `lite-api/`: Implements the **API Layer** for the lite SDK.
+    *   `core/`: Implements the **Sync Engine** and **Event Manager**.
+    *   `local/`: Implements the **Local Store**, which includes the **Mutation Queue**, **Remote Table**, **Local View**, and the **Persistence Layer**.
+    *   `remote/`: Implements the **Remote Store**, handling all network communication.
+
+For a more detailed explanation of the contents of each directory, see the [Code Layout](./code-layout.md) documentation.
+
+
+# Data Flow
+
+Here's a step-by-step walkthrough of how data flows through the SDK for a write operation, referencing the core components.
+
+## Write Data Flow
+
+1.  **API Layer**: A user initiates a write operation (e.g., `setDoc`, `updateDoc`, `deleteDoc`).
+2.  **Sync Engine**: The call is routed to the Sync Engine, which wraps the operation in a "mutation".
+3.  **Mutation Queue (in Local Store)**: The Sync Engine adds this mutation to the Mutation Queue. The queue is persisted to the **Persistence Layer** (IndexedDB). At this point, the SDK "optimistically" considers the write successful locally.
+4.  **Local View (in Local Store)**: The change is immediately reflected in the Local View, making it available to any active listeners without waiting for backend confirmation.
+5.  **Remote Store**: The Sync Engine notifies the Remote Store that there are pending mutations.
+6.  **Backend**: The Remote Store sends the mutations from the queue to the Firestore backend.
+7.  **Acknowledgement**: The backend acknowledges the write.
+8.  **Mutation Queue (in Local Store)**: The Remote Store informs the Sync Engine, which then removes the acknowledged mutation from the Mutation Queue.
+
+## Read Data Flow (with a Real-Time Listener)
+
+1.  **API Layer**: A user attaches a listener to a query (e.g., `onSnapshot`).
+2.  **Event Manager**: The Event Manager creates a listener and passes it to the Sync Engine.
+3.  **Sync Engine**: The Sync Engine creates a "view" for the query.
+4.  **Local View (in Local Store)**: The Sync Engine asks the Query Engine in the Local Store to execute the query. The Query Engine selects a strategy (e.g., Index Scan or Timestamp Optimization) to find matching keys. The Local Store then constructs the documents by applying cached Overlays on top of Remote Documents.
+5.  **API Layer**: The initial data from the Local View is sent back to the user's `onSnapshot` callback. This provides a fast, initial result.
+6.  **Remote Store**: Simultaneously, the Sync Engine instructs the Remote Store to listen to the query on the Firestore backend.
+7.  **Backend**: The backend returns the initial matching documents for the query.
+8.  **Remote Table (in Local Store)**: The Remote Store receives the documents and saves them to the Remote Table in the Local Store, overwriting any previously cached versions of those documents.
+9.  **Sync Engine**: The Sync Engine is notified of the updated documents. It re-calculates the query view by combining the new data from the Remote Table with any applicable pending mutations from the **Mutation Queue**.
+10. **API Layer**: If the query results have changed after this reconciliation, the new results are sent to the user's `onSnapshot` callback. This is why a listener may fire twice initially.
+11. **Real-time Updates**: From now on, any changes on the backend that affect the query are pushed to the Remote Store, which updates the Remote Table, triggering the Sync Engine to re-calculate the view and notify the listener.
+
+**Note on Query Lifecycle:** The steps above describe the "Happy Path" of a query starting up. For details on how queries are deduplicated, how the data persists after a listener is removed, and how Garbage Collection eventually cleans it up, see the [Query Lifecycle](query-lifecycle.md).
@@ -0,0 +1,5 @@
+# Build Process
+
+> **Note**: This documentation is currently under construction.
+
+For current build instructions, test commands, and setup, please go to [CONTRIBUTING.md](../CONTRIBUTING.md).
@@ -0,0 +1,42 @@
+# Firestore Data Bundles
+
+This document provides an overview of Firestore data bundles, how they are processed, and how they are used within the SDK.
+
+## What is a Bundle?
+
+A Firestore data bundle is a serialized, read-only collection of documents and named query results. Bundles are created on a server using the Firebase Admin SDK and can be efficiently distributed to clients.
+
+While bundles can be used for several purposes, their primary design motivation is to optimize Server-Side Rendering (SSR) workflows. In an SSR setup, a server pre-renders a page with data from Firestore. This data can be packaged into a bundle and sent to the client along with the HTML. The client-side SDK can then load this bundle and "hydrate" a real-time query with the pre-existing data, avoiding the need to re-fetch the same documents from the backend. This results in a significant performance improvement and cost savings.
+
+## Primary Use Case: Server-Side Rendering (SSR) Hydration
+
+The main workflow for bundles is as follows:
+
+1.  **Server-Side:** A server fetches data from Firestore to render a page.
+2.  **Bundling:** The server packages the fetched documents and the corresponding query into a bundle.
+3.  **Transmission:** The bundle is embedded in the HTML page sent to the client.
+4.  **Client-Side:** The client-side JavaScript calls `loadBundle()` to load the data from the bundle into the SDK's local cache.
+5.  **Hydration:** The client then attaches a real-time listener to the same query that was bundled. The SDK finds the query results in the local cache and immediately fires the listener with the initial data, avoiding a costly roundtrip to the backend.
+
+## Other Benefits and Use Cases
+
+Beyond SSR hydration, bundles offer several other advantages:
+
+*   **Enhanced Offline Experience:** Bundles can be shipped with an application's initial assets, allowing users to have a more complete offline experience from the first launch, reducing the need to sync every document individually.
+*   **Efficient Data Distribution:** They provide an efficient way to deliver curated or static datasets to clients in a single package. For instance, an application could bundle a list of popular items or configuration data.
+
+## The Loading Process
+
+When an application provides a bundle to the SDK, a loading process is initiated. The SDK reads the bundle, which is a stream of documents and queries, and saves each item into its local cache. This process is asynchronous, allowing the application to continue running while the data is being loaded in the background.
+
+To give developers visibility into this process, the SDK provides progress updates, including the number of documents and bytes loaded so far. Once all the data has been successfully loaded into the cache, the SDK signals that the process is complete.
+
+## Error Handling
+
+The bundle loading process is designed to be robust. If an error is encountered at any point—for example, if the bundle data is malformed or there is an issue writing to the local cache—the entire operation is aborted. The SDK ensures that the application is notified of the failure, allowing developers to catch the error and implement appropriate fallback or recovery logic.
+
+## Interacting with Bundled Data
+
+Once a bundle has been loaded, the data it contains is available for querying. If the bundle included named queries, you can use the `getNamedQuery()` method to retrieve a `Query` object, which can then be executed.
+
+When a named query is executed, the Firestore SDK will first attempt to fulfill the query from the local cache. If the results for the named query are available in the cache (because they were loaded from a bundle), they will be returned immediately, without a server roundtrip.
@@ -0,0 +1,35 @@
+# SDK Code Layout
+
+This document explains the code layout in this repository. It is closely related to the [architecture](./architecture.md).
+
+*   `src/`: Contains the source code for the main `@firebase/firestore` package.
+    *   `api/`: **API Surface**. Implements the public API (e.g., `doc`, `collection`, `onSnapshot`).
+        *   `database.ts`: The entry point for the `Firestore` class.
+        *   `reference.ts`: Implements `DocumentReference` and `CollectionReference`.
+    *   `core/`: **Sync Engine**. Contains the high-level orchestration logic.
+        *   `sync_engine.ts`: The central coordinator. It manages the "User World" <-> "System World" bridge, `TargetID` allocation, and the main async queue.
+        *   `event_manager.ts`: Handles `QueryListener` registration, fan-out (deduplication of identical queries), and raising snapshot events to the user.
+        *   `query.ts`: Defines the internal `Query` and `Target` models.
+        *   `firestore_client.ts`: The initialization logic that wires up the components.
+    *   `local/`: **Storage and Query Execution**. Manages persistence, caching, and local execution.
+        *   `local_store.ts`: The main interface for the Core layer to interact with storage. It coordinates the components below.
+        *   `indexeddb_persistence.ts`: The implementation of the [Persistence Schema](./persistence-schema.md) using IndexedDB.
+        *   `local_documents_view.ts`: Implements the logic to assemble the user-facing view (`RemoteDoc` + `Mutation`).
+        *   `query_engine.ts`: The optimizer that decides how to scan the cache.
+        *   `lru_garbage_collector.ts` & `reference_delegate.ts`: Implements the Sequence Number logic to clean up old data.
+    *   `remote/`: **Network**. Handles gRPC/REST communication.
+        *   `remote_store.ts`: Manages the "Watch Stream" (listening to queries) and the "Commit Stream" (sending mutations).
+        *   `connection.ts`: Abstracts the underlying networking transport.
+        *   `serializer.ts`: Converts between internal model objects and the Protobuf format used by the backend.
+    *   `model/`: Defines the immutable data structures used throughout the SDK (e.g., `DocumentKey`, `FieldPath`, `Mutation`).
+    *   `util/`: General purpose utilities (AsyncQueue, Assertions, Types).
+*   `lite/`: Defines the entrypoint code for the `@firebase/firestore/lite` package.
+*   `test/`: Contains all unit and integration tests for the SDK. The tests are organized by component and feature, and they are essential for ensuring the quality and correctness of the code.
+*   `scripts/`: Contains a collection of build and maintenance scripts used for tasks such as bundling the code, running tests, and generating documentation.
+
+TODO: Add more detailed information as appropriate on each folder
+
+TODO: Mention critical entry points
+    - `package.json` for packages and common commands. Go to [build.md](./build.md) for details
+    - rollup configs for main and lite sdks. Go to [build.md](./build.md) for details
+    - tests entry points. Go to [testing.md](./testing.md) for details