InMemoryVectorStore is a minimal command‑line tool written in C#/.NET that demonstrates how to build and query a vector database entirely in memory. The application allows you to process local documents, create vector embeddings for them and then ask questions using your choice of AI provider. Everything runs locally without requiring a paid search service such as Azure Cognitive Search or AWS Kendra.
- Works with Azure OpenAI, OpenAI or DeepSeek APIs
- Builds an in‑memory vector database from your own documents
- Supports Retrieval Augmented Generation (RAG) mode or Full Context mode
- Stores vector data on disk as a small cache file so it can be reloaded quickly
- Simple
.envconfiguration for API keys
Before running the application, set the following environment variables (or populate a .env file):
OPENAI_API_KEYAZURE_OPENAI_API_KEYAZURE_OPENAI_ENDPOINTDEEPSEEK_API_KEY
Copy .env.example to .env and fill in the values for the providers you want to use. The .env file will be loaded automatically when the app starts.
Run the project using the .NET CLI:
# restore dependencies and build
dotnet build
# start the program
dotnet run --project InMemoryVectorStore.csprojWhen launched, you will be asked which AI provider to use followed by a menu to select the desired mode:
- RAG Mode – query a trained vector database and get answers based on the most relevant chunks
- Context Mode – load whole documents into the chat context and ask questions against the full text
- Train Mode – process a folder of documents and create a new vector database
- List Databases – show all cached databases available on your machine
Provide a folder path containing your source files (PDF, DOCX, TXT, etc.) and a database identifier. The tool will parse the files, create embeddings and store them in a local cache file. This cache is later loaded by RAG mode.
After training, choose RAG Mode and specify the same database identifier. The application searches the in‑memory vectors for the best matching chunks and passes them to your selected AI service to generate an answer.
Context mode reads the contents of the provided files directly into the chat messages without using the vector database. This is useful for small sets of documents that fit within the model’s context window.
By storing vectors in memory and caching them locally, you can experiment with retrieval based workflows without the cost of hosted search solutions. It’s ideal for prototypes, demos or learning how RAG systems work.
This project is licensed under the MIT License. See LICENSE for details.