Skip to content
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 24 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,30 @@ Transform unstructured data (PDFs, DOCs, TXT, YouTube videos, web pages, etc.) i

This application allows you to upload files from various sources (local machine, GCS, S3 bucket, or web sources), choose your preferred LLM model, and generate a Knowledge Graph.

---
## Getting Started

### **Prerequisites**
- **Python 3.12 or higher** (for local/separate backend deployment)
- Neo4j Database **5.23 or later** with APOC installed.
- **Neo4j Aura** databases (including the free tier) are supported.
- If using **Neo4j Desktop**, you will need to deploy the backend and frontend separately (docker-compose is not supported).

#### **Backend Setup**
1. Create the `.env` file in the `backend` folder by copying `backend/example.env`.
2. Preconfigure user credentials in the `.env` file to bypass the login dialog:
```bash
NEO4J_URI=<your-neo4j-uri>
NEO4J_USERNAME=<your-username>
NEO4J_PASSWORD=<your-password>
NEO4J_DATABASE=<your-database-name>
```
3. Run:
```bash
cd backend
python3.12 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt -c constraints.txt
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The command includes -c constraints.txt but there's no mention of a constraints.txt file in the repository or documentation. If this file doesn't exist or isn't provided, users will encounter an error. Either ensure the constraints.txt file is included in the repository or remove this flag from the command.

Suggested change
pip install -r requirements.txt -c constraints.txt
pip install -r requirements.txt

Copilot uses AI. Check for mistakes.
uvicorn score:app --reload

## Key Features

Expand Down
2 changes: 1 addition & 1 deletion backend/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM python:3.10-slim
FROM python:3.12-slim
WORKDIR /code
ENV PORT 8000
EXPOSE 8000
Expand Down
14 changes: 12 additions & 2 deletions backend/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
# Project Overview
Welcome to our project! This project is built using FastAPI framework to create a fast and modern API with Python.

## Prerequisites

- Python 3.12 or higher
- pip (Python package manager)

## Feature
API Endpoint : This project provides various API endpoint to perform specific tasks.
Data Validation : Utilize FastAPI data validation and serialization feature.
Expand All @@ -16,9 +21,14 @@ Follow these steps to set up and run the project locally:

> cd llm-graph-builder

2. Install Dependency :
2. Create a virtual environment (recommended):

> python3.12 -m venv venv
> source venv/bin/activate # On Windows: venv\Scripts\activate

3. Install Dependency :

> pip install -t requirements.txt
> pip install -r requirements.txt -c constraints.txt
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The command includes -c constraints.txt but there's no mention of a constraints.txt file in the repository or documentation. If this file doesn't exist or isn't provided, users will encounter an error. Either ensure the constraints.txt file is included in the repository or remove this flag from the command.

Suggested change
> pip install -r requirements.txt -c constraints.txt
> pip install -r requirements.txt

Copilot uses AI. Check for mistakes.

## Run backend project using unicorn
Run the server:
Expand Down
80 changes: 40 additions & 40 deletions backend/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,65 +1,65 @@
accelerate==1.7.0
asyncio==3.4.3
boto3==1.38.36
botocore==1.38.36
certifi==2025.6.15
fastapi==0.115.12
accelerate==1.12.0
asyncio==4.0.0
boto3==1.40.23
botocore==1.40.23
certifi==2025.8.3
fastapi==0.116.1
fastapi-health==0.4.0
fireworks-ai==0.15.12
google-api-core==2.25.1
google-auth==2.40.3
google_auth_oauthlib==1.2.2
google-cloud-core==2.4.3
json-repair==0.39.1
json-repair==0.44.1
pip-install==1.3.5
langchain==0.3.25
langchain-aws==0.2.25
langchain-anthropic==0.3.15
langchain-fireworks==0.3.0
langchain-community==0.3.25
langchain-core==0.3.65
langchain-experimental==0.3.4
langchain-google-vertexai==2.0.25
langchain-groq==0.3.2
langchain-openai==0.3.23
langchain-text-splitters==0.3.8
langchain-huggingface==0.3.0
langchain==1.1.2
langchain-aws==1.1.0
langchain-anthropic==1.2.0
langchain-fireworks==1.1.0
langchain-community==0.4.1
langchain-core==1.1.1
langchain-experimental==0.4.0
langchain-google-vertexai==3.1.1
langchain-groq==1.1.0
langchain-openai==1.1.0
langchain-text-splitters==1.0.0
langchain-huggingface==1.1.0
langchain-classic==1.0.0
langdetect==1.0.9
langsmith==0.3.45
langserve==0.3.1
neo4j-rust-ext==5.28.1.0
langsmith==0.4.55
langserve==0.3.3
neo4j-rust-ext==5.28.2.1
nltk==3.9.1
openai==1.86.0
opencv-python==4.11.0.86
openai==2.9.0
psutil==7.0.0
pydantic==2.11.7
python-dotenv==1.1.0
pydantic==2.12.5
python-dotenv==1.1.1
python-magic==0.4.27
PyPDF2==3.0.1
PyMuPDF==1.26.1
starlette==0.46.2
sse-starlette==2.3.6
PyMuPDF==1.26.4
starlette==0.47.3
sse-starlette==3.0.2
starlette-session==0.4.3
tqdm==4.67.1
unstructured==0.18.14
unstructured[all-docs]
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unstructured package appears twice in the requirements (line 44 with version 0.18.14 and line 45 with [all-docs] extra). Consider consolidating these into a single line as unstructured[all-docs]==0.18.14 to avoid potential installation conflicts and improve clarity.

Suggested change
unstructured==0.18.14
unstructured[all-docs]
unstructured[all-docs]==0.18.14

Copilot uses AI. Check for mistakes.
unstructured==0.17.2
unstructured-client==0.36.0
unstructured-client==0.42.3
unstructured-inference==1.0.5
urllib3==2.4.0
uvicorn==0.34.3
urllib3==2.5.0
uvicorn==0.35.0
gunicorn==23.0.0
wikipedia==1.4.0
wrapt==1.17.2
wrapt==1.17.3
yarl==1.20.1
youtube-transcript-api==1.1.0
youtube-transcript-api==1.2.2
zipp==3.23.0
sentence-transformers==5.0.0
sentence-transformers==5.1.0
google-cloud-logging==3.12.1
pypandoc==1.15
graphdatascience==1.15.1
Secweb==1.18.1
ragas==0.3.1
graphdatascience==1.18a1
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using an alpha version (1.18a1) of graphdatascience in production is not recommended. Alpha versions are pre-release and may contain bugs or breaking changes. Consider using a stable release version instead.

Suggested change
graphdatascience==1.18a1
graphdatascience==1.18.0

Copilot uses AI. Check for mistakes.
Secweb==1.25.2
ragas==0.3.2
rouge_score==0.1.2
langchain-neo4j==0.4.0
langchain-neo4j==0.6.0
pypandoc-binary==1.15
chardet==5.2.0
8 changes: 4 additions & 4 deletions backend/src/QA_integration.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,12 @@
from langchain_neo4j import Neo4jVector
from langchain_neo4j import Neo4jChatMessageHistory
from langchain_neo4j import GraphCypherQAChain
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableBranch
from langchain.retrievers import ContextualCompressionRetriever
from langchain_community.document_transformers import EmbeddingsRedundantFilter
from langchain.retrievers.document_compressors import EmbeddingsFilter, DocumentCompressorPipeline
from langchain_classic.retrievers import ContextualCompressionRetriever
from langchain_classic.document_transformers import EmbeddingsRedundantFilter
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'EmbeddingsRedundantFilter' is not used.

Suggested change
from langchain_classic.document_transformers import EmbeddingsRedundantFilter

Copilot uses AI. Check for mistakes.
from langchain_classic.retrievers.document_compressors import EmbeddingsFilter, DocumentCompressorPipeline
from langchain_text_splitters import TokenTextSplitter
from langchain_core.messages import HumanMessage, AIMessage
from langchain_community.chat_message_histories import ChatMessageHistory
Expand Down
2 changes: 1 addition & 1 deletion backend/src/create_chunks.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
from langchain_text_splitters import TokenTextSplitter
from langchain.docstore.document import Document
from langchain_core.documents import Document
from langchain_neo4j import Neo4jGraph
import logging
from src.document_sources.youtube import get_chunks_with_timestamps, get_calculated_timestamps
Expand Down
2 changes: 1 addition & 1 deletion backend/src/document_sources/youtube.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from langchain.docstore.document import Document
from langchain_core.documents import Document
from src.shared.llm_graph_builder_exception import LLMGraphBuilderException
from youtube_transcript_api import YouTubeTranscriptApi
from youtube_transcript_api.proxies import GenericProxyConfig
Expand Down
2 changes: 1 addition & 1 deletion backend/src/llm.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import logging
from langchain.docstore.document import Document
from langchain_core.documents import Document
import os
from langchain_openai import ChatOpenAI, AzureChatOpenAI
from langchain_google_vertexai import ChatVertexAI
Expand Down
2 changes: 1 addition & 1 deletion backend/src/make_relationships.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
from langchain_neo4j import Neo4jGraph
from langchain.docstore.document import Document
from langchain_core.documents import Document
from src.shared.common_fn import load_embedding_model,execute_graph_query
from src.shared.common_fn import load_embedding_model,execute_graph_query
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate import statement detected. This import appears twice on consecutive lines and should be removed.

Suggested change
from src.shared.common_fn import load_embedding_model,execute_graph_query

Copilot uses AI. Check for mistakes.
import logging
Expand Down
9 changes: 7 additions & 2 deletions docs/project_docs.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,11 @@ This document provides comprehensive documentation for the Neo4j llm-graph-build

== Local Setup and Execution

Prerequisites:
- Python 3.12 or higher
- Node.js 20 or higher
- Docker (optional, for containerized deployment)

Run Docker Compose to build and start all components:
....
docker-compose up --build
Expand All @@ -38,8 +43,8 @@ yarn run dev
** For backend
....
cd backend
python -m venv envName
source envName/bin/activate
python3.12 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
uvicorn score:app --reload
....
Expand Down
Loading