-
Notifications
You must be signed in to change notification settings - Fork 5
TensorLake agent example #91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThis pull request introduces a new TensorLake Document AI agent module within the Agentuity framework. It includes configuration files, comprehensive documentation, project setup files, and a complete agent implementation supporting document parsing, status polling, schema introspection, and text analysis with structured data extraction. Changes
Sequence DiagramsequenceDiagram
actor Client
participant Agent as TensorLake Agent
participant DocumentAI as DocumentAI API
participant Storage as Agent Storage
Client->>Agent: POST /run (action: "parse")
activate Agent
alt Demo Mode
Agent->>Agent: Generate sample data
Agent-->>Client: Return demo results
else Production
Agent->>DocumentAI: Submit document parse job
DocumentAI-->>Agent: parse_id
Agent->>DocumentAI: Poll for completion
DocumentAI-->>Agent: Status (processing/complete)
alt Parse Complete
Agent->>DocumentAI: Fetch structured extraction results
DocumentAI-->>Agent: Extracted data + chunks
Agent->>Storage: Store results (optional)
Agent-->>Client: Return structured data + chunks
else Error/Timeout
Agent-->>Client: Return error details
end
end
deactivate Agent
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes Areas requiring extra attention:
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (5)
agents/tensorlake/tensorlake-agent/pyproject.toml (1)
1-10: Project metadata and dependencies look reasonable; confirm supported versionsThe manifest is minimal but sufficient for this example agent, and the dependency set matches what you’d expect. It would be good to double‑check that the agent has been run/tests have passed on all Python versions in the
>=3.10, <3.13range and with the chosen minimum versions ofagentuity,tensorlake, andpydantic, or narrow the constraints if you only intend to support a subset.agents/tensorlake/tensorlake-agent/main.py (1)
1-6: Simple CLI entry point is fine for nowA minimal
main()that prints a greeting is acceptable as a placeholder; you can later evolve this to invoke the actual agent/server startup if you want a richer CLI experience.agents/tensorlake/tensorlake-agent/server.py (1)
20-27: Consider using yellow for warning messages instead of red.The warning messages on lines 22-26 use red ANSI codes (
\033[31m) but are labeled as[WARN]. Red is typically reserved for errors, while yellow (\033[33m) is more conventional for warnings, improving visual distinction.# Check if AGENTUITY_TRANSPORT_URL is set if not os.environ.get("AGENTUITY_TRANSPORT_URL"): print( - "\033[31m[WARN] You are running this agent outside of the Agentuity environment. Any automatic Agentuity features will be disabled.\033[0m" + "\033[33m[WARN] You are running this agent outside of the Agentuity environment. Any automatic Agentuity features will be disabled.\033[0m" ) print( - "\033[31m[WARN] Recommend running `agentuity dev` to run your project locally instead of `python script`.\033[0m" + "\033[33m[WARN] Recommend running `agentuity dev` to run your project locally instead of `python script`.\033[0m" )agents/tensorlake/tensorlake-agent/agentuity_agents/tensorlake_agent/agent.py (2)
10-13: Remove unused import.The
jsonmodule is imported but not used. The JSON parsing is handled byrequest.data.json().import os import re -import json from typing import List, Tuple, Optional
458-463: Uselogging.exceptionto capture traceback and consider narrowing the exception type.Per static analysis, catching bare
Exceptionis broad. While acceptable for a top-level handler, usingcontext.logger.exception()instead ofcontext.logger.error()will automatically include the traceback, which aids debugging.except Exception as e: - context.logger.error("Error in TensorLake agent: %s", str(e)) + context.logger.exception("Error in TensorLake agent: %s", str(e)) return response.json({ "error": str(e), "type": type(e).__name__ })
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
agents/tensorlake/tensorlake-agent/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (12)
agents/tensorlake/tensorlake-agent/.editorconfig(1 hunks)agents/tensorlake/tensorlake-agent/.gitignore(1 hunks)agents/tensorlake/tensorlake-agent/.python-version(1 hunks)agents/tensorlake/tensorlake-agent/AGENTS.md(1 hunks)agents/tensorlake/tensorlake-agent/README.md(1 hunks)agents/tensorlake/tensorlake-agent/agentuity.yaml(1 hunks)agents/tensorlake/tensorlake-agent/agentuity_agents/__init__.py(1 hunks)agents/tensorlake/tensorlake-agent/agentuity_agents/tensorlake_agent/__init__.py(1 hunks)agents/tensorlake/tensorlake-agent/agentuity_agents/tensorlake_agent/agent.py(1 hunks)agents/tensorlake/tensorlake-agent/main.py(1 hunks)agents/tensorlake/tensorlake-agent/pyproject.toml(1 hunks)agents/tensorlake/tensorlake-agent/server.py(1 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.18.1)
agents/tensorlake/tensorlake-agent/AGENTS.md
8-8: Images should have alternate text (alt text)
(MD045, no-alt-text)
40-40: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
48-48: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🪛 Ruff (0.14.7)
agents/tensorlake/tensorlake-agent/agentuity_agents/tensorlake_agent/agent.py
458-458: Do not catch blind exception: Exception
(BLE001)
459-459: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
🔇 Additional comments (13)
agents/tensorlake/tensorlake-agent/agentuity_agents/__init__.py (1)
1-1: Package initializer is fine as a no-op stubKeeping this
__init__intentionally empty is a reasonable way to declare the namespace without side effects.agents/tensorlake/tensorlake-agent/.python-version (1)
1-1: Python version pin is consistent with pyproject rangePinning to 3.11 here is compatible with the
>=3.10, <3.13constraint inpyproject.tomland gives a clear default for local dev.agents/tensorlake/tensorlake-agent/agentuity_agents/tensorlake_agent/__init__.py (1)
1-1: Minimal package initializer is acceptableAn empty
__init__(aside from the comment) is fine for now; you can add explicit re-exports later if you want a curated public API.agents/tensorlake/tensorlake-agent/.editorconfig (1)
1-12: EditorConfig settings look coherent for this subprojectThe root flag and basic formatting options are consistent and should give predictable editor behavior across contributors.
agents/tensorlake/tensorlake-agent/.gitignore (1)
1-180: Comprehensive ignore rules with useful Agentuity additionsThe Python/IDE patterns are thorough, and the Agentuity-specific entries at the end ensure local agent state and crash reports stay out of git. Keeping
.python-versionunignored matches the committed version file.agents/tensorlake/tensorlake-agent/AGENTS.md (1)
1-110: Agentuity Python guide is clear and well-structuredThe doc gives a concise but complete overview of handler signatures, request/response/context APIs, storage, and logging, which should be enough for someone to get started with Python agents in this repo.
agents/tensorlake/tensorlake-agent/server.py (1)
29-36: LGTM!Logging configuration and autostart invocation are properly placed after environment validation.
agents/tensorlake/tensorlake-agent/agentuity.yaml (1)
1-70: LGTM!The configuration is well-documented with clear comments explaining each section. The watch patterns, bundler settings, and agent definitions are properly structured.
agents/tensorlake/tensorlake-agent/README.md (1)
1-238: Well-structured documentation.The README provides comprehensive coverage of installation, configuration, usage examples, and troubleshooting. The action/schema tables and JSON examples are helpful for users.
agents/tensorlake/tensorlake-agent/agentuity_agents/tensorlake_agent/agent.py (4)
26-84: LGTM!The Pydantic schema definitions are well-structured with clear docstrings and appropriate typing. The schema registry pattern provides a clean lookup mechanism.
98-106: LGTM!The
topk_wordsfunction is a clean implementation with proper stopword filtering and efficient counting.
152-181: LGTM!The handler provides a helpful default response when no data is provided, documenting available actions with an example.
388-410: The "status" action blocks by callingwait_for_completion, which defeats the purpose of non-blocking status retrieval.The TensorLake Python SDK does not expose a non-blocking status check method;
wait_for_completion()is the documented way to retrieve results. However, the underlying REST API supports polling viaGET /documents/v2/parse/{parse_id}. For a status endpoint, consider either:
- Making direct HTTP calls to poll the REST API without blocking, or
- Document that this action blocks until parse completion and clarify expected use cases.
| resources: | ||
| # The memory requirements | ||
| memory: 250Mi | ||
| # The CPU requirements | ||
| cpu: 500M | ||
| # The disk size requirements | ||
| disk: 300Mi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
Agentuity CPU resource units format millicores
💡 Result:
Agentuity uses standard Kubernetes CPU quantity format: millicores (m). Key points:
- Syntax: integer CPU (e.g., "1") or millicores with "m" suffix (e.g., "250m" = 0.25 CPU). [1]
- Conversion: 1000m = 1 CPU (1 vCPU/core); 1m = 0.001 CPU (minimum precision). [1][2]
- Usage examples: cpu: "500m" → 0.5 CPU; cpu: "100m" → 0.1 CPU. [1][3]
Sources: Kubernetes docs on resource units and community guides. [1][2][3]
References
[1] Kubernetes — Resource Management for Pods and Containers.
[2] VSHN / Kubernetes resource units explanation.
[3] Kubernetes resources examples (config snippets).
Fix the CPU resource unit to use lowercase 'm'.
The CPU value 500M is incorrect. Agentuity follows standard Kubernetes resource format where CPU must use lowercase m for millicores (e.g., 500m = 0.5 CPU cores). Change cpu: 500M to cpu: 500m on line 46.
🤖 Prompt for AI Agents
In agents/tensorlake/tensorlake-agent/agentuity.yaml around lines 42 to 48, the
CPU resource uses an incorrect unit `500M`; change it to the Kubernetes-standard
lowercase millicore unit `500m` so the CPU entry reads `cpu: 500m`. Ensure no
other CPU entries use uppercase units.
| 1. Clone the repo: | ||
| ```bash | ||
| git clone https://github.com/agentuity/tensorlake-agent.git | ||
| cd tensorlake-agent | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clone URL may be incorrect.
The clone URL references https://github.com/agentuity/tensorlake-agent.git, but this agent appears to be part of the agentuity/examples repository based on the PR context. Consider updating to reflect the correct repository path or providing instructions for navigating to the correct subdirectory.
1. Clone the repo:
```bash
- git clone https://github.com/agentuity/tensorlake-agent.git
- cd tensorlake-agent
+ git clone https://github.com/agentuity/examples.git
+ cd examples/agents/tensorlake/tensorlake-agent
```🤖 Prompt for AI Agents
In agents/tensorlake/tensorlake-agent/README.md around lines 25 to 29, the
repository clone instructions point to
https://github.com/agentuity/tensorlake-agent.git which is incorrect for this
PR; update the instructions to clone the main examples repo and change the cd
step to the agent subdirectory (e.g., git clone
https://github.com/agentuity/examples.git and cd
examples/agents/tensorlake/tensorlake-agent) so users land in the correct
project folder.
| ## Configuration | ||
|
|
||
|
|
||
| 1. Open `.env` and set: | ||
| ``` | ||
| TENSORLAKE_API_KEY=your_tensorlake_api_key | ||
| ``` | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing step to create .env file.
Step 1 under Configuration instructs users to "Open .env", but there's no prior instruction to create this file. Consider adding a step to copy from a template or create the file.
## Configuration
-1. Open `.env` and set:
+1. Create a `.env` file and set:
```
TENSORLAKE_API_KEY=your_tensorlake_api_key
```🤖 Prompt for AI Agents
In agents/tensorlake/tensorlake-agent/README.md around lines 36 to 43, the
Configuration section tells users to "Open `.env`" but omits how to create it;
add a preceding step instructing users to create the file (for example by
copying a provided template like `.env.example` or creating a new `.env`), and
show the exact command to run (e.g., copy or touch) and mention where to place
it before setting TENSORLAKE_API_KEY so the instructions are complete.
Summary by CodeRabbit
New Features
Chores
✏️ Tip: You can customize this high-level summary in your review settings.