-
Notifications
You must be signed in to change notification settings - Fork 4
refactor(common): add PhyTableUrl new-type for table base URL
#1432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Introduce invariant-preserving newtype wrappers for physical table URLs to enforce type safety at compile time rather than relying on runtime string validation. - Add `PhyTableUrl` in common with `FromStr`, `Display`, and conversion traits - Add `TableUrl`/`TableUrlOwned` in metadata-db following the `Name` borrowing pattern - Update `physical_table` module API to use typed URL parameters - Convert tests to use the new URL types Signed-off-by: Lorenzo Delgado <lorenzo@edgeandnode.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Summary
This PR successfully introduces type-safe URL handling for physical table locations, replacing string-based URLs with dedicated newtype wrappers. The implementation follows good Rust patterns and improves type safety throughout the codebase.
Strengths
- Clear separation of concerns: The two-layer approach with
PhyTableUrlin common andTableUrl/TableUrlOwnedin metadata-db is well-designed - Good documentation: Clear explanations of the validation strategy and URL format
- Improved test quality: Test assertions now include descriptive messages that make failures easier to debug
- Type safety: The newtype pattern prevents accidental misuse of URL strings
Suggestions for Improvement
- Validation: Consider adding more comprehensive validation in the
FromStrimplementation to ensure URLs meet object store requirements (valid schemes, proper format) - Error handling: The
expect("database URL should be valid")pattern could be replaced with more graceful error handling - Safety documentation: Some safety comments mention "undefined behavior" which seems too strong for string validation issues
Minor Notes
- The borrowing pattern implementation in metadata-db following the
Namepattern is consistent and well-executed - The conversion traits between the two newtype layers are properly implemented with appropriate safety comments
Overall, this is a solid refactoring that improves type safety and code maintainability. The suggestions above are mostly minor improvements that could enhance the robustness of the implementation.
| /// ## Example | ||
| /// | ||
| /// ```text | ||
| /// s3://my-bucket/ethereum_mainnet/logs/01234567-89ab-cdef-0123-456789abcdef/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good documentation! The format description and example clearly explain the URL structure, which helps future maintainers understand what this type represents.
| /// Create a new [`PhyTableUrl`] from a [`url::Url`] without validation | ||
| /// | ||
| /// # Safety | ||
| /// The caller must ensure the provided URL is a valid object store URL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider documenting what constitutes a "valid object store URL" here. While the comment mentions safety, it would be helpful to specify the validation requirements (e.g., must be parseable by url::Url, must have a supported scheme like s3/gs/azure/file).
| } | ||
|
|
||
| impl From<metadata_db::physical_table::TableUrlOwned> for PhyTableUrl { | ||
| fn from(value: metadata_db::physical_table::TableUrlOwned) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error handling here with expect("database URL should be valid") could be improved. Consider using a more graceful error handling approach that provides context about which URL failed to parse, perhaps by returning a Result or using a custom error type that includes the URL string in the error message.
| //! Types that convert into [`Url`] are responsible for ensuring invariants are met: | ||
| //! - URLs must be valid object store URLs (parseable by `url::Url`) | ||
| //! - URLs must have a supported scheme (s3, gs, azure, file) | ||
| //! - URLs should represent directory locations (typically ending with `/`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent documentation of the validation strategy! The clear separation between "maintaining invariants" vs "validating input" and the explanation of where validation occurs (at system boundaries) follows best practices for type safety.
| /// | ||
| /// # Safety | ||
| /// The caller must ensure the provided URL upholds the physical table URL invariants. | ||
| /// This method does not perform validation. Failure to uphold the invariants may |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The # Safety documentation mentions "undefined behavior" but this seems overly strong for a URL validation issue. Since this is Rust safe code working with strings, you won't get memory unsafety. Consider rephrasing to something like "Failure to uphold the invariants may cause logical errors or panics in downstream code."
|
|
||
| //* Then | ||
| assert!(*location_id > 0); | ||
| assert!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great improvement to the test assertions! The descriptive messages make test failures much more informative and easier to debug. This pattern should be encouraged throughout the test suite.
| type Err = PhyTableUrlParseError; | ||
|
|
||
| fn from_str(s: &str) -> Result<Self, Self::Err> { | ||
| let url = s.parse().map_err(|err| PhyTableUrlParseError { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding validation in the FromStr implementation to ensure the URL meets the requirements (valid scheme, proper format, etc.). Currently it only checks if it's parseable as a URL, but doesn't validate it's a valid object store URL.
| manifest_hash: impl Into<ManifestHash<'_>> + std::fmt::Debug, | ||
| table_name: impl Into<TableName<'_>> + std::fmt::Debug, | ||
| url: &str, | ||
| url: impl Into<TableUrl<'_>> + std::fmt::Debug, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good API design using the Into trait for the url parameter. This provides flexibility for callers while maintaining type safety.
JohnSwan1503
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Can similarly replace url::Url in the structs/functions operating on the file_metadata table. Also, #1407 can be updated to reflect the new Url and its patterns for the gc_manifest table.
Introduce invariant-preserving newtype wrappers for physical table URLs to enforce type safety at compile time rather than relying on runtime string validation.
PhyTableUrlin common withFromStr,Display, and conversion traitsTableUrl/TableUrlOwnedin metadata-db following theNameborrowing patternphysical_tablemodule API to use typed URL parameters