-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Problem
When using pushcart for metadata-driven data ingestion, there are instances where the source data lacks some columns that are described in the metadata table. Currently, this scenario causes the ingestion to fail or process incorrectly because the expected columns are absent.
Suggested Enhancement
It would be beneficial to introduce a mechanism to handle such discrepancies gracefully. One possible solution is to add an optional nullable field to the metadata table. This field would specify whether a missing source column can be considered as null during the data ingestion process.
Proposed Behavior
- Metadata Table Adjustment: Add a
nullablecolumn in the metadata table where each entry specifies if the corresponding data column can default to null when absent. - Ingestion Logic: Modify the ingestion process to check for the
nullableflag. If the source data column is missing andnullableis set totrue, the column should be assumed as null in the ingested data. - User Notification: Optionally, the system could log or notify when data columns are set to null due to the absence of source data, keeping the user informed of these adjustments.
Benefits
Implementing this feature would make the framework more robust and flexible in handling various real-world data scenarios, reducing the need for manual data pre-processing and enhancing the overall usability of the Pushcart framework for data ingestion tasks.
Please consider this feature for upcoming releases, as it would significantly enhance error handling and data integrity in the ingestion process.