Skip to content

Commit cbfb904

Browse files
committed
touch up docstrings
1 parent 5bcc9ab commit cbfb904

File tree

1 file changed

+17
-1
lines changed
  • datafusion/expr/src/logical_plan

1 file changed

+17
-1
lines changed

datafusion/expr/src/logical_plan/plan.rs

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2609,13 +2609,29 @@ impl PartialOrd for Window {
26092609
}
26102610

26112611
/// Communicates the desired ordering of the output of a scan operation.
2612-
/// This can be used by implementers of [`TableProvider`] to optimize the order in which data is output from the scan.
2612+
///
2613+
/// Preferred orderings can potentially help DataFusion optimize queries, even in cases
2614+
/// when the output does not completely follow that order. This is information passed
2615+
/// to the scan about what might help.
2616+
///
2617+
/// For example, a query with `ORDER BY time DESC LIMIT 10`, DataFusion's dynamic
2618+
/// predicates and TopK operator will work better if the data is roughly ordered by descending
2619+
/// time (more recent data first).
2620+
///
2621+
/// Implementers of [`TableProvider`] should use this information to optimize the order in which data is output from the scan.
2622+
///
26132623
/// It is a hint and not a requirement:
26142624
/// - If this information is completely ignored, e.g. data is scanned randomly, the query will still be correct because a sort will be applied to the data.
26152625
/// - Partially ordered data will also be re-sorted but this may result in optimizations like early stopping, additional data pruning, reduced memory usage during the sort, etc.
26162626
/// - If the scan produces exactly the requested ordering, and sets it's properties to reflect this, upstream sorts may be optimized away.
2627+
///
2628+
/// Actually removing unecessary sorts is done at the physical plan level: logical operators like a join may or may not preserve ordering
2629+
/// depending on what physical operator is chosen (e.g. HashJoin vs. SortMergeJoin).
2630+
/// If you as a [`TableProvider`] implementer would like to eliminiate unecessary sorts you shuold make sure the [`ExecutionPlan`]
2631+
/// you produce reflects the ordering in it's properties.
26172632
///
26182633
/// [`TableProvider`]: https://docs.rs/datafusion/latest/datafusion/catalog/trait.TableProvider.html
2634+
/// [`ExecutionPlan`]: https://docs.rs/datafusion/latest/datafusion/physical_plan/trait.ExecutionPlan.html
26192635
#[derive(Clone, PartialEq, Eq, Hash, PartialOrd, Default)]
26202636
pub struct ScanOrdering {
26212637
/// Optional preferred ordering for the scan that matches the output order of upstream query nodes.

0 commit comments

Comments
 (0)