-
Notifications
You must be signed in to change notification settings - Fork 53
Description
Currently col_paths is implemented as
function (x)
{
if (!is(coltree(x), "LayoutColTree")) {
stop("I don't know how to extract the column paths from an object of class ",
class(x))
}
make_col_df(x, visible_only = TRUE)$path
}
The problem is make_col_df does a ton of other things unrelated to column paths. Combine this with the fact that pruning or scoring functions may need to call col_paths for every row of a table (if implemented naively) and this gives rise to a situation where for large tables we have seen repeated col_paths calls take up to 50% of the total pruning/sorting time, when each call in those contexts is guaranteed to return the same set of paths making that time entirely wasted.
I propose we extend the InstantiatedColumnInfo class to cache its set of column paths the way it already does for column subset expressions. This would make repeated col_paths calls acceptable as each one is effectively free.
In fact, the result of make_col_df doesn't depend on font the way make_row_df does, so I think we could consider caching the full result of make_col_df rather than just the col_paths...