-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Dear authors,
I recently came across your impressive work, which I find amazing. However, I have a few questions about the paper, detailed below:
Octree Mask
You mentioned that the octree mask comes from image segmentation, and Figure 2 shows the architecture. The gray area represents the parent query, and the white area represents the leaf query. The octree mask goes through the octree encoder to adjust the octree mask via Iterative Structure Rectification. I notice that you first sort the octree mask to identify high and low confidence areas.
Why are there white grids in the high-confidence areas and gray grids in the low-confidence areas? The gray grids indicate high confidence that they will be divided, but here the gray grids appear in the low-confidence areas.
Query Selection and Confidence Adjustment
You mentioned that you pick up the queries that have low confidence and then use an MLP to get a new confidence. However, the paper also mentions that you only retain queries that do not require splitting. Because you only retain leaf queries at the beginning, it is possible that later on, you cannot find the corresponding queries for the confidence adjustment.
Temporal Self-Attention
If the last frame's octree query architecture is not the same as the current octree query, can they still perform Temporal Self-Attention?
Octree Query transform to Dense Query
In Section 3.3 of your paper, you mentioned that if you need to get the dense query, you will apply an inverse operation back into the dense query. I am very curious about this operation, as it relates to the same issue as question second.
Thanks!!