Skip to content

Conversation

@owen-mc
Copy link
Contributor

@owen-mc owen-mc commented Nov 27, 2025

This PR contains wo small tweaks to the go library to remove some results from the data flow consistency queries.

  • The "reverse reads" query wasn't happy. Consider a SelectorExpr a.f which is reading a promoted method, and it's equivalent to a.B.C.f if we add all the implicit reads of embedded fields. Currently we only create a post-update node for a. This PR also adds them for the implicit field read nodes corresponding to a.B and a.B.C, which should allow the "reverse read" feature of the data flow library to work.
  • The "identical step" query highlighted some steps from a node to itself. This turned out to be when a variable is read once in a for loop. I have added a condition that steps between adjacent uses of an SSA variable to exclude this case.

QA has been run and shows no performance problems or alert changes (after I fixed a mistake in how I got rid of the identical steps).

Commit-by-commit review is recommended.

Copilot AI review requested due to automatic review settings November 27, 2025 16:44
@owen-mc owen-mc requested review from a team as code owners November 27, 2025 16:44
Copilot finished reviewing on behalf of owen-mc November 27, 2025 16:46
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses small issues identified by data flow consistency checks in the Go language implementation. The main focus is on fixing the reverseRead consistency check violations, which occur when a read step exists from node a to a.b, node a.b has a post-update node, but node a is missing its post-update node.

Key Changes

  • Extended post-update node generation for method call receivers to include all implicit field reads in promoted field access chains
  • Refactored skipImplicitFieldReads in FlowSummaryImpl.qll to use a new utility predicate lookThroughImplicitFieldRead
  • Added exclusion predicates to the Go consistency checks for known acceptable edge cases
  • Fixed an SSA implementation issue where adjacent variable references could incorrectly include self-loops

Reviewed changes

Copilot reviewed 63 out of 64 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
shared/dataflow/codeql/dataflow/internal/DataFlowImplConsistency.qll Added localFlowIsLocalExclude exclusion predicate and documentation for reverseRead check
go/ql/lib/semmle/go/dataflow/internal/DataFlowImplConsistency.qll Implemented Go-specific exclusion predicates for consistency checks
go/ql/lib/semmle/go/dataflow/internal/DataFlowNodes.qll Extended post-update node creation to handle promoted field access chains using transitive closure
go/ql/lib/semmle/go/dataflow/internal/FlowSummaryImpl.qll Refactored to use new IR::lookThroughImplicitFieldRead utility predicate
go/ql/lib/semmle/go/dataflow/SsaImpl.qll Added condition to prevent self-loops in adjacent variable reference detection
go/ql/lib/semmle/go/controlflow/IR.qll Added lookThroughImplicitFieldRead utility predicate for consistent field read navigation
go/ql/consistency-queries/UnexpectedFrontendErrors.ql Moved query to consistency-queries directory
go/ql/lib/change-notes/2025-11-26-unexpected-frontend-errors-query-moved.md Change note documenting the query move
go/Makefile Updated consistency queries path references
Multiple .expected files Updated test expectations with reverseRead consistency check results

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@owen-mc owen-mc marked this pull request as draft November 27, 2025 16:59
@owen-mc owen-mc force-pushed the go/fix-data-flow-consistency-checks branch from 62e254e to ac90f7a Compare December 2, 2025 13:33
@owen-mc owen-mc force-pushed the go/fix-data-flow-consistency-checks branch from ac90f7a to a20c8cf Compare December 4, 2025 12:37
@owen-mc
Copy link
Contributor Author

owen-mc commented Dec 4, 2025

QA showed a few alert changes which helped me find a bug. I had previously tracked the identity steps down to adjacentVarRefs in SsaImpl.qll, which can hold from an expression to itself (i.e. b1=b2 and i1=i2), for example when in a variable is updated once in a for loop. I added a line to remove those rows, but that turned out to be wrong, because we want to know if the post-update node of x flows to x, for example. So the fix is to remove the identity steps in the disjunct of basicLocalFlowStep which deals with flow between adjacent uses of the same SSA variable, because at that point we can distinguish post-update nodes.

@owen-mc
Copy link
Contributor Author

owen-mc commented Dec 4, 2025

I ran DCA on the 16 repos which had alert differences in the QA run. This confirmed that the fix has worked and there are now no alert differences. There are still no noticeable performance changes. So I think this is ready for review.

@owen-mc owen-mc added the no-change-note-required This PR does not need a change note label Dec 4, 2025
@owen-mc owen-mc marked this pull request as ready for review December 4, 2025 14:03
// then mri.getReceiver() will give us the implicit field read a.b.c
// and we want to have post-update nodes for a, the implicit field
// read a.b and the implicit field read a.b.c.
insn = IR::lookThroughImplicitFieldRead*(mri.getReceiver())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this instead be a general rule saying: If we have a post-update node for a.b then we also need one for a?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking more closely, I see that actually we already deal with reads of promoted fields (in insn = getAWrittenInsn()), and this change is actually about receivers of calls to promoted methods. (I've just updated the description.) To your point, doing that would make the predicate recursive and less transparent. Since I've determined exactly what was missing from the previous definition, I feel like the most straightforward thing is to just add it to the predicate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, as long as the logic covers satisfies the implication I mention above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Go no-change-note-required This PR does not need a change note

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants