-
Notifications
You must be signed in to change notification settings - Fork 365
Description
I'm having an issue where relationships are causing the wrong object to be deleted. Unfortunately this is in a very complex code path, so I can't give any steps to reproduce the behaviour, but I can report my observations. While I can't guarantee that there are no bugs in our code, the behaviour I'm observing from Yap does seem inconsistent.
This is a pretty serious bug because it can cause seemingly random data deletions in a Yap app that uses relationships.
Mechanism
I tracked this down to [YapDatabaseTransaction removeObjectForCollectionKey:withRowId:] being called with the wrong row ID. I added the following code block to the start of the method and put a breakpoint in it.
YapCollectionKey *storedKey = [self collectionKeyForRowid:rowid];
if (![storedKey isEqual:cacheKey]) {
int64_t correctRowID = 0;
BOOL result = [self getRowid:&correctRowID forCollectionKey:cacheKey];
NSLog(@"====rssr yap is removing rowid %ld. Caller thinks the key is %@ but it's actually %@. The correct rowid to delete is most likely %ld (fetch rowid result was %@)", rowid, cacheKey, storedKey, correctRowID, result ? @"TRUE": @"FALSE");
}
When the issue happens I get the following log:
2021-01-14 18:14:12.558989-0500 Daylite[2351:1444308] ====rssr yap is removing rowid 9691. Caller thinks the key is <YapCollectionKey collection(RemoteMutationsReferences) key(72D89780-77D6-436F-BEEE-5024AA600D2F_1)> but it's actually <YapCollectionKey collection(accounts) key(6)>. The correct rowid to delete is most likely 0 (fetch rowid result was FALSE)
So it looks like the wrong row id is stored in the edge (it doesn't match the destination key)
The call to remove the object is in YapDatabaseRelationshipTransaction.m, at the end of "Enumerate all edges where source node is the deleted node" around line 3727:
[databaseRwTransaction removeObjectForKey:edge->destinationKey
inCollection:edge->destinationCollection
withRowid:edge->destinationRowid];
My Steps To Reproduce
The workflow in our app is so complicated that it isn't possible to glean the real requirements for this issue from them, but I think there is some information about the preconditions.
- View the messages list (it's backed by a YapDatabaseAutoView).
- Start swiping to delete messages from the list.
- Each deletion will trigger a server sync that will push the deletion up to the server and then fetch any new data (There does not need to be any new data to trigger the bug however).
- After a certain number of messages are deleted one of them will come back. We do have code that will read back a deleted object if we fail to push the deletion to the server, but I haven't verified that this is the cause of the object coming back (due to some of the issues listed below).
- Swipe to delete the same message again. This second deletion succeeds locally, but when pushed to the server it causes the entire account object to be deleted while cleaning up some unrelated relationships.
Notes & Observations
- In my debug code above I try to fetch the correct rowID from the CollectionKey and it comes back as 0, so I'm guessing that the intended target of the edge has already been deleted.
- I don't think that the bug is caused when the edge is firing in step 5, I believe that it's actually in step 4 when the object comes back. If I quit the application between steps 4 and 5 the message that 'comes back' is not there, so it looks like it wasn't fetched back by our application and persisted, it's an in memory issue.
- If you look at the output of the debug log I put above it tries to resolve the real rowid of the destination, and it comes back as 0. I believe that the 'real' target of the relationship has already been deleted, but the machinery in edge processing doesn't realize this because of the rowid issue. (That's just a guess at this point). I have stepped through the call to fetch the rowid and the zero doesn't come from the cache and it does do a database query to attempt to retrieve the rowid.
- I have code that performs a rollback in some cases, but I did not see the log from this code at all in some cases where this issue has occurred.
- The edge that is firing was not created in the same transaction as the one that is deleting everything.
- While investigaing this we have noticed a few other inconsistencies where some of our objects were in an unexpected state. Broken references between objects that are updated in a single transaction. We weren't able to fully rule out a bug in our code, so I don't know if that issue is related to this one.
- If the message that 'came back' was caused by our sync code, the new message would have been assigned a new id, since we assign ids to objects sequentially and don't re-use them.
- We upgraded our version of Yap recently, and commits b3d4599 and dda1ece are both new to our repo. I don't believe that we had this problem before, but it is a bit hard to tell if this issue is happening or not.
- This issue is heavily timing dependant. I added a lot of logs to both our code and to Yap and the issue stopped happening. Clearly this is exposing an issue in some special case for concurrent code that doesn't get hit all the time.
- I can reproduce this issue with a debug build of Yap.
- The object that is incorrectly targeted by the edge is the root of our object hierarchy. Not only does this trigger the deletion of everything, I suspect that it is the first row in our database, which may be a clue to the mechanism of operation.
Workaround
As a workaround I'm going to try removing the edge that is firing and deleting the wrong objects. We don't need this particular relationship anymore, so hopefully it will prevent this issue. But that seems like a band-aid solution, since the underlying problem is likely to remain. I'm a bit worried about trusting the relationship extension.