-
Notifications
You must be signed in to change notification settings - Fork 10
Data tracking
The Wikilink tool monitors the page-links-change EventStream (https://wikitech.wikimedia.org/wiki/Event_Platform/EventStreams) for link additions and removals to tracking URL patterns.
The linkevents_collect.py management command is the script which runs to collect this data.
For each event in the stream, the script:
- Checks if the event is for one the tool is tracking (i.e. we have a URLPattern object which matches the URL in the event)
- If so, gets or creates a User object for the user who triggered the event
- Finds all URL patterns matching this event (we might, for example, be tracking
clipping.newspapers.comin addition tonewspapers.com) - Per the comment at https://github.com/WikipediaLibrary/externallinks/blob/41afcc8bfbc37edd8fe38428244a9807684d66f5/extlinks/links/management/commands/linkevents_collect.py#L137, we assume this link event will only relate to a single Organisation object, and check that organisation's username list (see below) to see if we have a match. If so,
on_user_listis set to True. - Then we save the LinkEvent!
For each link event, we cross-reference the user with a list of users from the Library Card platform. Ideally this would be destination-agnostic, so the tool can support other use cases, but for now we've implemented this in a way that only supports The Wikipedia Library.
users_update_lists.py runs on a regular basis to update user lists. For each organisation, it simply checks the username list URL field and gets a response from the API which should be at that URL.
We assume a not-hugely-helpful formatting for this data as defined by the user serializer through the AuthorizedUsers view.