Skip to content

Conversation

@mkovalua
Copy link
Contributor

Ticket

https://openscience.atlassian.net/browse/ENG-6733

Purpose

When the data are rendered on OSF frontend side the data for ‘modified’ gitlab attribute is not returned from WaterButler service so sorting is not available

Changes

implemented async gitlab api https://gitlab.com/api/v4/projects/{}/commits?path= calls to gather modified datatime data , also the issue related changings are there CenterForOpenScience/osf.io#10956

Side effects

For such a purpose it is needed additional api calls for each folder item to get datetime because https://gitlab.com/api/v4/projects/{}/repository/tree?ref= API call is not able to return last_modified date.


It may be investigated also grapql approach to do less api calls and extract just needed gitlab api data. I have tried and not see time difference with calls

image

image

and do not find for now the api call that combine both https://gitlab.com/api/v4/projects/{}/repository/tree?ref= and https://gitlab.com/api/v4/projects/{}/commits?path=. and returns just needed data. 


Also I thought about how to speed up performance f.e we may avoid additional calls if last commit saved in OSF ORM is the same like in WaterButler but I did not found Repository item in OSF ORM (database) that saves as attribute last commit hash and related items.

Commit hashes are saved into _history = DateTimeAwareJSONField(default=list, blank=True) for files in BaseFileNode and it makes the task more complex to achieve good performance speed (

  1. it will be needed to determine all files related to one repository
  2. get last commits in ‘’_history for each files
  3. compare datetimes
  4. get specific commit hash
  5. check if the commit hash is the latest using gitlab API (if yes no additional calls for https://gitlab.com/api/v4/projects/{}/commits?path= otherwise do the calls to update datatime).

The changings aim is to get datetime. It is needed to think whether it is needed at all or how it is possible speed up it, for example the approach I mentioned above (adding caching) or any other ideas.

Though it was not purpose of the ticket but I suppose that it may be needed to take created time of file too in any another workflows.

QA Notes

Deployment Notes

…m WaterButler service on sorting attempt issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant