Skip to content

Conversation

@mjiori
Copy link
Contributor

@mjiori mjiori commented Sep 17, 2025

This PR is a draft. There are known issues and tests may not pass.

This PR provides a combination of DB migrations and code changes to help address the rising size and cost of the Registry RDS database.

  • column removed: created_at
  • column removed: updated_at
  • premis_events.event_type has been fully converted from a varchar type to an int type. A new lookup table has been added to match ints to strings, so we don’t lose any data.
  • premis_events.object
  • premis_events.agent
  • storage URLs

The migrations are separated simply to make them easier to run on the large production database, and also in the service of separation of concerns.

Numerous changes were necessary in Registry and Preservation Services to accommodate these database changes.

Manual testing:

  • Ingest bag
  • Restore bag
  • Delete bag
  • View events (type, object, agent)

@diamondap
Copy link
Member

This looks good so far. One question: in the migration 013_shrink_db_size.sql, you add three lookup tables for event type, agent and object, and you create foreign key constraints in premis_events pointing to the lookup tables. Where/when do you actually populate the lookup tables? If the lookup tables aren't populated, it seems the migration will fail as the foreign keys aren't there.

@mjiori
Copy link
Contributor Author

mjiori commented Sep 19, 2025

This looks good so far. One question: in the migration 013_shrink_db_size.sql, you add three lookup tables for event type, agent and object, and you create foreign key constraints in premis_events pointing to the lookup tables. Where/when do you actually populate the lookup tables? If the lookup tables aren't populated, it seems the migration will fail as the foreign keys aren't there.

Correct, I have not added those INSERT statements to populate the lookup tables just yet. Will do.

@diamondap
Copy link
Member

This looks good. Two minor questions/issues:

  1. In constants.go, all the event types are currently set to zero. I assume you'll change that before merging?
  2. In premis_event.go, the Validate() method complains if event.Agent and event.Object are less than zero. It should probably complain if they are less than or equal to zero. By default, Go sets int values to zero, and zero doesn't match any of the event type values defined in db/migrations/013_shrink_db_size.sql. That means Validate() could easily accept empty Agent and Object values that don't map to anything in the DB.

@mjiori mjiori changed the title DRAFT: DB Optimizations DB Optimizations Dec 16, 2025
@mjiori mjiori marked this pull request as ready for review December 16, 2025 07:34
@mjiori mjiori requested a review from diamondap December 16, 2025 07:37
@mjiori mjiori changed the base branch from master to qa December 16, 2025 07:47
Copy link
Member

@diamondap diamondap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The db migration starts with number 013. There's another db migration in the authy removal or MFA branch that also has a migration starting with 013. Registry applies the migrations in order, so you may want to renumber one of these 013 migrations and be sure they don't conflict with each other. Hopefully, they can be applied in any order without stepping on each other.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants