Skip to content

Conversation

@curranosaurus
Copy link

@curranosaurus curranosaurus commented Dec 30, 2024

Addresses #93 by adding schema support to Persistent. You specify a Persistent model's schema using syntax like schema=foo.
I worked on this together with @benjonesy.

Design challenges and decisions

"Schema" means many things to many backends

In this PR, we use the word "schema" to refer to various devices that backends offer for namespacing tables.

  • ✔️ In PostgreSQL, a schema exists in between the level of a table and a database.
  • ✔️ In MySQL and SQLite, schemas and databases are synonymous concepts.
  • ❌ MongoDB and Redis do not have a concept analogous to schemas. This PR does not extend support to these backends.

Schemas may or may not exist

We considered designing this PR so that Persistent would create schemas for the user if they did not already exist. But we chose not to do that. This means that local development with Persistent using schemas will require some manual SQL migrations to create the schemas that the user wants. But for conventional use-cases, schema creation will be a rare event, so this is not very onerous for developers. Below are some issues one would have to solve to make Persistent automate schema creation for the developer:

Three-stage migrations

RIght now, the CREATe TABLE DDL statements are sorted to the top of the migration output, so that the (potentially recursive) foreign-key constraints are created only after all of the tables they reference exist. If we also wanted Persistent to ensure the existence of schemas, we would need to sort backend-appropriate versions of CREATE SCHEMA statements above the CREATe TABLE statements.

Granting permissions in MySQL

Before a connection can use a newly created database in MySQL, one must run a GRANT statement to give the current user the requisite DDL operations to run CREATe TABLE statements. We would somehow need to make Persistent aware of the GRANT statement required.

File name and directory in SQLite

SQLite's invocation for creating a new database also specifies the file name and location for that database's .db file. Persistent would need to be opinionated about this directory scheme, or we would have to add a configuration option, if we wanted to support schema creation in SQLite.

PR Template checklist

Before submitting your PR, check that you've:

  • Documented new APIs with Haddock markup
  • Added @since declarations to the Haddock
  • Ran stylish-haskell on any changed files.
  • Adhered to the code style (see the .editorconfig file for details)

After submitting your PR:

  • Update the Changelog.md file with a link to your PR
  • Bumped the version number if there isn't an (unreleased) on the Changelog
  • Check that CI passes (or if it fails, for reasons unrelated to your change, like CI timeouts)

curranosaurus and others added 30 commits September 30, 2024 15:56
Add new `entitySchema` field and refactor callsites of `entityDB` to use `getEntityDBName` instead
[INT-844] prepend schema to entity db name
…ntityDBName

revert change to `getEntityDBName`
…tters-for-entitySchema

add setters and getters for `entitySchema`
…b.com:curranosaurus/persistent into curran/update-entitySchema-to-its-own-newtype
…o-its-own-newtype

Create newtype for schema name
…t-schema

update sqlQQ test to consider a case with a schema
…s-migrations

Use new entitySchema field for Postgres migrations
…sql-queries

Start using table schema in queries with Postgres
@curranosaurus
Copy link
Author

Alright, I think I made it through all of your requests, @parsonsmatt. This should be ready for another round of review. Thanks for your advice!

I think there's further work to support this on redis and mong, even if it's just mconcat [schemaName, "_", tableName]. But I'll defer that decision to someone who is using those backends.

Yeah, I suspect you're right that people will want to apply this feature to those backends. But I don't know what namespacing conventions people will expect for those data stores. So I agree that we should leave that decision to a future contributor.

@curranosaurus
Copy link
Author

Looks like the MySQL schema setup failed:


Run mysql -utest -ptest -h127.0.0.1 --port=33306 test -e "CREATE DATABASE foo; GRANT ALL ON foo.* to 'test'@'localhost';"
  mysql -utest -ptest -h127.0.0.1 --port=33306 test -e "CREATE DATABASE foo; GRANT ALL ON foo.* to 'test'@'localhost';"
  shell: /usr/bin/bash -e {0}
  env:
    CONFIG: --enable-tests --enable-benchmarks
Warning: arning] Using a password on the command line interface can be insecure.
ERROR 1044 (42000) at line 1: Access denied for user 'test'@'%' to database 'foo'
Error: Process completed with exit code 1.

I am guessing it's because the test user doesn't have rights to create databases or grant itself access to those databases. I'll probably need to rewrite the CLI invocation to log in as root user?

- name: Check MySQL connection
run: mysql -utest -ptest -h127.0.0.1 --port=33306 test -e "SELECT 1;"
- name: Create 'foo' schema in MySQL
run: mysql -uroot -ptest -h127.0.0.1 --port=33306 test -e "CREATE DATABASE foo; GRANT ALL ON foo.* to 'test'@'localhost';"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think -uroot -ptest will work because near the top of this file there's a declaration MYSQL_ROOT_PASSWORD: test.

@curranosaurus
Copy link
Author

mysql -uroot -ptest -h127.0.0.1 --port=33306 test -e "CREATE DATABASE foo; GRANT ALL ON foo.* to 'test'@'localhost';"
ERROR 1410 (42000) at line 1: You are not allowed to create a user with GRANT

Maybe the host isn't called localhost?

@parsonsmatt
Copy link
Collaborator

This may be relevant: https://stackoverflow.com/questions/50177216/how-to-grant-all-privileges-to-root-user-in-mysql-8-0

@curranosaurus
Copy link
Author

My current theory is that the user test@<?> exists because of the MYSQL_USER: test declaration on line 32. But that instead of <?> being localhost, it should be 127.0.0.1, matching the way that we log in to the db using the mysql command.

Nothing -> rs
(Just r) -> (unFieldNameDB $ cName c, r) : rs
vals = [ PersistText $ pack $ MySQL.connectDatabase connectInfo
vals = [ PersistText $ fromMaybe (pack $ MySQL.connectDatabase connectInfo) $ fmap unSchemaNameDB $ getEntitySchema def
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
vals = [ PersistText $ fromMaybe (pack $ MySQL.connectDatabase connectInfo) $ fmap unSchemaNameDB $ getEntitySchema def
vals = [ PersistText $ maybe (pack $ MySQL.connectDatabase connectInfo) unSchemaNameDB $ getEntitySchema def

]
let vars =
[ PersistText $ pack $ MySQL.connectDatabase connectInfo
[ PersistText $ fromMaybe (pack $ MySQL.connectDatabase connectInfo) $ fmap unSchemaNameDB tschema
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same maybe thing

[] ->
Nothing
[[PersistText tab, PersistText ref, PersistInt64 pos, PersistText onDel, PersistText onUpd]] ->
[] -> Nothing
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would arguably be easier to review if it weren't so inlined. A comment at the top or breaking it out into a named function would make it much clearer what the intended semantics are

then Just $
let colSchema =
case schema of
PersistNull -> Nothing
Copy link

@rjmholt rjmholt Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible comment: We didn't get a schema name at all, so there's no schema

Comment on lines +818 to +821
PersistText schemaName -> do
guard $ not (T.null schemaName)
guard $ schemaName /= (pack $ MySQL.connectDatabase connectInfo)
pure $ SchemaNameDB schemaName
Copy link

@rjmholt rjmholt Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible comment: We got a schema name. If it's empty or the default, we return nothing, otherwise we return the name.

Comment on lines +685 to +689
, -- It's not possible for SQLite to maintain foreign key
-- constraints across databases (which Persistent calls "schemas").
-- In fact, it's a syntax error to use a schema qualifier in this
-- part of the SQL expression. So we omit the schema here, and throw
-- an error if this is a cross-schema reference.
Copy link

@rjmholt rjmholt Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this bit feels a bit weird to me. Does sqlite actually support what we've referred to elsewhere as "schemas" (i.e. namespaces within the same database)? (Looks like no?) Or are we grasping at some other sqlite functionality that we imagine could be made analogous to schemas? What happens if sqlite adds support for true schemas later?

Should sqlite maybe instead fall into the same category as MongoDB, where the actual backend just doesn't support them?

parseEntityFields fieldLines

schemaName =
fmap SchemaNameDB $ listToMaybe $ mapMaybe (T.stripPrefix "schema=") entAttribs
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on what packages are available, Data.List.Extra.firstJust does this nicely

EntityNameDB $ psToDBName ps refTableName
, foreignRefSchemaDBName =
Nothing
-- ^ This will be determined in the TH phase ('fixForeignRefSchemaDBName').
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though this is written as a haddock comment, because it's in actual implementation code, people will have to read the internal module to see it.

If this is information about the lifecycle contract of the values on the record type, it might be worth hoisting this to something like the type definition so it's apparent in docs

ConstraintNameDB $ Data.Monoid.mconcat [table, "_", column, "_fkey"]

resolveTableName :: [EntityDef] -> EntityNameHS -> EntityNameDB
resolveTableName :: [EntityDef] -> EntityNameHS -> (EntityNameDB, Maybe SchemaNameDB)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think a record would be nicer

data ColumnReference = ColumnReference
{ crTableName :: !EntityNameDB
-- ^ The table name that the
-- ^ The foreign table's name.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not necessarily foreign right? You can have a self-referential FK. I guess it's not totally clear by what's meant by foreign in this context

@arrowd arrowd mentioned this pull request May 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants