-
Notifications
You must be signed in to change notification settings - Fork 41
Create a new setup page #6671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Create a new setup page #6671
Conversation
|
Any INSERT INTO spuserpolicy (id, resource, action, collection_id, specifyuser_id) VALUES (1, '%', '%', null, 1); |
|
NOTES:
|
63728c3 to
31211cc
Compare
Triggered by 31211cc on branch refs/heads/issue-2931-1
grantfitzsimmons
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing instructions
- Use a blank DB to test this PR.
New DB was created!
- Fill out the setup forms
- Make sure the forms look good in light mode and dark mode (It uses your system settings.)
- You can only progress to the next form if you filled out all required fields.
- Make sure your choices are shown in the Overview sidebar.
- Make sure you can submit at the end.
Some database restrictions (e.g. max field length) are not accounted for, say for instance when you enter more text than is able to fit in a field (e.g. institution.Name)

- Make sure you can log into the database.
There is still a reported schema mismatch (it expects 6.8.03, returning 7)

- Make sure the schema config defaults were applied correctly.
- Make sure default picklists were created correctly.
- Make sure prep types were created correctly.
- Make sure all tree viewer pages load.
When creating a 'Herbarium' collection, my Taxon tree looks like this:
See Geography as well:
Same story for Storage...
- Local testing:
- Use an empty DB again (restore it or use a different one).
- Stop the specify worker and try to start the setup. You should get a clear error on the frontend.
It just pauses on loading Institution?
- [X] Make sure API endpoints work. Look at the stetup_tool section in http://localhost/documentation/api/operations/all/ for more information. You should be able to manually set up a blank database by creating these resources in order. - [X] `setup_tool/institution/create` - [X] `setup_tool/division/create` - [X] `setup_tool/discipline/create` - [X] `setup_tool/collection/create` - [X] `setup_tool/specifyuser/create`
|
@grantfitzsimmons thanks for the review!
Fixed for institution name and password 👍
Once the tree PR is merged I can create a PR to properly create the trees. EDIT: #7593 <- to be addressed once the tree creation PR is merged
Hm I'm getting an error (it's not descriptive, though). I might need to look at your set up to see if the check isn't working or if the error isn't being reported.
EDIT: Hopefully addressed by my new changes |
| 'insert into specifyuser_spprincipal(specifyuserid, spprincipalid) values (%s, %s)', | ||
| [user.id, gp.id] | ||
| ) | ||
| # TODO: UNCOMMENT THIS. Commented specifically for testing PR https://github.com/specify/specify7/pull/6671 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note
| WHERE Name = 'Administrator' | ||
| ) | ||
| """, [self.id]) | ||
| # TODO: UNCOMMENT THIS. Commented specifically for testing PR https://github.com/specify/specify7/pull/6671 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note
melton-jason
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work so far! 🚀
I've only reviewed the functionality and backend code, I've yet to finish a review on the frontend.
A lot of the "required" changes I've noticed should be fairly easy to undertake and were relatively small, or related to database structure from #6389.
My larger comments were mainly related to code maintainability, compatibility, and security.
Let me know if you have any questions with these comments!
specifyweb/specify/views.py
Outdated
| if not spmodels.Institution.objects.exists(): | ||
| return view(request, *args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, as long as there is not an Institution, all requests are unauthenticated and allowed.
Practically, this would mean an attacker could trivially do something like:
- Create their own Institution, Division, Discipline, Collection, and SpecifyUser
And theoretically:
- Create any other record in the database without going through the setup tool
- Including their own
SpecifyUserwith Institution Admin permissions
For the current setup process to be secure with no other changes, an administrator would need to perform the setup tool locally and not expose the server to the internet until the setup process is complete.
This may be infeasible for some setups where the Specify instance is only accessible remotely; configuring networking to allow a specific machine to connect (and removing that rule after setup is complete) might be unreasonably complex and place too much of the burden on the user to setup Specify.
Here are two more secure approaches which might be sufficient:
- Have a password or similar setting that can be set before the container is started that an administrator can use to "login" to the Setup page
- Similar to feat: add
ALLOW_SUPPORT_LOGINto Dockerfile #7399, generate a token at container startup which gets emitted (in the logs?) that the user can use to create an authenticated session
It would be very tricky to have authentication entirely handled by the browser. Though a user would only need to authenticate once for a secure session (unless the session expires, e.g., via closing the browser).
With Django's authentication systems, we can authenticate a user without having a corresponding SpecifyUser.
- https://docs.djangoproject.com/en/6.0/topics/auth/default/#how-to-log-a-user-in
- https://docs.djangoproject.com/en/6.0/topics/auth/customizing/
This would involve setting up a SetupTool Authentication Backend (and maybe a middleware, that can be disabled once the setup tool process is complete) that would handle whichever method we choose to verify the identity of the admin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted the changes to login_maybe_required 👍
cf48ece
This leaves only the new setup endpoints exposed, still not the best. Leaving this open for discussion to see what everyone thinks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(optional)
The complexity of update_table_schema_config_with_defaults and update_table_field_schema_config_with_defaults have increased quite a bit in this branch.
Especially with calls to update_table_schema_config_with_defaults, this added complexity can make it somewhat difficult to parse and understand what could be happening, so I think it might be time to reconsider refactoring these helper functions.
Below is something like a Control Flow Graph (CFG) depicting the work done compared to main (left) and this branch (right) for creating the description of a table when calling update_table_schema_config_with_defaults.
flowchart TD
A("description: str | None")
A -- == None --> B(Default Table Name)
A -- != None --> C
B --> C(Container Desc)
D("defaults: dict | None")
D -- == None --> E(Empty Dict)
D -- != None --> G("defaults.get(tablename)")
G --> F
E --> F(Table Defaults)
F --has desc--> H("default.get(desc)")
F -- missing desc --> I(Default Table Name)
H --> J(Container Desc)
I --> J
Similarly, below is a CFG for the process of creating the field labels when calling update_table_schema_config_with_defaults in main (left) and this branch (right).
Nodes in green depict the work performed in update_table_schema_config_with_defaults, and nodes in orange depict the work performed in update_table_field_schema_config_with_defaults.
flowchart TD
classDef fieldLvl stroke:#ffa500
classDef tblLvl stroke:#0f0
A(Get table from datamodel)
A --> B(Collect fields from table)
B -- for each field --> C(Convert name to Default Field Label)
C --> CC(Create field label)
class A,B tblLvl;
class C,CC fieldLvl;
D("default:s dict | None")
E("overrides: dict | None")
D --> AA(Get table from datamodel)
E --> AA
AA -- table in defaults --> G(field_overrides += defaults)
G -- table in overrides --> H(Get fields from overrides)
H -- for field in overrides --> I(field_overrides += field)
I --> BB(Collect fields from table)
AA -- table not in defaults --> BB
G -- table not in overrides --> BB
BB -- for each field --> J(Label = Convert name to Default Field Label)
J -- field in field_overrides --> K("Label = field_override")
K --> L(Create field label)
J -- field not in field_overrides --> L
class D,E,AA,G,H,I,BB tblLvl;
class J,K,L fieldLvl;
It looks to me like our current functions are solely responsible for a lot of work.
I think adopting the Single responsibility principle with the Chain of responsibility pattern might be beneficial here.
Dividing the required functionalities into the smallest unit of work, I think there are four main responsibilities with our current functions:
- Creating the SpLocaleContainer (table in schemaconfig)
- Creating the table label and description (the SpLocaleItemStr for each)
- Creating the SpLocaleContainerItem (field in schemaconfig)
- Creating the field label and description (the SpLocaleItemStr for each)
Each of these can be converted to a function as the "basic building blocks" and used as needed: either in helper functions for common use cases or directly by the caller if more fine-grained behavior is desired.
For example, the following might be how I would refactor our update_table_schema_config_with_defaults
Suggested Implementation in this dropdown
The overall design/CFG:
flowchart TD
classDef main stroke:#FF00FF
classDef table stroke:#f00
classDef tableStr stroke:#FFA500
classDef field stroke:#228B22
classDef fieldStr stroke:#4169E1
A("Get table from datamodel") --> B("Create SpLocaleContainer")
B --> C("Create table label and descriptions")
C --> D("Gather fields from table")
D -- for each field --> E("Create SpLocaleContainerItem")
E --> F("Create field label and description")
class A,D main;
class B table
class C tableStr;
class E field;
class F fieldStr;
A basic implementation might look like the following:
def create_default_table_schema_config(
table_name: str,
discipline_id: int,
apps
):
table = datamodel.get_table(table_name)
if table is None:
logger.warning(
f"Table does not exist in latest state of the datamodel, skipping Schema Config entry for: {table_name}"
)
return
# If we need a more complicated datatype- like TableSchemaConfig- we can define/create it here
# and pass it to the basic functions
# creates and returns the SplocaleContainer
container = create_table_config(apps, table, discipline_id)
create_table_strings(apps, container, table)
# It'd be very easy to define another function that's just a wrapper for creating the Container or Item
# and corresponding strings
# e.g., container = create_table_locale(apps, table, discipline_id)
# And for an Item and it's strings:
# item = create_field_locale(apps, field, container)
for field in table.all_fields:
# creates and returns the SpLocaleContainerItem
item = create_field_config(apps, field, container)
create_field_strings(apps, item, field)The above implementation would be only for creating with the "defaults" .
If a caller has a non-default table or field description, we can easily define another helper which accepts a TableSchemaConfig object or similar and handles creating the object(s) (we could then even simplify the above implementation even further by using said helper).
Here's how a more general implementation mirroring the current implementation might look like:
def create_table_schema_config(
config: TableSchemaConfig,
discipline_id: int,
apps
):
table = datamodel.get_table(config.table_name)
if table is None:
logger.warning(
f"Table does not exist in latest state of the datamodel, skipping Schema Config entry for: {config.table_name}"
)
return
container = create_table_config(apps, config, discipline_id)
create_table_strings(apps, container, config)
for field in table.all_fields:
field_config = config.fields.get(field.name, None)
item = create_field_config(apps, field, container, field_config)
create_field_strings(apps, item, field, field_config)
# This can be used like the following:
# Everything not defined here is assumed to be the "default" value
my_config = TableSchemaConfig(
table_name="SomeTable",
label="Custom Name",
fields={
"text1": FieldSchemaConfig(
description="Use this as needed!"
)
}
)
create_table_schema_config(my_config, discipline_id, apps)Essentially, the goal is to keep implementations simple with a single responsibility. With the basic building blocks and helpers for common use cases, the caller has a clean interface regardless of their required complexity.
I could also absolutely see this taken with an OOP approach (blame it on my Java background 😅).
The following is a conceptual example of an OOP implementation using something similar to the Builder pattern:
my_schema = TableSchemaConfig(label="My table", description="Hello")
# nearly all methods of TableSchemaConfig are focused on manipulating the
# underlying data structure, and explicitly doesn't touch the database
my_schema.add_field("text1", FieldSchemaConfig(label="Method"))
# this call is what would actually create the container, the items, and the strings
my_schema.create()Instead of single responsibility and building up from the smallest building blocks, the intuition and focus behind this approach would be separation of data and concrete instantiation. We have a flexible, complicated data structure (the schema config tables) and a clear process/methodology of creating it. Essentially, we let the caller build out how it wants the schema to look like before committing to its creation.
This is all just how I would refactor these functions to make them more usable and maintainable.
There's no "right" way of doing this, so if you have other ideas about how to simplify or refactor these functions, I'd love to hear them and absolutely go for it 🚀
| Apply schema config localization defaults for this discipline. | ||
| """ | ||
| # Get default schema localization | ||
| defaults = load_json_from_file(Path(__file__).parent.parent.parent.parent / 'config' / 'common' / 'schema_localization_en.json') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(optional)
Could we simplify these defaults? I imagine the vast majority of these items already fall within the heuristic used by update_table_schema_config_with_defaults to determine the defaults.
It might be easier from a maintainability and performance standpoint if the defaults from the file contained only the elements which should explicitly not follow these defaults.
Provided is a link to the original PR for convenience: #7444
You can use the following query to get the schema config information for all fields in a Specify database:
SELECT its.Language,
spc.name AS 'Table Name',
itm.name AS 'Field Name',
its.Text AS 'Table Label',
itd.Text AS 'Table Desc',
itms.Text AS 'Field Label',
itmd.Text AS 'Field Desc',
itm.type AS 'Field Type',
CASE
WHEN itm.IsRequired = 1 THEN 'True'
WHEN itm.IsRequired = 0 THEN 'False'
ELSE itm.IsRequired
END AS 'Field Is Required',
CASE
WHEN itm.IsHidden = 1 THEN 'True'
WHEN itm.IsHidden = 0 THEN 'False'
ELSE itm.IsHidden
END AS 'Field Is Hidden',
itm.Format AS 'Field Format',
itm.PickListName AS 'Field PickList',
itm.WebLinkName AS 'Field WebLink'
FROM splocalecontainer AS spc
LEFT OUTER JOIN splocaleitemstr AS its ON its.SpLocaleContainerNameID = spc.splocalecontainerid
LEFT OUTER JOIN splocaleitemstr itd ON itd.SpLocaleContainerDescID = spc.splocalecontainerid
LEFT OUTER JOIN splocalecontaineritem AS itm ON itm.SpLocaleContainerID = spc.splocalecontainerid
LEFT OUTER JOIN splocaleitemstr itms ON itms.SpLocaleContainerItemNameID = itm.SpLocaleContainerItemID
LEFT OUTER JOIN splocaleitemstr itmd ON itmd.SpLocaleContainerItemDescID = itm.SpLocaleContainerItemID
WHERE spc.SchemaType = 0
ORDER BY spc.DisciplineID,
its.Language,
spc.name,
itm.name;(Or alternatively use the /context/schema_localization.json endpoint).
And if you want to export the Schema from SQL as a CSV (this will output as schema_output.csv, but you can change the file path after the OUTFILE keyword):
(Note that NULL values are interpreted as the literal \N).
SELECT its.Language,
spc.name AS 'Table Name',
itm.name AS 'Field Name',
its.Text AS 'Table Label',
itd.Text AS 'Table Desc',
itms.Text AS 'Field Label',
itmd.Text AS 'Field Desc',
itm.type AS 'Field Type',
CASE
WHEN itm.IsRequired = 1 THEN 'True'
WHEN itm.IsRequired = 0 THEN 'False'
ELSE itm.IsRequired
END AS 'Field Is Required',
CASE
WHEN itm.IsHidden = 1 THEN 'True'
WHEN itm.IsHidden = 0 THEN 'False'
ELSE itm.IsHidden
END AS 'Field Is Hidden',
itm.Format AS 'Field Format',
itm.PickListName AS 'Field PickList',
itm.WebLinkName AS 'Field WebLink'
FROM splocalecontainer AS spc
LEFT OUTER JOIN splocaleitemstr AS its ON its.SpLocaleContainerNameID = spc.splocalecontainerid
LEFT OUTER JOIN splocaleitemstr itd ON itd.SpLocaleContainerDescID = spc.splocalecontainerid
LEFT OUTER JOIN splocalecontaineritem AS itm ON itm.SpLocaleContainerID = spc.splocalecontainerid
LEFT OUTER JOIN splocaleitemstr itms ON itms.SpLocaleContainerItemNameID = itm.SpLocaleContainerItemID
LEFT OUTER JOIN splocaleitemstr itmd ON itmd.SpLocaleContainerItemDescID = itm.SpLocaleContainerItemID
WHERE spc.SchemaType = 0
ORDER BY spc.DisciplineID,
its.Language,
spc.name,
itm.name INTO OUTFILE 'schema_output.csv' FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n';There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When this is ready to be merged we'll have to do some setup in Weblate to make these strings accessible to translators. You largely only have to do this when creating a new localization directory - which corresponds to a new Component on Weblate.
There's no instructions for adding a new component for in our https://github.com/specify/specify7/blob/main/specifyweb/frontend/js_src/lib/localization/README.md currently, so I'll modify the file to include that documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're not creating any default App Resources in the database at the moment.
This is probably fine for stuff like View Sets, TypeSearches, UIFormatters, etc., which exist as defaults on disk, but there is no default Remote or Global Preferences.
Do you know if these are these going to be addressed in something like #7501, or is generally expected that the preference files exist on data creation?
If these changes are going go to be included in a release before being addressed, I think we at least need to create the Remote Preferences so Specify has access to settings like the Date Format (ui.formatting.scrdateformat), Tree Count thresholds (TreeEditor.Rank.Threshold.<TREE>, related #6844), and allow users to configure things like automatically creating resources and other Remote Preferences
Remove unused urls
Refactor worker status code
Triggered by 64879ec on branch refs/heads/issue-2931-1
Remove print statement





Fixes #2931
Fixes #4832
Fixes #6210
Adds an initial set up page for configuring your database much like the Spwizard for Specify 6.
TODO:
specifyuser_spprincipaltable needs to exist before the setup process works on a new sp7 database.It also looks like the admin user isn't actually a proper admin.(fixed)Future TODO:
Checklist
self-explanatory (or properly documented)
Testing instructions