-
Notifications
You must be signed in to change notification settings - Fork 1
Pseudonymisation and FTPS #25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…end up in the docker image
evaluated together.
really under the right dir
files at the end of each (eg) day
meantime. Couple of other fixes.
which was triggering a mypy warning
thompson318
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've had a go at running this. I like the structure with the controller, hasher, and exporter in separate containers. I'll keep playing around with it.
It needs some instructions on installing PIXL (or maybe add PIXL installation to docker compose.)?
Also the config files had to be in the parent directory (../config not ./config). I also couldn't get the exporter crontab working.
src/pseudon/pseudon.py
Outdated
| ("location", pa.string()), | ||
| # decimal32 can have a maximum of 9 significant digits. | ||
| # We can go to 64 if needed but let's try and keep it compact. | ||
| # But they're not exposed?? Use 128 instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Clarified
Co-authored-by: Stephen Thompson <s.thompson@ucl.ac.uk>
thompson318
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good thank you.
Split into three containers, each with their own config files:
waveform-controller— the Emap -> CSV bit of the pipelinewaveform-exporter— converts to parquet (original and pseudon), and uploads via FTPS.waveform-hasher— from PIXL. Currently not used.There is a toy hasher built into our code just to get the full pipeline going. This will be replaced with
waveform-hasherwhen we get our credentials for the keyvault.The exporter runs cron, which doesn't currently run anything, but the individual functions are runnable as individual commands.
All data files are accessed through the mounted directory, with careful separation between pseudon and originals.