-
Notifications
You must be signed in to change notification settings - Fork 0
WIP: Prototype of minimal codebase #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
CentofantiEze
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Emma and I gave a good read to all the modules. The implementation is well structured and seems functional. We left a few comments and questions, specially in the parts that might need further refinement. We haven't tried to run the code yet.
Main comments
- MAP: the main concern is the way the MAP is handled. In this code it seems that the MAP is a standalone inference option (we can do MCMC or MAP). However, we want the map to be an (optional) extra step to be performed prior to running the chains, to tune the initial starting points.
- Ellipticities: the intrinsic ellipticity variables are missing.
- Modularity. PSF modelling and galaxy morphology should be encapsulated in separate submodules/subpackages.
- All the galaxies are being generated at the centre of the postage stamp (overlapped).
- It seems that the synthetic observations are individual postage stamps but the scene modelling creates a single field.
| type: Gaussian | ||
| sigma: 0.1 | ||
|
|
||
| gal: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This parameters are for the data generation or the generative forward modelling?
Missing ellipticity prior information.
| type: str = "Exponential" # Changed default from Sersic to Exponential | ||
| n: Optional[Union[float, DistributionConfig]] = None # Make optional for Exponential | ||
| flux: Union[float, DistributionConfig] | ||
| half_light_radius: Union[float, DistributionConfig] = Field(..., alias="half_light_radius") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does Field do? If it is for adding an alias to half_light_radius shouldn't the alias be hlr? And why is Field only used here?
| def generate_synthetic(config: ShineConfig) -> Observation: | ||
| import galsim | ||
|
|
||
| # 1. Define PSF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually this will be a single line:
psf = psf_utils.get_psf(config.psf)
and all the code should be written in the psf_utils sub package/module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same applies to the following lines.
| g1 = get_mean(config.gal.shear.g1) | ||
| g2 = get_mean(config.gal.shear.g2) | ||
| shear = galsim.Shear(g1=g1, g2=g2) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing intrinsic ellipticity.
| g2 = get_mean(config.gal.shear.g2) | ||
| shear = galsim.Shear(g1=g1, g2=g2) | ||
|
|
||
| # Create Galaxy Object - Use Exponential (Sersic n=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In principle if we follow the previous logic it should check the galaxy type:
if config.gal.type == "exponential":
...
| sigma = self.config.image.noise.sigma | ||
| numpyro.sample("obs", dist.Normal(model_image, sigma), obs=observed_data) | ||
|
|
||
| return model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't return a log likelihood function but the forward model function.
| if extra_args is None: | ||
| extra_args = {} | ||
|
|
||
| kernel = NUTS(self.model, dense_mass=self.dense_mass) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do NUTS and MCMC deal with initial samples?
| from typing import Dict, Any | ||
| import arviz as az | ||
|
|
||
| class HMCInference: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be named simply Inference because it could implement various inference methods (HMC, MCMC, etc).
| # Print summary | ||
| print(results.posterior) | ||
|
|
||
| elif args.mode == "map": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are using map estimation for getting the initial samples closer to the solution, thus it should be run before the inference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MAP shouldn't be an inference mode option. It should be an optional step to run prior to inference (mcmc) to improve the starting point of the chains.
| # We want 'model_image'. | ||
| # We didn't expose 'model_image' as a deterministic site in the builder. | ||
| # Let's modify the builder to expose it, OR we can just inspect the 'obs' distribution mean in the trace. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lines below might not be useful given the comments above (MAP shouldn't be an independent inference method).
This pull request introduces the initial implementation of the SHINE pipeline, providing a complete configuration-driven workflow for Bayesian shear inference using JAX, NumPyro, and GalSim. The changes include new configuration models, data loading and synthetic generation, scene/model construction, inference engines, and a main entrypoint script. These updates establish a modular, extensible foundation for both synthetic and (future) real data analysis.
Configuration and Workflow Foundation:
shine/config.py, supporting flexible priors, image, PSF, galaxy, and inference options, with a YAML-based config handler.configs/test_run.yaml) specifying all parameters for a test run, including galaxy, PSF, image, and inference settings.Data Handling and Synthetic Generation:
shine/data.pywith aDataLoaderclass that generates synthetic galaxy image data using GalSim when no real data is provided, including noise and PSF handling.Model and Inference Engines:
shine/scene.pyfeaturing aSceneBuilderclass that constructs a NumPyro model for the scene, supporting flexible priors and differentiable rendering with JAX-GalSim.shine/inference.pywith HMC (NUTS) and MAP inference engines, supporting ArviZ output for posterior analysis and MAP parameter extraction.Entrypoint and Workflow Integration:
shine/main.py, a CLI script to run the full pipeline: configuration loading, data generation, model building, inference execution (HMC or MAP), and result saving (including residuals and reduced chi-squared for MAP).DESIGN.mdto reflect the new modular workflow and clarify component relationships.