-
|
I have a brief question regarding the dimensions in the PyGEM outputs. In the output files, there are three dimensions defined, namely glac, time, year. As I understand it, glac corresponds to glacier index and should uniquely identify each glacier, right? In that case the max value for this field should be the same as the total number of glaciers within each region. In my regional simulations Scandinavia, this does not match. I found in the code where it is defined as In the reference RGI table, the column 'name' does not have unique names for each glaciers, therefore there are many blank cells. Am I missing something here or did I misunderstand the intention of defining a glacier index dimension? I would really appreciate your clarification on this. |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 4 replies
-
|
Hi @yelizy, There may be some confusion depending on which output files you are looking at. Are the output files you are looking at the output from a single glacier (produced by running run_simulation.py and names with the RGIId in the filename), or after compiling the results for your entire region (produced by running postproc_compile_simulations.py and named with the region in the filename)? The reason I ask is because the compiled regional output should not have a The single-glacier output for a given run will just be the results for one glacier. When you compile all glaciers using the postprocessing script, you should have N Here's an example of an output regional run for Iceland showing that all glaciers are contained within the file: |
Beta Was this translation helpful? Give feedback.
-
|
Hi @btobers , Thanks so much for your prompt reply. You are right, I didn’t specify which outputs I am looking at. I was referring to the individual glacier outputs. There, the dimension glac is not uniquely defined for each glacier. Other dimensions (time and year) seem correctly defined. Here are a few examples :
In the postprocessing script, the number of glaciers is carefully checked before creating batches which is great. But in that case :
It seems to me that when np.array is defined based on glacier names (in my case between 0,.., 213) in the RGI table, it does not match the total number of glaciers (3417). Somehow the array seems to conform to the glacier length when producing outputs. |
Beta Was this translation helpful? Give feedback.
-
|
Hi @yelizy, Great questions. The short answer is that this index is not used after storing the simulations, and we should probably modify this structure. Each individual simulation output will only have a single A bit more detail: the reason the What becomes the 'name' key in our resulting series as we loop through each glacier is the index in If you an an entire region, the Again, in summary, the |
Beta Was this translation helpful? Give feedback.
-
|
Thanks so much for your detailed explanation @btobers. It was important to clarify that "Name" column was already dropped from the original RGI data, and the Anyways, if these values are not used in the other parts of the workflow, I believe it should be fine to keep it the way it is. |
Beta Was this translation helpful? Give feedback.



The reason for that mismatch may be due to using multiprocessing. Did you run your simulations across multiple cores? If so, then the index may only go up to the total number of glaciers//Njobs. Did all your 3416 simulations get exported?