Skip to content

Conversation

@tprocter46
Copy link
Collaborator

This PR updates the version of Rivet from 3.1.5 to 3.1.8; and Contur from 2.1.1 to 2.4.1.
The main improvement this brings us is a lot more rivet analyses (and hence measurements for contur to examine), as well as some updates to how contur runs.

There are a few other changes that needed to be made as a result of this:

  • fastjet updated from 3.3.2 to 3.4.0 (Rivet 3.1.8 requires 3.4.0)
  • Contur_output class has been redesigned to handle the new way contur returns results.
  • Contur_get_analyses_from_beam convenience function rewritten completely due to redesign of the contur database.

Some remaining questions:

  • I've left in place the patch that comments out the contur progress bar: but contur now explicitly imports the multiprocessing library elsewhere. Do I recall correctly that this causes a problem on some OS? Should I try and remove this also?
  • Do we still need the Multi_contur... functions? 2.4.1 by default already calculates both theory and SMBG likelihoods, which was our original motivation for creating these. I guess there are still situations where the multi versions do things we couldn't do otherwise, but I struggle to see where this'd be useful.

Everything seems to run fine on my system, but given recent experience, a cross-check on a mac probably wouldn't hurt.

Also, just to note, there's a slightly hacky thing done to the python path for rivet in backends.cmake because 3.1.8 behaves slightly unpredictable in where it installs the python libraries. Hopefully it's fixed in 3.1.9/3.2.0.

Still needs testing and possibly ironing out rivet config issues
@agbuckley
Copy link
Collaborator

cp: /Users/gambitremotecontrol/GAMBIT/Runners/Organisation_level/actions-runner/_work/gambit/gambit/contrib/YODA-1.9.9/pyext/yoda1: unable to copy extended attributes to /usr/lib/python3.10/site-packages: No such file or directory
cp: /usr/lib/python3.10/site-packages/__init__.py: No such file or directory

This seems to be the same basic issue that is affecting PR #462 too -- see comment here: #462 (comment) For some reason, on the Mac x64 CI job, the YODA install step doesn't seem to respect the --prefix option we pass in during the configure step, and then ends up trying to install to some directories it doesn't have permission for. Have you encountered this before, @tprocter46?

Not something I've ever seen with YODA externally: getting the Python dir structure correct (for the particular Python install) within the given prefix has proven challenging to do portably, though I think the most recent releases without distutils do it as well as is possible, but the path prefix should always have been respected. Is this using the most recent YODA 1.9.x?

@tprocter46
Copy link
Collaborator Author

cp: /Users/gambitremotecontrol/GAMBIT/Runners/Organisation_level/actions-runner/_work/gambit/gambit/contrib/YODA-1.9.9/pyext/yoda1: unable to copy extended attributes to /usr/lib/python3.10/site-packages: No such file or directory
cp: /usr/lib/python3.10/site-packages/__init__.py: No such file or directory

This seems to be the same basic issue that is affecting PR #462 too -- see comment here: #462 (comment) For some reason, on the Mac x64 CI job, the YODA install step doesn't seem to respect the --prefix option we pass in during the configure step, and then ends up trying to install to some directories it doesn't have permission for. Have you encountered this before, @tprocter46?

Not something I've ever seen with YODA externally: getting the Python dir structure correct (for the particular Python install) within the given prefix has proven challenging to do portably, though I think the most recent releases without distutils do it as well as is possible, but the path prefix should always have been respected. Is this using the most recent YODA 1.9.x?

It is using 1.9.9

@tprocter46
Copy link
Collaborator Author

It is using 1.9.9

In fact, we upgraded from 1.9.7 to fix a different issue, but this one seems to have been introduced by going from 1.9.7 -> 1.9.9

@ChrisJChang
Copy link
Collaborator

The YODA configure command:

YODA_PYTHONPATH=$PYTHON -c "from __future__ import print_function; import sysconfig; print(sysconfig.get_path('platlib', 'posix_user', vars={'userbase' : '$prefix'}))"

Was changed in 1.9.7 to

YODA_PYTHONPATH=$PYTHON -c "from __future__ import print_function; import sysconfig; print(sysconfig.get_path('platlib', vars={'platbase': 'XXX', 'base': 'XXX'}).replace('/local', '').replace('XXX', '$prefix'))"

in 1.9.9. This looks like this is the reason why it is trying to access the wrong location. If I just open python on the Mac and run these commands (just setting '${prefix}' to 'PREFIX'):

First Case:

PREFIX/lib/python3.10/site-packages

Second Case:

/usr/lib/python3.10/site-packages

So in the updated yoda case, nothing is being replaced with the prefix.

@agbuckley
Copy link
Collaborator

Thanks @ChrisJChang . The extra complexities in that command were to account for YODA doing some silly things with the path-completion reporting, such that we needed to strip out misattributed additional local/ path elements. But I think the real kicker has been the switch from userbase to platbase: that was supplied to us as the "correct" way to do things, and it seemed to be working, including for Mac users in our dev team, but not on the CI, apparently. Are you able to try hacking to also add a userbase entry in that dict, and see if that helps? We'll obviously feed back upstream if we can find a works-for-everyone patch.

@ChrisJChang
Copy link
Collaborator

@agbuckley , a little playing around suggests that its actually the 'posix_user' option that is having the effect here. The 'userbase' option won't appear in the path unless the scheme is supplied (a few work, not only posix_user). And if they are specified, the 'userbase' option isn't necessary.

i.e.

print(sysconfig.get_path('platlib', vars={'platbase': 'XXX', 'base': 'XXX'}))
/usr/local/lib/python3.11/site-packages

but

print(sysconfig.get_path('platlib', 'posix_prefix', vars={'platbase': 'XXX', 'base': 'XXX'}))
XXX/lib/python3.11/site-packages

Looking at the documentation for this (https://docs.python.org/3/library/sysconfig.html), it seems to me like the default scheme should be one of the posix ones. However, for me it is the 'osx_framework_library' one. Perhaps the mac testing others have done had a different default scheme than my machine?

@agbuckley
Copy link
Collaborator

Thanks for the digging, @ChrisJChang : I'm not sure exactly what's going on with the inconsistent Python behaviours re. this on different platforms, but I've added the 'posix_prefix' argument to the sysconfig query on the dev heads of Rivet and YODA so hopefully this will behave out of the box with the next releases.

@ChrisJChang
Copy link
Collaborator

May I ask what the status of this PR is?

@tprocter46
Copy link
Collaborator Author

May I ask what the status of this PR is?

Other than the mac issues you identified above, everything is ready to go.
If I understand correctly we can either just add an extra patch with your fix to make the mac install work properly; or we could do the slightly more rigorous thing and do the full extra upgrade from rivet 3.1.8 to the brand-new-and-shiny rivet 3.1.10 which should have the fix built in, as well as coming with some other useful analyses.

If you give me 1-2 weeks, I can probably do the 3.1.10 upgrade, the extra analyses would be nice. It would be good to get it ready for susyrun2 post-processing, I guess.

@ChrisJChang
Copy link
Collaborator

Other than the mac issues you identified above, everything is ready to go.

Awesome. Based on some discussions in the core group, and that the state of the Mac CIs has been changing a lot (e.g. I need to go and rebuild a bunch of stuff on them), I would suggest not bothering too much to fix it for the Macs right now, and if it is ready to go for Linux, then feel happy to merge if Anders has approved it for merging.

If you give me 1-2 weeks, I can probably do the 3.1.10 upgrade, the extra analyses would be nice. It would be good to get it ready for susyrun2 post-processing, I guess.

This is another option. I would suggest deciding on whether to do this based on the susyrun2 post-processing, rather than Mac builds.

@tprocter46
Copy link
Collaborator Author

Other than the mac issues you identified above, everything is ready to go.

Awesome. Based on some discussions in the core group, and that the state of the Mac CIs has been changing a lot (e.g. I need to go and rebuild a bunch of stuff on them), I would suggest not bothering too much to fix it for the Macs right now, and if it is ready to go for Linux, then feel happy to merge if Anders has approved it for merging.

If you give me 1-2 weeks, I can probably do the 3.1.10 upgrade, the extra analyses would be nice. It would be good to get it ready for susyrun2 post-processing, I guess.

This is another option. I would suggest deciding on whether to do this based on the susyrun2 post-processing, rather than Mac builds.

If the mac's not a blocker, then I would merge this. Then I can make a 3.1.10 PR when/if required (should hopefully be less complicated than this one)

@anderkve
Copy link
Collaborator

Sounds good! I justed merged the latest master into this branch, so I'll merge it once the Ubuntu CI jobs have completed successfully.

@anderkve
Copy link
Collaborator

Sorry, one more thing, @tprocter46: I tried to build this branch now on my Ubuntu machine and I get a some compilation errors -- see below for a list of the first few that show up. (This was starting from a clean state, after make nuke-all and deleting the build dir.)

Does the branch build and run OK for you?

The first few errors seem to just be related to the Contur_output --> Contur_subOutput change, so they are probably trivial to fix. I haven't looked into the rest. Do you have time to take a quick look?

[ 34%] Building CXX object Utils/CMakeFiles/Utils.dir/src/mpiwrapper.cpp.o
/home/anders/physics/GAMBIT/gambit_3/ColliderBit_backup/src/ColliderBit_measurements.cpp: In function ‘void Gambit::ColliderBit::Contur_LHC_measurements_LogLike(double&)’:
/home/anders/physics/GAMBIT/gambit_3/ColliderBit_backup/src/ColliderBit_measurements.cpp:454:45: error: ‘class Gambit::Contur_output’ has no member named ‘LLR’
  454 |           result = contur_likelihood_object.LLR;
      |                                             ^~~
/home/anders/physics/GAMBIT/gambit_3/ColliderBit_backup/src/ColliderBit_measurements.cpp: In function ‘void Gambit::ColliderBit::Multi_Contur_LHC_measurements_LogLike_all(Gambit::map_str_dbl&)’:
/home/anders/physics/GAMBIT/gambit_3/ColliderBit_backup/src/ColliderBit_measurements.cpp:466:69: error: ‘class Gambit::Contur_output’ has no member named ‘LLR’
  466 |             result[Contur_name.first + "_LLR"] = Contur_name.second.LLR;
      |                                                                     ^~~
/home/anders/physics/GAMBIT/gambit_3/ColliderBit_backup/src/ColliderBit_measurements.cpp: In function ‘void Gambit::ColliderBit::Multi_Contur_LHC_measurements_LogLike_single(double&)’:
/home/anders/physics/GAMBIT/gambit_3/ColliderBit_backup/src/ColliderBit_measurements.cpp:476:59: error: ‘std::map<std::__cxx11::basic_string<char>, Gambit::Contur_output>::mapped_type’ {aka ‘class Gambit::Contur_output’} has no member named ‘LLR’
  476 |           result = contur_likelihood_object[which_as_LLR].LLR;
      |                                                           ^~~
/home/anders/physics/GAMBIT/gambit_3/ColliderBit_backup/src/ColliderBit_measurements.cpp: In function ‘void Gambit::ColliderBit::Contur_LHC_measurements_LogLike_perPool(Gambit::map_str_dbl&)’:
/home/anders/physics/GAMBIT/gambit_3/ColliderBit_backup/src/ColliderBit_measurements.cpp:485:45: error: invalid use of non-static member function ‘Gambit::map_str_dbl Gambit::Contur_output::pool_LLR() const’
  485 |           result = (*Dep::LHC_measurements).pool_LLR;
      |                                             ^~~~~~~~
In file included from /home/anders/physics/GAMBIT/gambit_3/Backends/include/gambit/Backends/backend_types_rollcall.hpp:54,
                 from /home/anders/physics/GAMBIT/gambit_3/Elements/include/gambit/Elements/shared_types.hpp:62,
                 from /home/anders/physics/GAMBIT/gambit_3/Elements/include/gambit/Elements/types_rollcall.hpp:95,
                 from /home/anders/physics/GAMBIT/gambit_3/Elements/include/gambit/Elements/printable_types.hpp:22,
                 from /home/anders/physics/GAMBIT/gambit_3/Printers/include/gambit/Printers/baseprinter.hpp:45,
                 from /home/anders/physics/GAMBIT/gambit_3/Elements/include/gambit/Elements/elements_extras.hpp:38,
                 from /home/anders/physics/GAMBIT/gambit_3/Elements/include/gambit/Elements/module_macros_inmodule_defs.hpp:43,
                 from /home/anders/physics/GAMBIT/gambit_3/Elements/include/gambit/Elements/module_macros_inmodule.hpp:40,
                 from /home/anders/physics/GAMBIT/gambit_3/Elements/include/gambit/Elements/gambit_module_headers.hpp:37,
                 from /home/anders/physics/GAMBIT/gambit_3/ColliderBit_backup/src/ColliderBit_measurements.cpp:21:
/home/anders/physics/GAMBIT/gambit_3/Backends/include/gambit/Backends/backend_types/Contur.hpp:109:21: note: declared here
  109 |         map_str_dbl pool_LLR() const {
      |                     ^~~~~~~~
/home/anders/physics/GAMBIT/gambit_3/ColliderBit_backup/src/ColliderBit_measurements.cpp: In function ‘void Gambit::ColliderBit::Multi_Contur_LHC_measurements_LogLike_perPool(Gambit::map_str_dbl&)’:
/home/anders/physics/GAMBIT/gambit_3/ColliderBit_backup/src/ColliderBit_measurements.cpp:504:77: error: invalid initialization of reference of type ‘std::map<std::__cxx11::basic_string<char>, double> (Gambit::Contur_output::*&&)() const’ from expression of type ‘<unresolved overloaded function type>’
  504 |             for (auto const& pool_LLR_entry : contur_output_instance.second.pool_LLR)
      |                                               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~
/home/anders/physics/GAMBIT/gambit_3/ColliderBit_backup/src/ColliderBit_measurements.cpp:504:77: error: no matching function for call to ‘begin(std::map<std::__cxx11::basic_string<char>, double> (Gambit::Contur_output::*&)() const)’
  504 |             for (auto const& pool_LLR_entry : contur_output_instance.second.pool_LLR)
      |                                                                             ^~~~~~~~
In file included from /usr/include/c++/11/bits/range_access.h:36,
                 from /usr/include/c++/11/string:54,
                 from /usr/include/c++/11/bits/locale_classes.h:40,
                 from /usr/include/c++/11/bits/ios_base.h:41,
                 from /usr/include/c++/11/ios:42,
                 from /usr/include/c++/11/ostream:38,
                 from /usr/include/c++/11/iostream:39,
                 from /home/anders/physics/GAMBIT/gambit_3/Elements/include/gambit/Elements/safety_bucket.hpp:25,
                 from /home/anders/physics/GAMBIT/gambit_3/Elements/include/gambit/Elements/module_macros_inmodule_defs.hpp:40,
                 from /home/anders/physics/GAMBIT/gambit_3/Elements/include/gambit/Elements/module_macros_inmodule.hpp:40,
                 from /home/anders/physics/GAMBIT/gambit_3/Elements/include/gambit/Elements/gambit_module_headers.hpp:37,
                 from /home/anders/physics/GAMBIT/gambit_3/ColliderBit_backup/src/ColliderBit_measurements.cpp:21:
/usr/include/c++/11/initializer_list:90:5: note: candidate: ‘template<class _Tp> constexpr const _Tp* std::begin(std::initializer_list<_Tp>)’
   90 |     begin(initializer_list<_Tp> __ils) noexcept

@tprocter46
Copy link
Collaborator Author

sics/GAMBIT/gambit_3/ColliderBit_backup/src/ColliderBit_measurements.cpp: In function ‘void Gambit::ColliderBit::Multi_Contur_LHC_measurements_LogLike_single(double&)’:
/home/anders/phy

Hmmm... seems to be compiling fine on my laptop. Both the setup I was working with before, and I also did a clean build in a newly cloned repo.

Is it possible you have some stuff from old contur versions kicking about? As we updated the cmake for the new contur version, then it's possible it wouldn't get cleaned out by nuke-all?

If its still not working I can try rebuilding on the glasgow cluster, to see if I can reproduce the problem there.

…ade_try3

Conflicts:
	Backends/patches/rivet/3.1.8/patch_rivet_3.1.8.dif
@tprocter46 tprocter46 changed the base branch from master to SUSYRun2 April 24, 2024 17:07
@tprocter46
Copy link
Collaborator Author

Hi @anderkve -- I've just retried this in a completely new repo (on the Glasgow cluster this time, not my laptop) and I can't reproduce the error you saw -- it seems to compile fine.
I've taken the opportunity to get this branch up to date with SUSYRun2, and resolve the couple of merge conflicts that came up. Can you try again with the latest version? I'm pretty hopeful it should work...

If we do any postprocessing for SusyRun2 it would be good to use a more up-to-date contur version -- with that in mind, I've changed the PR target to SusyRun2, but am happy to change it back if that complicates things further.

(Separately, in trying to rebuild gambit on the glasgow cluster I noticed something weird is going on with cmake and finding GSL -- if I get to the bottom of it I may try and make another PR)

@ChrisJChang
Copy link
Collaborator

I just did a fresh build test, and I can confirm that it is successfully building for me too. @anderkve , would you have time to test again? Alternatively, we could say that this is ready to merge in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants