The CAST macro is used to generate unaligned accesses. This breaks compatibility on architectures which don't support unaligned access, like ARM.
I'm looking into this issue and it seems that I would have to rewrite the CAST macro to be an inline function that uses byte loads.