Skip to content

Process hangs in slot, failing to compute new incoming message #436

@jax-cn

Description

@jax-cn

Env

  1. edge branch
  2. HB_PRINT=compute_short,push_short rebar3 as genesis_wasm shell
  3. Operator: lpJ5Edz_8DbNnVDL0XdbsY9vCOs45NACzfI4jvo4Ba8
  4. Process ID: ZIc9924GI_wMzPayOZAgVjaxasNq1rIQwdSZseGoh7M
  5. Scheduler: GQ33BkPtZrqxA84vM8Zk-N2aO0toNNu_C-l-rawrBA (legacy SU)
  6. halted slot: 1268

Bug

Normarlly, an incoming message will result in

pushed_message_to, process: Bf6JJ..hTRzo, slot: 15
starting_compute, proc_id: Bf6JJ..hTRzo, current: 15, target: 15
done, process: Bf6JJR2tl2Wr38O2-H6VctqtduxHgKF-NzRB9HhTRzo, slot: 15

But, for my process, an incoming message only result in

pushed_message_to, process: ZIc99..Goh7M, slot: 1298
starting_compute, proc_id: ZIc99..Goh7M, current: 1268, target: 1298

Expected results

starting_compute, proc_id: ZIc99..Goh7M, current: 1269, target: 1298
starting_compute, proc_id: ZIc99..Goh7M, current: 1270, target: 1298
...
starting_compute, proc_id: ZIc99..Goh7M, current: 1298, target: 1298
done, process: ZIc9924GI_wMzPayOZAgVjaxasNq1rIQwdSZseGoh7M, slot: 1298

Logs

When trying to reproduce the issue on another HyperBEAM Node, got this error

=ERROR REPORT==== 19-Aug-2025::10:14:03.535724 ===
Error in process <0.9374.0> with exit value:
{{nocatch,
     {necessary_message_not_found,<<"body/tags">>,
         <<"Link (to link): 1KXso6h8Hsi1FWXRYrR8sLU8kyBCSvavfFvdCyNM7j8">>}},
 [{hb_cache,report_ensure_loaded_not_found,3,
      [{file,"/root/HyperBEAM/src/hb_cache.erl"},{line,132}]},
  {hb_cache,ensure_all_loaded,3,
      [{file,"/root/HyperBEAM/src/hb_cache.erl"},{line,149}]},
  {maps,map_1,3,[{file,"maps.erl"},{line,942}]},
  {maps,map_1,3,[{file,"maps.erl"},{line,942}]},
  {maps,map,2,[{file,"maps.erl"},{line,927}]}, 
  {maps,map_1,3,[{file,"maps.erl"},{line,942}]},
  {maps,map_1,3,[{file,"maps.erl"},{line,942}]},
  {maps,map,2,[{file,"maps.erl"},{line,927}]}, 
  {dev_scheduler_cache,write,2,
      [{file,"/root/HyperBEAM/src/dev_scheduler_cache.erl"},{line,25}]},
  {lists,foreach_1,2,[{file,"lists.erl"},{line,2641}]},
  {dev_scheduler,'-cache_remote_schedule/2-fun-1-',2,
      [{file,"/root/HyperBEAM/src/dev_scheduler.erl"},{line,1238}]}]}

How to reproduce

  1. Spawn a process on legacy (specify the scheduler to legacy SU) aos process_1
  2. Reconnect to the HyperBEAM Node on edge branch aos process_1 --mainnet xxx
  3. Write a handler that Send nested data structure to patch device each call.
Handlers.add("Cron", "Cron", function(msg) Send({ device = "patch@1.0", cache = {
	taskStats = {
	  succeeded = 59,
	  timestamp = os.time(),
	},
	tasks = {
	  ["sNWrdfUcR9kBpRPPPnJKFlel4j_z2rJ89PStNXITMto-1755453195384"] = {
		timestamp = os.time(),
		status = "succeeded",
	  }
	},
	timestamp = os.time(),
	workerTasks = {
	  timestamp = os.time(),
	}
} }) end)
  1. Call the handler every 5 seconds.
  2. After about 500 messages. Stop the Node.
  3. Clear the cache-mainnet folder
  4. Restart the Node.
  5. Send another message to the process, and you might get the error log in a very high probability.

Is this a known issue?
I've met the issue a couple times. the previous process was D0na6AspYVzZnZNa7lQHnBt_J92EldK_oFtEPLjIexo

If need any future log or cache.tar to debug, please let me know.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions