Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 27, 2025

The pod_kill chaos experiment was using the pod-kill action, which deletes pods and triggers immediate Kubernetes controller recreation, defeating the fault injection.

Changes

  • inject_symp.py: Changed inject_pod_kill() to use pod-failure action instead of pod-kill

    • pod-kill forcefully deletes pods → controller recreates immediately
    • pod-failure makes pods unavailable for duration without deletion → prevents recreation
  • chaos-yaml/pod-kill.yaml: Updated template to reflect correct action

The fault now persists for the specified duration (e.g., 100s) as intended.

chaos_experiment = {
    "spec": {
-       "action": "pod-kill",
+       "action": "pod-failure",
        "mode": "one",
        "duration": duration,
        # ...
    }
}
Original prompt

This section details on the original issue you should resolve

<issue_title>Pod killing does not work</issue_title>
<issue_description>Problem id: pod_kill_hotel_res-localization-1

test-hotel-reservation   search-5c64c8f5bc-cz8hr                                    1/1     Running      0              91s
test-hotel-reservation   user-66d5d6f874-9gh86                                      1/1     Running      0              40s

The user service should get killed by chaos-mesh but k8s recreates the pod immediately. Is this expected?</issue_description>

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Co-authored-by: gaganso <27358592+gaganso@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix pod killing issue in user service Fix pod_kill to use pod-failure action to prevent immediate pod recreation Dec 27, 2025
Copilot AI requested a review from gaganso December 27, 2025 01:15
Copy link
Contributor

@vitrixLab vitrixLab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary on PR#120– Pod Kill Fault Injection

✅ The change updates the pod-kill chaos experiment to use pod-failure instead of pod-kill, aligning with the intended behavior of temporarily disrupting pods without immediate deletion.

✅ Documentation in inject_pod_kill() clearly explains the rationale, making the fault injection logic easy to understand and maintain.

✅ YAML and Python changes are consistent; action field update is correct and aligns with the docstring explanation.

Minor Note (non-blocking):

  • None; the code and configuration reflect the intended behavior accurately.

Overall, this improves safety and clarity of pod fault injection. Approving. ✅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pod killing does not work

3 participants