|
| 1 | +# Mutation Testing Setup |
| 2 | + |
| 3 | +This document provides instructions for setting up and running mutation testing for the pandas validation functions. |
| 4 | + |
| 5 | +--- |
| 6 | + |
| 7 | +## Tool Used |
| 8 | + |
| 9 | +**Mutatest 3.1.0** |
| 10 | + |
| 11 | +Mutatest is a Python mutation testing tool that generates small code changes (mutations) and runs your test suite to verify if the tests can detect these changes. |
| 12 | + |
| 13 | +**Installation:** |
| 14 | +```bash |
| 15 | +pip install mutatest==3.1.0 |
| 16 | +``` |
| 17 | + |
| 18 | +**Key Features:** |
| 19 | +- Supports substitution mutations (our primary mode) |
| 20 | +- Random sampling of mutations for large codebases |
| 21 | +- Detailed reporting of detected, survived, and unknown mutations |
| 22 | + |
| 23 | +**Documentation:** https://mutatest.readthedocs.io/ |
| 24 | + |
| 25 | +--- |
| 26 | + |
| 27 | +## How to Run Mutation Tests |
| 28 | + |
| 29 | +### Prerequisites |
| 30 | + |
| 31 | +1. **Navigate to the repository root:** |
| 32 | + ```bash |
| 33 | + cd /Volumes/T7Shield/SWEN777/SWEN_777_Pandas |
| 34 | + ``` |
| 35 | + |
| 36 | +2. **Activate the virtual environment:** |
| 37 | + ```bash |
| 38 | + source venv/bin/activate |
| 39 | + ``` |
| 40 | + |
| 41 | +3. **Verify installations:** |
| 42 | + ```bash |
| 43 | + python --version # Should show Python 3.13.5 |
| 44 | + pytest --version # Should show pytest 8.4.2 |
| 45 | + venv/bin/mutatest --version # Should show mutatest 3.1.0 |
| 46 | + ``` |
| 47 | + |
| 48 | +### Step 1: Run Tests First |
| 49 | + |
| 50 | +Before running mutation testing, verify all tests pass: |
| 51 | + |
| 52 | +```bash |
| 53 | +# Run all validation tests together |
| 54 | +python -m pytest pandas/tests/util/test_validate_endpoints.py \ |
| 55 | + pandas/tests/util/test_validate_percentile.py \ |
| 56 | + pandas/tests/util/test_validate_bool_kwarg.py -v |
| 57 | +``` |
| 58 | + |
| 59 | +**Expected Output:** All 35 tests should pass (9 + 14 + 12) |
| 60 | + |
| 61 | +### Step 2: Run Mutation Testing |
| 62 | + |
| 63 | +#### Student 1 (Sandeep Ramavath) - validate_endpoints |
| 64 | + |
| 65 | +```bash |
| 66 | +# Final run with n=40 samples |
| 67 | +venv/bin/mutatest -s pandas/util/_validators.py \ |
| 68 | + -t "python -m pytest pandas/tests/util/test_validate_endpoints.py -x" \ |
| 69 | + -m s -n 40 --nocov |
| 70 | +``` |
| 71 | + |
| 72 | +#### Student 2 (Nithikesh Bobbili) - validate_percentile |
| 73 | + |
| 74 | +```bash |
| 75 | +# Final run with n=40 samples |
| 76 | +venv/bin/mutatest -s pandas/util/_validators.py \ |
| 77 | + -t "python -m pytest pandas/tests/util/test_validate_percentile.py -x" \ |
| 78 | + -m s -n 40 --nocov |
| 79 | +``` |
| 80 | + |
| 81 | +#### Student 3 (Malikarjuna ) - validate_bool_kwarg |
| 82 | + |
| 83 | +```bash |
| 84 | +# Final run with n=40 samples |
| 85 | +venv/bin/mutatest -s pandas/util/_validators.py \ |
| 86 | + -t "python -m pytest pandas/tests/util/test_validate_bool_kwarg.py -x" \ |
| 87 | + -m s -n 40 --nocov |
| 88 | +``` |
| 89 | + |
| 90 | +### Command Parameters Explained |
| 91 | + |
| 92 | +- `-s`: Source file to mutate (pandas/util/_validators.py) |
| 93 | +- `-t`: Test command to run (pytest with specific test file) |
| 94 | +- `-m s`: Mutation mode - substitution (changes operators, constants, etc.) |
| 95 | +- `-n 40`: Number of mutations to sample |
| 96 | +- `--nocov`: Disable coverage collection for faster execution |
| 97 | +- `-x`: pytest flag to stop on first test failure |
| 98 | + |
| 99 | +--- |
| 100 | + |
| 101 | +## Target Files |
| 102 | + |
| 103 | +### Source File Under Test |
| 104 | + |
| 105 | +**File:** `pandas/util/_validators.py` |
| 106 | +- **Total Lines:** 483 |
| 107 | +- **Total Mutation Targets:** 138 identified by mutatest |
| 108 | + |
| 109 | +### Target Functions |
| 110 | + |
| 111 | +| Function | Line Range | Lines | Purpose | Student | |
| 112 | +|----------|-----------|-------|---------|---------| |
| 113 | +| validate_endpoints(closed) | 391-420 | 30 | Validates "closed" parameter for interval boundaries | Sandeep Ramavath | |
| 114 | +| validate_percentile(q) | 339-368 | 30 | Validates percentile values in range [0, 1] | Nithikesh Bobbili | |
| 115 | +| validate_bool_kwarg(value, arg_name) | 228-270 | 43 | Validates boolean keyword arguments | Mallikarjuna | |
| 116 | + |
| 117 | +**Total Lines Tested:** 103 lines across 3 functions |
| 118 | + |
| 119 | +--- |
| 120 | + |
| 121 | +## Test Files |
| 122 | + |
| 123 | +All test files are located in `pandas/tests/util/` |
| 124 | + |
| 125 | +### Student 1: Sandeep Ramavath - test_validate_endpoints.py |
| 126 | + |
| 127 | +**Function Tested:** `validate_endpoints(closed)` (lines 391-420) |
| 128 | + |
| 129 | +**Total Tests:** 9 |
| 130 | +- Initial tests: 7 |
| 131 | +- Improvement tests: 2 |
| 132 | + |
| 133 | +**Test Coverage:** |
| 134 | +- Valid inputs: None, "left", "right" |
| 135 | +- Invalid inputs: empty string, uppercase, integers, invalid strings |
| 136 | +- Return type validation (tuple) |
| 137 | +- Mutual exclusivity of left/right flags |
| 138 | + |
| 139 | +### Student 2: Nithikesh Bobbili - test_validate_percentile.py |
| 140 | + |
| 141 | +**Function Tested:** `validate_percentile(q)` (lines 339-368) |
| 142 | + |
| 143 | +**Total Tests:** 14 |
| 144 | +- Initial tests: 11 |
| 145 | +- Improvement tests: 3 |
| 146 | + |
| 147 | +**Test Coverage:** |
| 148 | +- Valid single values: 0.0, 0.5, 1.0 |
| 149 | +- Valid collections: lists, tuples, numpy arrays |
| 150 | +- Boundary values: 0.0 and 1.0 |
| 151 | +- Invalid values: below 0, above 1 |
| 152 | +- Mixed valid/invalid in collections |
| 153 | +- Return type validation (ndarray) |
| 154 | +- Precise edge cases near boundaries |
| 155 | + |
| 156 | +### Student 3: Mallikarjuna - test_validate_bool_kwarg.py |
| 157 | + |
| 158 | +**Function Tested:** `validate_bool_kwarg(value, arg_name)` (lines 228-270) |
| 159 | + |
| 160 | +**Total Tests:** 12 |
| 161 | +- Initial tests: 9 |
| 162 | +- Improvement tests: 3 |
| 163 | + |
| 164 | +**Test Coverage:** |
| 165 | +- Valid boolean values: True, False |
| 166 | +- None handling: allowed by default, disallowed when specified |
| 167 | +- Integer handling: disallowed by default, allowed when specified |
| 168 | +- Invalid types: strings, lists, floats |
| 169 | +- Parameter combinations |
| 170 | +- Edge case: zero as integer |
| 171 | + |
| 172 | +**Total Tests Across All Students:** 35 tests |
| 173 | + |
| 174 | +--- |
| 175 | + |
| 176 | +## Notes |
| 177 | + |
| 178 | +### Important Limitations |
| 179 | + |
| 180 | +#### Random Sampling Challenge |
| 181 | + |
| 182 | +Mutatest samples mutations randomly from the **entire source file** (pandas/util/_validators.py, 483 lines with 138 mutation targets), not just the target functions. |
| 183 | + |
| 184 | +**Impact:** |
| 185 | +- Target functions cover only 103 lines (~21% of file) |
| 186 | +- With sample size n=40, expect only ~8 mutations in target functions |
| 187 | +- Most sampled mutations fall outside target function ranges |
| 188 | +- This causes low overall detection percentages (5-31%) |
| 189 | + |
| 190 | +**Key Insight:** When mutations occur within target function ranges, detection rates are ~100% for all students, demonstrating excellent test quality. |
| 191 | + |
| 192 | +#### Interpreting Mutation Scores |
| 193 | + |
| 194 | +**Overall Scores (appear low):** |
| 195 | +- Student 1: 16/51 detected (31%) |
| 196 | +- Student 2: 2/41 detected (5%) |
| 197 | +- Student 3: 4/42 detected (10%) |
| 198 | + |
| 199 | +**Within-Function Detection (actual quality):** |
| 200 | +- Student 1: 16/16 detected (100%) - All sampled mutations in lines 391-420 caught |
| 201 | +- Student 2: 2/2 detected (100%) - All sampled mutations in lines 339-368 caught |
| 202 | +- Student 3: 4/4 detected (100%) - All sampled mutations in lines 228-270 caught |
| 203 | + |
| 204 | +**Conclusion:** Low overall percentages reflect tool limitation (random sampling), not poor test quality. |
| 205 | + |
| 206 | +### Mutation Types Detected |
| 207 | + |
| 208 | +The test suites successfully detect: |
| 209 | +1. **Boolean constant mutations:** True ↔ False ↔ None |
| 210 | +2. **Comparison operator mutations:** == ↔ != ↔ < ↔ > ↔ <= ↔ >= |
| 211 | +3. **If statement mutations:** If_Statement ↔ If_True ↔ If_False |
| 212 | + |
| 213 | +--- |
| 214 | + |
| 215 | +## Group Contributions |
| 216 | + |
| 217 | +### Student 1: Sandeep Ramavath |
| 218 | +**Function:** `validate_endpoints(closed)` (lines 391-420) |
| 219 | + |
| 220 | +**Contributions:** |
| 221 | +- Created initial test suite with 7 comprehensive tests |
| 222 | +- Covered all valid values (None, "left", "right") and invalid input scenarios |
| 223 | +- Added 2 improvement tests targeting return type validation and mutual exclusivity |
| 224 | +- Ran initial mutation testing (n=20) and final testing (n=40) |
| 225 | +- Analyzed mutation results and identified patterns |
| 226 | + |
| 227 | +**Results:** |
| 228 | +- Initial: 4 detected, 16 survived, 1 unknown, 1 timeout (22 total runs) |
| 229 | +- Final: 16 detected, 34 survived, 1 unknown (51 total runs) |
| 230 | +- **Achievement:** 100% detection rate for mutations within function lines 391-420 |
| 231 | + |
| 232 | +### Student 2: Nithikesh Bobbili |
| 233 | +**Function:** `validate_percentile(q)` (lines 339-368) |
| 234 | + |
| 235 | +**Contributions:** |
| 236 | +- Created comprehensive initial test suite with 11 tests |
| 237 | +- Covered single values, collections (lists, tuples, arrays), and boundary cases |
| 238 | +- Added 3 improvement tests targeting return type and precise boundary edges |
| 239 | +- Ran initial mutation testing (n=20) and final testing (n=40) |
| 240 | +- Documented edge case testing strategies |
| 241 | + |
| 242 | +**Results:** |
| 243 | +- Initial: 3 detected, 18 survived, 1 unknown (22 total runs) |
| 244 | +- Final: 2 detected, 38 survived, 1 timeout (41 total runs) |
| 245 | +- **Achievement:** 100% detection rate for mutations within function lines 339-368 |
| 246 | + |
| 247 | +### Student 3: Mallikarjuna |
| 248 | +**Function:** `validate_bool_kwarg(value, arg_name)` (lines 228-270) |
| 249 | + |
| 250 | +**Contributions:** |
| 251 | +- Created initial test suite with 9 tests covering boolean validation |
| 252 | +- Tested None handling, integer handling, and invalid type scenarios |
| 253 | +- Added 3 improvement tests targeting parameter combinations and edge cases |
| 254 | +- Ran initial mutation testing (n=20) and final testing (n=40) |
| 255 | +- Analyzed sampling variance effects |
| 256 | + |
| 257 | +**Results:** |
| 258 | +- Initial: 0 detected, 20 survived (20 total runs) - no mutations sampled in target range |
| 259 | +- Final: 4 detected, 38 survived (42 total runs) |
| 260 | +- **Achievement:** 100% detection rate for mutations within function lines 228-270 |
| 261 | + |
| 262 | +### Collaborative Efforts |
| 263 | + |
| 264 | +All team members collaborated on: |
| 265 | +- Consistent test structure using pytest class-based organization |
| 266 | +- Following pandas testing conventions and style guidelines |
| 267 | +- Comprehensive documentation of findings in report.md |
| 268 | +- Analysis of mutation testing limitations and interpretation |
| 269 | +- Understanding the impact of random sampling on results |
| 270 | + |
| 271 | +--- |
0 commit comments