Fix is_not_offered_this_year for multi-course entries #268
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #267
When a catalog entry contains multiple courses (e.g., HST.010 and HST.011 sharing the same content block), the
is_not_offered_this_yearcheck was applied to the entire shared HTML. If any "not offered" indicator was found, all courses in that entry were incorrectly skipped.The Problem
The example from the issue (course 12.S492) demonstrates this: when one course in a multi-course entry has a "not offered" indicator, the current logic hides both courses from Hydrant — even though one of them might actually be offered.
The Fix
For multi-course entries (
len(course_nums) > 1), we now skip theis_not_offered_this_yearcheck entirely and include all courses. This is because we cannot reliably determine which specific course the indicator applies to.For single-course entries, the behavior remains unchanged.
Changes
scrapers/catalog.py: Modified the condition inscrape_courses_from_pageto only useis_not_offered_this_yearas a skip criterion for single-course entries.