Skip to content

Conversation

@lenn-arts
Copy link

Sharing the fix to a bug that cost me a lot of time to find during fine-tuning. High severity since it makes results of fine-tuning unusable.

Issue

Location:
scripts/preprocess_dataset.py > scripts/process_videos.py > MediaDataset > _filter_valid_videos()

Relevant for:
Finetuning on any dataset that has conditions and some videos that may get filtered out.

TLDR:
MediaDataset filters videos upon instantiation by removing file paths of videos that are, for instance, too short relative to bucket size from self.video_paths. MediaDataset also keeps self.main_media_paths which in many cases is equal to self.video_paths and determines the save filepath of the computed VAE embedding. However, filtering only changes self.video_paths, NOT self.main_media_paths, so that embeddings of kept videos get saved with name of filtered-out videos. This causes kept videos to be loaded with the conditioning of other videos, and the model may get confused by the lack of correlation of conditioning and video.

Simple example:
Raw videos in FT dataset:

  1. a.mp4
  2. b.mp4
  3. c.mp4

Precomputed text condition embeddings:

  1. .precomputed/conditions/a.pt
  2. .precomputed/conditions/b.pt
  3. .precomputed/conditions/c.pt

self.video_paths before filtering = [a.mp4,b.mp4,c.mp4]
self.main_media_paths before filtering = [a.mp4,b.mp4,c.mp4]

self.video_paths AFTER filtering (remove b because too short) = [a.mp4,c.mp4]
self.main_media_paths AFTER filtering (doesn't get changed) = [a.mp4,b.mp4,c.mp4]

MediaDataset.__getitem__(index=1) returns
{"video": video("c.mp4"), "relative_path": "c.mp4", "main_media_relative_path": "b.mp4"}

As a result, the embedding for c.mp4 gets saved as .precomputed/latents/b.pt. This leads to this video (c) being loaded with the caption for video b .precomputed/conditions/b.pt in src/ltxv_trainer/datasets.py:PrecomputedDataset.__getitem__.

Solution

Keep self.main_media_paths in sync with self.video_paths during filtering in scripts/process_videos.py:MediaDataset._filter_valid_videos().

@lenn-arts lenn-arts requested a review from matanby as a code owner October 8, 2025 21:40
@lenn-arts
Copy link
Author

@matanby Have time to review this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant