Skip to content

Modern36/jdc_reader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Journal Digital Corpus Reader

A web-based search interface for the Journal Digital Corpus - transcripts from Swedish historical newsreels (SF Veckorevy).

Features

  • Full-text search across ~6,800 transcript files
  • Fuzzy search for finding matches despite OCR/ASR errors
  • Filter by transcript type (speech/intertitle), collection, and year
  • Side-by-side viewer showing speech and intertitle transcripts with timestamps
  • Shareable URLs for bookmarking searches and specific videos
  • Client-side only - loads corpus directly from Zenodo, no backend required

Usage

Visit the hosted version at: https://[username].github.io/jdc_browser/

Or run locally:

git clone https://github.com/[username]/jdc_browser.git
cd jdc_browser
python3 -m http.server 8000
# Open http://localhost:8000

Deployment to GitHub Pages

  1. Push the repository to GitHub
  2. Go to Settings > Pages
  3. Set source to "Deploy from a branch" and select main / root
  4. The site will be available at https://[username].github.io/jdc_browser/

Data Source

The corpus is loaded directly from Zenodo at runtime (~13 MB download). It contains:

  • Speech transcripts: Automatic speech recognition via SweScribe
  • Intertitle transcripts: OCR from silent film text cards via stum

DOI: 10.5281/zenodo.15596191

Source repository: Modern36/journal_digital_corpus

Credits

Developed for the Modern Times 1936 research project at Lund University, Sweden. The project investigates what software "sees," "hears," and "perceives" when pattern recognition technologies such as 'AI' are applied to media historical sources. The project is funded by Riksbankens Jubileumsfond.

License

The Journal Digital Corpus is licensed under the CC-BY-NC 4.0 International license.

References

@article{aspenskog2025journal,
  title={Journal Digital Corpus: Swedish Newsreel Transcriptions},
  author={Aspenskog, Robert and Johansson, Mathias and Snickars, Pelle},
  journal={Journal of Open Humanities Data},
  volume={11},
  number={1},
  year={2025}
}

About

A reader for browsing and searching in Journal Digital Corpus

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published