Generate stats from MARVA's backend database BSON files.
pip install -r requirements.txtPlace your BSON export files in the bson_data/ directory.
This should be the resourcesProduction.bson from the backup.
You can put multiple resourcesProduction.bson there even if their records
overlap that is fine the program will dedupe based on e number. So put the
backups in that cover all the desired range of time.
python3 extract_published.py <start_date> <end_date>Date format: yyyy-mm-dd
Example:
python3 extract_published.py 2024-10-01 2025-09-30This script:
- Scans all BSON files in
bson_data/ - Filters for published records within the date range
- Deduplicates records by URI across files
- Extracts NAR IDs (n20255...) from XML content (needs to modified for 2026+)
- Outputs JSON files to
output/:records_by_day.json- record counts per dayrecords_by_month.json- record counts per monthrecords_by_user_month.json- record counts by user by monthnars_by_day.json- new NAR counts per daynars_by_month.json- new NAR counts per monthnars_by_user_month.json- new NAR counts by user by monthnars_list.json- list of all NAR URLs found
- Saves sample XML files containing NARs to
xml_data/
After extracting data, generate CSV and PNG reports:
python3 generate_reports.pyThis creates reports in reports/:
marva_records_by_day_<dates>.csv- daily record countsmarva_records_by_day_<dates>.png- histogram of daily recordsmarva_records_by_month_<dates>.csv- monthly record countsmarva_records_by_user_month_<dates>.csv- records by user by monthmarva_nars_by_day_<dates>.csv- daily NAR countsmarva_nars_by_day_<dates>.png- histogram of daily NARsmarva_nars_by_month_<dates>.csv- monthly NAR countsmarva_nars_by_user_month_<dates>.csv- NARs by user by month
All reports include totals and use the date range in filenames.
marva-stats/
├── bson_data/ # Place BSON export files here
├── output/ # JSON intermediate files
├── reports/ # CSV and PNG reports
├── xml_data/ # Sample XML files (gitignored)
├── extract_published.py
├── generate_reports.py
└── requirements.txt