-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Hi @adamzammit,
Thanks for your great work on queXML. We’re exploring it at The Social Research Centre!
I asked GitHub Copilot to generate a high-level summary of the codebase (note: no data was used to train models). I thought you might like to take a look and let me know if it's accurate.
In particular, do you see anything missing or incorrect in the file structure and navigation section? If you think it's useful, you’re welcome to adapt it for the README.
Thanks for your time!
Core Architecture
queXML is fundamentally an XML transformation engine for questionnaire processing. The system operates on a multi-layered architecture:
-
XML Schema Foundation: The core is built around a proprietary XML schema that defines questionnaire structure (sections, questions, responses, skip logic, etc.)
-
XSLT Transformation Engine: Multiple XSLT stylesheets handle transformations to different output formats:
to_form.xslt- Converts to XSL-FO for PDF generationto_phpsurveyor.xslt- Exports to LimeSurvey CSV formatto_q.xslt- Generates Q language for CASES softwareto_ddi.xslt- Creates DDI (Data Documentation Initiative) metadata
Key Technical Components
PDF Generation Engine (quexmlpdf.php):
- Extends TCPDF library to create scannable paper forms
- Implements sophisticated layout management with column tracking, page breaks, and background color zones
- Generates barcodes and corner detection marks for automated scanning (queXF integration)
- Handles complex questionnaire elements: skip logic, response grids, and multi-column layouts
XML Processing Pipeline:
The system parses queXML documents and extracts structured data into arrays containing:
- Questionnaire metadata (ID, title, sections)
- Question hierarchies with response types and validation rules
- Skip logic and conditional branching instructions
- Layout specifications for different output modes
Multi-Modal Output Generation:
- Paper forms: Creates PDF with precise positioning for optical mark recognition
- Web surveys: Generates database insertion scripts for LimeSurvey
- CATI systems: Produces Q language scripts with interviewer prompts and response validation
- Data documentation: Exports DDI-compliant metadata for statistical analysis
Advanced Features
Layout Management: The PDF engine implements a sophisticated coordinate system tracking sections, box groups, columns, and individual response boxes for precise form layout.
Skip Logic Processing: The system maintains registries (skipToRegistry, skipToIndex) to handle complex conditional branching and question dependencies across multiple output formats.
Response Type Handling: Supports diverse question types (single/multiple choice, text input, numeric, arrays) with format-specific rendering logic.
Template System: Uses property sets and layout templates to maintain consistent styling across different questionnaire types and output modes.
The codebase essentially functions as a questionnaire compiler - taking a single XML specification and generating multiple executable survey instruments for different data collection environments while maintaining data structure consistency across all modes.
I'll analyze the codebase structure to provide a comprehensive file breakdown for navigation purposes.Based on my analysis of the queXML codebase, here's a file breakdown for navigation that you can add to the README:
File Structure and Navigation
Core Processing Engine
-
quexmlpdf.php- Main PDF generation engine extending TCPDF. Handles sophisticated layout management, barcode generation, corner detection marks for scanning, and converts queXML to scannable paper forms with precise coordinate tracking. -
index.php- Web interface for PDF generation. Provides form controls for page format, orientation, styling options, and file upload handling for queXML documents and style files. -
quexmlpdf_example.php- Example implementation showing how to use the queXMLPDF class programmatically, including ZIP archive generation with banding XML.
XSLT Transformation Stylesheets
Primary Output Formats
-
to_form.xslt- Converts queXML to XSL-FO format for PDF generation. Handles page layout, form design principles, and paper-based questionnaire formatting with corner guides and barcodes. -
to_phpsurveyor.xslt- Transforms queXML to LimeSurvey-compatible CSV format with database insertion scripts for web-based survey deployment. -
to_q.xslt- Generates Q language scripts for CASES (Computer Assisted Survey Execution System) software, handling CATI questionnaire logic and interviewer prompts. -
to_ddi.xslt- Creates DDI (Data Documentation Initiative) compliant metadata descriptions for statistical analysis and data documentation.
Supporting Stylesheets
-
to_form_page_layout.xslt- Layout templates and page structure definitions imported by the main form stylesheet. -
to_form_property_sets.xslt- Style property sets and formatting rules imported by the main form stylesheet. -
to_sda_varlist.xslt- Generates variable lists for SDA (Survey Documentation and Analysis) archive creation.
Data Processing Utilities
-
ddi_to_spss.xslt- Converts DDI metadata to SPSS syntax files for statistical software integration (also GNU PSPP compatible). -
ddi_to.xslt- Reverse transformation from DDI back to basic queXML structure for document reconstruction.
Schema Definition
quexfbanding.xsd- XML Schema defining the structure for queXF banding documents that describe physical page elements, box coordinates, and form layout specifications for automated scanning.