Skip to content

Conversation

@norberttech
Copy link
Member

@norberttech norberttech commented Jan 6, 2026

Resolves: #2125

Change Log


Added

  • Schema::isSame to use it for early exit merge operation when schema is the same
  • TypesMap::flowRowTypes() method for Entry-based type detection in DbalLoader
  • EnumType to TypesMap FLOW_TYPES mapping (maps to DBAL StringType)

Changed

  • Run benchamrks on CI/CD synchronously
  • Improve loaders benchmarks performance
  • Increase benchmarks iterations on CI/CD to 3
  • Entry type now is directly extracted from Definition
  • DbalLoader now uses Entry-based type detection instead of Schema-based detection for better performance
  • Replaced DbalLoader::withTypesDetector() with withTypesMap() for simpler API
  • DbalLoader now skips processing when rows are empty

Removed

  • DbalTypesDetector class - functionality consolidated into TypesMap
  • DbalLoader::withColumnTypes() method - use withTypesMap() instead

- added Schema::isSame to use it for early exit merge operation when
schema is the same
- use mnemonics for Schema in Rows / Row
- create Schema Definition while entry is initialized
- Entry type now is directly extracted from Definition
- added: TypesMap::flowRowTypes() method for Entry-based type detection
in DbalLoader
- added: EnumType to TypesMap FLOW_TYPES mapping (maps to DBAL
StringType)
- changed: DbalLoader now uses Entry-based type detection instead of
Schema-based detection for better performance
- changed: Replaced DbalLoader::withTypesDetector() with withTypesMap()
for simpler API
- changed: DbalLoader now skips processing when rows are empty
- removed: DbalTypesDetector class - functionality consolidated into
TypesMap
- removed: DbalLoader::withSchema() method - no longer needed with
Entry-based type detection
- removed: DbalLoader::withColumnTypes() method - use withTypesMap()
instead
- removed: $schema parameter from to_dbal_table_insert() and
to_dbal_table_update() DSL functions
@norberttech norberttech added this to the 0.30.0 milestone Jan 6, 2026
@norberttech norberttech moved this from Todo to In Progress in Roadmap Jan 6, 2026
@codecov
Copy link

codecov bot commented Jan 6, 2026

Codecov Report

❌ Patch coverage is 83.43195% with 28 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.10%. Comparing base (bb7b23b) to head (af77904).
⚠️ Report is 2 commits behind head on 1.x.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##              1.x    #2129      +/-   ##
==========================================
+ Coverage   82.87%   83.10%   +0.23%     
==========================================
  Files        1178     1177       -1     
  Lines       41731    41697      -34     
==========================================
+ Hits        34586    34654      +68     
+ Misses       7145     7043     -102     
Components Coverage Δ
etl 90.55% <82.78%> (+0.82%) ⬆️
cli 85.76% <ø> (ø)
lib-array-dot 95.00% <ø> (ø)
lib-azure-sdk 60.01% <ø> (ø)
lib-doctrine-dbal-bulk 95.14% <ø> (ø)
lib-filesystem 80.44% <ø> (ø)
lib-types 88.98% <ø> (+1.51%) ⬆️
lib-parquet 68.25% <ø> (ø)
lib-parquet-viewer 83.04% <ø> (ø)
lib-snappy 90.18% <ø> (ø)
bridge-filesystem-async-aws 90.95% <ø> (ø)
bridge-filesystem-azure 89.38% <ø> (ø)
bridge-monolog-http 96.89% <ø> (ø)
bridge-openapi-specification 91.50% <ø> (ø)
symfony-http-foundation 74.11% <ø> (ø)
adapter-chartjs 86.33% <ø> (ø)
adapter-csv 89.30% <ø> (ø)
adapter-doctrine 92.00% <88.88%> (+0.50%) ⬆️
adapter-elasticsearch 97.02% <ø> (ø)
adapter-google-sheet 97.05% <ø> (ø)
adapter-http 65.94% <ø> (ø)
adapter-json 89.69% <ø> (ø)
adapter-logger 83.33% <ø> (ø)
adapter-meilisearch 97.77% <ø> (ø)
adapter-parquet 79.92% <ø> (ø)
adapter-text 86.84% <ø> (ø)
adapter-xml 82.66% <ø> (ø)
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

- run benchmarks synchrounusly to give them the same amount of resources
- run 3 iterations of each benchmark to get more reliable values
- simplified loaders benchmarks by removing redundant code
@github-actions
Copy link
Contributor

github-actions bot commented Jan 6, 2026

Flow PHP - Benchmarks

Results of the benchmarks from this PR are compared with the results from 1.x branch.

Extractors
+--------------------------+--------------------------------+------+-----+------------------+-------------------+-----------------+
| benchmark                | subject                        | revs | its | mem_peak         | mode              | rstdev          |
+--------------------------+--------------------------------+------+-----+------------------+-------------------+-----------------+
| CSVExtractorBench        | bench_extract_10k              | 1    | 2   | 5.808mb +3.42%   | 324.697ms +10.11% | ±0.26% -72.37%  |
| DbalExtractorBench       | bench_extract_10k_keyset       | 1    | 2   | 35.135mb +18.38% | 246.852ms +20.55% | ±0.71% -38.94%  |
| DbalExtractorBench       | bench_extract_10k_limit_offset | 1    | 2   | 35.120mb +18.50% | 212.500ms +19.11% | ±2.69% +37.62%  |
| ExcelExtractorBench      | bench_extract_10k_ods          | 1    | 2   | 48.760mb +37.69% | 979.569ms +24.09% | ±0.60% +3.65%   |
| ExcelExtractorBench      | bench_extract_10k_xlsx         | 1    | 2   | 49.623mb +36.90% | 1.605s +15.21%    | ±0.17% -84.71%  |
| JsonExtractorBench       | bench_extract_10k              | 1    | 2   | 6.533mb +2.04%   | 1.163s +29.80%    | ±0.32% -46.91%  |
| ParquetExtractorBench    | bench_extract_10k              | 1    | 2   | 11.852mb -0.71%  | 10.031s +1.84%    | ±0.11% -46.30%  |
| PostgreSqlExtractorBench | bench_extract_10k_cursor       | 1    | 2   | 37.632mb +15.04% | 267.635ms +20.63% | ±0.35% -79.65%  |
| PostgreSqlExtractorBench | bench_extract_10k_keyset       | 1    | 2   | 37.662mb +14.94% | 450.355ms +8.45%  | ±0.68% -32.06%  |
| PostgreSqlExtractorBench | bench_extract_10k_limit_offset | 1    | 2   | 37.654mb +14.96% | 406.491ms +5.65%  | ±0.12% -91.25%  |
| TextExtractorBench       | bench_extract_10k              | 1    | 2   | 5.807mb +3.27%   | 59.864ms +8.03%   | ±0.30% -26.78%  |
| XmlExtractorBench        | bench_extract_10k              | 1    | 2   | 5.827mb +3.27%   | 673.999ms +17.75% | ±1.91% +106.47% |
+--------------------------+--------------------------------+------+-----+------------------+-------------------+-----------------+
Transformers
+---------------------------------+--------------------------+------+-----+-------------------+------------------+----------------+
| benchmark                       | subject                  | revs | its | mem_peak          | mode             | rstdev         |
+---------------------------------+--------------------------+------+-----+-------------------+------------------+----------------+
| RenameEntryTransformerBench     | bench_transform_10k_rows | 1    | 2   | 124.679mb +46.02% | 88.840ms -31.38% | ±0.44% -63.20% |
| RenameEachEntryTransformerBench | bench_transform_10k_rows | 1    | 2   | 19.874mb +34.83%  | 77.624ms -5.51%  | ±0.35% -69.88% |
+---------------------------------+--------------------------+------+-----+-------------------+------------------+----------------+
Loaders
+-----------------------+---------------------+------+-----+-------------------+-------------------+-----------------+
| benchmark             | subject             | revs | its | mem_peak          | mode              | rstdev          |
+-----------------------+---------------------+------+-----+-------------------+-------------------+-----------------+
| CSVLoaderBench        | bench_load_10k      | 1    | 2   | 47.962mb +112.86% | 113.705ms +45.46% | ±0.72% +132.91% |
| DbalLoaderBench       | bench_load_10k      | 1    | 2   | 35.105mb +208.93% | 414.732ms -62.88% | ±1.27% +0.90%   |
| ExcelLoaderBench      | bench_load_10k_ods  | 1    | 2   | 6.741mb +1420.99% | 269.765ms -77.98% | ±0.99% +99.95%  |
| ExcelLoaderBench      | bench_load_10k_xlsx | 1    | 2   | 11.409mb +800.51% | 275.518ms -86.49% | ±0.03% -92.26%  |
| JsonLoaderBench       | bench_load_10k      | 1    | 2   | 82.794mb +23.31%  | 96.855ms -20.89%  | ±0.74% +11.02%  |
| ParquetLoaderBench    | bench_load_10k      | 1    | 2   | 794.702mb -83.63% | 2.158s -91.31%    | ±0.37% -62.24%  |
| PostgreSqlLoaderBench | bench_load_10k      | 1    | 2   | 37.642mb +192.64% | 380.536ms -55.38% | ±0.11% -93.30%  |
| TextLoaderBench       | bench_load_10k      | 1    | 2   | 19.264mb +456.83% | 33.695ms -28.78%  | ±0.55% -54.80%  |
+-----------------------+---------------------+------+-----+-------------------+-------------------+-----------------+
Building Blocks
+-------------------+-----------------------------------+------+-----+-------------------+-------------------+-----------------+
| benchmark         | subject                           | revs | its | mem_peak          | mode              | rstdev          |
+-------------------+-----------------------------------+------+-----+-------------------+-------------------+-----------------+
| EntryFactoryBench | bench_entry_factory               | 1    | 2   | 107.276mb +19.16% | 577.323ms +21.17% | ±0.95% +42.61%  |
| EntryFactoryBench | bench_entry_factory               | 1    | 2   | 56.673mb +18.25%  | 289.971ms +20.06% | ±0.75% +8.96%   |
| EntryFactoryBench | bench_entry_factory               | 1    | 2   | 16.355mb +10.68%  | 63.899ms +20.53%  | ±0.15% -55.86%  |
| RowsBench         | bench_chunk_1_000_on_10k          | 1    | 2   | ERR               | 237.500μs +0.00%  | ±0.21% +0.00%   |
| RowsBench         | bench_diff_left_100_on_1k         | 1    | 2   | ERR               | 84.347ms +0.00%   | ±0.27% +0.00%   |
| RowsBench         | bench_diff_right_100_on_1k        | 1    | 2   | ERR               | 82.329ms +0.00%   | ±2.46% +0.00%   |
| RowsBench         | bench_drop_100_on_1k              | 1    | 2   | ERR               | 16.000μs +0.00%   | ±6.25% +0.00%   |
| RowsBench         | bench_drop_right_10_on_1k         | 1    | 2   | ERR               | 16.000μs +0.00%   | ±0.00% 0.00%    |
| RowsBench         | bench_entries_on_1k               | 1    | 2   | ERR               | 64.500μs +0.00%   | ±8.53% +0.00%   |
| RowsBench         | bench_filter_on_1k                | 1    | 2   | ERR               | 305.495μs +0.00%  | ±0.82% +0.00%   |
| RowsBench         | bench_find_on_1k                  | 1    | 2   | ERR               | 305.000μs +0.00%  | ±3.93% +0.00%   |
| RowsBench         | bench_find_one_on_1k              | 1    | 2   | ERR               | 297.499μs +0.00%  | ±0.17% +0.00%   |
| RowsBench         | bench_first_on_1k                 | 1    | 2   | ERR               | 2.000μs +0.00%    | ±0.00% 0.00%    |
| RowsBench         | bench_merge_100_on_1k             | 1    | 2   | ERR               | 13.500μs +0.00%   | ±3.70% +0.00%   |
| RowsBench         | bench_partition_by_on_1k          | 1    | 2   | ERR               | 1.535ms +0.00%    | ±4.36% +0.00%   |
| RowsBench         | bench_schema_on_1k_identical_rows | 1    | 2   | ERR               | 221.979ms +0.00%  | ±0.05% +0.00%   |
| RowsBench         | bench_sort_asc_on_1k              | 1    | 2   | 93.812mb +18.14%  | 4.974ms -88.30%   | ±0.41% -85.56%  |
| RowsBench         | bench_sort_by_on_1k               | 1    | 2   | 93.813mb +18.16%  | 5.014ms -87.84%   | ±0.03% -96.69%  |
| RowsBench         | bench_sort_desc_on_1k             | 1    | 2   | 93.812mb +18.17%  | 5.015ms -87.88%   | ±0.79% -45.47%  |
| RowsBench         | bench_sort_entries_on_1k          | 1    | 2   | 95.871mb +16.46%  | 5.023ms -49.67%   | ±1.22% -2.62%   |
| RowsBench         | bench_unique_on_1k                | 1    | 2   | 112.569mb -1.46%  | 457.064ms +30.77% | ±3.17% +108.37% |
| TypeDetectorBench | bench_type_detector               | 1    | 2   | 43.534mb -0.00%   | 332.762ms -0.98%  | ±0.17% -75.84%  |
| TypeDetectorBench | bench_type_detector               | 1    | 2   | 12.592mb -0.01%   | 67.152ms +2.01%   | ±0.70% +15.77%  |
+-------------------+-----------------------------------+------+-----+-------------------+-------------------+-----------------+
Parquet Library
+--------------------+---------------------------------+------+-----+------------------+-------------------+-----------------+
| benchmark          | subject                         | revs | its | mem_peak         | mode              | rstdev          |
+--------------------+---------------------------------+------+-----+------------------+-------------------+-----------------+
| ParquetWriterBench | bench_write_batch               | 1    | 2   | 12.704mb -12.72% | 193.990ms -6.30%  | ±0.29% -44.95%  |
| ParquetWriterBench | bench_write_gzip                | 1    | 2   | 11.364mb +0.00%  | 219.070ms +6.93%  | ±0.62% -27.28%  |
| ParquetWriterBench | bench_write_row_by_row          | 1    | 2   | 12.704mb -12.72% | 195.834ms -5.42%  | ±1.51% +362.87% |
| ParquetWriterBench | bench_write_snappy              | 1    | 2   | 12.704mb -12.72% | 192.496ms -7.47%  | ±0.20% -71.25%  |
| ParquetWriterBench | bench_write_uncompressed        | 1    | 2   | 11.088mb +0.00%  | 190.914ms +11.56% | ±0.99% +79.26%  |
| ParquetReaderBench | bench_page_headers              | 1    | 2   | 7.897mb -0.02%   | 1.883s +1.95%     | ±1.55% -8.33%   |
| ParquetReaderBench | bench_read_metadata             | 1    | 2   | 6.526mb -0.02%   | 8.277ms +0.06%    | ±0.57% -51.98%  |
| ParquetReaderBench | bench_read_schema               | 1    | 2   | 6.526mb -0.02%   | 8.269ms +0.51%    | ±0.23% -90.89%  |
| ParquetReaderBench | bench_read_values_all_columns   | 1    | 2   | 10.102mb -0.20%  | 5.558s -25.60%    | ±0.06% -87.93%  |
| ParquetReaderBench | bench_read_values_single_column | 1    | 2   | 7.399mb -0.27%   | 215.170ms -52.86% | ±0.46% -40.63%  |
| ParquetReaderBench | bench_read_values_with_limit    | 1    | 2   | 7.927mb -0.41%   | 18.716ms -19.96%  | ±0.04% -97.52%  |
+--------------------+---------------------------------+------+-----+------------------+-------------------+-----------------+

@norberttech norberttech merged commit 6d3ab28 into 1.x Jan 6, 2026
23 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in Roadmap Jan 6, 2026
@norberttech norberttech deleted the schema-performance branch January 6, 2026 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

[Proposal]: possibly cache `=Flow\ETL\Rows::schema

2 participants