You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* September bulletin
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
* Improve identation
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
---------
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
excerpt: "Overview of all work happening in Vortex"
6
+
published: true
7
+
---
8
+
9
+
Its been a month since Vortex was contributed to the Linux Foundation, and its been amazing to see the response and community interest, we start seeing more and more people taking a deep look at Vortex for their various use-cases, and even some moving real production workloads to it!
10
+
11
+
This month we accepted 232 commits from 16 different (mostly) human contributors, together we’ve done a few cool things:
12
+
13
+
1. Added a new Ray Datasource, allowing users to read directories of Vortex files [#4339](https://github.com/vortex-data/vortex/pull/4339)
14
+
1. Improved DuckDB support, we can pass list arrays from/to Vortex [#4169](https://github.com/vortex-data/vortex/pull/4169)
15
+
1. Improved Arrow and Apache DataFusion support [#4157](https://github.com/vortex-data/vortex/pull/4157)[#4227](https://github.com/vortex-data/vortex/pull/4227)[#4237](https://github.com/vortex-data/vortex/pull/4237)[#4180](https://github.com/vortex-data/vortex/pull/4180)
16
+
1. Improved Spark support, including expanding Java platform support to Ubuntu 20.04+ and Amazon Linux 2023 [#4246](https://github.com/vortex-data/vortex/pull/4246)[#4335](https://github.com/vortex-data/vortex/pull/4335)
17
+
1. Added a new scalar type and array - FixedSizeList [#4385](https://github.com/vortex-data/vortex/pull/4385)[#4405](https://github.com/vortex-data/vortex/pull/4405)[#4428](https://github.com/vortex-data/vortex/pull/4428)
18
+
1. Started laying down the groundwork for major changes in the IO system and better ways of supporting different async runtimes, more details to come soon!
19
+
1. Significantly improved the Rust API, by improving test and documentation coverage, made APIs more consistent and fixed more bugs and other small issues than we can count.
20
+
1. Added two new benchmarks to our benchmarking suite:
21
+
- TPC-DS, comparing DuckDB and Apache Datafusion, running on Vortex, Parquet and DuckDB’s native format [#4155](https://github.com/vortex-data/vortex/pull/4155)
22
+
- StatPopGen, our own collection of [Statistical Genetics](https://en.wikipedia.org/wiki/Statistical_genetics) queries rendered in SQL over a [VCF-like](https://en.wikipedia.org/wiki/Variant_Call_Format) schema. Our [benchmarks website](https://bench.vortex.dev/?group=Statistical+and+Population+Genetics) presents the results, per commit, of executing these queries, in DuckDB, on 100,000 rows of [gnomAD's 1kg+HGDP dataset](https://gnomad.broadinstitute.org/news/2020-10-gnomad-v3-1-new-content-methods-annotations-and-data-availability/#the-gnomad-hgdp-and-1000-genomes-callset) stored in Parquet, Vortex, and Vortex-Compact (a version of vortex with the [pco](https://github.com/pcodec/pcodec) and zstd encodings). This is the only benchmark in our suite which includes queries on list-typed columns. [#4175](https://github.com/vortex-data/vortex/pull/4175)
23
+
24
+
25
+
We intend to start yanking some older Vortex version that have known bugs, in order to improve the experience for new users while we improve overall stability and correctness.
26
+
27
+
We want to thank to anyone who has tried Vortex, provided feedback, asked question and filed issues.
28
+
29
+
Special thanks go for all the contributors who took the time and care to contribute to Vortex this month (in descending count of commits):
0 commit comments