Replies: 15 comments 2 replies
-
|
the current favorite API is the SQL API. But I would love to have a great native Spark API. Currently using sqlframe for this. the relational API feels for me like mixture of magic strings (like the sql api) + some python native API. |
Beta Was this translation helpful? Give feedback.
-
|
My vote comes from the idea that I wasn't aware of the other options, if I am being honest. The Spark API has my antenna up. |
Beta Was this translation helpful? Give feedback.
-
|
I prefer Ibis as it aligns lot more with dataframe way of thinking and the code is significantly more compact than sql. Any duckdb feature not supported by Ibis can be called using inline sql or a UDF. |
Beta Was this translation helpful? Give feedback.
-
|
We used the DuckDB Python API to run our SQL worker layer for processing Snowflake BI queries without turning on a warehouse. This heavily leverages the SQL api. |
Beta Was this translation helpful? Give feedback.
-
|
SQL API or Ibis, depends on the actual use case. |
Beta Was this translation helpful? Give feedback.
-
|
SQL Mostly with DBT. Some Ibis. * Edit: SQLMesh soon. |
Beta Was this translation helpful? Give feedback.
-
|
Both myself and colleagues have run into difficult to track down bugs using the relational API so we try to do everything using explicit SQL, which results in fewer surprises. In particular, we've had issues when passing duckdbpyrelations into functions In summary I like the idea of the relational API, and use it for simple scripts, but steer clear of it for library/prod code |
Beta Was this translation helpful? Give feedback.
-
|
I’m mainly using DuckDB through Ibis. We have quite a bit of data in BigQuery. Ibis lets me run some operations in both BigQuery or DuckDB without changing much code at all. It's great when I do dev and experimentation with DuckDB and then move to prod in BigQuery. Also personal preference as I really like R's tidyverse syntax and Ibis is the closest there is in python. |
Beta Was this translation helpful? Give feedback.
-
|
I use the relational API pretty heavily, and TBH it feels really messy and inconsistent at the moment. I'm never quite sure if I'm supposed to provide a Python |
Beta Was this translation helpful? Give feedback.
-
|
Previously I've used the SQL API in conjunction with Pandas and PyODBC within Jupyter Lab to mix and match queries against various databases and to be able to query dataframes in memory. As I'm exploring Marimo now, I'm starting to rely on DuckDB more directly where DuckDB and Pandas overlap, and Marimo's SQL cells make it much more pleasant to query in almost an IDE with autocomplete and linting instead of simply wrangling text like I'm used to doing in Jupyter Lab. Love this stuff, can't get enough of the SQL API and trying to encourage Python and CLI usage of DuckDB throughout my company where it makes sense for automation and reporting analytics 💙😎🙌 |
Beta Was this translation helpful? Give feedback.
-
|
The SQL API. I would prefer to see an official SQLAlchemy dialect/adapter maintained rather than developing own Relational API. |
Beta Was this translation helpful? Give feedback.
-
|
Would love to use Spark API but it lacked too many features. Settled on Relational API but it is messy. Would prefer to avoid SQL strings manipulation as much as possible. |
Beta Was this translation helpful? Give feedback.
-
|
The SQL Api. I just love SQL... I'm familiar with it and it's waaaay more simple than having to learn a new library. I use it on Amphi ( https://github.com/amphi-ai/amphi-etl ) to do some SQL on a pandas dataframe but also as execution engine for some tools (such as Join or Compare Dataframe). Best regards, Simon |
Beta Was this translation helpful? Give feedback.
-
|
I've started using Marimo for all queries with any degree of complexity. Autocompletion and autoformatting are very helpful. |
Beta Was this translation helpful? Give feedback.
-
|
I integrated DuckDB’s Spark-compatible API so the AI agent can execute and validate generated PySpark logic locally, catching syntax and semantic issues early while avoiding the overhead of spinning up a real Spark environment. So my vote will go for DuckDB's Spark API |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Maybe you come from the SQL world and are just looking to automate some queries against your warehouse...
Maybe you come from the Spark world and want to have a more
DataFrame -> func() -> DataFramekind of workflow...We want to know what API do you use the most (or would love to use more if it had better support)! The options are:
It can also be the case that you like neither of this, in which case, we also want to know!
PS. If you like using projects like ibis or narwhals, let us know why in the comments too!
124 votes ·
Beta Was this translation helpful? Give feedback.
All reactions