🦆 DuckDB

DuckDB provides an OLAP embedded database in the style of SQLite. It's fast, robust and conceptually small so you can get started with analytics easily. Lots of language supported (NodeJS, Python, etc) -- so you can get started quickly!

An in-process SQL OLAP database management system
DuckDB is an in-process SQL OLAP database management system. Simple, feature-rich, fast & open source.

DuckDB (code) is the SQLite of OLAP queries. That's high praise, since SQLite is the most widely deployed database on the planet, and is legendary for it's stability, performance (in use-cases that fit) and testing.

Duck DB has lots of platform support:

  • NodeJS
  • Python
  • R
  • Java
  • Julia
  • C++

And it's only growing in popularity. Check out the DuckDB documentation for yourself – it's a well-organized read and should make it easy to understand how DuckDB works.

Much like SQLite, DuckDB has an excellent "why DuckDB" page that you should read.

Getting started with DuckDB

If you've got some data (for example in a CSV), here's how to get started with DuckDB.

First, import the data:

CREATE TABLE your_table AS SELECT * FROM read_csv_auto('your-input.csv');

(yep, it's that easy!)

Then you can select and start to do normal queries from that table:

SELECT * FROM your_table;

Of course, there's a wide range of SQL-standard statements supported, you can read more about SELECT via the documentation:

-- select all columns from the table "tbl"
SELECT * FROM tbl;
-- select the rows from tbl
SELECT j FROM tbl WHERE i=3;
-- perform an aggregate grouped by the column "i"
SELECT i, SUM(j) FROM tbl GROUP BY i;
-- select only the top 3 rows from the tbl
SELECT * FROM tbl ORDER BY i DESC LIMIT 3;
-- join two tables together using the USING clause
SELECT * FROM t1 JOIN t2 USING(a, b);