ADBC (Arrow Database Connectivity) is a database access API designed around Apache Arrow, enabling efficient columnar data interchange between applications and databases. Unlike traditional row-oriented interfaces, ADBC allows you to retrieve query results directly as Arrow tables, avoiding costly row-to-column conversions.
This notebook walks through connecting to ClickHouse from Python using ADBC with the standard DBAPI interface. You'll learn how to execute queries and fetch results as Arrow tables.
Requirements:
- Python 3
- ClickHouse or Docker
Setup¶
Install the required dependencies:
%pip install -q dbc adbc-driver-manager pyarrow
Note: you may need to restart the kernel to use updated packages.
Install the ClickHouse ADBC driver:
!dbc install -q --pre clickhouse
If you don't already have a ClickHouse instance running, start an instance with Docker:
!docker run -d --rm --name some-clickhouse-server \
-p 8123:8123 \
-e CLICKHOUSE_USER=user \
-e CLICKHOUSE_PASSWORD=pass \
clickhouse/clickhouse-server
1d1feb7c837387e9ee097e238eee1e81023aa4646703cf48c2a7703411aa6777
Import PyArrow and the ADBC driver manager:
import pyarrow as pa
from adbc_driver_manager import dbapi
Connection and Cursor¶
Create a DBAPI-style connection with the ClickHouse database:
connection = dbapi.connect(
driver="clickhouse",
db_kwargs={"uri": "http://localhost:8123", "username": "user", "password": "pass"},
autocommit=True,
)
Create a cursor:
cursor = connection.cursor()
Query Execution¶
Execute a SQL query and get the results via the normal, row-oriented DBAPI interface:
cursor.execute("SELECT 1 AS id, 'Alice' AS name")
cursor.fetchone()
(1, 'Alice')
Alternatively, get the results as a PyArrow Table:
cursor.execute("SELECT 1 AS id, 'Alice' AS name")
cursor.fetch_arrow_table()
pyarrow.Table id: uint8 not null name: string not null ---- id: [[1]] name: [["Alice"]]
Parameters can be bound to queries by passing a single-row PyArrow Table where column names match the {name: Type} placeholders in the query:
cursor.execute(
"SELECT {num: UInt64} + 1 AS favorite_num",
pa.table({"num": pa.array([2], type=pa.uint64())}),
)
cursor.fetch_arrow_table()
pyarrow.Table favorite_num: uint64 not null ---- favorite_num: [[3]]
Cleanup¶
Close the cursor and connection:
cursor.close()
connection.close()
If you ran ClickHouse with Docker earlier, stop and remove the container:
!docker stop some-clickhouse-server
some-clickhouse-server