226 lines
5.8 KiB
Markdown
226 lines
5.8 KiB
Markdown
# qroissant
|
|
|
|
qroissant is a minimal q/kdb+ IPC client library with first-class support for the Apache Arrow ecosystem.
|
|
|
|
- **Lightweight** — qroissant is a minimal library weighing in at less than 4 MiB with no required dependencies.
|
|
- **Fast** — qroissant is written in Rust, a safe and high-performance systems programming language. Moreover, qroissant uses your system resources to the best extent possible by leveraging zero-copy, multithreading, and other vectorization techniques such as SIMD.
|
|
- **Modular** — qroissant relies heavily on the [Apache Arrow PyCapsule Interface](https://arrow.apache.org/docs/format/CDataInterface/PyCapsuleInterface.html) for communicating with other libraries from the Apache Arrow ecosystem with zero-copy. This includes pyarrow, polars, duckdb, pandas, datafusion, and more.
|
|
- **Type hints** — qroissant provides type annotations for all of its functionality.
|
|
|
|
---
|
|
|
|
## Installation
|
|
|
|
```bash
|
|
pip install qroissant
|
|
```
|
|
|
|
Requires Python 3.10+. Wheels are available for Linux (x86\_64, aarch64), macOS (universal2), and Windows (x86\_64).
|
|
|
|
---
|
|
|
|
## Quick start
|
|
|
|
### Connect and query
|
|
|
|
```python
|
|
import qroissant as q
|
|
|
|
endpoint = q.Endpoint.tcp("localhost", 5000)
|
|
|
|
with q.Connection(endpoint) as conn:
|
|
result = conn.query("select from trade where date = .z.d")
|
|
print(result) # Table
|
|
```
|
|
|
|
### To Arrow / Polars / PyArrow
|
|
|
|
Decoded values implement the Arrow PyCapsule protocol — pass them straight to any Arrow-aware library:
|
|
|
|
```python
|
|
import polars as pl
|
|
import pyarrow as pa
|
|
|
|
with q.Connection(endpoint) as conn:
|
|
table = conn.query("select from trade")
|
|
|
|
# zero-copy — no intermediate Python objects
|
|
df = pl.from_arrow(table)
|
|
pa_table = pa.RecordBatch.from_batches([pa.record_batch(table)])
|
|
```
|
|
|
|
### Async
|
|
|
|
```python
|
|
import asyncio
|
|
import qroissant as q
|
|
|
|
async def main():
|
|
endpoint = q.Endpoint.tcp("localhost", 5000)
|
|
async with q.AsyncConnection(endpoint) as conn:
|
|
result = await conn.query("1 + 1")
|
|
print(result) # Atom → 2
|
|
|
|
asyncio.run(main())
|
|
```
|
|
|
|
### Connection pool
|
|
|
|
```python
|
|
pool_opts = q.PoolOptions(
|
|
max_size=10,
|
|
min_idle=2,
|
|
checkout_timeout_ms=5_000,
|
|
test_on_checkout=True,
|
|
)
|
|
|
|
with q.Pool(endpoint, pool=pool_opts) as pool:
|
|
pool.prewarm() # open idle connections eagerly
|
|
result = pool.query("count trade") # checked out and returned automatically
|
|
print(pool.metrics()) # PoolMetrics(connections=2, idle=2, …)
|
|
```
|
|
|
|
### Streaming raw response
|
|
|
|
For large results you can stream the raw IPC bytes before decoding:
|
|
|
|
```python
|
|
with q.Connection(endpoint) as conn:
|
|
with conn.query("select from trade", raw=True) as resp:
|
|
print(resp.header) # MessageHeader(size=…, compression=…)
|
|
value = resp.decode() # decode on demand
|
|
```
|
|
|
|
### Standalone encode / decode
|
|
|
|
```python
|
|
# decode an IPC payload you already have
|
|
payload: bytes = ...
|
|
value = q.decode(payload)
|
|
|
|
# encode a value back to IPC bytes
|
|
frame = q.encode(value, message_type=q.MessageType.SYNCHRONOUS)
|
|
```
|
|
|
|
---
|
|
|
|
## Value types
|
|
|
|
Every `conn.query()` call returns a `Value` subclass:
|
|
|
|
| q type | Python type | Arrow export |
|
|
|--------|------------|--------------|
|
|
| scalar (atom) | `Atom` | `__arrow_c_array__` |
|
|
| typed list | `Vector` | `__arrow_c_array__` |
|
|
| mixed list | `List` | `__arrow_c_array__` |
|
|
| dictionary | `Dictionary` | `__arrow_c_array__` (StructArray) |
|
|
| table | `Table` | `__arrow_c_stream__` |
|
|
|
|
---
|
|
|
|
## Decode options
|
|
|
|
Control how IPC data is projected into Arrow:
|
|
|
|
```python
|
|
opts = (
|
|
q.DecodeOptions.builder()
|
|
.with_symbol_interpretation(q.SymbolInterpretation.DICTIONARY) # dict-encode symbols
|
|
.with_temporal_nulls(True) # map q null sentinels → None
|
|
.with_treat_infinity_as_null(True) # map ±∞ → None
|
|
.with_parallel(True) # decode table columns in parallel
|
|
.build()
|
|
)
|
|
|
|
with q.Connection(endpoint, options=opts) as conn:
|
|
result = conn.query("select from trade")
|
|
```
|
|
|
|
---
|
|
|
|
## Endpoints
|
|
|
|
```python
|
|
# TCP
|
|
endpoint = q.Endpoint.tcp(
|
|
"localhost", 5000,
|
|
username="user",
|
|
password="pass",
|
|
timeout_ms=3_000,
|
|
)
|
|
|
|
# Unix domain socket
|
|
endpoint = q.Endpoint.unix(
|
|
"/tmp/qroissant.sock",
|
|
username="user",
|
|
password="pass",
|
|
)
|
|
```
|
|
|
|
---
|
|
|
|
## Error handling
|
|
|
|
```python
|
|
from qroissant import (
|
|
QroissantError, # base class
|
|
DecodeError, # malformed IPC payload
|
|
ProtocolError, # bad frame header
|
|
TransportError, # socket / IO failure
|
|
QRuntimeError, # q process returned an error
|
|
PoolError, # pool management failure
|
|
PoolClosedError, # operation on a closed pool
|
|
)
|
|
|
|
try:
|
|
result = conn.query("invalid expression")
|
|
except q.QRuntimeError as e:
|
|
print(f"q error: {e}")
|
|
except q.TransportError as e:
|
|
print(f"connection lost: {e}")
|
|
```
|
|
|
|
---
|
|
|
|
## Architecture
|
|
|
|
qroissant is organized as a Rust workspace with strict crate boundaries:
|
|
|
|
```
|
|
crates/
|
|
├── qroissant-core # q protocol, value types, encode/decode
|
|
├── qroissant-transport # sync & async TCP/Unix socket connections
|
|
├── qroissant-arrow # zero-copy Arrow projection
|
|
├── qroissant-kernels # SIMD / nightly-sensitive hot paths
|
|
└── qroissant-python # PyO3 bindings (the _native extension module)
|
|
```
|
|
|
|
The Python package at `python/qroissant/` re-exports everything from the compiled `_native` extension. The `.pyi` stub files in that directory define the public API contract.
|
|
|
|
---
|
|
|
|
## Development
|
|
|
|
```bash
|
|
# Install Python dependencies
|
|
uv sync --group dev --group docs
|
|
|
|
# Build the Rust extension (required before running Python tests)
|
|
uv run maturin develop
|
|
|
|
# Run tests
|
|
uv run pytest
|
|
cargo test --workspace
|
|
|
|
# Lint and format
|
|
uv run ruff check python/ tests/
|
|
cargo fmt --all
|
|
```
|
|
|
|
Transport integration tests require a q binary. Set `Q_BIN` to the path of your q executable before running `pytest`.
|
|
|
|
---
|
|
|
|
## License
|
|
|
|
Apache 2.0 — see [LICENSE](LICENSE).
|