attempt 3
This commit is contained in:
parent
de214336a9
commit
7c3fbd73fd
12 changed files with 594832 additions and 64277 deletions
114
BUILD.md
Normal file
114
BUILD.md
Normal file
|
|
@ -0,0 +1,114 @@
|
|||
# Building qroissant wheels
|
||||
|
||||
This project ships as a [maturin](https://www.maturin.rs/) / PyO3 mixed
|
||||
Rust+Python package. The native extension uses `#![feature(portable_simd)]` so
|
||||
**nightly Rust is required**, pinned via `rust-toolchain.toml`.
|
||||
|
||||
All builds run inside a Docker image (`scripts/Dockerfile.build`) so the host
|
||||
needs only Docker and Python 3 — no Rust, no MSVC SDK, no mingw to install.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Docker (tested with 29.x)
|
||||
- Python 3.11+ on the host, **only** if you want to install/test the produced
|
||||
Linux wheel locally via `scripts/build.sh check`.
|
||||
|
||||
## One-time setup
|
||||
|
||||
```sh
|
||||
scripts/build.sh image
|
||||
```
|
||||
|
||||
Builds `qroissant-build:latest`. Pulls `ghcr.io/rust-cross/cargo-xwin`,
|
||||
installs the nightly toolchain + `x86_64-pc-windows-msvc` target + maturin.
|
||||
~5 min on a cold cache, then it's a no-op.
|
||||
|
||||
## Build commands
|
||||
|
||||
```sh
|
||||
scripts/build.sh linux # -> dist-linux/qroissant-*-manylinux_*_x86_64.whl
|
||||
scripts/build.sh windows # -> dist-windows/qroissant-*-win_amd64.whl
|
||||
scripts/build.sh all # image + linux + windows
|
||||
scripts/build.sh check # install latest linux wheel into .venv, import-smoke,
|
||||
# then `zipfile -l` the windows wheel
|
||||
scripts/build.sh clean # rm dist-linux dist-windows
|
||||
scripts/build.sh clean-cache # also drop the cargo/target/xwin Docker volumes
|
||||
```
|
||||
|
||||
The wheels target **abi3-py311**, so a single artifact per platform covers
|
||||
CPython 3.11, 3.12, 3.13, and onward.
|
||||
|
||||
## What lives where
|
||||
|
||||
| Artifact | Path |
|
||||
| --- | --- |
|
||||
| Build image definition | `scripts/Dockerfile.build` |
|
||||
| Wrapper script | `scripts/build.sh` |
|
||||
| Toolchain pin | `rust-toolchain.toml` |
|
||||
| Linux wheels | `dist-linux/` (gitignored) |
|
||||
| Windows wheels | `dist-windows/` (gitignored) |
|
||||
| Cargo registry cache | Docker volume `qroissant-cargo-registry` |
|
||||
| Cargo target dir | Docker volume `qroissant-target` |
|
||||
| cargo-xwin MSVC cache | Docker volume `qroissant-xwin` |
|
||||
|
||||
The three Docker volumes persist between runs so reruns only rebuild changed
|
||||
crates; nuke them with `scripts/build.sh clean-cache` if anything looks stale.
|
||||
|
||||
## Debug symbols on Windows
|
||||
|
||||
The release profile sets `debug = "line-tables-only"`, so a `_native.pdb` is
|
||||
produced alongside the wheel. `scripts/build.sh windows` copies it into
|
||||
`dist-windows/`. To get resolved frames in a Rust panic backtrace on a Windows
|
||||
machine:
|
||||
|
||||
1. `pip install qroissant-*-win_amd64.whl`
|
||||
2. Locate the installed extension, e.g.
|
||||
`Lib\site-packages\qroissant\_native.pyd`
|
||||
3. Drop `_native.pdb` next to it (same directory, same basename).
|
||||
|
||||
Symbol info is line-tables-only, so frames resolve to `file:line` but local
|
||||
variables and types aren't included. That's enough for backtraces and is
|
||||
~30 MB; full debug info would be much larger.
|
||||
|
||||
## How Windows cross-compile works
|
||||
|
||||
The Windows path uses [cargo-xwin](https://github.com/rust-cross/cargo-xwin),
|
||||
which downloads the Microsoft CRT + Windows SDK headers (cached in the xwin
|
||||
volume; license auto-accepted via `XWIN_ACCEPT_LICENSE=1`). PyO3's
|
||||
`generate-import-lib` feature — enabled in
|
||||
`crates/qroissant-python/Cargo.toml` — lets us produce the `python3.dll`
|
||||
import library at build time, so no Windows Python install is required on the
|
||||
host.
|
||||
|
||||
## Quick wheel sanity checks
|
||||
|
||||
```sh
|
||||
# Listing
|
||||
python3 -m zipfile -l dist-windows/qroissant-*-win_amd64.whl
|
||||
|
||||
# Validate the dist-info
|
||||
pipx run twine check dist-linux/*.whl dist-windows/*.whl
|
||||
```
|
||||
|
||||
## Iterating on Python code
|
||||
|
||||
For Python-side changes (no Rust touched), the fastest loop is still
|
||||
`maturin develop` against a local rustup install. The Docker flow above is
|
||||
optimized for producing wheels, not for inner-loop iteration. If you have a
|
||||
nightly rustup toolchain on the host, you can do:
|
||||
|
||||
```sh
|
||||
python3 -m venv .venv && .venv/bin/pip install 'maturin>=1.8,<2.0'
|
||||
.venv/bin/maturin develop --release
|
||||
.venv/bin/python -c 'import qroissant'
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- **"manylinux_2_34" wheel won't install on old distros** — the image base is
|
||||
Debian 13 (glibc 2.38). For broader compatibility, build inside a
|
||||
manylinux2014 image or run `auditwheel repair` against the produced wheel.
|
||||
- **"failed to generate python3.dll import library"** — means `llvm-dlltool`
|
||||
is missing inside the image. Rebuild it with `scripts/build.sh image`.
|
||||
- **xwin download is slow / fails** — Microsoft's CDN occasionally rate-limits.
|
||||
Drop the cache (`docker volume rm qroissant-xwin`) and retry.
|
||||
10
Cargo.lock
generated
10
Cargo.lock
generated
|
|
@ -868,6 +868,7 @@ version = "0.28.2"
|
|||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "8bf94ee265674bf76c09fa430b0e99c26e319c945d96ca0d5a8215f31bf81cf7"
|
||||
dependencies = [
|
||||
"python3-dll-a",
|
||||
"target-lexicon",
|
||||
]
|
||||
|
||||
|
|
@ -906,6 +907,15 @@ dependencies = [
|
|||
"syn 2.0.117",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "python3-dll-a"
|
||||
version = "0.2.15"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "d80ba7540edb18890d444c5aa8e1f1f99b1bdf26fb26ae383135325f4a36042b"
|
||||
dependencies = [
|
||||
"cc",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "qroissant-arrow"
|
||||
version = "0.3.1"
|
||||
|
|
|
|||
|
|
@ -12,6 +12,11 @@ repository = "https://github.com/qroissant/qroissant"
|
|||
lto = "fat"
|
||||
codegen-units = 1
|
||||
opt-level = 3
|
||||
# Line tables only: keeps optimisations identical to a normal release build
|
||||
# but emits enough info for Rust panic backtraces to resolve to file:line.
|
||||
# On windows-msvc this produces a PDB next to _native.pyd (see BUILD.md).
|
||||
debug = "line-tables-only"
|
||||
strip = "none"
|
||||
|
||||
[workspace.dependencies]
|
||||
pyo3 = "0.28.2"
|
||||
|
|
|
|||
|
|
@ -14,7 +14,7 @@ path = "src/lib.rs"
|
|||
bb8 = "0.9.0"
|
||||
bytes = "1.11.1"
|
||||
chrono = "0.4.44"
|
||||
pyo3 = { workspace = true, features = ["extension-module", "abi3-py311"] }
|
||||
pyo3 = { workspace = true, features = ["extension-module", "abi3-py311", "generate-import-lib"] }
|
||||
pyo3-arrow = { version = "0.17.0", default-features = false }
|
||||
pyo3-async-runtimes = { version = "0.28.0", features = ["tokio-runtime"] }
|
||||
qroissant-arrow = { path = "../qroissant-arrow" }
|
||||
|
|
|
|||
BIN
dist-linux/qroissant-0.3.1-cp311-abi3-manylinux_2_34_x86_64.whl
Normal file
BIN
dist-linux/qroissant-0.3.1-cp311-abi3-manylinux_2_34_x86_64.whl
Normal file
Binary file not shown.
BIN
dist-windows/_native.pdb
Normal file
BIN
dist-windows/_native.pdb
Normal file
Binary file not shown.
530253
dist-windows/_native.pdb.txt
Normal file
530253
dist-windows/_native.pdb.txt
Normal file
File diff suppressed because it is too large
Load diff
Binary file not shown.
File diff suppressed because it is too large
Load diff
4
rust-toolchain.toml
Normal file
4
rust-toolchain.toml
Normal file
|
|
@ -0,0 +1,4 @@
|
|||
[toolchain]
|
||||
channel = "nightly"
|
||||
components = ["rustfmt", "clippy", "rust-src"]
|
||||
targets = ["x86_64-unknown-linux-gnu"]
|
||||
25
scripts/Dockerfile.build
Normal file
25
scripts/Dockerfile.build
Normal file
|
|
@ -0,0 +1,25 @@
|
|||
# Build environment for qroissant.
|
||||
#
|
||||
# Base: ghcr.io/rust-cross/cargo-xwin — Debian 13, ships with cargo-xwin,
|
||||
# llvm-dlltool, and the x86_64-pc-windows-msvc rustup target preinstalled.
|
||||
# We layer on nightly Rust (project uses #![feature(portable_simd)]),
|
||||
# lld (so cargo-xwin can invoke lld-link), and maturin.
|
||||
FROM ghcr.io/rust-cross/cargo-xwin:latest
|
||||
|
||||
ENV DEBIAN_FRONTEND=noninteractive
|
||||
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y --no-install-recommends lld python3-pip python3-venv \
|
||||
&& rm -rf /var/lib/apt/lists/* \
|
||||
&& ln -sf /usr/bin/lld /usr/bin/lld-link \
|
||||
&& ln -sf /usr/bin/clang /usr/bin/clang-cl
|
||||
|
||||
RUN rustup toolchain install nightly --profile minimal \
|
||||
--component rustfmt --component clippy --component rust-src \
|
||||
--component llvm-tools-preview \
|
||||
&& rustup target add --toolchain nightly x86_64-pc-windows-msvc \
|
||||
&& rustup default nightly
|
||||
|
||||
RUN pip3 install --no-cache-dir --break-system-packages 'maturin>=1.8,<2.0'
|
||||
|
||||
ENV PATH="/usr/local/cargo/bin:${PATH}"
|
||||
158
scripts/build.sh
Executable file
158
scripts/build.sh
Executable file
|
|
@ -0,0 +1,158 @@
|
|||
#!/usr/bin/env bash
|
||||
# Build qroissant wheels via Docker. Both Linux and Windows builds run inside
|
||||
# the qroissant-build:latest image (defined in scripts/Dockerfile.build) so the
|
||||
# host needs only Docker + Python.
|
||||
#
|
||||
# Usage:
|
||||
# scripts/build.sh image Build the qroissant-build Docker image.
|
||||
# scripts/build.sh linux Build a manylinux x86_64 wheel -> dist-linux/
|
||||
# scripts/build.sh windows Build a win_amd64 wheel -> dist-windows/
|
||||
# scripts/build.sh all image + linux + windows.
|
||||
# scripts/build.sh check Install the latest linux wheel into .venv
|
||||
# and import qroissant as a smoke test.
|
||||
# scripts/build.sh clean Remove dist-*/ output dirs (volumes kept).
|
||||
# scripts/build.sh clean-cache Also drop the Docker volumes (cargo
|
||||
# registry, target dir, xwin SDK cache).
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)
|
||||
ROOT_DIR=$(cd -- "$SCRIPT_DIR/.." &>/dev/null && pwd)
|
||||
cd "$ROOT_DIR"
|
||||
|
||||
IMAGE=qroissant-build:latest
|
||||
VOL_CARGO=qroissant-cargo-registry
|
||||
VOL_TARGET=qroissant-target
|
||||
VOL_XWIN=qroissant-xwin
|
||||
|
||||
need_docker() {
|
||||
command -v docker >/dev/null 2>&1 || {
|
||||
echo "error: docker is required but not on PATH" >&2
|
||||
exit 1
|
||||
}
|
||||
}
|
||||
|
||||
ensure_volumes() {
|
||||
local uid gid
|
||||
uid=$(id -u); gid=$(id -g)
|
||||
for v in "$VOL_CARGO" "$VOL_TARGET" "$VOL_XWIN"; do
|
||||
if ! docker volume inspect "$v" >/dev/null 2>&1; then
|
||||
docker volume create "$v" >/dev/null
|
||||
fi
|
||||
# Re-chown the volume to the invoking user every run; cheap and idempotent.
|
||||
docker run --rm -v "$v":/v --entrypoint=sh "$IMAGE" -c "chown -R $uid:$gid /v" >/dev/null
|
||||
done
|
||||
}
|
||||
|
||||
ensure_image() {
|
||||
if ! docker image inspect "$IMAGE" >/dev/null 2>&1; then
|
||||
echo ">>> building $IMAGE"
|
||||
docker build -t "$IMAGE" -f "$SCRIPT_DIR/Dockerfile.build" "$SCRIPT_DIR"
|
||||
fi
|
||||
}
|
||||
|
||||
run_in_image() {
|
||||
# Run as the invoking user so produced wheels are owned by them, not root.
|
||||
# CARGO_HOME / XDG_CACHE_HOME are redirected into the persistent volumes,
|
||||
# which must therefore be owned by the same uid (handled by ensure_volumes).
|
||||
docker run --rm \
|
||||
--user "$(id -u):$(id -g)" \
|
||||
-e CARGO_HOME=/cargo \
|
||||
-e XDG_CACHE_HOME=/xdg-cache \
|
||||
-e HOME=/tmp \
|
||||
-e XWIN_ACCEPT_LICENSE=1 \
|
||||
-v "$ROOT_DIR":/io -w /io \
|
||||
-v "$VOL_CARGO":/cargo \
|
||||
-v "$VOL_TARGET":/io/target \
|
||||
-v "$VOL_XWIN":/xdg-cache/cargo-xwin \
|
||||
--entrypoint=sh \
|
||||
"$IMAGE" -c "$1"
|
||||
}
|
||||
|
||||
cmd_image() {
|
||||
need_docker
|
||||
echo ">>> (re)building $IMAGE"
|
||||
docker build -t "$IMAGE" -f "$SCRIPT_DIR/Dockerfile.build" "$SCRIPT_DIR"
|
||||
}
|
||||
|
||||
cmd_linux() {
|
||||
need_docker
|
||||
ensure_image
|
||||
ensure_volumes
|
||||
echo ">>> building Linux wheel -> dist-linux/"
|
||||
run_in_image 'maturin build --release --out /io/dist-linux'
|
||||
ls -1 dist-linux/*.whl
|
||||
}
|
||||
|
||||
cmd_windows() {
|
||||
need_docker
|
||||
ensure_image
|
||||
ensure_volumes
|
||||
echo ">>> building Windows wheel -> dist-windows/"
|
||||
run_in_image '
|
||||
maturin build --release --target x86_64-pc-windows-msvc --out /io/dist-windows
|
||||
# Copy the PDB next to the wheel when one was produced (release profile
|
||||
# has debug = "line-tables-only"). Place it next to _native.pyd at
|
||||
# runtime on Windows to get resolved panic backtraces.
|
||||
pdb=$(find /io/target/x86_64-pc-windows-msvc/release -maxdepth 2 -name "_native.pdb" 2>/dev/null | head -1)
|
||||
if [ -n "$pdb" ]; then
|
||||
cp "$pdb" /io/dist-windows/
|
||||
echo "+ copied $(basename "$pdb") -> dist-windows/"
|
||||
fi
|
||||
'
|
||||
ls -1 dist-windows/
|
||||
}
|
||||
|
||||
cmd_check() {
|
||||
local venv="$ROOT_DIR/.venv"
|
||||
if [[ ! -x "$venv/bin/python" ]]; then
|
||||
echo ">>> creating venv at $venv"
|
||||
python3 -m venv "$venv"
|
||||
"$venv/bin/pip" install --quiet --upgrade pip
|
||||
fi
|
||||
local wheel
|
||||
wheel=$(ls -t dist-linux/qroissant-*-linux*.whl dist-linux/qroissant-*-manylinux*.whl 2>/dev/null | head -1 || true)
|
||||
if [[ -z "$wheel" ]]; then
|
||||
echo "error: no Linux wheel in dist-linux/; run '$0 linux' first" >&2
|
||||
exit 1
|
||||
fi
|
||||
echo ">>> installing $wheel"
|
||||
"$venv/bin/pip" install --quiet --force-reinstall "$wheel"
|
||||
"$venv/bin/python" -c 'import qroissant; print("ok: qroissant", qroissant.__version__ if hasattr(qroissant, "__version__") else "(no __version__)")'
|
||||
if [[ -f dist-windows/qroissant-*-win_amd64.whl ]] 2>/dev/null || ls dist-windows/qroissant-*-win_amd64.whl >/dev/null 2>&1; then
|
||||
echo ">>> inspecting Windows wheel"
|
||||
for w in dist-windows/qroissant-*-win_amd64.whl; do
|
||||
python3 -m zipfile -l "$w" | grep -E '_native\.pyd|WHEEL' || true
|
||||
done
|
||||
fi
|
||||
}
|
||||
|
||||
cmd_clean() {
|
||||
rm -rf dist-linux dist-windows
|
||||
echo ">>> removed dist-linux/ dist-windows/"
|
||||
}
|
||||
|
||||
cmd_clean_cache() {
|
||||
cmd_clean
|
||||
for v in "$VOL_CARGO" "$VOL_TARGET" "$VOL_XWIN"; do
|
||||
docker volume rm "$v" >/dev/null 2>&1 && echo ">>> removed volume $v" || true
|
||||
done
|
||||
}
|
||||
|
||||
case "${1:-}" in
|
||||
image) cmd_image ;;
|
||||
linux) cmd_linux ;;
|
||||
windows) cmd_windows ;;
|
||||
all) cmd_image; cmd_linux; cmd_windows ;;
|
||||
check) cmd_check ;;
|
||||
clean) cmd_clean ;;
|
||||
clean-cache) cmd_clean_cache ;;
|
||||
""|-h|--help)
|
||||
sed -n '2,/^set -euo/p' "${BASH_SOURCE[0]}" | sed 's/^# \{0,1\}//' | sed '/^set -euo/d'
|
||||
;;
|
||||
*)
|
||||
echo "error: unknown command: $1" >&2
|
||||
echo "run '$0 --help' for usage" >&2
|
||||
exit 2
|
||||
;;
|
||||
esac
|
||||
Loading…
Add table
Add a link
Reference in a new issue