faster-whisper-server / docs /installation.md
Fedir Zadniprovskyi
chore: update volume names and mount points
a5d79bf
|
raw
history blame
2.99 kB

Docker Compose (Recommended)

TODO: just reference the existing compose file in the repo === "CUDA"

```yaml
# https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html
services:
  faster-whisper-server:
    image: fedirz/faster-whisper-server:latest-cuda
    name: faster-whisper-server
    restart: unless-stopped
    ports:
      - 8000:8000
    volumes:
      - hf-hub-cache:/home/ubuntu/.cache/huggingface/hub
    deploy:
      resources:
        reservations:
          devices:
            - capabilities: ["gpu"]
volumes:
  hf-hub-cache:
```

=== "CUDA (with CDI feature enabled)"

```yaml
# https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html
services:
  faster-whisper-server:
    image: fedirz/faster-whisper-server:latest-cuda
    name: faster-whisper-server
    restart: unless-stopped
    ports:
      - 8000:8000
    volumes:
      - hf-hub-cache:/home/ubuntu/.cache/huggingface/hub
    deploy:
      resources:
        reservations:
          # https://docs.docker.com/reference/cli/dockerd/#enable-cdi-devices
          # https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/cdi-support.html
          devices:
            - driver: cdi
              device_ids:
              - nvidia.com/gpu=all
volumes:
  hf-hub-cache:
```

=== "CPU"

```yaml
services:
  faster-whisper-server:
    image: fedirz/faster-whisper-server:latest-cpu
    name: faster-whisper-server
    restart: unless-stopped
    ports:
      - 8000:8000
    volumes:
      - hf-hub-cache:/home/ubuntu/.cache/huggingface/hub
volumes:
  hf-hub-cache:
```

Docker

=== "CUDA"

```bash
docker run --rm --detach --publish 8000:8000 --name faster-whisper-server --volume hf-hub-cache:/home/ubuntu/.cache/huggingface/hub --gpus=all fedirz/faster-whisper-server:latest-cuda
```

=== "CUDA (with CDI feature enabled)"

```bash
docker run --rm --detach --publish 8000:8000 --name faster-whisper-server --volume hf-hub-cache:/home/ubuntu/.cache/huggingface/hub --device=nvidia.com/gpu=all fedirz/faster-whisper-server:latest-cuda
```

=== "CPU"

```bash
docker run --rm --detach --publish 8000:8000 --name faster-whisper-server --volume hf-hub-cache:/home/ubuntu/.cache/huggingface/hub fedirz/faster-whisper-server:latest-cpu
```

Kubernetes

WARNING: it was written few months ago and may be outdated. Please refer to this blog post

Python (requires Python 3.12+)

git clone https://github.com/fedirz/faster-whisper-server.git
cd faster-whisper-server
uv venv
sourve .venv/bin/activate
uv sync --all-extras
uvicorn --factory --host 0.0.0.0 faster_whisper_server.main:create_app