🎮 NVIDIA Jetson Setup

Deploy SeaClip spoke on NVIDIA Jetson devices with GPU acceleration

Supported Devices

Device GPU RAM LLM Capability
Jetson AGX Orin 2048 CUDA cores 32-64 GB ✅ 70B models
Jetson Orin NX 1024 CUDA cores 8-16 GB ✅ 13B models
Jetson Orin Nano 512 CUDA cores 4-8 GB ✅ 7B models
Jetson AGX Xavier 512 CUDA cores 16-32 GB ✅ 13B models
Jetson Xavier NX 384 CUDA cores 8 GB ✅ 7B models
Jetson Nano 128 CUDA cores 4 GB ⚠️ Small models only

Prerequisites

JetPack SDK

Ensure JetPack 5.0+ is installed. Check version:

bash
cat /etc/nv_tegra_release
# or
dpkg -l | grep nvidia-jetpack

One-Line Install

bash
curl -sSL https://raw.githubusercontent.com/t4tarzan/seaclip/main/scripts/spoke-install.sh | bash -s -- --hub http://YOUR_HUB_IP:51842 --type jetson

Manual Installation

Step 1: System Update

bash
sudo apt update && sudo apt upgrade -y

Step 2: Install Node.js 20

bash
# NodeSource for ARM64
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt-get install -y nodejs

# Install pnpm
sudo npm install -g pnpm

Step 3: Install Ollama with CUDA

bash
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Verify GPU detection
ollama run llama3 "test" --verbose 2>&1 | grep -i gpu

Step 4: Clone and Configure

bash
sudo mkdir -p /opt/seaclip-spoke
sudo chown $USER:$USER /opt/seaclip-spoke

git clone --depth 1 https://github.com/t4tarzan/seaclip.git /opt/seaclip-spoke
cd /opt/seaclip-spoke
pnpm install --filter @seaclip/spoke --filter @seaclip/shared

# Configure
cat > .env << EOF
SEACLIP_HUB_URL=http://YOUR_HUB_IP:51842
SEACLIP_DEVICE_NAME=$(hostname)
SEACLIP_DEVICE_TYPE=jetson
SEACLIP_TELEMETRY_INTERVAL=30
OLLAMA_BASE_URL=http://localhost:11434
EOF

Step 5: Register and Start

bash
# Register with hub
pnpm spoke register

# Create systemd service
sudo tee /etc/systemd/system/seaclip-spoke.service > /dev/null << 'EOF'
[Unit]
Description=SeaClip Spoke Agent
After=network.target

[Service]
Type=simple
User=nvidia
WorkingDirectory=/opt/seaclip-spoke
ExecStart=/usr/bin/node /opt/seaclip-spoke/spoke/dist/index.js
Restart=always
RestartSec=10
Environment=NODE_ENV=production
EnvironmentFile=/opt/seaclip-spoke/.env

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl enable seaclip-spoke
sudo systemctl start seaclip-spoke

GPU Telemetry

Jetson spokes report additional GPU metrics:

Metric Source
GPU Utilization tegrastats
GPU Memory tegrastats
GPU Temperature /sys/devices/gpu.0/temp
Power Draw /sys/bus/i2c/drivers/ina3221x
CUDA Version nvcc --version

Recommended Models by Device

Device Recommended Models
AGX Orin 64GB llama3:70b, mixtral:8x7b
AGX Orin 32GB llama3:13b, codellama:34b
Orin NX 16GB llama3:8b, mistral:7b
Orin Nano 8GB llama3:8b, phi3
Xavier NX 8GB mistral:7b, gemma:7b
Nano 4GB tinyllama, phi

Power Modes

Optimize for performance or power efficiency:

bash
# List available power modes
sudo nvpmodel -q --verbose

# Set max performance (Orin)
sudo nvpmodel -m 0
sudo jetson_clocks

# Set power-saving mode
sudo nvpmodel -m 2

Verify GPU Acceleration

bash
# Check CUDA
nvcc --version

# Check GPU memory
nvidia-smi  # or tegrastats

# Test Ollama with GPU
ollama run llama3 "What GPU am I running on?" --verbose

Troubleshooting

Ollama not using GPU

bash
# Check CUDA libraries
ldconfig -p | grep cuda

# Reinstall Ollama
curl -fsSL https://ollama.com/install.sh | sh

Out of GPU memory

Use a smaller model or enable model offloading:

bash
# Use quantized model
ollama pull llama3:8b-q4_0

Thermal throttling

Improve cooling or reduce power mode:

bash
# Check temperature
cat /sys/devices/virtual/thermal/thermal_zone*/temp

# Reduce power mode
sudo nvpmodel -m 1