2024-11-21 一站式 AI 平台生日大派对!2024-11-21 一站式 AI 平台生日大派对! 无问芯穹特别推出多项超值福利!立即参与
Skip to content
回到全部文章

在开发机上科学安装 Ollama

Ollama 是一个开源项目,允许用户在本地运行和部署大型语言模型(LLMs),Ollama 简单易用,是不少开发者的首选。

环境信息

  • 开发机使用 Ubuntu Ubuntu 22.04 为基础镜像:cr.infini-ai.com/infini-ai/ubuntu:22.04-20240429

官方原始安装方式

Ollama 官方网站提供了Linux 平台安装指南,仅需要一行命令即可在智算云平台的开发机上安装 Ollama:

shell
curl -fsSL https://ollama.com/install.sh | sh

此外,也可以使用官方的的Linux 平台手动安装指南

shell
# 此处仅贴出部分命令,省略了其他安装步骤
curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz
sudo tar -C /usr -xzf ollama-linux-amd64.tgz

然而无论使用哪一种安装方式,都有可能遇到网络慢或不可用,造成安装失败。通过在 CURL 命令中添加 -v 参数,我们发现 Ollama 安装过程中需要从 GitHub 下载数据:

shell
curl -L -v https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz

从下面输出中可以发现,对 ollama.com 的请求经过 307 重定向到了 github.com。由于众所周知的原因,可能无法顺利地从 GitHub 下载数据:

shell
(base) yinghaozhao@YinghaodeMacBook-Air ~ % curl -L -v https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 34.120.132.20:443...
* Connected to ollama.com (34.120.132.20) port 443
* ALPN: curl offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
} [315 bytes data]
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* (304) (IN), TLS handshake, Server hello (2):
{ [122 bytes data]
* (304) (IN), TLS handshake, Unknown (8):
{ [15 bytes data]
* (304) (IN), TLS handshake, Certificate (11):
{ [4052 bytes data]
* (304) (IN), TLS handshake, CERT verify (15):
{ [264 bytes data]
* (304) (IN), TLS handshake, Finished (20):
{ [36 bytes data]
* (304) (OUT), TLS handshake, Finished (20):
} [36 bytes data]
* SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256
* ALPN: server accepted h2
* Server certificate:
*  subject: CN=ollama.com
*  start date: Aug 16 08:01:55 2024 GMT
*  expire date: Nov 14 08:56:29 2024 GMT
*  subjectAltName: host "ollama.com" matched cert's "ollama.com"
*  issuer: C=US; O=Google Trust Services; CN=WR3
*  SSL certificate verify ok.
* using HTTP/2
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0* [HTTP/2] [1] OPENED stream for https://ollama.com/download/ollama-linux-amd64.tgz
* [HTTP/2] [1] [:method: GET]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: ollama.com]
* [HTTP/2] [1] [:path: /download/ollama-linux-amd64.tgz]
* [HTTP/2] [1] [user-agent: curl/8.4.0]
* [HTTP/2] [1] [accept: */*]
> GET /download/ollama-linux-amd64.tgz HTTP/2
> Host: ollama.com
> User-Agent: curl/8.4.0
> Accept: */*
> 
< HTTP/2 307 
< content-type: text/html; charset=utf-8
< location: https://github.com/ollama/ollama/releases/download/v0.3.12/ollama-linux-amd64.tgz
< referrer-policy: same-origin
< set-cookie: aid=d397e7ab-aca8-4b02-9ea5-d28a532f2e8f; Path=/; HttpOnly
< x-frame-options: DENY
< date: Wed, 09 Oct 2024 06:55:43 GMT
< content-length: 117
< via: 1.1 google
< alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
< 
* Ignoring the response-body
{ [117 bytes data]
100   117  100   117    0     0    171      0 --:--:-- --:--:-- --:--:--   171
* Connection #0 to host ollama.com left intact
* Issue another request to this URL: 'https://github.com/ollama/ollama/releases/download/v0.3.12/ollama-linux-amd64.tgz'
*   Trying 20.205.243.166:443...

科学安装 Ollama

为了科学安装 Ollama,我们对 Ollama 安装脚本进行了改造,使用了第三方加速服务处理了下载速度慢的问题,

NOTE

以下步骤使用了第三方提供的学术加速服务,用于在中国大陆加速访问 GitHub 等服务。无问芯穹并不提供这些加速服务,也不对其可靠性负责。详见第三方学术加速服务指南

环境准备

首先,需要确保开发机上已经安装了 lspiclshw。在 Ollama 的安装脚本中,这些工具主要用于检测 NVIDIA 或 AMD GPU 的存在。如果开发机上没有安装这些工具,则可能造成安装失败。

shell
sudo apt update && sudo apt install pciutils -y && sudo apt install lshw -y

验证安装是否成功:

shell
lspci --version
lshw -version

改造 Ollama 安装脚本

为了科学地安装 Ollama,我们需要处理 curl 请求的响应体,从中获取 github.com 的 URL,并代理到第三方学术加速地址上。核心逻辑在 download_with_proxy() 函数中。另外,由于容器环境内 systemctl 不可用,我们新增了 start_ollama 函数,用于在安装完成后直接启动 ollama 服务。

NOTE

由于第三方学术加速服务稳定性无法保障,您可能需要多次尝试,或更换不同的加速地址。无问芯穹不对其可靠性负责。

shell
#!/bin/sh
# This script installs Ollama on Linux.
# It detects the current operating system architecture and installs the appropriate version of Ollama.

# Enable auto-export of variables to ensure proper function behavior
# This resolves issues related to variable scope in some environments
set -a

set -eu

status() { echo ">>> $*" >&2; }
error() { echo "ERROR $*"; exit 1; }
warning() { echo "WARNING: $*"; }

TEMP_DIR=$(mktemp -d)
cleanup() { rm -rf $TEMP_DIR; }
trap cleanup EXIT

available() { command -v $1 >/dev/null; }
require() {
    local MISSING=''
    for TOOL in $*; do
        if ! available $TOOL; then
            MISSING="$MISSING $TOOL"
        fi
    done

    echo $MISSING
}

# 改进的代理下载函数
# 如果下载失败,可尝试替换 https://ghfast.top 为 https://gh-proxy.com,或其他类似 github 加速服务
download_with_proxy() {
    local url="$1"
    local output_file="$2"
    local github_url=$(curl -s -I "$url" | grep -i Location | awk '{print $2}' | tr -d '\r')
    local proxy_url="https://ghfast.top/$github_url"
    local max_retries=3
    local retry_count=0

    while [ $retry_count -lt $max_retries ]; do
        if curl -L --http1.1 "$proxy_url" -o "$output_file"; then
            return 0
        else
            retry_count=$((retry_count + 1))
            echo "Download failed. Retrying in 5 seconds... (Attempt $retry_count of $max_retries)"
            sleep 5
        fi
    done

    echo "Download failed after $max_retries attempts."
    return 1
}

[ "$(uname -s)" = "Linux" ] || error 'This script is intended to run on Linux only.'

ARCH=$(uname -m)
case "$ARCH" in
    x86_64) ARCH="amd64" ;;
    aarch64|arm64) ARCH="arm64" ;;
    *) error "Unsupported architecture: $ARCH" ;;
esac

IS_WSL2=false

KERN=$(uname -r)
case "$KERN" in
    *icrosoft*WSL2 | *icrosoft*wsl2) IS_WSL2=true;;
    *icrosoft) error "Microsoft WSL1 is not currently supported. Please use WSL2 with 'wsl --set-version <distro> 2'" ;;
    *) ;;
esac

VER_PARAM="${OLLAMA_VERSION:+?version=$OLLAMA_VERSION}"

SUDO=
if [ "$(id -u)" -ne 0 ]; then
    # Running as root, no need for sudo
    if ! available sudo; then
        error "This script requires superuser permissions. Please re-run as root."
    fi

    SUDO="sudo"
fi

NEEDS=$(require curl awk grep sed tee xargs)
if [ -n "$NEEDS" ]; then
    status "ERROR: The following tools are required but missing:"
    for NEED in $NEEDS; do
        echo "  - $NEED"
    done
    exit 1
fi

for BINDIR in /usr/local/bin /usr/bin /bin; do
    echo $PATH | grep -q $BINDIR && break || continue
done
OLLAMA_INSTALL_DIR=$(dirname ${BINDIR})

status "Installing ollama to $OLLAMA_INSTALL_DIR"
$SUDO install -o0 -g0 -m755 -d $BINDIR
$SUDO install -o0 -g0 -m755 -d "$OLLAMA_INSTALL_DIR"
if curl -I --silent --fail --location "https://ollama.com/download/ollama-linux-${ARCH}.tgz${VER_PARAM}" >/dev/null ; then
    status "Downloading Linux ${ARCH} bundle"
    if download_with_proxy "https://ollama.com/download/ollama-linux-${ARCH}.tgz${VER_PARAM}" "$TEMP_DIR/ollama.tgz"; then
        $SUDO tar -xzf "$TEMP_DIR/ollama.tgz" -C "$OLLAMA_INSTALL_DIR"
        BUNDLE=1
        if [ "$OLLAMA_INSTALL_DIR/bin/ollama" != "$BINDIR/ollama" ] ; then
            status "Making ollama accessible in the PATH in $BINDIR"
            $SUDO ln -sf "$OLLAMA_INSTALL_DIR/ollama" "$BINDIR/ollama"
        fi
    else
        error "Failed to download Ollama bundle"
    fi
else
    status "Downloading Linux ${ARCH} CLI"
    if download_with_proxy "https://ollama.com/download/ollama-linux-${ARCH}${VER_PARAM}" "$TEMP_DIR/ollama"; then
        $SUDO install -o0 -g0 -m755 $TEMP_DIR/ollama $OLLAMA_INSTALL_DIR/ollama
        BUNDLE=0
        if [ "$OLLAMA_INSTALL_DIR/ollama" != "$BINDIR/ollama" ] ; then
            status "Making ollama accessible in the PATH in $BINDIR"
            $SUDO ln -sf "$OLLAMA_INSTALL_DIR/ollama" "$BINDIR/ollama"
        fi
    else
        error "Failed to download Ollama CLI"
    fi
fi

install_success() {
    status 'The Ollama API is now available at 127.0.0.1:11434.'
    status 'Install complete. Run "ollama" from the command line.'
}
trap install_success EXIT

# Everything from this point onwards is optional.

configure_systemd() {
    if ! id ollama >/dev/null 2>&1; then
        status "Creating ollama user..."
        $SUDO useradd -r -s /bin/false -U -m -d /usr/share/ollama ollama
    fi
    if getent group render >/dev/null 2>&1; then
        status "Adding ollama user to render group..."
        $SUDO usermod -a -G render ollama
    fi
    if getent group video >/dev/null 2>&1; then
        status "Adding ollama user to video group..."
        $SUDO usermod -a -G video ollama
    fi

    status "Adding current user to ollama group..."
    $SUDO usermod -a -G ollama $(whoami)

    status "Creating ollama systemd service..."
    cat <<EOF | $SUDO tee /etc/systemd/system/ollama.service >/dev/null
[Unit]
Description=Ollama Service
After=network-online.target

[Service]
ExecStart=$BINDIR/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH=$PATH"

[Install]
WantedBy=default.target
EOF
}

start_ollama() {
    if [ -f "$BINDIR/ollama" ]; then
        status "Starting Ollama..."
        nohup $BINDIR/ollama serve > /var/log/ollama.log 2>&1 &
        PID=$!
        status "Ollama started with PID $PID"
    else
        error "Ollama binary not found at $BINDIR/ollama"
    fi
}

if available systemctl; then
    configure_systemd
    SYSTEMCTL_RUNNING="$(systemctl is-system-running || true)"
    case $SYSTEMCTL_RUNNING in
        running|degraded)
            status "Enabling and starting ollama service..."
            $SUDO systemctl daemon-reload
            $SUDO systemctl enable ollama

            start_service() { $SUDO systemctl restart ollama; }
            trap start_service EXIT
            ;;
        *)
            status "systemd not available or not in running state. Starting Ollama directly..."
            start_ollama
            ;;
    esac
else
    status "systemd not available. Starting Ollama directly..."
    start_ollama
fi

# WSL2 only supports GPUs via nvidia passthrough
# so check for nvidia-smi to determine if GPU is available
if [ "$IS_WSL2" = true ]; then
    if available nvidia-smi && [ -n "$(nvidia-smi | grep -o "CUDA Version: [0-9]*\.[0-9]*")" ]; then
        status "Nvidia GPU detected."
    fi
    install_success
    exit 0
fi

# Install GPU dependencies on Linux
if ! available lspci && ! available lshw; then
    warning "Unable to detect NVIDIA/AMD GPU. Install lspci or lshw to automatically detect and install GPU dependencies."
    exit 0
fi

check_gpu() {
    # Look for devices based on vendor ID for NVIDIA and AMD
    case $1 in
        lspci)
            case $2 in
                nvidia) available lspci && lspci -d '10de:' | grep -q 'NVIDIA' || return 1 ;;
                amdgpu) available lspci && lspci -d '1002:' | grep -q 'AMD' || return 1 ;;
            esac ;;
        lshw)
            case $2 in
                nvidia) available lshw && $SUDO lshw -c display -numeric -disable network | grep -q 'vendor: .* \[10DE\]' || return 1 ;;
                amdgpu) available lshw && $SUDO lshw -c display -numeric -disable network | grep -q 'vendor: .* \[1002\]' || return 1 ;;
            esac ;;
        nvidia-smi) available nvidia-smi || return 1 ;;
    esac
}

if check_gpu nvidia-smi; then
    status "NVIDIA GPU installed."
    exit 0
fi

if ! check_gpu lspci nvidia && ! check_gpu lshw nvidia && ! check_gpu lspci amdgpu && ! check_gpu lshw amdgpu; then
    install_success
    warning "No NVIDIA/AMD GPU detected. Ollama will run in CPU-only mode."
    exit 0
fi

if check_gpu lspci amdgpu || check_gpu lshw amdgpu; then
    if [ $BUNDLE -ne 0 ]; then
        status "Downloading Linux ROCm ${ARCH} bundle"
        if download_with_proxy "https://ollama.com/download/ollama-linux-${ARCH}-rocm.tgz${VER_PARAM}" "$TEMP_DIR/ollama-rocm.tgz"; then
            $SUDO tar -xzf "$TEMP_DIR/ollama-rocm.tgz" -C "$OLLAMA_INSTALL_DIR"
            install_success
            status "AMD GPU ready."
            exit 0
        else
            error "Failed to download AMD GPU bundle"
        fi
    fi
    # Look for pre-existing ROCm v6 before downloading the dependencies
    for search in "${HIP_PATH:-''}" "${ROCM_PATH:-''}" "/opt/rocm" "/usr/lib64"; do
        if [ -n "${search}" ] && [ -e "${search}/libhipblas.so.2" -o -e "${search}/lib/libhipblas.so.2" ]; then
            status "Compatible AMD GPU ROCm library detected at ${search}"
            install_success
            exit 0
        fi
    done

    status "Downloading AMD GPU dependencies..."
    $SUDO rm -rf /usr/share/ollama/lib
    $SUDO chmod o+x /usr/share/ollama
    $SUDO install -o ollama -g ollama -m 755 -d /usr/share/ollama/lib/rocm
    if download_with_proxy "https://ollama.com/download/ollama-linux-amd64-rocm.tgz${VER_PARAM}" "$TEMP_DIR/ollama-rocm.tgz"; then
        $SUDO tar zx --owner ollama --group ollama -f "$TEMP_DIR/ollama-rocm.tgz" -C /usr/share/ollama/lib/rocm
        install_success
        status "AMD GPU ready."
        exit 0
    else
        error "Failed to download AMD GPU dependencies"
    fi
fi

CUDA_REPO_ERR_MSG="NVIDIA GPU detected, but your OS and Architecture are not supported by NVIDIA.  Please install the CUDA driver manually https://docs.nvidia.com/cuda/cuda-installation-guide-linux/"
# ref: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#rhel-7-centos-7
# ref: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#rhel-8-rocky-8
# ref: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#rhel-9-rocky-9
# ref: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#fedora
install_cuda_driver_yum() {
    status 'Installing NVIDIA repository...'
    
    case $PACKAGE_MANAGER in
        yum)
            $SUDO $PACKAGE_MANAGER -y install yum-utils
            if curl -I --silent --fail --location "https://developer.download.nvidia.com/compute/cuda/repos/$1$2/$(uname -m | sed -e 's/aarch64/sbsa/')/cuda-$1$2.repo" >/dev/null ; then
                $SUDO $PACKAGE_MANAGER-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/$1$2/$(uname -m | sed -e 's/aarch64/sbsa/')/cuda-$1$2.repo
            else
                error $CUDA_REPO_ERR_MSG
            fi
            ;;
        dnf)
            if curl -I --silent --fail --location "https://developer.download.nvidia.com/compute/cuda/repos/$1$2/$(uname -m | sed -e 's/aarch64/sbsa/')/cuda-$1$2.repo" >/dev/null ; then
                $SUDO $PACKAGE_MANAGER config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/$1$2/$(uname -m | sed -e 's/aarch64/sbsa/')/cuda-$1$2.repo
            else
                error $CUDA_REPO_ERR_MSG
            fi
            ;;
    esac

    case $1 in
        rhel)
            status 'Installing EPEL repository...'
            # EPEL is required for third-party dependencies such as dkms and libvdpau
            $SUDO $PACKAGE_MANAGER -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-$2.noarch.rpm || true
            ;;
    esac

    status 'Installing CUDA driver...'

    if [ "$1" = 'centos' ] || [ "$1$2" = 'rhel7' ]; then
        $SUDO $PACKAGE_MANAGER -y install nvidia-driver-latest-dkms
    fi

    $SUDO $PACKAGE_MANAGER -y install cuda-drivers
}

# ref: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#ubuntu
# ref: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#debian
install_cuda_driver_apt() {
    status 'Installing NVIDIA repository...'
    if curl -I --silent --fail --location "https://developer.download.nvidia.com/compute/cuda/repos/$1$2/$(uname -m | sed -e 's/aarch64/sbsa/')/cuda-keyring_1.1-1_all.deb" >/dev/null ; then
        curl -fsSL -o $TEMP_DIR/cuda-keyring.deb https://developer.download.nvidia.com/compute/cuda/repos/$1$2/$(uname -m | sed -e 's/aarch64/sbsa/')/cuda-keyring_1.1-1_all.deb
    else
        error $CUDA_REPO_ERR_MSG
    fi

    case $1 in
        debian)
            status 'Enabling contrib sources...'
            $SUDO sed 's/main/contrib/' < /etc/apt/sources.list | $SUDO tee /etc/apt/sources.list.d/contrib.list > /dev/null
            if [ -f "/etc/apt/sources.list.d/debian.sources" ]; then
                $SUDO sed 's/main/contrib/' < /etc/apt/sources.list.d/debian.sources | $SUDO tee /etc/apt/sources.list.d/contrib.sources > /dev/null
            fi
            ;;
    esac

    status 'Installing CUDA driver...'
    $SUDO dpkg -i $TEMP_DIR/cuda-keyring.deb
    $SUDO apt-get update

    [ -n "$SUDO" ] && SUDO_E="$SUDO -E" || SUDO_E=
    DEBIAN_FRONTEND=noninteractive $SUDO_E apt-get -y install cuda-drivers -q
}

if [ ! -f "/etc/os-release" ]; then
    error "Unknown distribution. Skipping CUDA installation."
fi

. /etc/os-release

OS_NAME=$ID
OS_VERSION=$VERSION_ID

PACKAGE_MANAGER=
for PACKAGE_MANAGER in dnf yum apt-get; do
    if available $PACKAGE_MANAGER; then
        break
    fi
done

if [ -z "$PACKAGE_MANAGER" ]; then
    error "Unknown package manager. Skipping CUDA installation."
fi

if ! check_gpu nvidia-smi || [ -z "$(nvidia-smi | grep -o "CUDA Version: [0-9]*\.[0-9]*")" ]; then
    case $OS_NAME in
        centos|rhel) install_cuda_driver_yum 'rhel' $(echo $OS_VERSION | cut -d '.' -f 1) ;;
        rocky) install_cuda_driver_yum 'rhel' $(echo $OS_VERSION | cut -c1) ;;
        fedora) [ $OS_VERSION -lt '39' ] && install_cuda_driver_yum $OS_NAME $OS_VERSION || install_cuda_driver_yum $OS_NAME '39';;
        amzn) install_cuda_driver_yum 'fedora' '37' ;;
        debian) install_cuda_driver_apt $OS_NAME $OS_VERSION ;;
        ubuntu) install_cuda_driver_apt $OS_NAME $(echo $OS_VERSION | sed 's/\.//') ;;
        *) exit ;;
    esac
fi

if ! lsmod | grep -q nvidia || ! lsmod | grep -q nvidia_uvm; then
    KERNEL_RELEASE="$(uname -r)"
    case $OS_NAME in
        rocky) $SUDO $PACKAGE_MANAGER -y install kernel-devel kernel-headers ;;
        centos|rhel|amzn) $SUDO $PACKAGE_MANAGER -y install kernel-devel-$KERNEL_RELEASE kernel-headers-$KERNEL_RELEASE ;;
        fedora) $SUDO $PACKAGE_MANAGER -y install kernel-devel-$KERNEL_RELEASE ;;
        debian|ubuntu) $SUDO apt-get -y install linux-headers-$KERNEL_RELEASE ;;
        *) exit ;;
    esac

    NVIDIA_CUDA_VERSION=$($SUDO dkms status | awk -F: '/added/ { print $1 }')
    if [ -n "$NVIDIA_CUDA_VERSION" ]; then
        $SUDO dkms install $NVIDIA_CUDA_VERSION
    fi

    if lsmod | grep -q nouveau; then
        status 'Reboot to complete NVIDIA CUDA driver install.'
        exit 0
    fi

    $SUDO modprobe nvidia
    $SUDO modprobe nvidia_uvm
fi

# make sure the NVIDIA modules are loaded on boot with nvidia-persistenced
if available nvidia-persistenced; then
    $SUDO touch /etc/modules-load.d/nvidia.conf
    MODULES="nvidia nvidia-uvm"
    for MODULE in $MODULES; do
        if ! grep -qxF "$MODULE" /etc/modules-load.d/nvidia.conf; then
            echo "$MODULE" | $SUDO tee -a /etc/modules-load.d/nvidia.conf > /dev/null
        fi
    done
fi

status "NVIDIA GPU ready."
install_success

运行脚本安装 Ollama

shell
# 在 install.sh 写入改造后的安装脚本
root@is-c76hhq6otrme4ygb-devmachine-0:~# vim install.sh 
root@is-c76hhq6otrme4ygb-devmachine-0:~# chmod +x install.sh 
root@is-c76hhq6otrme4ygb-devmachine-0:~# ./install.sh 
>>> Installing ollama to /usr/local
>>> Downloading Linux amd64 bundle
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
 65 1583M   65 1042M    0     0  6860k      0  0:03:56  0:02:35  0:01:21 7594k
curl: (18) transfer closed with 567067073 bytes remaining to read
Download failed. Retrying in 5 seconds... (Attempt 1 of 3)
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1583M  100 1583M    0     0  6562k      0  0:04:07  0:04:07 --:--:-- 14.3M
>>> Creating ollama user...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> NVIDIA GPU installed.
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.

本地推理

执行 ollama run gemma2:2b,ollama 会直接下载模型,并运行本地推理服务。

shell
root@is-c76hhq6otrme4ygb-devmachine-0:~# ollama run gemma2:2b
pulling manifest 
pulling 7462734796d6... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 1.6 GB                         
pulling e0a42594d802... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏  358 B                         
pulling 097a36493f71... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 8.4 KB                         
pulling 2490e7468436... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏   65 B                         
pulling e18ad7af7efb... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏  487 B                         
verifying sha256 digest 
writing manifest 
success 
>>> who are you?
I am Gemma, an open-weights AI assistant. I'm a large language model created by the Gemma team at Google DeepMind.

NOTE

如果 ollama 服务未启动,您可以使用 tmux 工具新开会话,执行 ollama serve 运行 Ollama 推理服务。回到原来的会话,执行 ollama run gemma2:2b 即可。

资源与支持

  • 公网访问开发机内 HTTP 服务:在使用过程中,可能会遇到需要从公网访问开发机内部服务的情况(例如推理 API 服务、带 UI 的服务等),但是暂时开发机仅提供了云平台内网访问地址,无法直接从公网访问。这时,您可以利用开发机提供的 SSH 服务和 SSH 端口转发功能,将开发机内网端口映射到本地电脑,从而实现从公网直接访问开发机内部服务。