GenStudio LLM API 部分模型价格调整公告GenStudio LLM API 部分模型价格调整公告 ,新价格 2025 年 11 月 1 日生效调价公告
Skip to content
回到全部文章

在开发机中科学安装 Ollama

Ollama 是一个专为本地开发和快速原型设计打造的轻量级大语言模型(LLM)运行工具。它以极简的配置和卓越的易用性著称,旨在帮助开发者在本地环境中轻松运行和管理如 Llama 3 等主流开源模型。虽然在处理大规模高并发的生产级负载时,vLLM 等专用推理引擎表现更为出色,但 Ollama 凭借其对单用户场景的优化和开箱即用的特性,成为了个人开发者进行实验、调试以及构建 AI 应用原型的理想起点。

环境信息

  • 开发机使用 Ubuntu Ubuntu 22.04 为基础镜像:cr.infini-ai.com/infini-ai/ubuntu:22.04-20240429

硬件支持

Ollama 仅支持 Nvidia、AMD Radeon、Metal(Apple GPUs)。详见 Ollama Hardware Support

为何 Ollama 官方安装方式会失效

如果您已在开发机内自行安装可靠的代理服务,可使用 Ollama 官方安装方式。

官方 Linux 平台安装命令:

shell
curl -fsSL https://ollama.com/install.sh | sh

官方的Linux 平台手动安装指南

shell
# 此处仅贴出部分命令,省略了其他安装步骤
curl -fsSL https://ollama.com/download/ollama-linux-amd64.tgz \
    | sudo tar zx -C /usr

虽然 Ollama 官方安装命令中的下载地址指向 ollama.com,但实际上服务器会返回 307 Temporary Redirect 状态码,将请求重定向至 github.com 的 Release 下载链接。因此,在限制访问 GitHub 的网络环境中,即使看似访问的是官网地址,下载也会失败。

我们可以通过在 CURL 命令中添加 -v 参数来验证这一重定向过程:

shell
curl -L -v https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz

从下面输出中可以发现,对 ollama.com 的请求经过 307 重定向到了 github.com。由于众所周知的原因,可能无法顺利地从 GitHub 下载数据:

shell
(base) yinghaozhao@YinghaodeMacBook-Air ~ % curl -L -v https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 34.120.132.20:443...
* Connected to ollama.com (34.120.132.20) port 443
* ALPN: curl offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
} [315 bytes data]
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* (304) (IN), TLS handshake, Server hello (2):
{ [122 bytes data]
* (304) (IN), TLS handshake, Unknown (8):
{ [15 bytes data]
* (304) (IN), TLS handshake, Certificate (11):
{ [4052 bytes data]
* (304) (IN), TLS handshake, CERT verify (15):
{ [264 bytes data]
* (304) (IN), TLS handshake, Finished (20):
{ [36 bytes data]
* (304) (OUT), TLS handshake, Finished (20):
} [36 bytes data]
* SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256
* ALPN: server accepted h2
* Server certificate:
*  subject: CN=ollama.com
*  start date: Aug 16 08:01:55 2024 GMT
*  expire date: Nov 14 08:56:29 2024 GMT
*  subjectAltName: host "ollama.com" matched cert's "ollama.com"
*  issuer: C=US; O=Google Trust Services; CN=WR3
*  SSL certificate verify ok.
* using HTTP/2
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0* [HTTP/2] [1] OPENED stream for https://ollama.com/download/ollama-linux-amd64.tgz
* [HTTP/2] [1] [:method: GET]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: ollama.com]
* [HTTP/2] [1] [:path: /download/ollama-linux-amd64.tgz]
* [HTTP/2] [1] [user-agent: curl/8.4.0]
* [HTTP/2] [1] [accept: */*]
> GET /download/ollama-linux-amd64.tgz HTTP/2
> Host: ollama.com
> User-Agent: curl/8.4.0
> Accept: */*
> 
< HTTP/2 307 
< content-type: text/html; charset=utf-8
< location: https://github.com/ollama/ollama/releases/download/v0.3.12/ollama-linux-amd64.tgz
< referrer-policy: same-origin
< set-cookie: aid=d397e7ab-aca8-4b02-9ea5-d28a532f2e8f; Path=/; HttpOnly
< x-frame-options: DENY
< date: Wed, 09 Oct 2024 06:55:43 GMT
< content-length: 117
< via: 1.1 google
< alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
< 
* Ignoring the response-body
{ [117 bytes data]
100   117  100   117    0     0    171      0 --:--:-- --:--:-- --:--:--   171
* Connection #0 to host ollama.com left intact
* Issue another request to this URL: 'https://github.com/ollama/ollama/releases/download/v0.3.12/ollama-linux-amd64.tgz'
*   Trying 20.205.243.166:443...

科学安装 Ollama

为了科学安装 Ollama,我们对 Ollama 安装脚本进行了改造,使用了第三方加速服务处理了下载速度慢的问题。

注意

以下步骤使用了第三方提供的学术加速服务,用于在中国大陆加速访问 GitHub 等服务。平台并不提供这些加速服务,也不对其可靠性负责。详见第三方学术加速服务指南

手动安装 Ollama(推荐)

如果您的开发机内没有安装代理服务,则不可使用 Ollama 官方安装方式。推荐按照以下方式手动安装:

  1. 如果从旧版 Ollama 升级,需先删除旧版:

    shell
    sudo rm -rf /usr/lib/ollama
  2. 使用学术加速服务,获取 Ollama 离线安装包。安装包地址可从 Ollama Releases 页面获取。

    shell
    # 加速服务一:https://ghfast.top/
    wget -c "https://ghfast.top/https://github.com/ollama/ollama/releases/download/v0.13.0/ollama-linux-amd64.tgz"
    # 加速服务二:gh-proxy.com
    wget -c  "https://gh-proxy.org/https://github.com/ollama/ollama/releases/download/v0.13.0/ollama-linux-amd64.tgz"
  3. 安装 Ollama 至指定位置。

    shell
    sudo tar zxf ollama-linux-amd64.tgz -C /usr

改造 Ollama 官方一键安装脚本

官方一键安装脚本包含了针对 OS、GPU 的一系列检测,不适用于智算云平台,一般情况下不推荐使用。如果在智算云平台分发一键安装脚本,需要针对安装包下载的相关函数进行改造。

  1. 首先,需要确保开发机上已经安装了 lspiclshw。在 Ollama 的安装脚本中,这些工具主要用于检测 NVIDIA 或 AMD GPU 的存在。如果开发机上没有安装这些工具,则可能造成安装失败。

    shell
    sudo apt update && sudo apt install pciutils -y && sudo apt install lshw -y

    验证安装是否成功:

    shell
    lspci --version
    lshw -version
  2. 为了科学地安装 Ollama,我们需要处理 curl 请求的响应体,从中获取 github.com 的 URL,并代理到第三方学术加速地址上。核心逻辑在 download_with_proxy() 函数中。另外,由于容器环境内 systemctl 不可用,我们新增了 start_ollama 函数,用于在安装完成后直接启动 ollama 服务。

    注意

    • 由于第三方学术加速服务稳定性无法保障,您可能需要多次尝试,或更换不同的加速地址。平台不对其可靠性负责。
    • Ollama 官方安装脚本可能已更新,平台不保证以下改造脚本可用性。
    shell
    #!/bin/sh
    # This script installs Ollama on Linux.
    # It detects the current operating system architecture and installs the appropriate version of Ollama.
    
    # Enable auto-export of variables to ensure proper function behavior
    # This resolves issues related to variable scope in some environments
    set -a
    
    set -eu
    
    red="$( (/usr/bin/tput bold || :; /usr/bin/tput setaf 1 || :) 2>&-)"
    plain="$( (/usr/bin/tput sgr0 || :) 2>&-)"
    
    status() { echo ">>> $*" >&2; }
    error() { echo "${red}ERROR:${plain} $*"; exit 1; }
    warning() { echo "${red}WARNING:${plain} $*"; }
    
    TEMP_DIR=$(mktemp -d)
    cleanup() { rm -rf $TEMP_DIR; }
    trap cleanup EXIT
    
    available() { command -v $1 >/dev/null; }
    require() {
        local MISSING=''
        for TOOL in $*; do
            if ! available $TOOL; then
                MISSING="$MISSING $TOOL"
            fi
        done
    
        echo $MISSING
    }
    
    # 改进的代理下载函数
    # 如果下载失败,可尝试替换 https://ghfast.top 为 https://gh-proxy.org,或其他类似 github 加速服务
    download_with_proxy() {
        local url="$1"
        local output_file="$2"
        local github_url=$(curl -s -I "$url" | grep -i Location | awk '{print $2}' | tr -d '\r')
        local proxy_url="https://gh-proxy.org/$github_url"
        local max_retries=3
        local retry_count=0
    
        while [ $retry_count -lt $max_retries ]; do
            if curl -L --http1.1 "$proxy_url" -o "$output_file"; then
                return 0
            else
                retry_count=$((retry_count + 1))
                echo "Download failed. Retrying in 5 seconds... (Attempt $retry_count of $max_retries)"
                sleep 5
            fi
        done
    
        echo "Download failed after $max_retries attempts."
        return 1
    }
    
    [ "$(uname -s)" = "Linux" ] || error 'This script is intended to run on Linux only.'
    
    ARCH=$(uname -m)
    case "$ARCH" in
        x86_64) ARCH="amd64" ;;
        aarch64|arm64) ARCH="arm64" ;;
        *) error "Unsupported architecture: $ARCH" ;;
    esac
    
    IS_WSL2=false
    
    KERN=$(uname -r)
    case "$KERN" in
        *icrosoft*WSL2 | *icrosoft*wsl2) IS_WSL2=true;;
        *icrosoft) error "Microsoft WSL1 is not currently supported. Please use WSL2 with 'wsl --set-version <distro> 2'" ;;
        *) ;;
    esac
    
    VER_PARAM="${OLLAMA_VERSION:+?version=$OLLAMA_VERSION}"
    
    SUDO=
    if [ "$(id -u)" -ne 0 ]; then
        # Running as root, no need for sudo
        if ! available sudo; then
            error "This script requires superuser permissions. Please re-run as root."
        fi
    
        SUDO="sudo"
    fi
    
    NEEDS=$(require curl awk grep sed tee xargs)
    if [ -n "$NEEDS" ]; then
        status "ERROR: The following tools are required but missing:"
        for NEED in $NEEDS; do
            echo "  - $NEED"
        done
        exit 1
    fi
    
    for BINDIR in /usr/local/bin /usr/bin /bin; do
        echo $PATH | grep -q $BINDIR && break || continue
    done
    OLLAMA_INSTALL_DIR=$(dirname ${BINDIR})
    
    if [ -d "$OLLAMA_INSTALL_DIR/lib/ollama" ] ; then
        status "Cleaning up old version at $OLLAMA_INSTALL_DIR/lib/ollama"
        $SUDO rm -rf "$OLLAMA_INSTALL_DIR/lib/ollama"
    fi
    
    status "Installing ollama to $OLLAMA_INSTALL_DIR"
    $SUDO install -o0 -g0 -m755 -d $BINDIR
    $SUDO install -o0 -g0 -m755 -d "$OLLAMA_INSTALL_DIR/lib/ollama"
    
    status "Downloading Linux ${ARCH} bundle"
    if download_with_proxy "https://ollama.com/download/ollama-linux-${ARCH}.tgz${VER_PARAM}" "$TEMP_DIR/ollama.tgz"; then
        $SUDO tar -xzf "$TEMP_DIR/ollama.tgz" -C "$OLLAMA_INSTALL_DIR"
        
        if [ "$OLLAMA_INSTALL_DIR/bin/ollama" != "$BINDIR/ollama" ] ; then
            status "Making ollama accessible in the PATH in $BINDIR"
            $SUDO ln -sf "$OLLAMA_INSTALL_DIR/ollama" "$BINDIR/ollama"
        fi
    else
        error "Failed to download Ollama bundle"
    fi
    
    # Check for NVIDIA JetPack systems with additional downloads
    if [ -f /etc/nv_tegra_release ] ; then
        if grep R36 /etc/nv_tegra_release > /dev/null ; then
            status "Downloading JetPack 6 components"
            if download_with_proxy "https://ollama.com/download/ollama-linux-${ARCH}-jetpack6.tgz${VER_PARAM}" "$TEMP_DIR/ollama-jetpack.tgz"; then
                $SUDO tar -xzf "$TEMP_DIR/ollama-jetpack.tgz" -C "$OLLAMA_INSTALL_DIR"
            else
                error "Failed to download JetPack 6 components"
            fi
        elif grep R35 /etc/nv_tegra_release > /dev/null ; then
            status "Downloading JetPack 5 components"
            if download_with_proxy "https://ollama.com/download/ollama-linux-${ARCH}-jetpack5.tgz${VER_PARAM}" "$TEMP_DIR/ollama-jetpack.tgz"; then
                $SUDO tar -xzf "$TEMP_DIR/ollama-jetpack.tgz" -C "$OLLAMA_INSTALL_DIR"
            else
                error "Failed to download JetPack 5 components"
            fi
        else
            warning "Unsupported JetPack version detected. GPU may not be supported"
        fi
    fi
    
    install_success() {
        status 'The Ollama API is now available at 127.0.0.1:11434.'
        status 'Install complete. Run "ollama" from the command line.'
    }
    trap install_success EXIT
    
    start_ollama() {
        if [ -f "$BINDIR/ollama" ]; then
            status "Starting Ollama..."
            nohup $BINDIR/ollama serve > /var/log/ollama.log 2>&1 &
            PID=$!
            status "Ollama started with PID $PID"
        else
            error "Ollama binary not found at $BINDIR/ollama"
        fi
    }
    
    # Since we're in a container, skip systemd setup and directly start ollama
    status "systemd not available in container. Starting Ollama directly..."
    start_ollama
    
    # WSL2 only supports GPUs via nvidia passthrough
    # so check for nvidia-smi to determine if GPU is available
    if [ "$IS_WSL2" = true ]; then
        if available nvidia-smi && [ -n "$(nvidia-smi | grep -o "CUDA Version: [0-9]*\.[0-9]*")" ]; then
            status "Nvidia GPU detected."
        fi
        install_success
        exit 0
    fi
    
    # Don't attempt to install drivers on Jetson systems
    if [ -f /etc/nv_tegra_release ] ; then
        status "NVIDIA JetPack ready."
        install_success
        exit 0
    fi
    
    # Install GPU dependencies on Linux
    if ! available lspci && ! available lshw; then
        warning "Unable to detect NVIDIA/AMD GPU. Install lspci or lshw to automatically detect and install GPU dependencies."
        exit 0
    fi
    
    check_gpu() {
        # Look for devices based on vendor ID for NVIDIA and AMD
        case $1 in
            lspci)
                case $2 in
                    nvidia) available lspci && lspci -d '10de:' | grep -q 'NVIDIA' || return 1 ;;
                    amdgpu) available lspci && lspci -d '1002:' | grep -q 'AMD' || return 1 ;;
                esac ;;
            lshw)
                case $2 in
                    nvidia) available lshw && $SUDO lshw -c display -numeric -disable network | grep -q 'vendor: .* \[10DE\]' || return 1 ;;
                    amdgpu) available lshw && $SUDO lshw -c display -numeric -disable network | grep -q 'vendor: .* \[1002\]' || return 1 ;;
                esac ;;
            nvidia-smi) available nvidia-smi || return 1 ;;
        esac
    }
    
    if check_gpu nvidia-smi; then
        status "NVIDIA GPU installed."
        exit 0
    fi
    
    if ! check_gpu lspci nvidia && ! check_gpu lshw nvidia && ! check_gpu lspci amdgpu && ! check_gpu lshw amdgpu; then
        install_success
        warning "No NVIDIA/AMD GPU detected. Ollama will run in CPU-only mode."
        exit 0
    fi
    
    if check_gpu lspci amdgpu || check_gpu lshw amdgpu; then
        status "Downloading Linux ROCm ${ARCH} bundle"
        if download_with_proxy "https://ollama.com/download/ollama-linux-${ARCH}-rocm.tgz${VER_PARAM}" "$TEMP_DIR/ollama-rocm.tgz"; then
            $SUDO tar -xzf "$TEMP_DIR/ollama-rocm.tgz" -C "$OLLAMA_INSTALL_DIR"
            
            install_success
            status "AMD GPU ready."
            exit 0
        else
            error "Failed to download AMD GPU bundle"
        fi
    fi
    
    CUDA_REPO_ERR_MSG="NVIDIA GPU detected, but your OS and Architecture are not supported by NVIDIA. Please install the CUDA driver manually https://docs.nvidia.com/cuda/cuda-installation-guide-linux/"
    # ref: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#rhel-7-centos-7
    # ref: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#rhel-8-rocky-8
    # ref: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#rhel-9-rocky-9
    # ref: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#fedora
    install_cuda_driver_yum() {
        status 'Installing NVIDIA repository...'
        
        case $PACKAGE_MANAGER in
            yum)
                $SUDO $PACKAGE_MANAGER -y install yum-utils
                if curl -I --silent --fail --location "https://developer.download.nvidia.com/compute/cuda/repos/$1$2/$(uname -m | sed -e 's/aarch64/sbsa/')/cuda-$1$2.repo" >/dev/null ; then
                    $SUDO $PACKAGE_MANAGER-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/$1$2/$(uname -m | sed -e 's/aarch64/sbsa/')/cuda-$1$2.repo
                else
                    error $CUDA_REPO_ERR_MSG
                fi
                ;;
            dnf)
                if curl -I --silent --fail --location "https://developer.download.nvidia.com/compute/cuda/repos/$1$2/$(uname -m | sed -e 's/aarch64/sbsa/')/cuda-$1$2.repo" >/dev/null ; then
                    $SUDO $PACKAGE_MANAGER config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/$1$2/$(uname -m | sed -e 's/aarch64/sbsa/')/cuda-$1$2.repo
                else
                    error $CUDA_REPO_ERR_MSG
                fi
                ;;
        esac
    
        case $1 in
            rhel)
                status 'Installing EPEL repository...'
                # EPEL is required for third-party dependencies such as dkms and libvdpau
                $SUDO $PACKAGE_MANAGER -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-$2.noarch.rpm || true
                ;;
        esac
    
        status 'Installing CUDA driver...'
    
        if [ "$1" = 'centos' ] || [ "$1$2" = 'rhel7' ]; then
            $SUDO $PACKAGE_MANAGER -y install nvidia-driver-latest-dkms
        fi
    
        $SUDO $PACKAGE_MANAGER -y install cuda-drivers
    }
    
    # ref: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#ubuntu
    # ref: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#debian
    install_cuda_driver_apt() {
        status 'Installing NVIDIA repository...'
        if curl -I --silent --fail --location "https://developer.download.nvidia.com/compute/cuda/repos/$1$2/$(uname -m | sed -e 's/aarch64/sbsa/')/cuda-keyring_1.1-1_all.deb" >/dev/null ; then
            curl -fsSL -o $TEMP_DIR/cuda-keyring.deb https://developer.download.nvidia.com/compute/cuda/repos/$1$2/$(uname -m | sed -e 's/aarch64/sbsa/')/cuda-keyring_1.1-1_all.deb
        else
            error $CUDA_REPO_ERR_MSG
        fi
    
        case $1 in
            debian)
                status 'Enabling contrib sources...'
                $SUDO sed 's/main/contrib/' < /etc/apt/sources.list | $SUDO tee /etc/apt/sources.list.d/contrib.list > /dev/null
                if [ -f "/etc/apt/sources.list.d/debian.sources" ]; then
                    $SUDO sed 's/main/contrib/' < /etc/apt/sources.list.d/debian.sources | $SUDO tee /etc/apt/sources.list.d/contrib.sources > /dev/null
                fi
                ;;
        esac
    
        status 'Installing CUDA driver...'
        $SUDO dpkg -i $TEMP_DIR/cuda-keyring.deb
        $SUDO apt-get update
    
        [ -n "$SUDO" ] && SUDO_E="$SUDO -E" || SUDO_E=
        DEBIAN_FRONTEND=noninteractive $SUDO_E apt-get -y install cuda-drivers -q
    }
    
    if [ ! -f "/etc/os-release" ]; then
        error "Unknown distribution. Skipping CUDA installation."
    fi
    
    . /etc/os-release
    
    OS_NAME=$ID
    OS_VERSION=$VERSION_ID
    
    PACKAGE_MANAGER=
    for PACKAGE_MANAGER in dnf yum apt-get; do
        if available $PACKAGE_MANAGER; then
            break
        fi
    done
    
    if [ -z "$PACKAGE_MANAGER" ]; then
        error "Unknown package manager. Skipping CUDA installation."
    fi
    
    if ! check_gpu nvidia-smi || [ -z "$(nvidia-smi | grep -o "CUDA Version: [0-9]*\.[0-9]*")" ]; then
        case $OS_NAME in
            centos|rhel) install_cuda_driver_yum 'rhel' $(echo $OS_VERSION | cut -d '.' -f 1) ;;
            rocky) install_cuda_driver_yum 'rhel' $(echo $OS_VERSION | cut -c1) ;;
            fedora) [ $OS_VERSION -lt '39' ] && install_cuda_driver_yum $OS_NAME $OS_VERSION || install_cuda_driver_yum $OS_NAME '39';;
            amzn) install_cuda_driver_yum 'fedora' '37' ;;
            debian) install_cuda_driver_apt $OS_NAME $OS_VERSION ;;
            ubuntu) install_cuda_driver_apt $OS_NAME $(echo $OS_VERSION | sed 's/\.//') ;;
            *) exit ;;
        esac
    fi
    
    if ! lsmod | grep -q nvidia || ! lsmod | grep -q nvidia_uvm; then
        KERNEL_RELEASE="$(uname -r)"
        case $OS_NAME in
            rocky) $SUDO $PACKAGE_MANAGER -y install kernel-devel kernel-headers ;;
            centos|rhel|amzn) $SUDO $PACKAGE_MANAGER -y install kernel-devel-$KERNEL_RELEASE kernel-headers-$KERNEL_RELEASE ;;
            fedora) $SUDO $PACKAGE_MANAGER -y install kernel-devel-$KERNEL_RELEASE ;;
            debian|ubuntu) $SUDO apt-get -y install linux-headers-$KERNEL_RELEASE ;;
            *) exit ;;
        esac
    
        NVIDIA_CUDA_VERSION=$($SUDO dkms status | awk -F: '/added/ { print $1 }')
        if [ -n "$NVIDIA_CUDA_VERSION" ]; then
            $SUDO dkms install $NVIDIA_CUDA_VERSION
        fi
    
        if lsmod | grep -q nouveau; then
            status 'Reboot to complete NVIDIA CUDA driver install.'
            exit 0
        fi
    
        $SUDO modprobe nvidia
        $SUDO modprobe nvidia_uvm
    fi
    
    # make sure the NVIDIA modules are loaded on boot with nvidia-persistenced
    if available nvidia-persistenced; then
        $SUDO touch /etc/modules-load.d/nvidia.conf
        MODULES="nvidia nvidia-uvm"
        for MODULE in $MODULES; do
            if ! grep -qxF "$MODULE" /etc/modules-load.d/nvidia.conf; then
                echo "$MODULE" | $SUDO tee -a /etc/modules-load.d/nvidia.conf > /dev/null
            fi
        done
    fi
    
    status "NVIDIA GPU ready."
    install_success
  3. 运行脚本安装 Ollama

    shell
    # 在 install.sh 写入改造后的安装脚本
    root@is-c76hhq6otrme4ygb-devmachine-0:~# vim install.sh 
    root@is-c76hhq6otrme4ygb-devmachine-0:~# chmod +x install.sh 
    root@is-c76hhq6otrme4ygb-devmachine-0:~# ./install.sh 
    >>> Installing ollama to /usr/local
    >>> Downloading Linux amd64 bundle
    % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                    Dload  Upload   Total   Spent    Left  Speed
    65 1583M   65 1042M    0     0  6860k      0  0:03:56  0:02:35  0:01:21 7594k
    curl: (18) transfer closed with 567067073 bytes remaining to read
    Download failed. Retrying in 5 seconds... (Attempt 1 of 3)
    % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                    Dload  Upload   Total   Spent    Left  Speed
    100 1583M  100 1583M    0     0  6562k      0  0:04:07  0:04:07 --:--:-- 14.3M
    >>> Creating ollama user...
    >>> Adding ollama user to video group...
    >>> Adding current user to ollama group...
    >>> Creating ollama systemd service...
    >>> NVIDIA GPU installed.
    >>> The Ollama API is now available at 127.0.0.1:11434.
    >>> Install complete. Run "ollama" from the command line.

本地推理

执行 ollama run gemma2:2b,ollama 会直接下载模型,并运行本地推理服务。

shell
root@is-c76hhq6otrme4ygb-devmachine-0:~# ollama run gemma2:2b
pulling manifest 
pulling 7462734796d6... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 1.6 GB                         
pulling e0a42594d802... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏  358 B                         
pulling 097a36493f71... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 8.4 KB                         
pulling 2490e7468436... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏   65 B                         
pulling e18ad7af7efb... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏  487 B                         
verifying sha256 digest 
writing manifest 
success 
>>> who are you?
I am Gemma, an open-weights AI assistant. I'm a large language model created by the Gemma team at Google DeepMind.

注意

如果 ollama 服务未启动,您可以使用 tmux 工具新开会话,执行 ollama serve 运行 Ollama 推理服务。回到原来的会话,执行 ollama run gemma2:2b 即可。

资源与支持

  • 公网访问开发机内 HTTP 服务:在使用过程中,可能会遇到需要从公网访问开发机内部服务的情况(例如推理 API 服务、带 UI 的服务等),但是暂时开发机仅提供了云平台内网访问地址,无法直接从公网访问。这时,您可以利用开发机提供的 SSH 服务和 SSH 端口转发功能,将开发机内网端口映射到本地电脑,从而实现从公网直接访问开发机内部服务。
找不到想要的答案?
让 AI 助手为您解答