如何构建一个提供nlp服务的镜像
本文介绍如何构建一个提供nlp服务的镜像,功能如下:
- 基于hanlp 2.x、jionlp为基础,封装NLP处理相关的方法,如命名实体提取,提取QQ、微信、身份证,文本相似性比较等
- 采用fastapi封装NLP的相关服务
Dockerfile文件内容
# sytax=docker/dockerfile:1
# use build args in the docker build command with --build-arg="BUILDARG=true"
# Override at your own risk - non-root configurations are untested
ARG UID=0
ARG GID=0FROM python:3.11-slim-bookworm as base
# Use args
ARG UID
ARG GID## Basis ##
ENV ENV=prod \PORT=10068WORKDIR /app/backendENV HOME /root# Create user and group if not root
RUN if [ $UID -ne 0 ];then \if [ $GID -ne 0 ];then \addgroup --gid $GID app; \fi; \adduser --uid $UID --gid $GID --home $HOME --disabled-password --no-create-home app; \fi#Make sure the user has access to the app and root directory
RUN chown -R $UID:$GID /app $HOMERUN apt-get update && \#Install pandoc,netcat and gccapt-get install -y --no-install-recommends curl jq procps && \#cleanuprm -rf /var/lib/apt/lists/*;# install python dependencies
COPY --chown=$UID:$GID ./backend/requirements.txt ./requirements.txtRUN pip install numpy==1.26.4 --no-cache-dir && \pip3 install torch --index-url https://download.pytorch.org/whl/cpu --no-cache-dirRUN pip3 install uv && \uv pip install --system -r requirements.txt --no-cache-dir && \chown -R $UID:$GID /app/backend#copy backend files
COPY --chown=$UID:$GID ./backend .EXPOSE 10068HEALTHCHECK CMD curl --silent --fail http://localhost:${PORT:-10068}/health | jq -ne 'input.status == true' || exit 1USER $UID:$GIDCMD ["bash","start.sh"]
requirements.txt内容
#fastapi
fastapi==0.111.0
uvicorn[standard]==0.30.1
pydantic==2.8.2
python-multipart==0.0.9requests==2.32.3
aiohttp==3.10.2#config
Jinja2==3.1.4
alembic==1.13.2#hanlp
hanlp==2.1.0b60#jionlp
jiojio==1.2.5
jionlp==1.5.15
zipfile36==0.1.3
start.sh内容
#!/usr/bin/env bashSCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )
cd "$SCRIPT_DIR" || exitPORT="${PORT:-10068}"
HOST="${HOST:-0.0.0.0}"exec uvicorn main:app --host "$HOST" --port "$PORT" --forwarded-allow-ips '*'