作者:来自 Elastic Gustavo Llermaly
使用 Ollama 通过 Go 创建 RAG 应用程序来利用本地模型。
关于各种开放模型,有很多话要说。其中一些被称为 Mixtral 系列,各种规模都有,而一种可能不太为人所知的是 openbiollm,这是 Llama 3 针对医疗领域的改编版。通过实现它们的 API 来测试所有这些模型需要大量工作。但是,Ollama 允许我们使用友好的界面和简单的命令行来测试它们。
在本文中,我们将在 Golang 中构建一个 RAG 应用程序,使用 Ollama 作为 LLM 服务器,使用 Elasticsearch 作为向量数据库。
步骤
- 安装 Ollama
- 提取数据
- Go 中的 RAG 应用程序
安装 Ollama
什么是 Ollama?
Ollama 是一个框架,允许你使用 CLI 在本地下载和访问模型。使用简单的命令,我们可以下载、聊天和设置一个服务器,其中包含我们想要从我们的应用程序中使用的模型。
在此处下载 Ollama 安装程序:
https://ollama.com/
包含可用模型的库在此处:
https://ollama.com/library
安装 Ollama 后,我们可以通过运行其中一个可用模型来测试一切是否正常。让我们使用 3B 参数安装 llama3.2。该库包含下载和运行模型所需的命令:
我们将运行 3B 版本的命令:
ollama run llama3.2:latest
第一次,它会下载模型,然后在终端中打开聊天:
现在我们可以输入 /exit 退出并使用在此位置设置的服务器:http://localhost:11434。让我们测试端点以确保一切按预期运行。
Ollama 提供两种答案模式:generate 以提供单一答案,chat 以与模型进行对话:
生成 - generate
当我们只希望对单个问题得到单个答案而不需要其他答案时,我们会使用生成。
curl http://localhost:11434/api/chat -d '{"model": "llama3.2","stream": false,"messages": [{ "role": "user", "content": "Why Elastic is so cool?" }]
}'
默认情况下,答案以 stream: true 的形式生成,但我们将使用 stream: false,这样答案就只在一条消息中生成,并且更易于阅读。stream: true 在 UI 应用程序中很有用,因为 tokens 在生成时就会发送,而不是阻塞直到整个响应完成。
让我们继续讨论数据。
提取数据
让我们将一些医学文档作为文本和向量在 Elasticsearch 中编入索引。我们将使用这些来测试面向医学的模型(如 openbiollm)与一般模型相比的答案质量。
在开始之前,请确保我们已经创建了推理端点以使用 ELSER 作为我们的嵌入模型:
PUT _inference/sparse_embedding/my-elser-model
{"service": "elser", "service_settings": {"num_allocations": 1,"num_threads": 1}
}
现在,让我们继续使用 semantic_text 字段类型创建索引,该类型允许我们控制分块大小以及向量配置。这使我们的索引能够支持全文、语义和混合搜索。
PUT rag-ollama
{"mappings": {"properties": {"semantic_field": {"type": "semantic_text","inference_id": "my-elser-model"},"content": {"type": "text","copy_to": "semantic_field"}}}
}
现在,让我们索引这些文档:
POST _bulk
{"index":{"_index":"rag-ollama"}}
{"title":"JAK Inhibitors vs. Monoclonal Antibodies in Rheumatoid Arthritis Treatment","content":"This article compares the mechanisms of action, efficacy, and safety profiles of JAK inhibitors and monoclonal antibodies in rheumatoid arthritis treatment, including recent clinical trial data and real-world evidence. It discusses the intracellular signaling pathways targeted by JAK inhibitors, their rapid onset of action, and oral administration advantages. The article also covers the specific targets of various monoclonal antibodies, their long-term safety profiles, and the criteria for choosing between these two classes of drugs based on patient characteristics and disease severity."}
{"index":{"_index":"rag-ollama"}}
{"title":"Diagnostic Approach to Resistant Hypertension: Focus on Primary Aldosteronism","content":"This guide outlines the step-by-step diagnostic process for resistant hypertension, with a particular emphasis on screening and confirming primary aldosteronism. It details the use of aldosterone-renin ratio (ARR) testing as an initial screening tool, explaining proper patient preparation and interpretation of results. The guide also covers confirmatory tests such as the saline infusion test and captopril challenge test, their protocols, and diagnostic criteria. Additionally, it discusses the role of imaging studies in localizing aldosterone-producing adenomas and the importance of adrenal vein sampling in subtype classification of primary aldosteronism."}
{"index":{"_index":"rag-ollama"}}
{"title":"Gut Microbiota Diversity and Inflammatory Cytokine Production in IBD","content":"This study examines the relationship between gut microbiota diversity and the production of pro-inflammatory cytokines in inflammatory bowel diseases (IBD). It explores how reduced microbial diversity correlates with increased levels of cytokines such as TNF-α, IL-1β, and IL-6 in both Crohn's disease and ulcerative colitis. The research discusses specific bacterial species associated with anti-inflammatory effects and their mechanisms of action. Furthermore, it delves into potential therapeutic implications, including the use of prebiotics, probiotics, and fecal microbiota transplantation to modulate the gut microbiome and influence cytokine production. The study also touches on emerging microbiome-based interventions and their potential to complement existing IBD treatments."}
{"index":{"_index":"rag-ollama"}}
{"title":"Biological Therapy Selection in Rheumatoid Arthritis After csDMARD Failure","content":"This article provides a comprehensive framework for selecting appropriate biological therapy in rheumatoid arthritis patients who have not responded adequately to conventional synthetic Disease-Modifying Antirheumatic Drugs (csDMARDs). It discusses the various classes of biologics available, including TNF inhibitors, IL-6 inhibitors, B-cell depleting agents, and T-cell costimulation modulators. The article outlines key factors to consider in the decision-making process, such as disease activity scores, extra-articular manifestations, comorbidities, and patient preferences. It also addresses the importance of biomarkers and predictors of treatment response in guiding therapy selection. The piece concludes with a discussion on cycling versus switching mechanisms of action when faced with inadequate response to an initial biologic agent."}
{"index":{"_index":"rag-ollama"}}
{"title":"Hypertension Management in Chronic Kidney Disease: Special Considerations","content":"This review discusses the unique challenges in managing hypertension in patients with chronic kidney disease (CKD). It outlines the current recommendations for blood pressure targets in CKD patients, explaining how these differ based on the presence and degree of albuminuria. The article explores the preferred classes of antihypertensive medications in CKD, with a focus on renin-angiotensin system blockers and their renoprotective effects. It addresses the complexities of managing volume status in CKD and the role of diuretics. The review also covers the impact of proteinuria on treatment decisions and the need for more aggressive blood pressure control in heavily proteinuric patients. Finally, it discusses considerations for patients on dialysis and the phenomenon of reverse epidemiology in end-stage renal disease."}
完成了!现在我们已经准备好了模型和数据,我们可以将所有内容与我们的 Go 应用程序整合在一起。
Go 中的 RAG 应用程序
对于我们的 Go 应用程序,我们可以直接调用 Ollama 服务器,但我决定改用 parakeet。Parakeet 是一个基于 Go 文本创建 GenAI 应用程序的库。它提供了 Go 接口来抽象 HTTP 通信,此外还提供了用于嵌入、分块和内存等的帮助程序,因此创建应用程序变得非常容易。
我们将首先创建工作文件夹并设置依赖项:
mkdir ollama-rag
cd ollama-rag
go mod init ollama-rag
go get github.com/parakeet-nest/parakeet
go get github.com/elastic/go-elasticsearch/v8@latest
现在,创建一个 main.go 文件,其中包含测试一切是否配置正确的最少内容:
main.go
package mainimport ("github.com/parakeet-nest/parakeet/completion""github.com/parakeet-nest/parakeet/enums/option""github.com/parakeet-nest/parakeet/llm""fmt"
)func main() {ollamaUrl := "http://localhost:11434"model := "llama3.2:latest"options := llm.SetOptions(map[string]interface{}{option.Temperature: 0.5,})question := llm.GenQuery{Model: model,Prompt: "Why Elastic is so cool?, answer in one sentence",Options: options,}// We use generate because we are going to run this script to ask a single questionanswer, err := completion.Generate(ollamaUrl, question)if err != nil {log.Fatal("😡:", err)}fmt.Println(answer.Response)
}
运行它:
go run main.go
在终端中,你应该看到类似这样的答案:
Elastic, a company known for its innovative and user-friendly software solutions, has disrupted the traditional IT industry by empowering businesses to create, deploy, and manage applications quickly and reliably.
这个答案是基于 LLM 的训练数据,这不是我们可以提供或控制的,并且存在一些缺点:
- 信息可能是错误的
- 信息可能已过时
- 无法获取来源的引用
现在,让我们创建一个名为 elasticsearch/elasticsearch.go 的文件,使用 Go 的官方客户端连接 Elasticsearch,并能够使用我们文档中的信息根据我们的数据生成有根据的答案。
注:更多有关如何连接到 Elasticsearch,摄取数据,并进行搜索,请参阅文章 “Elasticsearch:运用 Go 语言实现 Elasticsearch 搜索 - 8.x”。
elasticsearch/elasticsearch.go
package elasticsearchimport ("context""encoding/json""fmt""strings""github.com/elastic/go-elasticsearch/v8""github.com/elastic/go-elasticsearch/v8/typedapi/types"
)// Initializing elasticsearch clientfunc EsClient() (*elasticsearch.TypedClient, error) {var cloudID = "" // your Elastic Cloud ID Herevar apiKey = "" // your Elastic ApiKey Herees, err := elasticsearch.NewTypedClient(elasticsearch.Config{CloudID: cloudID,APIKey: apiKey,})if err != nil {return nil, fmt.Errorf("unable to connect: %w", err)}return es, nil
}// Searching for documents and building the context
func SemanticRetriever(client *elasticsearch.TypedClient, query string, size int) (string, error) {// Perform the semantic searchres, err := client.Search().Index("rag-ollama").Query(&types.Query{Semantic: &types.SemanticQuery{Field: "semantic_field",Query: query,},}).Size(size).Do(context.Background())if err != nil {return "", fmt.Errorf("semantic search failed: %w", err)}// Prepare to format the resultsvar output strings.Builderoutput.WriteString("Documents found\n\n")// Iterate through the search hitsfor i, hit := range res.Hits.Hits {// Define a struct to unmarshal each documentvar doc struct {Title string `json:"title"`Content string `json:"content"`}// Unmarshal the document source into our structif err := json.Unmarshal(hit.Source_, &doc); err != nil {return "", fmt.Errorf("failed to unmarshal document %d: %w", i, err)}// Append the formatted document to our outputoutput.WriteString(fmt.Sprintf("Title\n%s\n\nContent\n%s\n", doc.Title, doc.Content))// Add a separator between documents, except for the last oneif i < len(res.Hits.Hits)-1 {output.WriteString("\n-----\n\n")}}// Return the formatted output as a stringreturn output.String(), nil
}
EsClient 函数使用提供的云凭据初始化 Elasticsearch 客户端,SemanticRetriever 执行语义查询以构建 LLM 回答问题所需的上下文。
要查找你的云 ID 和 API 密钥,请转到此链接。
让我们回到我们的 main.go 文件并使用上述功能进行更新以调用 Elasticsearch 并运行语义查询:这将构建 LLM 上下文:
main.go
package mainimport ("fmt""log""ollama-rag/elasticsearch""github.com/parakeet-nest/parakeet/completion""github.com/parakeet-nest/parakeet/enums/option""github.com/parakeet-nest/parakeet/llm"
)func main() {ollamaUrl := "http://localhost:11434"chatModel := "llama3.2:latest"question := `Summarize document: JAK Inhibitors vs. Monoclonal Antibodies in Rheumatoid Arthritis Treatment`size := 3esClient, err := elasticsearch.EsClient()if err != nil {log.Fatalln("😡:", err)}// Retrieve documents from semantic query to build contextdocumentsContent, nil := elasticsearch.SemanticRetriever(esClient, question, size)systemContent := `You are a helpful medical assistant. Only answer the questions based on found documents.Add references to the base document titles and be succint in your answers.`options := llm.SetOptions(map[string]interface{}{option.Temperature: 0.0,})queryChat := llm.Query{Model: chatModel,Messages: []llm.Message{{Role: "system", Content: systemContent},{Role: "system", Content: documentsContent},{Role: "user", Content: question},},Options: options,}fmt.Println()fmt.Println("🤖 answer:")// Answer the question_, err = completion.ChatStream(ollamaUrl, queryChat,func(answer llm.Answer) error {fmt.Print(answer.Message.Content)return nil})if err != nil {log.Fatal("😡:", err)}fmt.Println()
}
如你所见,我们将用户的问题连同与之相关的所有文档一起发送。这就是我们如何根据 Elasticsearch 中的文档获得答案。
我们可以通过运行代码进行测试:
go run .
你应该看到类似这样的内容:
According to the article "JAK Inhibitors vs. Monoclonal Antibodies in Rheumatoid Arthritis Treatment", JAK inhibitors and monoclonal antibodies are two classes of drugs used to treat rheumatoid arthritis (RA). The main difference between them lies in their mechanisms of action:
JAK inhibitors target intracellular signaling pathways, specifically the Janus kinase (JAK) pathway, which is involved in inflammation and immune response. They have a rapid onset of action and are administered orally.
Monoclonal antibodies target specific proteins involved in the inflammatory process, such as tumor necrosis factor-alpha (TNF-α), interleukin-6 (IL-6), and interleukin-17 (IL-17).
The article highlights that JAK inhibitors have a more favorable safety profile compared to monoclonal antibodies, with fewer gastrointestinal side effects. However, the choice between these two classes of drugs depends on patient characteristics and disease severity.
"References:"
"JAK Inhibitors vs. Monoclonal Antibodies in Rheumatoid Arthritis Treatment" (document title)
Parakeet 将为我们处理 Ollama 交互,包括 token 流!从现在开始,我们可以非常轻松地测试不同的模型,而无需更改代码。
除了主库中的模型外,我们还可以访问社区成员上传的模型。
要使用其中一个,我们只需确保先下载到 Ollama 中。例如,让我们测试 openbiollm:
ollama run taozhiyuai/openbiollm-llama-3:8b_q8_0
安装后,我们可以将它与我们的 Go 代码一起使用:
chatModel := "taozhiyuai/openbiollm-llama-3:8b_q8_0"
让我们用同样的问题再运行一次。你注意到有什么不同吗?
In rheumatoid arthritis treatment, JAK inhibitors and monoclonal antibodies are commonly used. This article discusses the benefits and drawbacks of both therapies. JAK inhibitors work by targeting intracellular signaling pathways involved in the immune response. They have a rapid onset of action and can be administered orally, making them convenient for patients. Recent clinical trial data has shown that JAK inhibitors are effective at reducing inflammation and slowing joint damage progression in rheumatoid arthritis. However, there is still ongoing research to fully understand their long-term safety profile. Monoclonal antibodies, on the other hand, specifically target molecules involved in the immune system. These drugs have been found to be highly effective in managing symptoms of rheumatoid arthritis and improving joint function. They can provide prolonged symptom control and are often used as first-line treatment options. However, due to their complexity and unique administration requirements, monoclonal antibodies may not be suitable for all patients. In conclusion, both J AK inhibitors and monoclonal antibodies have their own advantages and disadvantages in treating rheumatoid arthritis. The choice of therapy depends on individual patient characteristics and disease severity. Ongoing research will contribute to a deeper understanding of the efficacy and safety profiles of these treatments, ultimately leading to improved care for patients with rheumatoid arthritis.
openbiollm 模型似乎提供了更多有关技术术语的细节,但它没有遵循有关引用上下文中提供的文档和给出简短答案的说明。相比之下,Llama3.2 更好地遵循了说明。
你可以在此处找到完整的工作示例
结论
Ollama 提供了一种非常直接和简单的方法来下载和测试不同的开放模型,从知名的模型到社区成员微调的模型。将它与 Parakeet 和官方 Elasticsearch Go 客户端配对,可以非常轻松地创建 RAG 应用程序。此外,通过使用 semantic_text 字段类型,你可以创建一个使用 ELSER(Elastic Sparse 嵌入模型)的语义查询就绪索引,而无需任何其他配置,从而简化了分块、索引和向量查询过程。
Elasticsearch 与行业领先的 Gen AI 工具和提供商进行了原生集成。查看我们的网络研讨会,了解如何超越 RAG 基础知识,或构建可用于生产的应用程序 Elastic Vector Database。
要为你的用例构建最佳搜索解决方案,请立即开始免费云试用或在你的本地机器上试用 Elastic。
原文:Using Ollama and Go for RAG applications - Elasticsearch Labs