广告设计学的是什么_邯郸互联网公司_网站优化排名金苹果下拉_互联网广告代理可靠吗

参考链接：https://www.datacamp.com/tutorial/fine-tuning-deepseek-r1-reasoning-model

如何在本地微调DeepSeek-R1-8b模型_哔哩哔哩_bilibili

1、重新训练并微调模型

必须使用python 3.11环境，否则unsloth里的某些包存在问题：

conda create -n py311-llm python=3.11

pip install unsloth # 会自动下载torch cuda cudnn等包，提前换好清华源

完整的代码：（训练过程中，最好从huggingface上提前下周模型和数据集，如果不提前下载，服务器不能访问外网，将会无法下载）

模型：https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B/tree/main

数据：https://huggingface.co/datasets/FreedomIntelligence/medical-o1-reasoning-SFT/tree/main

训练模型，并将模型保存为huggingface格式的：

from unsloth import FastLanguageModel
import torch
from huggingface_hub import login
import wandb
from datasets import load_datasetmax_seq_length = 2048
dtype = None
load_in_4bit = Truewandb.login(key="1b1e316624ce8b380a5bae13598a1d31536676f1") # 这里改成你自己的key
run = wandb.init(project='my fint-tune on deepseek r1 with medical data',job_type="training",anonymous="allow"
)# 方法一：使用huggingface的token下载数据集
# hf_token = "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"
# login(hf_token)
# model, tokenizer = FastLanguageModel.from_pretrained(
#     model_name = "unsloth/DeepSeek-R1-Distill-Llama-8B", # 这里改成你本地模型，以我的为例，我已经huggingface上的模型文件下载到本地。
#     max_seq_length = max_seq_length,
#     dtype = dtype,
#     load_in_4bit = load_in_4bit,
# )# 方法二：使用本地模型
model, tokenizer = FastLanguageModel.from_pretrained(model_name = "./ds_r1_8b_wts", # 这里改成你本地模型，以我的为例，我已经huggingface上的模型文件下载到本地。max_seq_length = max_seq_length,dtype = dtype,load_in_4bit = load_in_4bit,
)prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context. 
Write a response that appropriately completes the request. 
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.### Instruction:
You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning. 
Please answer the following medical question. ### Question:
{}### Response:
<think>{}"""question = "A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions?"FastLanguageModel.for_inference(model) 
inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")outputs = model.generate(input_ids=inputs.input_ids,attention_mask=inputs.attention_mask,max_new_tokens=1200,use_cache=True,
)
response = tokenizer.batch_decode(outputs)
print(response[0].split("### Response:")[1])train_prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context. 
Write a response that appropriately completes the request. 
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.### Instruction:
You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning. 
Please answer the following medical question. ### Question:
{}### Response:
<think>
{}
</think>
{}"""EOS_TOKEN = tokenizer.eos_token  # Must add EOS_TOKENdef formatting_prompts_func(examples):inputs = examples["Question"]cots = examples["Complex_CoT"]outputs = examples["Response"]texts = []for input, cot, output in zip(inputs, cots, outputs):text = train_prompt_style.format(input, cot, output) + EOS_TOKENtexts.append(text)return {"text": texts,}dataset = load_dataset("./medical-o1-reasoning-SFT","en", split = "train[0:500]",trust_remote_code=True)
dataset = dataset.map(formatting_prompts_func, batched = True,)
dataset["text"][0]
model = FastLanguageModel.get_peft_model(model,r=16,  target_modules=["q_proj","k_proj","v_proj","o_proj","gate_proj","up_proj","down_proj",],lora_alpha=16,lora_dropout=0,  bias="none",  use_gradient_checkpointing="unsloth",  # True or "unsloth" for very long contextrandom_state=3407,use_rslora=False,  loftq_config=None,
)from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supportedtrainer = SFTTrainer(model=model,tokenizer=tokenizer,train_dataset=dataset,dataset_text_field="text",max_seq_length=max_seq_length,dataset_num_proc=2,args=TrainingArguments(per_device_train_batch_size=2,gradient_accumulation_steps=4,# Use num_train_epochs = 1, warmup_ratio for full training runs!warmup_steps=5,max_steps=60,learning_rate=2e-4,fp16=not is_bfloat16_supported(),bf16=is_bfloat16_supported(),logging_steps=10,optim="adamw_8bit",weight_decay=0.01,lr_scheduler_type="linear",seed=3407,output_dir="outputs",),
)trainer_stats = trainer.train()wandb.finish()
input("训练完成，回车进入到测试1>>>")question = "A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions?"FastLanguageModel.for_inference(model)  # Unsloth has 2x faster inference!
inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")outputs = model.generate(input_ids=inputs.input_ids,attention_mask=inputs.attention_mask,max_new_tokens=1200,use_cache=True,
)
response = tokenizer.batch_decode(outputs)
print(response[0].split("### Response:")[1])input("训练完成，回车进入到测试2>>>")
question = "A 59-year-old man presents with a fever, chills, night sweats, and generalized fatigue, and is found to have a 12 mm vegetation on the aortic valve. Blood cultures indicate gram-positive, catalase-negative, gamma-hemolytic cocci in chains that do not grow in a 6.5% NaCl medium. What is the most likely predisposing factor for this patient's condition?"inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")outputs = model.generate(input_ids=inputs.input_ids,attention_mask=inputs.attention_mask,max_new_tokens=1200,use_cache=True,
)
response = tokenizer.batch_decode(outputs)
print(response[0].split("### Response:")[1])input("将模型保存到本地，回车>>>>")
new_model_local = "DeepSeek-R1-Medical-COT"
model.save_pretrained(new_model_local) # Local saving
tokenizer.save_pretrained(new_model_local)input("保存为huggingface模型，回车>>>>")
model.save_pretrained_merged("merged_models", tokenizer, save_method="merged_16bit")

这里会出现错误：

AttributeError: 'PeftModelForCausalLM' object has no attribute '_unwrapped_old_generate'

解决方案：

到路径/home/jjg/anaconda3/envs/py311-llm/lib/python3.11/site-packages/unsloth/models/ollama.py文件下找到错误代码行注释掉

训练完成：

2、编译llama.cpp，利用它将模型转换为gguf格式（Ollama可以处理gguf格式的模型）

# 获取llama.cpp:
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
#编译
cmake -B build
# 安装python依赖
pip install -r requirements.txt  # 这里会出现从pytorch官网下载包的问题，很慢，注释掉torch的安装，和外部链接

将模型转换为gguf格式：

python convert_hf_to_gguf.py ./merged_models --outtype f16

3、创建Ollama模型：

ollama create my_deepseek_r1_8b -f Modelfile

注意：创建Modelfile文件

touch Modelfile，在Modelfile中写入以下完整的Deepseek-r1_8b的完整文件内容：

# Modelfile generated by "ollama show"
# To build a new Modelfile based on this, replace FROM with:
# FROM deepseek-r1:8bFROM /home/jjg/codes/mydeepseek/merged_models/ds_r1-8.0B-_8b_wts-F16.gguf
TEMPLATE """{{- if .System }}{{ .System }}{{ end }}
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1}}
{{- if eq .Role "user" }}<｜User｜>{{ .Content }}
{{- else if eq .Role "assistant" }}<｜Assistant｜>{{ .Content }}{{- if not $last }}<｜end▁of▁sentence｜>{{- end }}
{{- end }}
{{- if and $last (ne .Role "assistant") }}<｜Assistant｜>{{- end }}
{{- end }}"""
PARAMETER stop <｜begin▁of▁sentence｜>
PARAMETER stop <｜end▁of▁sentence｜>
PARAMETER stop <｜User｜>
PARAMETER stop <｜Assistant｜>
LICENSE """MIT LicenseCopyright (c) 2023 DeepSeekPermission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
"""

注意：Modelfile 里如果只写FROM...这一行的话，在问答时可能只会回答一个问题，

剩下的就为空的回答了，解决方案：

获取完整的Modelfile文件，将完整的Modelfile文件，填到自己创建的那个Modelfile中，注意替换FROM后的字段：

ollama show deepseek-r1:8b --modelfile

4、运行自己训练的模型

查看现在已经有的模型：

ollama list

运行模型：

ollama run my_deepseek_r1_8b

5、将模型推送的huggingface

model.push_to_hub(new_model_online) # Online saving
tokenizer.push_to_hub(new_model_online) # Online savingmodel.save_pretrained_merged(new_model_local, tokenizer, save_method = "merged_16bit",)
model.push_to_hub_merged(new_model_online, tokenizer, save_method = "merged_16bit")