BUG:AttributeError: ‘GLMChineseTokenizer’ object has no attribute 'sp_model’
环境
Python 3.10
torch 2.0.1
transformers 4.37.0
详情
在运行 glm-large-chinese 模型时弹出的BUG,具体原因不清楚,大概是 transformers 版本改变了,导致一些接口导入方式改变,而glm-large-chinese 的代码还是旧版的。
解决方法
打开模型附带的 tokenization_glm.py
代码文件。修改 GLMChineseTokenizer
类初始化。
# 原始
def __init__(self, vocab_file, **kwargs):super().__init__(**kwargs) # 置后self.vocab_file = vocab_fileself.sp_model = spm.SentencePieceProcessor()self.sp_model.Load(vocab_file)# 修改
def __init__(self, vocab_file, **kwargs):self.vocab_file = vocab_fileself.sp_model = spm.SentencePieceProcessor()self.sp_model.Load(vocab_file)super().__init__(**kwargs) # 置后
参考
https://github.com/baichuan-inc/Baichuan2/issues/204
解决‘BaichuanTokenizer‘ object has no attribute ‘sp_model‘,无需重装transformers和torch_baichuantokenizer’ obiect has no attribute’sp mode-CSDN博客