题意:使用图像与@azure/openai
进行交互或查询
问题背景:
On chat.openai.com I can upload an image and ask chatgpt a question about it, with the existing openai and @azure/openai api however there doesn't seem to be a way to do this? The ChatCompletion object in both cases only take text prompts.
在chat.openai.com上,我可以上传一张图片并就它向ChatGPT提问,但是使用现有的openai和@azure/openai API时,似乎没有办法做到这一点?在这两种情况下,ChatCompletion对象都只接受文本提示。
Is this feautre supported at an api level?
这个特性在API级别上得到支持吗?
问题解决:
With OpenAI you just include your image as part of the message that you supply. Here is a piece from the code I use, which works whether you have an image or not:
在使用OpenAI时,你只需要将你的图像作为你提供消息的一部分包含进来。下面是我使用的一段代码,无论你是否有图像,它都能正常工作。
if image != '':# Get base64 stringbase64_image = encode_image(image)content = [{"type": "text","text": your_prompt},{"type": "image_url","image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}}]
else:content = your_prompt
messages.append({"role": "user", "content": content})
And then 然后
payload = {"model": model_name,"temperature": temperature,"max_tokens": tokens,"messages": messages
}
where encode_image() is defined: encode_image()
函数是在哪里定义的?
def encode_image(image_path):with open(image_path, "rb") as image_file:return base64.b64encode(image_file.read()).decode('utf-8')
Currently you need to target OpenAI model gpt-4-vision-preview. Update: As @Michael suggests, it also works with gpt-4o.
目前你需要将目标设定为OpenAI的模型gpt-4-vision-preview。更新:如@Michael所建议的,它也适用于gpt-4o