DSPy.Image:视觉模型支持
DSPy 最近在测试版中增加了对 VLM 的支持。本文介绍使用 DSPy 从图像中提取属性。对于此示例,我们将了解如何从网站屏幕截图中提取有用的属性
1、定义签名
定义DSPy签名。注意 dspy.Image
输入字段:
import dspy
class WebsiteDataExtractionSignature(dspy.Signature):
"""Website data extraction"""
website_screenshot: dspy.Image = dspy.InputField(
desc="A screenshot of the website"
)
hero_text: str = dspy.OutputField(
desc="The hero text of the website"
)
website_description: str = dspy.OutputField(
desc="A description of the website"
)
call_to_action: str = dspy.OutputField(
desc="The call to action of the website"
)
color_palette: list[str] = dspy.OutputField(
desc="The color palette of the website"
)
font_palette: list[str] = dspy.OutputField(
desc="The font palette of the website"
)
2、定义模块
接下来使用 ChainOfThought
优化器和上一步中的签名定义一个简单的程序:
class WebsiteDataExtraction(dspy.Module):
"""Module for extracting structured data from website screenshots."""
def __init__(self):
self.website_data_extraction = dspy.ChainOfThought(
WebsiteDataExtractionSignature
)
# pylint: disable=missing-function-docstring
def forward(self, website_screenshot: str):
website_data = self.website_data_extraction(website_screenshot)
return website_data
3、最终代码
最后,编写一个函数来读取图像并通过调用上一步中的程序来提取属性:
def extract_website_data(website_screenshot_path: str):
"""Extract data from a website screenshot.
Args:
website_screenshot_path (str): Path to the website screenshot image
Returns:
dict: Extracted website data
"""
# Load the image
with open(website_screenshot_path, "rb") as image_file:
base64_data = base64.b64encode(image_file.read()).decode('utf-8').replace('\n', '')
image_data_uri = f"data:image/png;base64,{base64_data}"
website_data_extraction = WebsiteDataExtraction()
website_data = website_data_extraction(image_data_uri)
return website_data
if __name__ == "__main__":
dspy_lm = dspy.LM(model="openai/gpt-4o-mini")
dspy.config( lm=dspy_lm)
result = extract_website_data(
"src/vision_lm/data/langtrace-screenshot.png"
)
print(result)
4、可观察性
就是这样!如果您的开发需要可观察性,只需添加 langtrace.init()
即可从跟踪中获得更深入的见解。
5、源代码
你可以在此处找到此示例的完整源代码。
原文链接:Attribute Extraction from Images using DSPy
汇智网翻译整理,转载请标明出处