文章目录
- 一、关于 python-docx
- 关于 OpenXML
- 安装
- 二、使用示例
一、关于 python-docx
python-docx 是一个Python库,用来 读写更新 Microsoft Word 2007+ (.docx) 文件。
- GitHub : https://python-docx.readthedocs.io/en/latest/
- 官方文档:https://python-docx.readthedocs.io/en/latest/
关于 OpenXML
OpenXML,是微软Office 2007及以上版本文档的存储格式,它将文档内容以XML形式存储,使得文档内容可以被程序直接读取和修改。
处理OpenXML所需的库主要是python-docx。
- 使用Python和OpenXML高效处理Word文档的技巧与实践
https://www.oryoy.com/news/shi-yong-python-he-openxml-gao-xiao-chu-li-word-wen-dang-de-ji-qiao-yu-shi-jian.html - 关于 Open XML SDK for Office
https://learn.microsoft.com/zh-cn/office/open-xml/about-the-open-xml-sdk
注:不支持 doc 文档
doc 和 docx 有本质诧异,doc 是 二进制,docx 是 XML 格式文件
安装
pip install python-docx
二、使用示例
>>> from docx import Document>>> document = Document()
>>> document.add_paragraph("It was a dark and stormy night.")
<docx.text.paragraph.Paragraph object at 0x10f19e760>
>>> document.save("dark-and-stormy.docx")>>> document = Document("dark-and-stormy.docx")
>>> document.paragraphs[0].text
'It was a dark and stormy night.'
from docx import Document
from docx.shared import Inchesdocument = Document()document.add_heading('Document Title', 0)p = document.add_paragraph('A plain paragraph having some ')
p.add_run('bold').bold = True
p.add_run(' and some ')
p.add_run('italic.').italic = Truedocument.add_heading('Heading, level 1', level=1)
document.add_paragraph('Intense quote', style='Intense Quote')document.add_paragraph('first item in unordered list', style='List Bullet'
)
document.add_paragraph('first item in ordered list', style='List Number'
)document.add_picture('monty-truth.png', width=Inches(1.25))records = ((3, '101', 'Spam'),(7, '422', 'Eggs'),(4, '631', 'Spam, spam, eggs, and spam')
)table = document.add_table(rows=1, cols=3)
hdr_cells = table.rows[0].cells
hdr_cells[0].text = 'Qty'
hdr_cells[1].text = 'Id'
hdr_cells[2].text = 'Desc'
for qty, id, desc in records:row_cells = table.add_row().cellsrow_cells[0].text = str(qty)row_cells[1].text = idrow_cells[2].text = descdocument.add_page_break()document.save('demo.docx')
2025-03-03