根据说明,调用测试
设置注册的API Key和Secret Key
调用类(官方文档中有)
这里改传入路径;
测试问题
1.{"error_code":110,"error_msg":"Access token invalid or no longer valid"}
查到说是
原来第一步取AccessToken方法,有误区,返回的result是一个集合,AccessToken是其中一项。。。
需要转化后获取(弱水三千,TMD只取一瓢):
自建类库,参考
public class AccessTokenInfo
{
public string refresh_token { get; set; }
public string expires_in { get; set; }
public string session_key { get; set; }
public string access_token { get; set; }
public string scope { get; set; }
public string session_secret { get; set; }
}
2.按照之前的类传入PDF不识别
{"log_id":1901887988395845459,"error_msg":"image format error","error_code":216201}
原因:给的示例只支持image,PDF需要自己调整:
至此调用成功
3.解析字符串
自建类库
public class OcrData
{
public string log_id { get; set; }
public string pdf_file_size { get; set; }
public string words_result_num { get; set; }
public InvoiceData words_result { get; set; }
}
public class InvoiceData
{
/// <summary>
/// 发票类型-电子发票(普通发票)
/// </summary>
public string InvoiceTypeOrg { get; set; }
/// <summary>
/// 发票号
/// </summary>
public string InvoiceNum { get; set; }
/// <summary>
/// 发票日期
/// </summary>
public string InvoiceDate { get; set; }
/// <summary>
/// 购买方抬头
/// </summary>
public string PurchaserName { get; set; }
/// <summary>
/// 购买方统一社会信用代码/纳税人识别号
/// </summary>
public string PurchaserRegisterNum { get; set; }
/// <summary>
/// 销售方抬头
/// </summary>
public string SellerName { get; set; }
/// <summary>
/// 销售方统一社会信用代码/纳税人识别号
/// </summary>
public string SellerRegisterNum { get; set; }
/// <summary>
/// 价税合计(小写)
/// </summary>
public string AmountInFiguers { get; set; }
/// <summary>
/// 税额-列表
/// </summary>
public List<CommodityData> CommodityTaxRate { get; set; }
/// <summary>
/// 税额-列表
/// </summary>
public List<CommodityData> CommodityTax { get; set; }
/// <summary>
/// 税额合计
/// </summary>
public string TotalTax { get; set; }
/// <summary>
/// 备注
/// </summary>
public string Remarks { get; set; }
/// <summary>
/// 开票人
/// </summary>
public string NoteDrawer { get; set; }
/// <summary>
/// 合计
/// </summary>
public string TotalAmount { get; set; }
}
public class CommodityData
{
public string row { get; set; }
public string word { get; set; }
}
4.一个pdf多张发票问题
没查到可以一次读取多个的接口,笨办法就是分割成多个pdf,然后分别读取,下面是PDF分割的方法
string inputPdfPath = "path/to/your/input.pdf";string outputDir = "path/to/output/directory";// 确保输出目录存在Directory.CreateDirectory(outputDir);using (PdfReader reader = new PdfReader(inputPdfPath)){using (PdfDocument pdfDoc = new PdfDocument(reader)){int numberOfPages = pdfDoc.GetNumberOfPages();// 遍历每一页for (int i = 1; i <= numberOfPages; i++){// 创建新文件的路径string outputPath = Path.Combine(outputDir, $"page_{i}.pdf");// 创建一个新的PDF文档,只包含当前页PdfDocument singlePageDoc = new PdfDocument(new PdfWriter(outputPath));pdfDoc.CopyPagesTo(i, i, singlePageDoc);singlePageDoc.Close();---这里读取就好了}}}