进项增值税发票的查验与去重,是企业日常财务管理及代理记账的核心合规模块。
尤其是电子发票(包括数电发票)普及后,由于其可以被无限次打印,同一张电子发票被不同的员工重复报销,或在不同的月份重复入账,已成为最常见的财税合规风险。
为了彻底消除这个隐患,三函代码在其自动化系统中设计了双重校验拦截引擎。
一、 防重报发票指纹哈希(MD5)算法
为了避免任何一张发票重复报销,系统不是简单地对比文件名,而是对提取的核心要素构建确定性哈希:
发票哈希 = MD5(发票代码 + "_" + 发票号码 + "_" + 开票日期 + "_" + 不含税金额)无论图像如何命名、如何裁切,只要核心商业要素一致,计算出的哈希值即完全相同。以下为防重报指纹库的 Python 实现:
import hashlib
import sqlite3
def init_fingerprint_db():
conn = sqlite3.connect("invoice_fingerprints.db")
cursor = conn.cursor()
# 建立发票指纹库表,将发票哈希设为主键
cursor.execute("""
CREATE TABLE IF NOT EXISTS invoice_fingerprints (
invoice_hash TEXT PRIMARY KEY,
invoice_code TEXT NOT NULL,
invoice_num TEXT NOT NULL,
amount REAL NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
""")
conn.commit()
conn.close()
def check_and_register_invoice(invoice_code, invoice_num, date_str, amount):
"""
校验并录入发票指纹,若重复则拦截
"""
# 计算唯一哈希
input_str = f"{invoice_code}_{invoice_num}_{date_str}_{amount:.2f}"
invoice_hash = hashlib.md5(input_str.encode('utf-8')).hexdigest()
conn = sqlite3.connect("invoice_fingerprints.db")
cursor = conn.cursor()
try:
cursor.execute(
"INSERT INTO invoice_fingerprints (invoice_hash, invoice_code, invoice_num, amount) VALUES (?, ?, ?, ?)",
(invoice_hash, invoice_code, invoice_num, amount)
)
conn.commit()
return "SUCCESS", invoice_hash
except sqlite3.IntegrityError:
# 主键冲突,代表该发票已报销过
cursor.execute("SELECT created_at FROM invoice_fingerprints WHERE invoice_hash = ?", (invoice_hash,))
recorded_time = cursor.fetchone()[0]
return "DUPLICATE_ERROR", f"发票已于 {recorded_time} 报销,禁止重复录入!"
finally:
conn.close()二、 国税发票查验接口自动化集成
在去重验证通过后,智能体将进一步调用查验接口,自动抓取发票的真伪数据:
import requests
def verify_invoice_via_tax_api(invoice_code, invoice_num, date_str, check_code, amount):
"""
通过国税查验接口(模拟报文)进行实名查验
"""
api_url = "https://api.tax-checking.example.gov.cn/v1/invoice/verify"
payload = {
"invoiceCode": invoice_code,
"invoiceNumber": invoice_num,
"billDate": date_str,
"checkCode": check_code, # 校验码后 6 位
"amount": amount
}
headers = {"Authorization": "Bearer YOUR_TAX_CLIENT_TOKEN"}
try:
response = requests.post(api_url, json=payload, headers=headers, timeout=10)
if response.status_code == 200:
result = response.json()
if result.get("status") == "01": # 01 代表真票
return "GENUINE", result.get("invoiceDetails")
else:
return "FAKE_OR_VOID", "警告:发票查验结果为假或已作废!"
return "API_ERROR", f"接口返回异常代码: {response.status_code}"
except requests.exceptions.RequestException as e:
return "API_TIMEOUT", f"连接超时: {str(e)}"三、 结论
通过在代账流程最前端加入这套“发票指纹去重 + 国税 API 实名查验”防线,企业可以在源头上拦截虚假发票和跨期、重报单据,将财税风险降低为零,彻底省去了年终税务审计时的人工比对麻烦。