feat(api): PDF + DOCX extraction in folder indexer

Add pypdf/python-docx deps, _extract_pdf_text/_extract_docx_text helpers,
and summarize_pdf/summarize_docx wrappers that delegate to summarize_text.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Roberto
2026-05-12 11:15:17 +02:00
parent b7a4edac90
commit 2aeb453229
3 changed files with 86 additions and 0 deletions

View File

@@ -39,3 +39,5 @@ lxml>=5.0.0
PyYAML>=6.0.0
apscheduler>=3.10.0
ruff>=0.8.0
pypdf>=4.0
python-docx>=1.1