AgentHub Logo

pdf Agent Skills - Data & Analytics

🎼
pdf

Powerful toolkit for PDF text, tables extraction, and editing

Data & Analytics

About This Skill

1,147 views
5 downloads

Features & Capabilities

Detailed Description

🎯 What is PDF?

A comprehensive toolkit for manipulating PDF files including text and table extraction, document creation, merging, splitting, and form handling. Designed for scalable automated PDF processing.

⚡ Key Features

  • Extract text and tables with layout retention using pdfplumber
  • Create and edit PDFs with reportlab and pypdf
  • Merge, split, rotate, and watermark PDF pages
  • Command-line tool support (qpdf, pdftotext, pdftk) for versatile operations
  • OCR integration for scanned document text extraction
  • PDF form filling and password protection

🛠️ Tech Stack

Uses Python libraries pypdf, pdfplumber, reportlab, pytesseract for OCR, and command-line utilities like qpdf, pdftotext, and pdftk.

📋 Installation

Install Python libraries via pip (`pypdf`, `pdfplumber`, `reportlab`, `pytesseract`, `pdf2image`). Setup poppler-utils for pdftotext and pdfimages commands. Refer to LICENSE for terms.

⚠️ Important Notes

Proprietary license applies. Handling scanned documents requires OCR setup. Form filling guided by additional documentation (forms.md).

❓ Common Issues

Ensure dependencies like poppler and Tesseract OCR are installed for full feature use. Coordinate between Python and CLI tools carefully for best performance.

How to Install pdf

Download Options

View Repository

Author Information

Author

Published: October 26, 2025
Updated: June 19, 2026

Status: published

Support & Resources

Support the Author

Related Agent Skills

pdf Agent Skill - Data & Analytics | Agent Skills Market | AgentHub