Processing Pipeline
When you upload a document, Shelv runs an automated pipeline that converts the source PDF into a structured Markdown filesystem.Stages
1. Parsing
The uploaded PDF is converted into normalized Markdown so the rest of the pipeline can reason over consistent text and headings. This stage also records metadata needed for status reporting and retries. Output: parsed page content ready for structuring.2. Structuring
Shelv analyzes parsed content and proposes a filesystem layout for files and directories.- Document headings and their hierarchy
- Logical content boundaries (chapters, sections, clauses)
- The selected template’s conventions (if any)
- Practical file sizing for agent workflows
3. Verification
Automated checks validate output quality before finalization:- Required artifacts — expected files (such as
README.md) are present - Content sanity — output content size is compared against source content
- Naming/path safety — file paths follow expected conventions
- File sizing sanity — suspiciously small or large files are flagged
4. Storage
Structured files are written to S3-compatible object storage scoped to your account and shelf ID. Availability:- File tree and file reads are available in
readyandreview - Temporary S3 credentials are available in
ready
5. Webhooks
When processing completes (or fails), Shelv dispatches webhook notifications to all registered endpoints for the relevant events (shelf.ready, shelf.failed, shelf.review).
Error Handling
If any stage fails, the shelf transitions tofailed status with a descriptive error message and the name of the failed step. You can retry processing with: