NotebookLM supports PDFs up to 200MB and 500,000 words per source. For larger documents, the standard workaround is splitting by chapter or section and uploading as separate sources within the same notebook. Scanned image-only PDFs require OCR preprocessing to extract a text layer before NotebookLM can read them. Both issues are solvable — the key is knowing which problem you have before trying workarounds.

PDFs are the most common source type in serious research workflows, and they are also the source type most likely to cause problems. A 400-page dissertation, a scanned legal document, or a dense technical manual will hit one of NotebookLM's limits if you do not prepare it first.

This guide covers the specific limits, how to diagnose which problem you have, and the practical workarounds that actually work.

For the complete NotebookLM overview, see the Complete NotebookLM Guide.

NotebookLM's PDF Limits

Understanding the three distinct limits prevents confusion:

| Limit | Value | What Happens When Exceeded | |---|---|---| | File size per source | 200MB | Upload rejected with error | | Words per source | 500,000 | Upload may fail or content truncated silently | | Sources per notebook | 50 | Cannot add more until existing sources are removed | | Total notebook capacity | ~25M words (approx.) | Degraded performance |

Most academic papers (5,000-30,000 words) are well within limits. The problems arise with: - Full books or dissertations - Multi-year annual reports bundled into one PDF - Court documents, transcripts, or legal filings - Technical documentation and specification sets

Diagnosing Your Problem

Before trying workarounds, identify which specific problem you have:

Problem 1: File is Too Large (>200MB)

Symptoms: Upload fails with a file size error.

Cause: The PDF contains large images, embedded fonts, or high-resolution scans.

Diagnosis: Check the file size in your operating system. On Windows: right-click → Properties. On Mac: right-click → Get Info.

Fix: Compress the PDF first (see below).

Problem 2: PDF is Scanned / Image-Only

Symptoms: PDF uploads without error, but the AI cannot answer questions about its content. Queries return "I don't see this in the provided sources" for content you know is in the document.

Diagnosis: Open the PDF, try to select text with your cursor. If you cannot select any text, it is image-only.

Fix: Run OCR first (see below).

Problem 3: Document is Too Long (>500,000 words)

Symptoms: Upload fails with a word count error, or upload succeeds but AI queries only reference content from the beginning of the document.

Diagnosis: Word count can be checked in Word or by converting to text and using a word counter. A 500,000-word document is approximately 1,000 pages of dense academic text.

Fix: Split the document (see below).

Problem 4: Content Not Extracting Correctly

Symptoms: Upload succeeds, but the AI gives garbled or incorrect information. Common with heavily formatted documents (textbooks with sidebars, columns, tables).

Diagnosis: Ask the AI to quote a specific paragraph you know exists. If the quote is incorrect or mangled, extraction failed.

Fix: Try a text-only export (see below).

Fix 1: Compress Large PDFs

For PDFs that exceed 200MB due to images or embedded assets:

Free tools: - Smallpdf (smallpdf.com) — drag, compress, download. Works for most cases. - ilovepdf (ilovepdf.com) — similar, slightly more compression options - Adobe Acrobat Reader (File → Print → Save as PDF, then adjust quality) — reduces size by removing some embedded data

Settings to use: - Target "Screen" or "Web" quality (72 DPI) if images are not critical - "Print" quality (150 DPI) if you need to preserve image clarity - Remove embedded fonts if the tool provides this option

After compression, verify the PDF is still readable before uploading.

Fix 2: OCR Scanned PDFs

For image-only PDFs with no text layer:

Google Drive (free, best quality for most documents): 1. Upload the PDF to Google Drive 2. Right-click → Open with Google Docs 3. Google automatically OCRs the document when opening as Docs 4. File → Download → PDF Document to get a text-layer PDF 5. Upload the downloaded PDF to NotebookLM

Adobe Acrobat (best quality, subscription required): 1. Open PDF in Acrobat 2. Tools → Scan & OCR → Recognize Text 3. Save the file 4. Upload to NotebookLM

Free online tools: - OCR.space — good for short documents - PDF24 OCR — handles longer documents, free - Adobe Acrobat online (limited free OCR) — acrobat.adobe.com

OCR accuracy varies by document quality. Handwritten text, unusual fonts, low-resolution scans, and heavily damaged documents will produce OCR errors. Always spot-check the extracted text by asking the AI about specific known content before relying on the results.

Fix 3: Split Long Documents

For documents exceeding 500,000 words or 200MB:

Split by chapter or section (recommended):

Open the PDF in Adobe Acrobat Reader, Foxit, or a browser PDF viewer
Check the table of contents for chapter page ranges
Use a split tool to extract each chapter as a separate file:
Smallpdf → Split → enter page ranges
ilovepdf → Split PDF → "Custom ranges"
PDF24 → Split PDF → "Specific page ranges"
Name each file clearly: Dissertation-Ch1-Introduction.pdf, Dissertation-Ch2-Literature.pdf
Upload all parts as separate sources to the same notebook
Add a text note to the notebook explaining the split structure

Split by content type (for textbooks): - Part 1: Theoretical chapters - Part 2: Case studies and examples - Part 3: Appendices and references (often can be omitted)

When splitting, be generous with overlap at chapter boundaries. Include the last page of one chapter in the start of the next split — important conclusions and transitions at chapter ends are easy to lose at split points.

Fix 4: Text-Only Export for Formatting Issues

For documents with heavy formatting (multi-column layouts, textbooks with sidebars):

From Word/Google Docs: 1. Open the source document in Word or Docs 2. File → Download/Save As → Plain text (.txt) 3. Upload the .txt file to NotebookLM instead of the PDF

From PDF (when you have the source file): 1. Use the original Word/InDesign file if available 2. Export as plain text or simple PDF (single-column, no sidebars)

Last resort — copy-paste into a text note: For short documents (under 20,000 words), select all text in the PDF (Ctrl+A), copy, and paste into a NotebookLM text note. This strips all formatting but preserves all content.

Working with Multiple PDF Parts in One Notebook

When you have split a long document into multiple parts, help the AI understand the structure:

Add a "navigator" text note: ``` DOCUMENT STRUCTURE NOTE: This notebook contains a split version of: [Document title]

Part 1 (Ch. 1-3): Background and literature review, pages 1-85
Part 2 (Ch. 4-5): Methodology and data collection, pages 86-140
Part 3 (Ch. 6-7): Results and analysis, pages 141-210
Part 4 (Ch. 8-9): Discussion and conclusions, pages 211-260

When answering questions about conclusions or implications, prioritize Part 4. For methodology questions, prioritize Part 3. ```

This context note dramatically improves the AI's ability to give coherent answers about a split document.

Effective prompts for split documents:

Format-Specific Tips

Academic Papers

Most academic papers are well within limits. The common issue is multi-paper collections (literature compilations, edited volumes). Split these into individual papers and label clearly.

Legal Documents

Legal documents often contain extensive headers, footers, page numbers, and formatting that confuse text extraction. The plain text export approach works well here — paste text into a NotebookLM note, stripping the formatting noise.

Technical Documentation

Documentation sets (e.g., an entire API reference, a multi-chapter technical manual) benefit from topical splitting. Rather than splitting by page count, split by what you are trying to learn: "Authentication docs", "Data models", "Error handling" as separate sources.

Textbooks

Textbooks vary widely. Recent textbooks are often text-searchable PDFs within size limits. Older textbooks from scanned sources need OCR. Highly illustrated textbooks may need compression. The most useful split for textbooks is by chapter or major unit.

What These Workarounds Cannot Fix

Password-protected PDFs — NotebookLM cannot read encrypted PDFs. Remove the password protection first.
Right-to-left text (Arabic, Hebrew) — Extraction quality is variable; verify before relying on results
Mathematical notation — Equations may not extract correctly; OCR and PDF text extraction both struggle with LaTeX-rendered math
Complex tables — Multi-row, multi-column tables often extract as garbled text; consider copying the table content manually into a text note

Summary

The troubleshooting flow: file size problem → compress first; scanned PDF → OCR first; too long → split by chapter; bad formatting → use text export. Most large-document problems are solvable with these four fixes. The key is diagnosing which problem you have before trying workarounds.

For research workflows using these sources: NotebookLM Research Workflows
For exporting findings from your notebook: Getting Your Data Out of NotebookLM
For the complete NotebookLM reference: Complete NotebookLM Guide