Status: Active Last Updated: 2026-01-15
This guide covers everything you need to contribute to notebooklm-py: architecture overview, testing, and releasing.
src/notebooklm/
├── __init__.py # Public exports
├── client.py # NotebookLMClient main class
├── auth.py # Authentication handling
├── types.py # Dataclasses and type definitions
├── _core.py # Core HTTP/RPC infrastructure
├── _notebooks.py # NotebooksAPI implementation
├── _sources.py # SourcesAPI implementation
├── _artifacts.py # ArtifactsAPI implementation
├── _chat.py # ChatAPI implementation
├── _research.py # ResearchAPI implementation
├── _notes.py # NotesAPI implementation
├── rpc/ # RPC protocol layer
│ ├── __init__.py
│ ├── types.py # RPCMethod enum and constants
│ ├── encoder.py # Request encoding
│ └── decoder.py # Response parsing
└── cli/ # CLI implementation
├── __init__.py # CLI package exports
├── helpers.py # Shared utilities
├── session.py # login, use, status, clear
├── notebook.py # list, create, delete, rename
├── source.py # source add, list, delete
├── artifact.py # artifact list, get, delete
├── generate.py # generate audio, video, etc.
├── download.py # download audio, video, etc.
├── chat.py # ask, configure, history
└── ...
┌─────────────────────────────────────────────────────────────┐
│ CLI Layer │
│ cli/session.py, cli/notebook.py, cli/generate.py, etc. │
└───────────────────────────┬─────────────────────────────────┘
│
┌───────────────────────────▼─────────────────────────────────┐
│ Client Layer │
│ NotebookLMClient → NotebooksAPI, SourcesAPI, ArtifactsAPI │
└───────────────────────────┬─────────────────────────────────┘
│
┌───────────────────────────▼─────────────────────────────────┐
│ Core Layer │
│ ClientCore → _rpc_call(), HTTP client │
└───────────────────────────┬─────────────────────────────────┘
│
┌───────────────────────────▼─────────────────────────────────┐
│ RPC Layer │
│ encoder.py, decoder.py, types.py (RPCMethod) │
└─────────────────────────────────────────────────────────────┘
| Layer | Files | Responsibility |
|---|---|---|
| CLI | cli/*.py |
User commands, input validation, Rich output |
| Client | client.py, _*.py |
High-level Python API, returns typed dataclasses |
| Core | _core.py |
HTTP client, request counter, RPC abstraction |
| RPC | rpc/*.py |
Protocol encoding/decoding, method IDs |
Why underscore prefixes? Files like _notebooks.py are internal implementation. Public API stays clean (from notebooklm import NotebookLMClient).
Why namespaced APIs? client.notebooks.list() instead of client.list_notebooks() - better organization, scales well, tab-completion friendly.
Why async? Google's API can be slow. Async enables concurrent operations and non-blocking downloads.
New RPC Method:
- Capture traffic (see RPC Development Guide)
- Add to
rpc/types.py:NEW_METHOD = "AbCdEf" - Implement in appropriate
_*.pyAPI class - Add dataclass to
types.pyif needed - Add CLI command if user-facing
New API Class:
- Create
_newfeature.pywithNewFeatureAPIclass - Add to
client.py:self.newfeature = NewFeatureAPI(self._core) - Export types from
__init__.py
-
Install dependencies:
uv pip install -e ".[dev]" -
Authenticate:
notebooklm login
-
Create read-only test notebook (required for E2E tests):
- Create notebook at NotebookLM
- Add multiple sources (text, URL, etc.)
- Generate artifacts (audio, quiz, etc.)
- Set env var:
export NOTEBOOKLM_READ_ONLY_NOTEBOOK_ID="your-id"
# Unit + integration tests (no auth needed)
pytest
# E2E tests (requires auth + test notebook)
pytest tests/e2e -m readonly # Read-only tests only
pytest tests/e2e -m "not variants" # Skip parameter variants
pytest tests/e2e --include-variants # All tests including variantstests/
├── unit/ # No network, fast, mock everything
├── integration/ # Mocked HTTP responses + VCR cassettes
└── e2e/ # Real API calls (requires auth)
| Fixture | Use Case |
|---|---|
read_only_notebook_id |
List/download existing artifacts |
temp_notebook |
Add/delete sources (auto-cleanup) |
generation_notebook_id |
Generate artifacts (CI-aware cleanup) |
NotebookLM has undocumented rate limits. Generation tests may be skipped when rate limited:
- Use
pytest tests/e2e -m readonlyfor quick validation - Wait a few minutes between full test runs
SKIPPED (Rate limited by API)is expected behavior, not failure
Record HTTP interactions for offline/deterministic replay:
# Record new cassettes (committed to repo with sensitive data scrubbed)
NOTEBOOKLM_VCR_RECORD=1 pytest tests/integration/test_vcr_*.py -v
# Run with recorded responses
pytest tests/integration/test_vcr_*.pySensitive data (cookies, tokens, emails) is automatically scrubbed.
Need network?
├── No → tests/unit/
├── Mocked → tests/integration/
└── Real API → tests/e2e/
└── What notebook?
├── Read-only → read_only_notebook_id + @pytest.mark.readonly
├── CRUD → temp_notebook
└── Generation → generation_notebook_id
└── Parameter variant? → add @pytest.mark.variants
| Workflow | Trigger | Purpose |
|---|---|---|
test.yml |
Push/PR | Unit tests, linting, type checking |
nightly.yml |
Daily 6 AM UTC | E2E tests with real API |
rpc-health.yml |
Daily 7 AM UTC | RPC method ID monitoring (see stability.md) |
testpypi-publish.yml |
Manual dispatch | Publish to TestPyPI |
verify-package.yml |
Manual dispatch | Verify TestPyPI or PyPI install + E2E |
publish.yml |
Tag push | Publish to PyPI |
- Get storage state:
cat ~/.notebooklm/storage_state.json - Add GitHub secrets:
NOTEBOOKLM_AUTH_JSON: Storage state JSONNOTEBOOKLM_READ_ONLY_NOTEBOOK_ID: Your test notebook ID
| Task | Frequency |
|---|---|
| Refresh credentials | Every 1-2 weeks |
| Check nightly results | Daily |
First step: Run notebooklm auth check --json in your workflow to diagnose issues.
Cause: The NOTEBOOKLM_AUTH_JSON env var is set to an empty string.
Solution:
- Ensure the GitHub secret is properly configured
- Check the secret isn't empty or whitespace-only
- Verify the workflow syntax:
${{ secrets.NOTEBOOKLM_AUTH_JSON }}
Cause: The JSON in NOTEBOOKLM_AUTH_JSON is missing the required structure.
Solution: Ensure your secret contains valid Playwright storage state JSON:
{
"cookies": [
{"name": "SID", "value": "...", "domain": ".google.com", ...},
...
],
"origins": []
}Cause: You're trying to run notebooklm login in CI/CD where NOTEBOOKLM_AUTH_JSON is set.
Why: The login command saves to a file, which conflicts with environment-based auth.
Solution:
- Don't run
loginin CI/CD - use the env var for auth instead - If you need to refresh auth, do it locally and update the secret
Cause: Google sessions expire periodically (typically every 1-2 weeks).
Solution:
- Re-run
notebooklm loginlocally - Copy the contents of
~/.notebooklm/storage_state.json - Update your GitHub secret
Use separate secrets and set NOTEBOOKLM_AUTH_JSON per job:
jobs:
account-1:
env:
NOTEBOOKLM_AUTH_JSON: ${{ secrets.NOTEBOOKLM_AUTH_ACCOUNT1 }}
steps:
- run: notebooklm list
account-2:
env:
NOTEBOOKLM_AUTH_JSON: ${{ secrets.NOTEBOOKLM_AUTH_ACCOUNT2 }}
steps:
- run: notebooklm listAdd diagnostic steps to your workflow:
- name: Debug auth
run: |
# Comprehensive auth check (preferred)
notebooklm auth check --json
# Check if env var is set (without revealing content)
if [ -n "$NOTEBOOKLM_AUTH_JSON" ]; then
echo "NOTEBOOKLM_AUTH_JSON is set (length: ${#NOTEBOOKLM_AUTH_JSON})"
else
echo "NOTEBOOKLM_AUTH_JSON is not set"
fiThe auth check --json output shows:
- Whether storage/env var is being used
- Which cookies are present
- Cookie domains (important for regional users)
- Any validation errors
- Check existing implementations in
_*.pyfiles - Look at test files for expected structures
- See RPC Development Guide for protocol details
- Open an issue with captured request/response (sanitized)