Project

SPECTRA

Security Policy, Emerging Cyber Threats, Research & AI — an automated monthly cybersecurity digest. 17 public sources, AI curation, branded PDF delivered to email and Discord.

Live · Running monthly since March 2026

What it is

An automated monthly digest pipeline. On the 1st of each month at 5 AM, a Claude Code scheduled task fans out across 17 public RSS and API sources (NIST, CISA, FedRAMP, Federal Register, Congress, industry news outlets, AI research blogs), pulls the new items, deduplicates them, and runs them through Claude for categorization, scoring, and summarization. The output is a seven-section branded PDF, delivered via email and Discord. No subscription services, no API keys — the whole pipeline runs on a Claude Code subscription and a scheduled task runner.

Why this exists

DoD and federal cybersecurity practitioners need to stay current on policy, threats, and AI developments. The available options are too narrow (vendor newsletters that only cover their own products), too noisy (every CISA advisory in the inbox), or too expensive (paid intelligence platforms). None of them are tuned to the practitioner who needs signal, not coverage.

SPECTRA scratched a personal itch first — I wanted one PDF on the 1st of each month that told me what changed and why I should care. The architecture (scheduled task + multi-source aggregation + AI curation + branded delivery) generalizes to any digest workflow. The cybersecurity policy slice is just the first instance.

Architecture

Five stages, fanned in from 17 sources and out to two delivery channels. The whole pipeline runs as a single Claude Code scheduled task — no orchestrator, no queue, no API key.

SPECTRA architecture: 17 sources fan into a 5-stage pipeline (Collect, Prep, Curate, Render, Deliver) driven by a Claude Code scheduled task; output is a monthly PDF delivered via email and Discord.

Collect

RSS and HTTP fetchers pull from 17 public sources in parallel. Each source has its own fetcher module (Federal Register API, Congress.gov API, CISA KEV, etc.). Health report written for each run.

Prep

Loads collected items, deduplicates by URL and title-similarity, normalizes dates and source labels into a single intermediate JSON for the curator.

Curate

Claude reads each item, picks one of seven sections, scores relevance 1-10 against the federal-cyber-practitioner reader, writes a 2-3 sentence summary, and consolidates duplicate stories into single entries.

Render

ReportLab assembles the curated markdown into a branded PDF with seven sections, top 20 items per section by score, with provenance links back to every source.

Deliver

SMTP relay sends the PDF to subscriber email; Discord webhook posts the same PDF to a channel for quick-scan reading on mobile.

How it works

One run, end-to-end:

The scheduled task fires at 5 AM on the 1st of the month. Collect runs all 17 fetchers concurrently and writes a per-source health report (rows fetched, last item date, errors). Prep loads everything, dedups, and produces a single normalized JSON with a few hundred items typical for a month.

Curate is the only stage that calls Claude. The prompt asks Claude to read each item, decide which of the seven sections it belongs in (Policy & Compliance, Publications & Standards, Threats & Incidents, AI & Agentic Developments, Legislative Highlights, Upcoming Conferences, plus an Executive Summary built from the top scoring items across sections), assign a 1-10 relevance score, write a short summary, and consolidate duplicates. The output is a markdown draft.

Render takes the draft and produces a branded PDF using ReportLab. Deliver sends the PDF to email and Discord. The whole run from fetch to inbox is typically under 10 minutes.

What this demonstrates

Scheduled-task automation

End-to-end pipeline driven by one cron-style scheduled task. No orchestration framework, no message queue, no manual intervention.

Multi-source aggregation

17 sources spanning RSS, government APIs, and industry news, with per-source fetcher modules and a health report covering every run.

AI curation with provenance

Claude categorizes, scores, and summarizes — but every item in the final PDF carries a link back to its source. The model curates; the source is authoritative.

Branded PDF generation

ReportLab pipeline produces a multi-section, multi-page PDF that reads like a real briefing — not a markdown export, not an email blast.

Multi-channel delivery

Same artifact, two channels: email (for archival) and Discord (for mobile scan). Delivery is the last 5% of the pipeline; getting the artifact to where the reader actually is matters.

See the output

SPECTRA's artifact is a monthly PDF. Two pages from the April 2026 edition:

SPECTRA sample PDF page 1: executive summary with the top items from across all sections.
SPECTRA sample PDF mid-document: one of the seven content sections with multiple items, each showing source, score, and summary.

Download the Sample PDF ↓

Or run it yourself

Three of the four pipeline stages run from the command line; the curate stage is driven by a Claude Code scheduled task that you'd configure separately:

git clone https://github.com/JAKSecurity/SPECTRA.git
cd SPECTRA
pip install -r requirements.txt

# Collect from all sources
PYTHONPATH="." python -m src.collect.runner src/collect/config.yaml data/sources

# Prep (load + dedup)
PYTHONPATH="." python -m src.curate.curate data/sources/2026-04 data/drafts/prepped_2026-04.json

# Render to PDF (after Claude Code curation produces the draft)
PYTHONPATH="." python -m src.render.render data/drafts/2026-04-SPECTRA.md output/2026-04-SPECTRA.pdf

Source on GitHub. Full source list in src/collect/config.yaml.

Stack & scale

Stack

Python 3.10+ ReportLab Claude Code RSS / HTTP SMTP Discord webhooks

Scale

984 lines 59 tests 17 sources 7 sections Monthly cadence ~10 min/run

What's not here

This is a personal pipeline, not a publishing platform. All sources are public and unclassified — no authenticated feeds, no proprietary intelligence streams, no paywalled content. Delivery is to a single recipient stream (Jeff's inbox + a private Discord channel), not a subscriber list. The 17-source set is curated for federal cyber practitioners; it could expand, but adding sources is a deliberate design decision, not a default. The architecture is the point; the specific scope is intentionally small.