Data Structures & Relationships for European Parliament Intelligence
📊 Entity Models • 🔗 Data Relationships • 📋 Schema Documentation
📋 Document Owner: CEO | 📄 Version: 1.2 | 📅 Last Updated:
2026-04-20 (UTC) | 📦 Release: v0.8.40
🔄 Review Cycle: Quarterly | ⏰ Next Review: 2026-07-20
This document defines the data structures and relationships used in the EU Parliament Monitor platform for news generation, storage, and delivery.
The article data flow has shifted from AI authors HTML to AI authors markdown artifacts, aggregator renders HTML deterministically. The canonical on-disk schema is now:
analysis/daily/<YYYY-MM-DD>/<article-type-slug>-run<NN>/— the authoritative unit of a news run. Contains every artifact listed in themanifest.json(see § Manifest Schema below) plus the renderedarticle.html(produced bynpm run generate-article).manifest.jsonat the root of each run directory — the aggregator's index: top-levelarticleType+filesobject listing every artifact. Stage-C enforces the schema at self-review time.news/<YYYY-MM-DD>-<slug>-run<NN>-<lang>.html— 14 language variants emitted by the aggregator + translation flush.The following sections still document the pre-migration entity model (article HTML as the authoritative artifact, strategy-per-type render). The
src/utils/content-validator.tsgates documented in § Dual Economic Context Gate and thesrc/generators/strategies/mapping documented in § Article Type Definitions were removed in April 2026. A follow-up PR rewrites those sections against the aggregator data model. Until then, read them in conjunction with this banner and withARCHITECTURE.md§ Key Characteristics.
Every analysis run under analysis/daily/<date>/<slug>-run<NN>/ carries a
manifest.json that the aggregator reads to know what to render:
Validation rules (enforced by the Stage-C agent-side review):
| Rule | Rationale |
|---|---|
top-level articleType present and matches one of the 8 slugs |
The aggregator uses this to pick the right shared-chrome variant |
files present as an object (nested category → string[] or flat path → description) |
Walked in canonical order by src/aggregator/artifact-order.ts |
Every files.* entry resolves to an existing file under the run directory |
Broken links fail the render |
Latest history[] entry with a non-PENDING gateResult is carried forward on re-runs |
Preserves the last GREEN / ANALYSIS_ONLY stamp |
At least one of the per-type required artifacts from .github/prompts/05-analysis-to-article-contract.md § 4 present |
Prevents a thin run from publishing |
The aggregator reads the manifest via src/aggregator/analysis-aggregator.ts
and walks artifacts in the order defined by src/aggregator/artifact-order.ts.
- Simplicity: Flat file structure, no databases
- Immutability: Generated articles never modified after creation
- Traceability: Generation metadata tracks provenance
- Multi-language: Language-specific content with shared structure
- Public Data: All data from European Parliament open sources
| Document | Focus | Description | Documentation Link |
|---|---|---|---|
| Architecture | 🏛️ Architecture | C4 model showing current system structure | View Source |
| Future Architecture | 🏛️ Architecture | C4 model showing future system structure | View Source |
| Mindmaps | 🧠 Concept | Current system component relationships | View Source |
| Future Mindmaps | 🧠 Concept | Future capability evolution | View Source |
| SWOT Analysis | 💼 Business | Current strategic assessment | View Source |
| Future SWOT Analysis | 💼 Business | Future strategic opportunities | View Source |
| Data Model | 📊 Data | Current data structures and relationships | View Source |
| Future Data Model | 📊 Data | Enhanced European Parliament data architecture | View Source |
| Flowcharts | 🔄 Process | Current data processing workflows | View Source |
| Future Flowcharts | 🔄 Process | Enhanced AI-driven workflows | View Source |
| State Diagrams | 🔄 Behavior | Current system state transitions | View Source |
| Future State Diagrams | 🔄 Behavior | Enhanced adaptive state transitions | View Source |
| Security Architecture | 🛡️ Security | Current security implementation | View Source |
| Future Security Architecture | 🛡️ Security | Security enhancement roadmap | View Source |
| Threat Model | 🎯 Security | STRIDE threat analysis | View Source |
| Classification | 🏷️ Governance | CIA classification & BCP | View Source |
| CRA Assessment | 🛡️ Compliance | Cyber Resilience Act | View Source |
| Workflows | ⚙️ DevOps | CI/CD documentation | View Source |
| Future Workflows | 🚀 DevOps | Planned CI/CD enhancements | View Source |
| Business Continuity Plan | 🔄 Resilience | Recovery planning | View Source |
| Financial Security Plan | 💰 Financial | Cost & security analysis | View Source |
| End-of-Life Strategy | 📦 Lifecycle | Technology EOL planning | View Source |
| Unit Test Plan | 🧪 Testing | Unit testing strategy | View Source |
| E2E Test Plan | 🔍 Testing | End-to-end testing | View Source |
| Performance Testing | ⚡ Performance | Performance benchmarks | View Source |
| Security Policy | 🔒 Security | Vulnerability reporting & security policy | View Source |
This data model aligns with Hack23 ISMS policies to ensure secure data handling, classification, and development practices:
| Policy | Relevance | Implementation in Data Model |
|---|---|---|
| Data Classification Policy | High | All data classified as Public (Level 1) per CLASSIFICATION.md. European Parliament data is publicly available open data. No PII or sensitive information processed. |
| Cryptography Policy | Medium | TLS 1.3 for data in transit from European Parliament API. At-rest encryption via GitHub repository storage. Planned SHA-256 hashes for data integrity verification in future generator updates. |
| Secure Development Policy | High | Planned schema validation for EP API responses and planned HTML sanitization (e.g., DOMPurify) in future generator/client updates. Input validation for external data where implemented. Git-based audit trail for all changes. |
ISO 27001:2022 Controls:
- A.5.12: Classification of information — Public data classification documented
- A.8.3: Management of technical vulnerabilities — Planned schema validation to prevent malformed data in future iterations
- A.8.24: Use of cryptography — TLS 1.3 for API communication
- A.8.28: Secure coding — Planned enhancements for input validation and HTML sanitization in the generator/client code
GDPR Compliance:
- Article 5(1)(c): Data minimization — No personal data collected beyond publicly available MEP information
- Article 5(1)(e): Storage limitation — Articles immutable, no unnecessary data retention
- Article 5(1)(f): Integrity and confidentiality — SHA-256 checksums, TLS 1.3 encryption
NIST CSF 2.0:
- ID.AM-5: Resources are prioritized based on classification — Public data classification
- PR.DS-2: Data-in-transit is protected — TLS 1.3 encryption
- PR.DS-5: Protections against data leaks — No sensitive data to leak (public data only)
erDiagram
NEWS_ARTICLE ||--o{ METADATA : has
NEWS_ARTICLE ||--o{ SOURCE : references
NEWS_ARTICLE }o--|| ARTICLE_TYPE : "belongs to"
NEWS_ARTICLE }o--|| LANGUAGE : "written in"
PLENARY_SESSION ||--o{ NEWS_ARTICLE : "mentioned in"
COMMITTEE_MEETING ||--o{ NEWS_ARTICLE : "mentioned in"
PARLIAMENTARY_QUESTION ||--o{ NEWS_ARTICLE : "mentioned in"
DOCUMENT ||--o{ NEWS_ARTICLE : "referenced in"
NEWS_ARTICLE {
string slug PK "Unique article identifier"
string category "ArticleCategory enum value"
string language "en, sv, da, no, fi, de, fr, es, nl, ar, he, ja, ko, zh"
string date "Publication date string"
string title "Article title"
string subtitle "Article subtitle"
string content "Full HTML content"
int readTime "Estimated read time (minutes)"
array keywords "SEO keywords (optional)"
array sources "ArticleSource references (optional)"
}
METADATA {
string filename "Article filename"
string date "Publication date"
string slug "Article slug"
string lang "Language code"
string title "Article title"
string type "ArticleCategory value (optional)"
}
SOURCE {
string title "Source title"
string url "Source URL"
}
ARTICLE_TYPE {
string code PK "ArticleCategory enum value"
string perspective "ArticlePerspective (prospective, retrospective, real-time, analytical)"
string label_en "English label"
string label_de "German label"
string label_fr "French label"
}
LANGUAGE {
string code PK "ISO 639-1 code"
string name "Language name"
string direction "ltr or rtl"
}
PLENARY_SESSION {
string session_id PK "EP session identifier"
date session_date "Session date"
string title "Session title"
array agenda_items "Agenda item IDs"
}
COMMITTEE_MEETING {
string meeting_id PK "EP meeting identifier"
string committee_code "Committee code (LIBE, ECON, etc.)"
date meeting_date "Meeting date"
string title "Meeting title"
}
PARLIAMENTARY_QUESTION {
string question_id PK "EP question identifier"
date submission_date "Date submitted"
string question_type "Written, Oral, Priority"
string author_mep "MEP name"
}
DOCUMENT {
string document_id PK "EP document identifier"
string document_type "Report, Resolution, Opinion"
date publication_date "Publication date"
string title "Document title"
}
File Location: news/YYYY-MM-DD-{slug}-{lang}.html
HTML Structure:
<!DOCTYPE html>
<html lang="en" dir="ltr">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Article Title - EU Parliament Monitor</title>
<!-- SEO Meta Tags -->
<meta name="description" content="Article subtitle" />
<meta name="keywords" content="european parliament, keyword1, keyword2" />
<meta name="author" content="EU Parliament Monitor" />
<meta name="publication-date" content="2026-03-01" />
<meta name="article-type" content="prospective" />
<meta name="language" content="en" />
<!-- Open Graph -->
<meta property="og:title" content="Article Title" />
<meta property="og:description" content="Article subtitle" />
<meta property="og:type" content="article" />
<meta
property="og:url"
content="https://euparliamentmonitor.com/news/2026-week-ahead-en.html"
/>
<!-- Schema.org structured data -->
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "NewsArticle",
"headline": "Article Title",
"description": "Article subtitle",
"datePublished": "2026-03-01T06:15:32Z",
"author": {
"@type": "Organization",
"name": "EU Parliament Monitor"
},
"publisher": {
"@type": "Organization",
"name": "EU Parliament Monitor",
"logo": {
"@type": "ImageObject",
"url": "https://euparliamentmonitor.com/logo.png"
}
}
}
</script>
<link rel="stylesheet" href="../styles.css" />
</head>
<body>
<article class="news-article">
<header>
<span class="article-type">Week Ahead</span>
<h1>Article Title</h1>
<p class="subtitle">Article subtitle</p>
<div class="meta">
<time datetime="2026-03-01">March 1, 2026</time>
<span class="read-time">5 min read</span>
</div>
</header>
<main class="content">
<!-- Generated HTML content -->
</main>
<footer>
<section class="sources">
<h3>Sources</h3>
<ul>
<li>
<a href="https://data.europarl.europa.eu/...">EP Source 1</a>
</li>
<li>
<a href="https://data.europarl.europa.eu/...">EP Source 2</a>
</li>
</ul>
</section>
<section class="languages">
<h3>Available Languages</h3>
<ul>
<li><a href="2026-week-ahead-de.html">Deutsch</a></li>
<li><a href="2026-week-ahead-fr.html">Français</a></li>
</ul>
</section>
</footer>
</article>
</body>
</html>File Location: articles-metadata.json
TypeScript Interface (NewsMetadataDatabase):
{
"lastUpdated": "2026-03-01T06:15:32Z",
"articles": [
{
"filename": "2026-03-01-week-ahead-en.html",
"date": "2026-03-01",
"slug": "week-ahead",
"lang": "en",
"title": "Week Ahead: European Parliament March Session",
"type": "week-ahead"
},
{
"filename": "2026-03-01-week-ahead-de.html",
"date": "2026-03-01",
"slug": "week-ahead",
"lang": "de",
"title": "Woche Voraus: Europäisches Parlament Märzsitzung",
"type": "week-ahead"
}
]
}File Location: src/types/index.ts (ArticleCategory enum + ARTICLE_TYPES catalogue in src/constants/language-articles.ts)
EU Parliament Monitor ships 8 production article types driven by 9 strategy modules (src/generators/strategies/ — 1 generic article-strategy + 8 type-specific strategies):
| Code | Perspective | Strategy Module | Source gh-aw Workflow Pair (analysis schedule) |
|---|---|---|---|
breaking |
real-time | breaking-news-strategy.ts |
news-breaking-analysis.md + news-breaking-article.md (every 6h) |
week-ahead |
prospective | week-ahead-strategy.ts |
news-week-ahead-analysis.md + news-week-ahead-article.md (Fri 07:00) |
week-in-review |
retrospective | weekly-review-strategy.ts |
news-weekly-review-analysis.md + news-weekly-review-article.md (Sat 09:00) |
month-ahead |
prospective | month-ahead-strategy.ts |
news-month-ahead-analysis.md + news-month-ahead-article.md (1st 08:00) |
month-in-review |
retrospective | monthly-review-strategy.ts |
news-monthly-review-analysis.md + news-monthly-review-article.md (28th 10:00) |
committee-reports |
analytical | committee-reports-strategy.ts |
news-committee-reports-analysis.md + news-committee-reports-article.md (Mon–Fri 04:00) |
motions |
analytical | motions-strategy.ts |
news-motions-analysis.md + news-motions-article.md (Mon–Fri 06:00) |
propositions |
analytical | propositions-strategy.ts |
news-propositions-analysis.md + news-propositions-article.md (Mon–Fri 05:00) |
Plus the generic article-strategy.ts used by manual news-article-generator.md dispatches, and news-translate.md for EN → 13-language fan-out.
{
"article_types": [
{
"code": "week-ahead",
"perspective": "prospective",
"labels": {
"en": "Week Ahead",
"sv": "Vecka Framåt",
"da": "Ugen Fremover",
"no": "Uken Fremover",
"fi": "Viikko Eteenpäin",
"de": "Woche Voraus",
"fr": "Semaine à Venir",
"es": "Semana Próxima",
"nl": "Week Vooruit",
"ar": "الأسبوع القادم",
"he": "השבוע הקרוב",
"ja": "今週の展望",
"ko": "주간 전망",
"zh": "一周展望"
},
"description": "Preview of upcoming parliamentary events and committee meetings"
},
{
"code": "week-in-review",
"perspective": "retrospective",
"description": "Weekly retrospective — voting, coalition dynamics, policy deliveries"
},
{
"code": "month-ahead",
"perspective": "prospective",
"description": "Monthly strategic outlook — legislative calendar, policy momentum"
},
{
"code": "month-in-review",
"perspective": "retrospective",
"description": "Comprehensive monthly trends and synthesis"
},
{
"code": "committee-reports",
"perspective": "analytical",
"description": "Per-committee deep analysis (rapporteur influence, amendments, trilogue)"
},
{
"code": "motions",
"perspective": "analytical",
"description": "Per-resolution voting breakdown, abstention analysis"
},
{
"code": "propositions",
"perspective": "analytical",
"description": "Legislative pipeline tracking (procedure stages, timeline forecast)"
},
{
"code": "breaking",
"perspective": "real-time",
"description": "Rapid-response coverage of significant developments — TODAY-only items, 6h cadence"
}
]
}File Location: src/constants/language-core.ts (defines ALL_LANGUAGES and LANGUAGE_PRESETS)
14 supported languages (LTR + RTL), defined in src/constants/language-core.ts::ALL_LANGUAGES. Per-language UI strings live in src/constants/language-ui.ts; per-article-type localized labels live in src/constants/language-articles.ts.
{
"languages": [
{ "code": "en", "name": "English", "native_name": "English", "direction": "ltr", "flag": "🇬🇧" },
{ "code": "sv", "name": "Swedish", "native_name": "Svenska", "direction": "ltr", "flag": "🇸🇪" },
{ "code": "da", "name": "Danish", "native_name": "Dansk", "direction": "ltr", "flag": "🇩🇰" },
{ "code": "no", "name": "Norwegian", "native_name": "Norsk", "direction": "ltr", "flag": "🇳🇴" },
{ "code": "fi", "name": "Finnish", "native_name": "Suomi", "direction": "ltr", "flag": "🇫🇮" },
{ "code": "de", "name": "German", "native_name": "Deutsch", "direction": "ltr", "flag": "🇩🇪" },
{ "code": "fr", "name": "French", "native_name": "Français", "direction": "ltr", "flag": "🇫🇷" },
{ "code": "es", "name": "Spanish", "native_name": "Español", "direction": "ltr", "flag": "🇪🇸" },
{ "code": "nl", "name": "Dutch", "native_name": "Nederlands", "direction": "ltr", "flag": "🇳🇱" },
{ "code": "ar", "name": "Arabic", "native_name": "العربية", "direction": "rtl", "flag": "🇸🇦" },
{ "code": "he", "name": "Hebrew", "native_name": "עברית", "direction": "rtl", "flag": "🇮🇱" },
{ "code": "ja", "name": "Japanese", "native_name": "日本語", "direction": "ltr", "flag": "🇯🇵" },
{ "code": "ko", "name": "Korean", "native_name": "한국어", "direction": "ltr", "flag": "🇰🇷" },
{ "code": "zh", "name": "Chinese", "native_name": "中文", "direction": "ltr", "flag": "🇨🇳" }
],
"language_presets": {
"all": ["en", "sv", "da", "no", "fi", "de", "fr", "es", "nl", "ar", "he", "ja", "ko", "zh"],
"eu-core": ["en", "de", "fr", "es", "nl"],
"nordic": ["en", "sv", "da", "no", "fi"]
}
}Footer source-of-truth: All 14 language variants of the site footer are rendered by
buildSiteFooter()insrc/templates/section-builders.ts. This is the single source of truth — no article template or generator should render footer markup inline.
All domain types are strongly typed in src/types/*.ts (strict mode, ESM). 15 type modules organised by bounded context:
| Module | Purpose | Key exports |
|---|---|---|
common.ts |
Shared primitives | LanguageCode, ISODate, ConfidenceLevel (🟢/🟡/🔴), Probability (likely/possible/unlikely), Significance |
parliament.ts |
EP domain entities | MEP, PlenarySession, Committee, Procedure, Vote, VotingRecord, AdoptedText, CommitteeMeeting, ParliamentaryQuestion, Document |
mcp.ts |
MCP transport contracts | FeedBaseOptions (sliding-window: timeframe, optional startDate), FixedWindowFeedOptions (fixed-window: limit, offset), MCPResponse<T>, MCPUnavailableEnvelope |
imf.ts |
IMF SDMX types | IMFDataflow, IMFObservation, IMFSeriesKey, IMFWEOForecast, IMFFMForecast |
world-bank.ts |
World Bank WDI types | WBIndicator, WBObservation, WBCountry |
generation.ts |
Generation pipeline | ArticleContext, PipelineStageInput/Output, StrategyResult, RenderContext |
analysis.ts |
Analysis pipeline | AnalysisContext, AnalysisManifest, AnalysisRunFiles, ClassificationResult, RiskScoring |
intelligence.ts |
Intelligence synthesis | IntelligenceArtifact, CrossArticleSynthesis, ReferenceThresholds |
political-classification.ts |
7-dimension taxonomy | PoliticalClassification, Actor, Force, ImpactMatrix |
political-risk.ts |
Risk matrix 5×5 | RiskMatrix, LikelihoodLevel, ImpactLevel, CapitalAtRisk, Velocity |
political-threats.ts |
6-dimension threat landscape | ThreatLandscape, ThreatDimension (coalition/transparency/reversal/institutional/obstruction/erosion), ActorThreat, DisruptionVector |
quality.ts |
AI-First quality gates | QualityReport, SWOTItem, StakeholderPerspective, ProseRatio |
significance.ts |
Publication priority | SignificanceScore, SignificanceFactors |
stakeholder.ts |
6 stakeholder perspectives | StakeholderPerspective, ImpactDirection (positive/negative/neutral/mixed), ImpactSeverity (high/medium/low) |
visualization.ts |
Chart.js + Mermaid types | ChartDataset, DashboardConfig, MindmapNode |
index.ts |
Re-exports barrel | All of the above |
The EP MCP Server v1.2.13 exposes two distinct feed-option schemas. This split was finalised in the 2026-04-20 release (fixes Hack23/european-parliament-mcp#377 / #378) and is reflected in src/types/mcp.ts:
// Sliding-window feeds (6 tools)
// Applied to: adopted_texts_feed, events_feed, procedures_feed, meps_feed,
// questions_feed, mep_declarations_feed
export interface FeedBaseOptions {
timeframe: "today" | "one-day" | "one-week" | "one-month" | "custom";
startDate?: string; // ISO 8601 — REQUIRED when timeframe === "custom"
workType?: string; // optional filter (adopted_texts, external_documents)
processType?: string; // procedures feed only
activityType?: string; // events feed only
}
// Fixed-window feeds (7 tools)
// Applied to: documents, plenary_documents, committee_documents,
// plenary_session_documents, parliamentary_questions,
// corporate_bodies, controlled_vocabularies
export interface FixedWindowFeedOptions {
limit?: number; // default 50, max 100
offset?: number; // default 0
// NO timeframe/startDate — these feeds ignore those parameters
}
// Uniform unavailable envelope — EP v1.2.13
export interface MCPUnavailableEnvelope<T> {
status: "unavailable";
items: T[]; // ALWAYS empty array — never null or undefined
reason?: string; // optional diagnostic string
retryAfterSeconds?: number;
}
export type MCPResponse<T> =
| { status: "ok"; items: T[] }
| MCPUnavailableEnvelope<T>;Breaking-change note: Prior to v1.2.13, fixed-window feeds silently accepted (and ignored)
timeframe/startDate. As of v1.2.13 those parameters are rejected at the schema level. TheFALLBACK_TEMPLATE_PATTERNSdetector scans article HTML for fragments like"unavailable"leaking into prose — any detection blocks PR creation viascanHtmlForFallbackLeaks()insrc/utils/content-validator.ts.
Every MCP client exports a canonical tool list asserted by an integration contract test:
| Client | Canonical list | Contract test |
|---|---|---|
src/mcp/ep-mcp-client.ts |
(no EP_MCP_TOOLS export yet — gap) |
test/integration/mcp-integration.test.js |
src/mcp/imf-mcp-client.ts (class IMFMCPClient) |
IMF_MCP_TOOLS |
test/integration/mcp/imf-mcp.test.js |
src/mcp/wb-mcp-client.ts |
WORLD_BANK_MCP_TOOLS |
test/integration/mcp/worldbank-mcp.test.js |
IMFMCPClient is a native TypeScript fetch client against IMF SDMX 3.0 — NOT an MCP server. Env configuration: IMF_API_BASE_URL (defaults to https://dataservices.imf.org/REST/SDMX_3.0/), IMF_API_TIMEOUT_MS. Provides monthly World Economic Outlook (WEO) and Fiscal Monitor (FM) forecasts up to five years ahead.
src/utils/content-validator.ts exports three content-policy gates used by the validator:
// Original single-source gate (kept for backward compatibility, legacy soft check)
export function articlePolicyHasWorldBank(articleHtml: string, articleType: string): boolean;
// Wave-2 OR-gate (default): accepts World Bank OR IMF evidence
export function articlePolicyHasEconomicContext(articleHtml: string, articleType: string): boolean;
// Wave-3 strict gate (dark-launched): IMF evidence only — World Bank does not satisfy
export function articlePolicyHasIMFEconomicEvidence(articleHtml: string, articleType: string): boolean;
// Flag parser — controls which gate is enforced at Stage-C
export function isWave3IMFStrictEnabled(env?: Record<string, string | undefined>): boolean;Policy articles (motions, propositions, committee-reports, month-ahead, month-in-review) MUST pass the active gate. Under default settings the OR-gate applies (Wave-2 back-compat). When WAVE3_IMF_STRICT=true is set in the environment, validate-articles.ts switches to articlePolicyHasIMFEconomicEvidence — World Bank citations alone no longer satisfy the gate. The strict gate is intended for opt-in dark-launch runs and dashboard telemetry; Wave-4 will flip the default. Breaking news and week-ahead have lighter economic-context requirements — configured per strategy in src/generators/strategies/.
Authoritative thresholds live in analysis/methodologies/reference-quality-thresholds.json:
| Artifact | Minimum words | Breaking-news threshold |
|---|---|---|
intelligence/mcp-reliability-audit.md |
200 | 385 |
intelligence/reference-analysis-quality.md |
140 | 190 |
Every agentic workflow emits an analysis/daily/YYYY-MM-DD/{article-type}/manifest.json that acts as the generation provenance record:
// src/types/analysis.ts
export interface AnalysisManifest {
articleType: ArticleCategory; // "breaking" | "week-ahead" | ...
runId: string; // gh-aw run identifier
generatedAt: string; // ISO 8601 UTC
sourceCommit: string; // Git SHA of source code
epMcpVersion: "1.2.13"; // Pinned EP MCP Server version
ghAwVersion: "v0.69.0"; // Pinned gh-aw CLI
files: AnalysisRunFiles; // Emitted artifact catalogue
qualityReport: QualityReport; // AI-First 2-pass metrics
dataSourcesUsed: Array<"EP" | "WB" | "IMF">;
languagesProduced: LanguageCode[]; // e.g. ["en"] for content runs, ["sv","de",...] for translate
}
export interface AnalysisRunFiles {
classification?: string[]; // paths relative to manifest
threatAssessment?: string[];
riskScoring?: string[];
data?: {
epFeeds?: string[]; // raw EP MCP payloads
worldBank?: string[];
imf?: string[];
osint?: string[];
};
articleHtml?: string[]; // generated article paths (news/*.html)
}analysis/daily/2026-04-20/
├── ai-daily-synthesis.md ← Cross-article synthesis (date root)
├── breaking/
│ ├── manifest.json ← Generation provenance
│ ├── classification/
│ ├── threat-assessment/
│ ├── risk-scoring/
│ └── data/
├── committee-reports/
│ ├── manifest.json
│ ├── classification/
│ └── data/
├── motions/{manifest.json, data/}
├── propositions/{manifest.json, data/}
├── week-ahead/{manifest.json, data/} ← Fridays only
├── weekly-review/ ← Saturdays only
├── month-ahead/ ← 1st of month only
└── monthly-review/ ← 28th of month only
🚨 Isolation Rule: Each workflow writes ONLY to its own
{article-type-slug}/subdirectory. Cross-workflow overwrites are prohibited. Theai-*.mdsynthesis files at the date root aggregate across all workflows and are authored by thenews-weekly-review-analysis.md/news-monthly-review-analysis.mdworkflows.
articles-metadata.json (maintained by src/utils/news-metadata.ts) is the language-indexed metadata layer powering per-language index pages and the sitemap:
interface NewsMetadataDatabase {
lastUpdated: string; // ISO 8601 UTC
articles: ArticleMetadata[];
}
interface ArticleMetadata {
filename: string; // e.g. "2026-04-20-week-ahead-en.html"
date: string; // "2026-04-20"
slug: string; // "week-ahead"
lang: LanguageCode; // 14 possible values
title: string; // localised title
type: ArticleCategory; // 7 production types
articleRunId?: string; // cross-reference to analysis manifest
correction?: { // immutability exception
correctsArticle: string; // filename of article being corrected
correctionReason: string;
};
}As of 2026-04-20: 1,894 HTML articles live under news/ (~135 article runs × 14 languages).
EP API Endpoint:
https://data.europarl.europa.eu/api/v2/sessions/{session_id}
{
"session_id": "PS-2026-03-01",
"session_date": "2026-03-01",
"session_type": "Plenary",
"title": "March 2026 Plenary Session I",
"location": "Strasbourg",
"agenda": [
{
"item_id": "AGI-2026-03-001",
"order": 1,
"title": "Commission statement: European Green Deal progress",
"speaker": "European Commission",
"duration_minutes": 60,
"voting_required": false
},
{
"item_id": "AGI-2026-03-002",
"order": 2,
"title": "Vote: Digital Services Act amendments",
"rapporteur": "MEP Name",
"duration_minutes": 30,
"voting_required": true
}
],
"attendees": 705,
"status": "scheduled"
}EP API Endpoint:
https://data.europarl.europa.eu/api/v2/committees/{committee_code}/meetings/{meeting_id}
{
"meeting_id": "LIBE-2026-02-25",
"committee_code": "LIBE",
"committee_name": "Committee on Civil Liberties, Justice and Home Affairs",
"meeting_date": "2026-02-25",
"meeting_time": "14:00:00",
"location": "Brussels",
"agenda": [
{
"item_id": "LIBE-AGI-001",
"title": "Artificial Intelligence Act implementation review",
"type": "Discussion",
"documents": ["DOC-2026-001", "DOC-2026-002"]
}
],
"chair": "MEP Name",
"status": "completed"
}EP API Endpoint:
https://data.europarl.europa.eu/api/v2/questions/{question_id}
{
"question_id": "PQ-2026-000123",
"question_type": "Written",
"priority": false,
"submission_date": "2026-02-20",
"author": {
"mep_id": "MEP-12345",
"name": "MEP Name",
"political_group": "EPP",
"country": "Germany"
},
"addressee": "European Commission",
"subject": "Implementation of GDPR enforcement",
"question_text": "What measures is the Commission taking to...",
"answer": {
"answer_date": "2026-03-05",
"answer_text": "The Commission has undertaken the following actions...",
"answered_by": "Commissioner Name"
},
"languages": ["en", "de"]
}EP API Endpoint:
https://data.europarl.europa.eu/api/v2/documents/{document_id}
{
"document_id": "DOC-2026-001",
"document_type": "Report",
"title": "Report on the implementation of the Digital Services Act",
"publication_date": "2026-02-15",
"rapporteur": {
"mep_id": "MEP-67890",
"name": "MEP Name",
"political_group": "S&D"
},
"committee": "LIBE",
"procedure": "INI",
"languages": ["en", "de", "fr", "es", "it"],
"documents": [
{
"language": "en",
"format": "PDF",
"url": "https://data.europarl.europa.eu/documents/DOC-2026-001-EN.pdf"
}
],
"status": "published"
}erDiagram
MEP ||--o{ COMMITTEE_MEMBERSHIP : "serves on"
MEP ||--o{ VOTING_RECORD : "casts"
MEP ||--o{ PARLIAMENTARY_QUESTION : "authors"
MEP }o--|| POLITICAL_GROUP : "belongs to"
MEP }o--|| COUNTRY : "represents"
MEP }o--|| NATIONAL_PARTY : "member of"
POLITICAL_GROUP ||--o{ MEP : "has members"
COUNTRY ||--o{ MEP : "has representatives"
COMMITTEE ||--o{ COMMITTEE_MEMBERSHIP : "has members"
MEP {
string id PK "MEP-xxxxx"
string name "Full name"
string email "Contact email"
string photoUrl "Photo URL"
date termStart "Term start date"
date termEnd "Term end date"
boolean active "Active status"
}
POLITICAL_GROUP {
string code PK "PPE, S&D, Renew, Greens/EFA, ECR, etc."
string name "Full group name"
string abbreviation "Short name"
int memberCount "Number of MEPs"
string politicalOrientation "Left, Center, Right"
}
COUNTRY {
string code PK "ISO 3166-1 alpha-2"
string name "Country name"
int seatCount "EP seats allocated"
string region "EU region"
}
NATIONAL_PARTY {
string id PK "Party identifier"
string name "Party name"
string country FK "Country code"
string europeanAffiliation FK "Political group code"
}
COMMITTEE_MEMBERSHIP {
string mepId FK
string committeeCode FK
string role "Member, Chair, Vice-Chair"
date joinDate
date leaveDate
}
VOTING_RECORD {
string id PK
string mepId FK
string documentReference
string vote "FOR, AGAINST, ABSTAIN"
date voteDate
string sessionId
}
PARLIAMENTARY_QUESTION {
string questionId PK
string authorMepId FK
string questionType "Written, Oral, Priority"
date submissionDate
string subject
string addressee
}
erDiagram
MCP_SERVER ||--o{ MCP_TOOL : "provides"
MCP_TOOL ||--o{ API_ENDPOINT : "calls"
API_ENDPOINT }o--|| EP_API : "endpoint of"
MCP_TOOL ||--o{ TOOL_RESPONSE : "returns"
TOOL_RESPONSE ||--o{ CACHED_RESPONSE : "cached as"
NEWS_GENERATOR ||--o{ MCP_CLIENT : "uses"
MCP_CLIENT ||--o{ MCP_TOOL : "invokes"
MCP_CLIENT ||--o{ RESPONSE_VALIDATOR : "validates with"
MCP_SERVER {
string version "1.2.13"
string connectionType "stdio, SSE"
string status "running, stopped"
datetime lastHealthCheck
}
MCP_TOOL {
string name PK "get_meps, get_plenary_sessions"
string description "Tool description"
json inputSchema "JSON Schema for parameters"
json outputSchema "JSON Schema for response"
string endpoint FK "EP API endpoint"
}
API_ENDPOINT {
string url PK "https://data.europarl.europa.eu/..."
string method "GET, POST"
json parameters "Query parameters"
int rateLimitPerMinute
int cacheTTL "Seconds"
}
EP_API {
string baseUrl "https://data.europarl.europa.eu"
string version "v2"
string authentication "None (public API; field reserved for future use such as API key, OAuth)"
boolean requiresAuth "false for current EP MCP; reserved for future use"
}
TOOL_RESPONSE {
string id PK
string toolName FK
json data "Response data"
datetime timestamp
string dataHash "SHA-256 hash"
int statusCode
}
CACHED_RESPONSE {
string cacheKey PK
string toolName FK
json cachedData
datetime cachedAt
datetime expiresAt
int hitCount
}
MCP_CLIENT {
string clientId PK
string version
string connectionType
int timeoutSeconds
int retryAttempts
}
RESPONSE_VALIDATOR {
string toolName FK
json schema "JSON Schema"
array requiredFields
boolean strictMode
}
NEWS_GENERATOR {
string version
string mode "daily, manual"
array supportedLanguages
}
erDiagram
ARTICLE ||--o{ TRANSLATION : "has"
TRANSLATION }o--|| LANGUAGE : "written in"
ARTICLE ||--o{ ARTICLE_METADATA : "has"
TRANSLATION ||--o{ SEO_METADATA : "has"
LANGUAGE ||--o{ TRANSLATION : "used for"
LANGUAGE ||--o{ INDEX_PAGE : "has"
ARTICLE {
string slug PK "2026-01-01-week-ahead"
string category "ArticleCategory enum value"
datetime generatedAt
string commitSha "Git commit hash"
array sourceIds "EP data source IDs"
}
TRANSLATION {
string id PK
string articleSlug FK
string languageCode FK
string title "Translated title"
string subtitle "Translated subtitle"
string contentHtml "Full HTML content"
int wordCount
int readTimeMinutes
array keywords
}
LANGUAGE {
string code PK "ISO 639-1"
string name "Native language name"
string flag "Flag emoji"
string direction "ltr or rtl"
string preset "all, eu-core, nordic"
}
ARTICLE_METADATA {
string articleSlug FK
string generatorVersion
string workflowRunId
string mcpServerVersion
json sources "Array of source data"
json statistics "Word counts, read times"
}
SEO_METADATA {
string translationId FK
string metaDescription
array metaKeywords
string ogTitle "Open Graph title"
string ogDescription
string ogImage
string canonicalUrl
array hreflangLinks
}
INDEX_PAGE {
string languageCode FK
string filename "index-{lang}.html"
array articleList "Ordered article references"
datetime lastUpdated
int articleCount
}
erDiagram
SITEMAP ||--o{ SITEMAP_ENTRY : "contains"
SITEMAP_ENTRY }o--|| TRANSLATION : "references"
SITEMAP_ENTRY ||--o{ HREFLANG_LINK : "has"
INDEX_PAGE ||--o{ INDEX_ENTRY : "lists"
INDEX_ENTRY }o--|| TRANSLATION : "links to"
SITEMAP {
string filename "sitemap.xml"
datetime lastModified
int urlCount
string xmlns "XML namespace"
}
SITEMAP_ENTRY {
string loc PK "Full URL"
datetime lastmod "Last modified"
string changefreq "always, daily, weekly"
float priority "0.0 to 1.0"
string translationId FK
}
HREFLANG_LINK {
string sourceUrl FK
string targetUrl "Alternate language URL"
string hreflang "Language code or x-default"
string rel "alternate"
}
INDEX_PAGE {
string languageCode PK
string filename "index-{lang}.html"
string title "Page title"
string metaDescription
datetime lastUpdated
}
INDEX_ENTRY {
string indexLanguage FK
string articleUrl "Relative URL"
string articleTitle
string articleSubtitle
string articleType
date publicationDate
int displayOrder
}
TRANSLATION {
string id PK
string articleSlug
string languageCode
string title
string filename
}
flowchart TB
subgraph "European Parliament"
EP_API["European Parliament<br/>Open Data API"]
EP_PLENARY["Plenary Sessions<br/>API Endpoint"]
EP_COMMITTEE["Committee Meetings<br/>API Endpoint"]
EP_MEP["MEPs Data<br/>API Endpoint"]
EP_DOCUMENTS["Documents<br/>API Endpoint"]
end
subgraph "MCP Server Layer"
MCP_SERVER["European Parliament<br/>MCP Server"]
TOOL_GET_MEPS["Tool: get_meps"]
TOOL_PLENARY["Tool: get_plenary_sessions"]
TOOL_COMMITTEE["Tool: get_committee_info"]
TOOL_DOCUMENTS["Tool: search_documents"]
MCP_CACHE["LRU Response Cache<br/>TTL: 24h"]
end
subgraph "Generator Layer"
GENERATOR["News Generator<br/>TypeScript Script"]
MCP_CLIENT["MCP Client<br/>stdio connection"]
VALIDATOR["Schema Validator<br/>(Planned)"]
SANITIZER["HTML Sanitizer<br/>(Planned: DOMPurify)"]
end
subgraph "Template Layer"
TEMPLATE_ENGINE["Template Module<br/>src/templates/article-template.ts"]
TEMPLATE_WEEK["Article Template<br/>(TS-based)"]
TEMPLATE_COMMITTEE["Committee Reports Template<br/>(TS-based)"]
LANGUAGE_PROCESSOR["Multi-Language<br/>Processor"]
end
subgraph "Output Layer"
ARTICLE_HTML["Article HTML<br/>news/*.html"]
METADATA_JSON["Metadata JSON<br/>articles-metadata.json"]
INDEX_HTML["Index Pages<br/>index-*.html"]
SITEMAP_XML["sitemap.xml"]
end
subgraph "Deployment"
GIT_COMMIT["Git Commit<br/>& Push"]
GHA_DEPLOY["GitHub Actions<br/>Deploy Workflow"]
GH_PAGES["GitHub Pages<br/>Static Hosting"]
end
EP_API --> EP_PLENARY
EP_API --> EP_COMMITTEE
EP_API --> EP_MEP
EP_API --> EP_DOCUMENTS
EP_PLENARY -->|"HTTPS GET<br/>TLS 1.3"| MCP_SERVER
EP_COMMITTEE -->|"HTTPS GET<br/>TLS 1.3"| MCP_SERVER
EP_MEP -->|"HTTPS GET<br/>TLS 1.3"| MCP_SERVER
EP_DOCUMENTS -->|"HTTPS GET<br/>TLS 1.3"| MCP_SERVER
MCP_SERVER --> TOOL_GET_MEPS
MCP_SERVER --> TOOL_PLENARY
MCP_SERVER --> TOOL_COMMITTEE
MCP_SERVER --> TOOL_DOCUMENTS
TOOL_GET_MEPS --> MCP_CACHE
TOOL_PLENARY --> MCP_CACHE
TOOL_COMMITTEE --> MCP_CACHE
TOOL_DOCUMENTS --> MCP_CACHE
MCP_CACHE -->|"stdio protocol"| MCP_CLIENT
MCP_CLIENT --> GENERATOR
GENERATOR --> VALIDATOR
VALIDATOR -->|"Valid JSON"| SANITIZER
VALIDATOR -->|"Invalid"| GENERATOR
SANITIZER --> TEMPLATE_ENGINE
TEMPLATE_ENGINE --> TEMPLATE_WEEK
TEMPLATE_ENGINE --> TEMPLATE_COMMITTEE
TEMPLATE_WEEK --> LANGUAGE_PROCESSOR
TEMPLATE_COMMITTEE --> LANGUAGE_PROCESSOR
LANGUAGE_PROCESSOR -->|"14 languages"| ARTICLE_HTML
LANGUAGE_PROCESSOR --> METADATA_JSON
LANGUAGE_PROCESSOR --> INDEX_HTML
LANGUAGE_PROCESSOR --> SITEMAP_XML
ARTICLE_HTML --> GIT_COMMIT
METADATA_JSON --> GIT_COMMIT
INDEX_HTML --> GIT_COMMIT
SITEMAP_XML --> GIT_COMMIT
GIT_COMMIT -->|"Push triggers<br/>deploy workflow"| GHA_DEPLOY
GHA_DEPLOY -->|"Deploy to<br/>GitHub Pages"| GH_PAGES
style EP_API fill:#fff4e1
style MCP_SERVER fill:#e8f5e9
style GENERATOR fill:#e1f5ff
style TEMPLATE_ENGINE fill:#f3e5f5
style ARTICLE_HTML fill:#e3f2fd
style GH_PAGES fill:#e0f2f1
euparliamentmonitor/
├── news/ # Generated articles
│ ├── 2026-01-01-week-ahead-en.html
│ ├── 2026-01-01-week-ahead-de.html
│ ├── 2026-01-01-week-ahead-fr.html
│ └── ...
│
├── articles-metadata.json # News metadata database
│
├── index-{lang}.html # Language-specific indexes
│ ├── index.html
│ ├── index-de.html
│ └── index-fr.html
│
├── sitemap.xml # SEO sitemap
├── robots.txt # Crawler rules
├── styles.css # Global styles
└── favicon.ico # Site icon
flowchart LR
subgraph "External Sources"
EP_API[European Parliament<br/>Open Data API]
end
subgraph "MCP Layer"
MCP[MCP Server]
CACHE[Response Cache]
end
subgraph "Generation Layer"
CLIENT[MCP Client]
VALIDATE[Data Validator]
SANITIZE[HTML Sanitizer]
end
subgraph "Template Layer"
TEMPLATE[Article Template]
META[Metadata Generator]
HTML[HTML Builder]
end
subgraph "Storage Layer"
FS[File System]
ARTICLE[Article HTML]
METADATA[Metadata JSON]
end
EP_API -->|JSON Response| MCP
MCP --> CACHE
CACHE --> CLIENT
CLIENT --> VALIDATE
VALIDATE --> SANITIZE
SANITIZE --> TEMPLATE
TEMPLATE --> META
TEMPLATE --> HTML
HTML --> ARTICLE
META --> METADATA
ARTICLE --> FS
METADATA --> FS
style EP_API fill:#fff4e1
style MCP fill:#e8f5e9
style CLIENT fill:#e8f5e9
style VALIDATE fill:#e1f5ff
style SANITIZE fill:#e1f5ff
style TEMPLATE fill:#e8f5e9
style ARTICLE fill:#f0f0f0
style METADATA fill:#f0f0f0
flowchart LR
subgraph "Input"
ARTICLES[Generated Articles<br/>news/*.html]
end
subgraph "Scanner"
SCAN[File Scanner]
PARSE[Metadata Parser]
end
subgraph "Processor"
GROUP[Group by Language]
SORT[Sort by Date]
FILTER[Filter by Type]
end
subgraph "Generator"
TEMPLATE[Index Template]
HTML[HTML Builder]
end
subgraph "Output"
INDEX["index-LANG.html"]
end
ARTICLES --> SCAN
SCAN --> PARSE
PARSE --> GROUP
GROUP --> SORT
SORT --> FILTER
FILTER --> TEMPLATE
TEMPLATE --> HTML
HTML --> INDEX
style ARTICLES fill:#f0f0f0
style SCAN fill:#e8f5e9
style PARSE fill:#e8f5e9
style GROUP fill:#e1f5ff
style SORT fill:#e1f5ff
style FILTER fill:#e1f5ff
style TEMPLATE fill:#e8f5e9
style INDEX fill:#f0f0f0
- Cardinality: One-to-One
- Foreign Key: Article slug
- Purpose: Track generation provenance and source data
- Cardinality: One-to-Many
- Foreign Key: Article slug
- Purpose: Link articles to European Parliament data sources
- Cardinality: Many-to-One
- Foreign Key: Language code
- Purpose: Multi-language support with shared metadata
| Data Type | Classification | Storage | Encryption |
|---|---|---|---|
| News Articles | Public | Git repository | At-rest (GitHub) |
| Metadata | Public | Git repository | At-rest (GitHub) |
| EP API Responses | Public | Ephemeral (runtime) | In-transit (TLS 1.3) |
| Generation Logs | Internal | GitHub Actions | At-rest (GitHub) |
- Immutability: Articles never modified after generation
- Checksums: SHA-256 hashes for verification (future)
- Audit Trail: Git commit history provides complete provenance
- Validation: Schema validation on all EP API responses
All data in EU Parliament Monitor is classified according to CLASSIFICATION.md and the Hack23 ISMS Classification Policy:
| Data Type | Classification | Confidentiality | Integrity | Availability | Rationale |
|---|---|---|---|---|---|
| News Articles | Public (Level 1) | Public | Medium | Medium | Derived from public EP data, accuracy critical for democratic transparency |
| Generation Metadata | Public (Level 1) | Public | Medium | Low | Technical provenance data, publicly accessible |
| EP API Responses | Public (Level 1) | Public | Medium | Medium | Public European Parliament data, temporary runtime storage |
| MCP Tool Responses | Public (Level 1) | Public | Medium | Medium | Cached EP data, integrity critical |
| GitHub Actions Logs | Public (Level 1) | Public | Low | Low | Actions logs are visible to anyone with read access to this public repo and contain technical build details but no secrets |
PII Status: No User/Customer PII Collected
EU Parliament Monitor processes publicly available European Parliament data only. MEP names, affiliations, and official contact details are publicly available personal data about public officials in their official capacity:
- MEP Information: Names, political affiliations, committee memberships (publicly available official data)
- Contact Information: Official MEP email addresses (publicly available official contact data)
- No User Data: No user accounts, no tracking, no analytics
- No Cookies: Static HTML site, no client-side tracking
- No Private Communications: No private messages, no personal correspondence
Note: Publicly available personal data about public officials (MEP names, affiliations, official emails) processed in their official capacity is handled under GDPR Article 6 lawful basis (e.g., Art. 6(1)(e) public task and/or Art. 6(1)(f) legitimate interests). No special category data under Article 9 is processed. No user or private personal data is collected.
GDPR Article 5 Alignment:
| GDPR Principle | Implementation | Status |
|---|---|---|
| Art. 5(1)(a) - Lawfulness | Processing of publicly available personal data of MEPs from official EP sources under GDPR Art. 6 lawful basis (public task/legitimate interests); no user/customer personal data processed | ✅ Compliant |
| Art. 5(1)(b) - Purpose Limitation | Data used only for news generation about parliamentary activities | ✅ Compliant |
| Art. 5(1)(c) - Data Minimization | Only necessary public EP data collected, no excessive data | ✅ Compliant |
| Art. 5(1)(d) - Accuracy | EP data used as-is from official sources; planned schema validation and HTML sanitization to ensure accurate representation | ✅ Compliant |
| Art. 5(1)(e) - Storage Limitation | Articles immutable, no unnecessary retention, git history for audit | ✅ Compliant |
| Art. 5(1)(f) - Integrity & Confidentiality | TLS 1.3 encryption, SHA-256 hashes, GitHub encryption at rest | ✅ Compliant |
Control Statement: Information shall be classified in terms of legal requirements, value, criticality, and sensitivity to unauthorized disclosure or modification.
Implementation:
-
Classification Labels:
- All data marked as Public (Level 1) in metadata
- No confidential, restricted, or secret information processed
- Classification documented in CLASSIFICATION.md
-
Handling Requirements:
- Public data: No access controls required
- Repository logs: GitHub Actions and repository logs accessible to all users with repository read access (public repository, logs contain only data classified as Public)
- No encryption requirements beyond standard TLS 1.3
-
Review Process:
- Quarterly classification review (per document control)
- Annual ISMS audit includes data classification verification
- Classification changes trigger security impact assessment
Evidence:
- CLASSIFICATION.md - Full classification framework
- SECURITY_ARCHITECTURE.md - Security controls
- GitHub Repository Settings - Access controls
| Control | Implementation | Purpose |
|---|---|---|
| TLS 1.3 Encryption | All EP API calls use HTTPS | Protect data in transit |
| At-Rest Encryption | GitHub repository encryption | Protect stored data |
| Schema Validation | Planned: JSON Schema validation for EP API responses | Prevent malformed data |
| HTML Sanitization | Planned: DOMPurify-based sanitization for rendered HTML | Prevent XSS attacks |
| Input Validation | Planned: Whitelist-based validation for all configurable inputs | Prevent injection attacks |
| SHA-256 Hashing | Planned: SHA-256 checksums for source data integrity | Detect data tampering |
| Git Audit Trail | Complete commit history | Track all changes |
| Immutable Articles | Articles never modified post-generation | Preserve integrity |
The EU Parliament Monitor data model has evolved through multiple phases to support enhanced functionality and multi-language content:
timeline
title Data Model Evolution Timeline
section v1.0 - Foundation (2026-Q1)
Basic Article Schema : Simple HTML generation
: Single language (English)
: Manual EP data entry
File Storage : Git repository
: Static HTML files
No Metadata : No generation tracking
section v1.1 - Multi-Language (2026-Q1)
14 Languages : en, sv, da, no, fi, de, fr, es, nl, ar, he, ja, ko, zh
: Language-specific index pages
: Hreflang SEO optimization
MCP Integration : European Parliament MCP Server
: Automated data fetching
: Tool-based API access
Generation Metadata : Provenance tracking
: Source data hashing
: Workflow run IDs
section v1.2 - Current (2026-Q1)
Enhanced ER Diagrams : MEP entity model
: MCP integration model
: Multi-language content model
: Sitemap & SEO model
ISMS Alignment : Data classification documented
: GDPR compliance verified
: ISO 27001 controls mapped
Data Flow : Comprehensive data flow diagrams
: European Parliament to GitHub Pages
section v2.0 - Future (2026-Q3)
Real-Time Updates : WebSocket data streams
: Live plenary session updates
: Instant breaking news
Enhanced Analytics : Article performance metrics
: Reader engagement tracking
: SEO optimization insights
AI-Driven Content : LLM-based content generation
: Automated fact-checking
: Sentiment analysis
Database Backend : PostgreSQL for metadata
: Elasticsearch for search
: Redis for caching
| Version | Release Date | Key Changes | Diagrams Added |
|---|---|---|---|
| v1.0 | 2026-02-01 | Initial release, basic article generation | 1 (Main ER diagram) |
| v1.1 | 2026-03-19 | Multi-language support, MCP integration, ISMS alignment | 4 (MEP, MCP, Multi-language, Sitemap models) + 1 (EP data flow) |
| v1.2 | 2026-04-20 | TypeScript type system coverage, FeedBaseOptions vs FixedWindowFeedOptions split (EP MCP v1.2.13), IMF/WB dual economic context, AnalysisManifest schema, 8 article types correctly enumerated, 14 languages from language-core.ts::ALL_LANGUAGES, buildSiteFooter() single source of truth, reference quality thresholds |
Same set — content updates |
| v2.0 | 2026-Q4 (Planned) | Real-time updates, expanded intelligence types | TBD |
No breaking changes to date. All schema changes backward-compatible.
Planned enhancement: responses from the European Parliament API will be validated against JSON Schemas before processing:
MEP Data Schema:
{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"required": ["id", "name", "country", "politicalGroup"],
"properties": {
"id": { "type": "string", "pattern": "^MEP-[0-9]+$" },
"name": { "type": "string", "minLength": 1, "maxLength": 200 },
"country": { "type": "string", "pattern": "^[A-Z]{2}$" },
"party": { "type": "string", "maxLength": 200 },
"politicalGroup": { "type": "string", "enum": ["PPE", "S&D", "Renew", "Greens/EFA", "ID", "ECR", "The Left", "NI"] },
"committees": { "type": "array", "items": { "type": "string" } },
"email": { "type": "string", "format": "email" },
"photoUrl": { "type": "string", "format": "uri", "pattern": "^https://" }
}
}Plenary Session Schema:
{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"required": ["session_id", "session_date", "title"],
"properties": {
"session_id": { "type": "string", "pattern": "^PS-[0-9]{4}-[0-9]{2}-[0-9]{2}$" },
"session_date": { "type": "string", "format": "date" },
"title": { "type": "string", "minLength": 5, "maxLength": 500 },
"location": { "type": "string", "enum": ["Strasbourg", "Brussels"] },
"agenda": { "type": "array", "items": { "type": "object" } },
"status": { "type": "string", "enum": ["scheduled", "ongoing", "completed", "cancelled"] }
}
}Validation Process (Planned Enhancements):
- Pre-Processing Validation: Planned JSON Schema–based validation before any data transformation
- Type Checking: Planned strict type enforcement (no implicit coercion) for EP MCP responses
- Range Validation: Planned validation for string length, number ranges, and array size limits
- Format Validation: Planned checks for email, URL, date, and ISO codes
- Enum Validation: Planned fixed vocabularies (political groups, committee codes)
- Error Handling (Current Behavior): Invalid or missing EP MCP data causes generation to fail or fall back to minimal/placeholder content; JSON Schema validation and cache/manual fallback for EP API responses are planned enhancements
Generated Article Validation (Planned Enhancements):
| Field | Validation Rule | Error Handling |
|---|---|---|
slug |
Alphanumeric + hyphens, max 100 chars | Planned: generation fails and alert is sent |
title |
Min 10 chars, max 200 chars | Planned: generation retries with adjusted prompt |
subtitle |
Min 20 chars, max 500 chars | Planned: optional, can be empty |
content_html |
Valid HTML5, no <script> tags |
Planned: HTML sanitization with DOMPurify |
language |
ISO 639-1 code, must be in supported list | Planned: generation fails for that language |
keywords |
Array of strings, max 10 keywords | Planned: truncated to 10 if exceeded |
read_time |
Integer >= 1, <= 60 minutes | Planned: calculated from word count |
Note: HTML sanitization via DOMPurify is a planned security enhancement. The current generator (
src/templates/article-template.ts) produces HTML from EP API data. The configuration below documents the intended future implementation.
Planned DOMPurify Configuration:
const clean = DOMPurify.sanitize(dirtyHtml, {
ALLOWED_TAGS: ['p', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'ul', 'ol', 'li', 'a', 'strong', 'em', 'blockquote', 'code', 'pre'],
ALLOWED_ATTR: ['href', 'title', 'class', 'id'],
ALLOWED_URI_REGEXP: /^https?:\/\/(data\.europarl\.europa\.eu|europarl\.europa\.eu|www\.europarl\.europa\.eu)\/.*/,
ALLOW_DATA_ATTR: false,
KEEP_CONTENT: true,
RETURN_DOM: false,
RETURN_DOM_FRAGMENT: false
});Sanitization Rules:
- Allowed Tags: Only semantic HTML5 tags (no styling, no scripting)
- Allowed Attributes: Limited to
href,title,class,id - URL Whitelist: Only European Parliament domains allowed in links
- No JavaScript: All
<script>,<style>,onclick, etc. removed - No Iframes: No embedded content
- No Forms: No user input elements
Policy: Once generated, articles are never modified.
- Implementation: Read-only file permissions (conceptual), no update functionality in generator
- Exceptions: Security vulnerabilities, factual errors (manual correction with audit trail)
- Enforcement: Git commit history provides complete audit trail
Note: Source data hashing is a planned integrity enhancement. The metadata structure below shows the intended future implementation; SHA-256 hashing of EP/MCP responses is not yet implemented in the current generator code.
Planned Source Data Hashing Pattern:
const sourceHash = crypto.createHash('sha256')
.update(JSON.stringify(epApiResponse))
.digest('hex');Metadata Storage:
{
"sources": [
{
"type": "plenary_session",
"id": "PS-2026-03-01",
"data_hash": "a1b2c3d4e5f6...",
"timestamp": "2026-03-01T06:00:00Z"
}
]
}Integrity Verification (future):
- Hash comparison to detect data tampering
- Periodic integrity audits via GitHub Actions
- Alert on hash mismatch
Every change tracked:
- Commit SHA: Unique identifier for every generation
- Author: GitHub Actions bot (
github-actions[bot]) - Timestamp: UTC timestamp of commit
- Diff: Exact changes made (new files, modified files)
- Workflow Run ID: Link to GitHub Actions run for full logs
Example Metadata:
{
"generator": {
"version": "0.8.40",
"commit_sha": "abc123def456...",
"workflow_run_id": "12345678",
"workflow_url": "https://github.com/Hack23/euparliamentmonitor/actions/runs/12345678"
}
}Audit Capabilities:
git logshows complete historygit blameidentifies when each line was addedgit diffshows exact changes between versions- GitHub UI provides web-based audit interface
- SECURITY_ARCHITECTURE.md
- FLOWCHART.md
- European Parliament Open Data Portal
- MCP Protocol Specification
Document Status: Active
Last Updated: 2026-04-20 (EU Parliament Monitor v0.8.40)
Next Review: 2026-07-20
Owner: Development Team, Hack23 AB
{ "articleType": "motions", // one of the 8 canonical slugs "runId": "motions-run46", // <slug>-run<NN> "date": "2026-04-20", // ISO date (run subdirectory) "history": [ // append-only gate history { "at": "2026-04-20T06:00:00Z", "gateResult": "PENDING", "pass": 1 }, { "at": "2026-04-20T06:22:00Z", "gateResult": "GREEN", "pass": 2 } ], "files": { // canonical artifact index "intelligence": [ "intelligence/synthesis-summary.md", "intelligence/analysis-index.md", "intelligence/stakeholder-map.md", "intelligence/economic-context.md" // … ], "classification": [ "classification/significance-classification.md", "classification/impact-matrix.md", "classification/actor-mapping.md" ], "risk-scoring": [ "risk-scoring/risk-matrix.md", "risk-scoring/quantitative-swot.md" ], "threat-assessment": [ "threat-assessment/political-threat-landscape.md" ], "existing": [ "existing/stakeholder-impact.md", "existing/deep-analysis.md" ], "documents": [] } }