portal

SFMIX Participant Portal

Django-based participant dashboard for SFMIX IX users. Authenticates via Authentik SSO (OIDC) at login.sfmix.org, using PeeringDB-sourced ASN group memberships to control which networks a user can view.

Architecture

Host: web.sfmix.org (Ansible group sfmix_website), DNS CNAME portal.sfmix.org → web
Runtime: Django + Gunicorn in Docker, port 8000, behind Nginx with Let's Encrypt TLS
Auth: mozilla-django-oidc → Authentik OIDC at login.sfmix.org
Data: NetBox API with proactive in-process cache (background thread per Gunicorn worker, 4-hour refresh cycle with exponential backoff on failure)
Monitoring: Prometheus metrics at /metrics/ (restricted to trusted networks), admin health dashboard at /admin/netbox-status/
ASN gating: Authentik groups like as64500 (sourced from PeeringDB federation) are extracted from the OIDC groups claim and stored in the Django session. Each user sees only their own networks.
Install dir: /opt/ixp_portal/

Authentik Setup

Create an OIDC provider + application in Authentik for the portal:

Parameter	Value
Client ID	`portal`
Client type	Confidential
Redirect URI	`https://portal.sfmix.org/oidc/callback/`
Scopes	`openid profile email groups`
Authorization flow	`default-provider-authorization-implicit-consent`

The groups scope must use the existing SFMIX: OpenID 'groups' ScopeMapping (same one Grafana uses).

Set the resulting client secret as OIDC_RP_CLIENT_SECRET in the portal's environment.

Local Development

cd portal
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

# For local dev without real OIDC, create a superuser:
python manage.py migrate
python manage.py createsuperuser

# Run dev server
python manage.py runserver

For full OIDC testing against login.sfmix.org, set these env vars:

export OIDC_RP_CLIENT_ID=portal
export OIDC_RP_CLIENT_SECRET=<from-authentik>
export OIDC_PROVIDER_URL=https://login.sfmix.org/application/o/portal
export IXP_NETBOX_URL=https://netbox.sfmix.org
export IXP_NETBOX_TOKEN=<read-only-netbox-token>

Production Deployment

Quick Deploy

cd ansible
pipenv run ansible-playbook deploy_portal.playbook.yml --vault-password-file ~/.sfmix_ansible_vault

Or as part of the full push:

pipenv run ansible-playbook push_servers.playbook.yml --tags ixp_portal --vault-password-file ~/.sfmix_ansible_vault

What the Ansible Role Does

The ixp_portal role (ansible/roles/ixp_portal/) performs these steps:

Installs Docker and docker-compose
Rsyncs portal/ source to /opt/ixp_portal/ (excludes .venv, __pycache__, db.sqlite3, staticfiles, .env)
Templates .env from vault-encrypted secrets (ansible/roles/ixp_portal/templates/dotenv.j2)
Builds and starts the Docker container (docker-compose up -d --build)
Runs Django migrations
Ensures Let's Encrypt cert for portal.sfmix.org via certbot
Deploys and enables Nginx reverse proxy vhost

Secrets

Stored vault-encrypted in ansible/inventory/host_vars/web.sfmix.org.yml:

Variable	Description
`ixp_portal_django_secret_key`	Django secret key
`ixp_portal_oidc_client_secret`	Authentik OIDC client secret
`ixp_portal_netbox_token`	NetBox API read-only token

Verification

After deploying, check the container logs for a successful NetBox cache refresh:

ssh web.sfmix.org "sudo docker-compose -f /opt/ixp_portal/docker-compose.yml logs --tail 20 2>&1 | grep -iE 'netbox|error'"

Expected output: NetBox cache refreshed in N.Ns: NN tenants, NN IPs, NN ports from each Gunicorn worker.

IX Administrators can also check health in the browser at https://portal.sfmix.org/admin/netbox-status/ after logging in.

Smoke test:

ssh web.sfmix.org "curl -s -o /dev/null -w '%{http_code}' -H 'Host: portal.sfmix.org' http://localhost:8000/login/"
# Expected: 200

Troubleshooting

SSH rate limiting / connection resets during deploy: web.sfmix.org has PerSourceMaxStartups 10 in sshd. Ansible rsync opens extra SSH connections which can trip this. Admin IPs should be in PerSourcePenaltyExemptList (managed by ansible/roles/sfmix_server/tasks/sshd.yml).

NetBox cache empty (0 tenants, 0 IPs): Check that IXP_NETBOX_TOKEN in /opt/ixp_portal/.env is a valid NetBox API token. Look for HTTP 403 errors in the container logs. The background cache thread starts when the first HTTP request hits each Gunicorn worker (via dashboard.middleware.NetBoxCacheMiddleware). On failure, the refresh loop retries with exponential backoff (30s → 60s → ... → 1h cap). IX Administrators can view the current error and clear/force-refresh from /admin/netbox-status/.

500 errors on network detail pages: Check container logs for tracebacks. Common cause: NetBox objects with null nested fields (e.g. custom_fields.participant is null on some peering ports).

Health check DisallowedHost warnings: The docker-compose health check hits localhost:8000 which isn't in ALLOWED_HOSTS. These log warnings are harmless.

Key Files

File	Purpose
`ansible/deploy_portal.playbook.yml`	Deployment playbook
`ansible/roles/ixp_portal/tasks/main.yml`	Ansible tasks (rsync, docker, nginx, certbot)
`ansible/roles/ixp_portal/defaults/main.yml`	Default variables (domain, ports, OIDC, NetBox URL)
`ansible/roles/ixp_portal/templates/dotenv.j2`	Environment variable template
`ansible/roles/ixp_portal/templates/nginx-portal.conf.j2`	Nginx vhost template
`ansible/inventory/host_vars/web.sfmix.org.yml`	Host secrets (vault-encrypted)
`portal/dashboard/services.py`	NetBox data fetching, proactive cache, health tracking, Prometheus metrics
`portal/dashboard/middleware.py`	Starts background refresh thread per Gunicorn worker
`portal/ixp_portal/settings.py`	Django settings (OIDC, logging, middleware)
`portal/Dockerfile`	Container image (Python 3.12 + Gunicorn)
`portal/docker-compose.yml`	Docker Compose config

Environment Variables

Variable	Default	Description
`DJANGO_SECRET_KEY`	`insecure-dev-key-change-me`	Django secret key
`DJANGO_DEBUG`	`true`	Debug mode
`DJANGO_ALLOWED_HOSTS`	`*`	Comma-separated allowed hosts
`OIDC_RP_CLIENT_ID`	`portal`	Authentik OIDC client ID
`OIDC_RP_CLIENT_SECRET`	(empty)	Authentik OIDC client secret
`OIDC_PROVIDER_URL`	`https://login.sfmix.org/application/o/portal`	OIDC provider base URL
`IXP_NETBOX_URL`	`https://netbox.sfmix.org`	NetBox API base URL
`IXP_NETBOX_TOKEN`	(empty)	NetBox API read-only token
`PROMETHEUS_TRUSTED_NETWORKS`	`127.0.0.0/8,::1/128`	Comma-separated CIDRs allowed to scrape `/metrics/`

NetBox Cache

The portal maintains an in-process cache of participant data from NetBox, refreshed by a background thread in each Gunicorn worker.

Refresh interval: 4 hours after a successful fetch
On failure: exponential backoff starting at 30s, doubling each attempt, capped at 1 hour. Once backoff reaches the cap it waits until the next normal 4-hour cycle. Stale data continues to be served.
Startup: the cache starts empty; the first refresh fires immediately when the background thread starts (triggered by the first HTTP request via NetBoxCacheMiddleware)
Per-worker isolation: each Gunicorn worker has its own independent cache and refresh thread (no shared Redis/memcached)

Health & Observability

Admin dashboard: IX Administrators see a "NetBox Status" link in the nav bar, leading to /admin/netbox-status/. Shows health badge (Healthy / Degraded / No data), cache age, item counts, refresh duration, success/failure counts, and last error message. Includes a "Clear Cache & Refresh Now" button.
Prometheus metrics at /metrics/, restricted to PROMETHEUS_TRUSTED_NETWORKS:

Metric	Type	Description
`netbox_refresh_duration_seconds`	Histogram	Time spent fetching data from NetBox
`netbox_refresh_success_total`	Counter	Successful cache refreshes
`netbox_refresh_failure_total`	Counter	Failed cache refreshes
`netbox_cache_age_seconds`	Gauge	Seconds since last successful refresh
`netbox_cache_items{type}`	Gauge	Cached items by type (tenants, ip_addresses, peering_ports)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

SFMIX Participant Portal

Architecture

Authentik Setup

Local Development

Production Deployment

Quick Deploy

What the Ansible Role Does

Secrets

Verification

Troubleshooting

Key Files

Environment Variables

NetBox Cache

Health & Observability

Name		Name	Last commit message	Last commit date
parent directory ..
dashboard		dashboard
ixp_portal		ixp_portal
static		static
templates		templates
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
manage.py		manage.py
requirements.txt		requirements.txt

FilesExpand file tree

portal

Directory actions

More options

Directory actions

More options

Latest commit

History

portal

Folders and files

parent directory

README.md

SFMIX Participant Portal

Architecture

Authentik Setup

Local Development

Production Deployment

Quick Deploy

What the Ansible Role Does

Secrets

Verification

Troubleshooting

Key Files

Environment Variables

NetBox Cache

Health & Observability