dantp-ai

Follow

Daniel Plop dantp-ai

Follow

16 followers · 4 following

Achievements

Achievements

Organizations

dantp-ai/README.md

Howdy, I'm Daniel 👋.

I am a Senior Machine Learning Software Engineer.

Focused on reinforcement learning, AI infrastructure, and building reliable and scalable software for AI systems.

Projects

Reinforcement Learning & Robotics

gym-puddle: Off-policy PAC algorithm implemented on the Puddle World Gymnasium environment using TorchRL
proprio: Unsupervised, uncertainty-aware perception for a 7-DOF robot arm; classifies each lidar reading as self, background, or anomaly, without any geometry or kinematics.

Tooling

AlphaEx: Sweep parameters and dispatch thousands of Slurm jobs from one Python script.

Educational

internals: Interactive, first-principles tutorials for modern AI systems & system components.
- Speculative Decoding: Interactive walkthrough of how LLMs emit several tokens per forward pass.
nabla: Educational numpy implementations of 15 optimizers (SGD → Muon), animated on a 2D saddle & benchmarked on matrix LS.
priori: Interactive marimo benchmark of TabPFN v2 — a tabular foundation model that predicts in-context, with no training — against tuned XGBoost & AutoGluon on churn and credit tables.
minitorch: Minimalistic deep learning framework rebuilt from scratch: autodiff, tensors & a neural-net stack across NumPy, Numba-parallel CPU, & CUDA backends.
kairos: Reinforcement learning when the environment won't wait: a bounded-compute PPO agent acts on a stale policy while each update computes, dropping the experience it's too busy to process — testing whether extra compute is better spent on more epochs or more fresh data.

GitHub Activity

Latest Blog Posts

Review on the Technical Report: Gemini Robotics 1.5

Pinned Loading

tianshou tianshou Public

Forked from thu-ml/tianshou

An elegant PyTorch deep reinforcement learning library.

Python
clawloop clawloop Public

Forked from aganthos/clawloop

Make your agents learn from experience. One protocol for weights, harness and routing.

Python
deep-rl-algos-methods deep-rl-algos-methods Public

Jupyter Notebook
minitorch minitorch Public template

Forked from minitorch/minitorch

A PyTorch-style deep learning framework built from scratch: reverse-mode autodiff, n-dimensional tensors, and a neural-net stack on NumPy, Numba-parallel CPU & CUDA backends.

Python
proprio proprio Public

Forked from georgosgeorgos/DLRC_2018

Statistical Models for Robotic Perception

Jupyter Notebook
gym-puddle gym-puddle Public

Forked from EhsanEI/gym-puddle

Continuous grid-world environment for RL using Gymnasium

Python