Sudarsh

SudarshA quiet research notebook on AI systems, inference optimization, reasoning architectures, and multimodal AI.https://sudarsh.dev/Optimizing a Layer Normalization Kernel with CUDA: a Workloghttps://sudarsh.dev/blog/cuda-layernorm-worklog/https://sudarsh.dev/blog/cuda-layernorm-worklog/An iterative guide to writing and optimizing a CUDA layer normalization kernel — from a naive single-thread implementation to vectorized loads — benchmarked against PyTorch.Mon, 17 Feb 2025 00:00:00 GMT