← All projects

nanoGPT C# Port

Close-up of a graphics card on a table
Photo by Christian Wiediger on Unsplash

nanoGPT C# Port is a spike that answered a single question: is TorchSharp a viable path to GPU-accelerated GPT training and inference in C#, without dropping to Python? It ports Andrej Karpathy's nanoGPT — the full architecture, tokenizer, causal self-attention, transformer blocks, weight tying, GPT-2-paper initialization — and validates it with a real training run on tiny Shakespeare, GPU-accelerated on an RTX 5090 in WSL2, with loss dropping from 3.88 to 1.63 in under two minutes.

The spike also surfaced two real defects in how .NET's garbage collector interacts with native GPU tensors — undisposed intermediates that either segfaulted the process or exhausted GPU memory — both fixed by explicit dispose-scoping rather than relying on the GC. The result is parked as a validated, reusable starting point for any future .NET-hosted GPU training work, documented in ADR-052.

Want to know more?

Interested in "nanoGPT C# Port"? Leave your details and I'll follow up with more information.

← All projects