Skip to main content
Liu ZhuoQi

Liu ZhuoQi

AI Agent Developer · Integrating AI into real products

Recent

Building a Personal Site with Hugo and Dual-Stack CDN

Why Hugo # When picking a framework for a personal blog, my top criterion was low maintenance cost — I didn’t want to abandon writing three months later because of npm dependency hell. Hugo is a single binary, requires no Node.js, builds thousands of posts in 1-2 seconds, and the PaperMod theme comes with dark mode, full-text search, RSS, Open Graph, and reading time estimates out of the box. Day-to-day writing only requires touching Markdown files.

Why LLMs Have No Memory — A Cross-Validated Research Report with 67 Primary Sources

·1623 words· 8 min
1. Why LLMs Are Stateless # Four independent constraints — individually manageable, together they leave “stateless” as the only viable engineering solution. This conclusion is cross-validated across 67 primary sources. Architecture: O(n²) Attention # Self-attention scales at O(n²). A single 4096-token sequence needs 2 GB VRAM for KV cache; 32 concurrent sessions hit 64 GB — more than the model weights themselves. Llama 3.1 at 100M context requires 638 H100 GPUs ($5,400/hour) for KV cache alone.