My story

My interest in Electrical and Computer Engineering first took shape after the college entrance exam, when I learned that Communication Engineering is a discipline combining theory and practice. I chose it as my major without hesitation. In my freshman year, many experiences left deep impressions and gave me a sense of intellectual joy: when my C program for Tic-tac-toe finally beat me; when the electronic piano I built with a C51 microcontroller produced sound on its first power-on; or when the memory module I built on a breadboard worked correctly after studying Yale Patt’s Computer Architecture. These moments taught me that abstract principles could be made tangible through circuits and code, and that ECE was the art of turning theory into reality.

As I delved deeper into the field, I realized — almost with surprise — that it was all mathematics underneath. I was thrilled when Tristan Needham revealed the connection between electrostatic equipotential surfaces and conformal mappings. I started attending advanced math courses in abstract algebra, differential manifolds, graph theory, etc. In Paolo’s category theory discussion group, I learned the universal property, which completely changed how I thought about groups, matrices, and the very idea that “everything is a set.” These experiences profoundly shaped my view: I no longer see my ECE courses as isolated subjects, but as interconnections of mathematics and computation. For instance, Signals and Systems becomes complex/harmonic/functional analysis; Stochastic Signal Processing is measure theory and ergodic theory; Information Theory and Electromagnetism relate to differential geometry; Digital Circuit Analysis reflects computability theory. This unified framework makes learning effortless and enjoyable!

With a growing understanding of the mathematical foundations of intelligence, I began to wonder whether similar structures could exist in machines, and I became intrigued by AI and machine learning (ML). The pace of AI’s progress astonished me — I remember watching a host from a show introducing natural language formatting in Word, and suddenly ChatGPT could write stories and generate amazing images. At university, I enrolled in Artificial Intelligence and Machine Learning course and learned the universal approximation theorem, back propagation, how raising dimensionality could discover structures of data, etc. The whole experience was fantastic!

The question that fascinated me initially was whether intelligence could exist within a few kilobytes of memory. In an embedded systems course, we aimed to build a wearable device using accelerometer data to predict whether a patient had fallen. I used an STM32 microcontroller as the platform. Initially, I relied on hard thresholds of acceleration, but a few days before submission, I had a bold idea: to implement a neural network directly on the STM32! Since no similar projects existed on GitHub, I decided to build it from first principles. I began by designing a network with inputs the acceleration values and outputs the falling probability. Then came the tedious work of collecting data — so I configured a Bluetooth module to stream real-time data to my computer and used two buttons for labeling. To obtain sufficient training samples, I literally performed approximately 200 simulated falls in the lab. After that, I wrote a Python script to process them into a dataset, trained a simple CNN on my computer, exported the trained parameters as a header file, and translated the Python CNN into C for STM32. After 4–5 days of late nights in the lab, the accuracy improved by 80%, far exceeding course requirements. I later open-sourced this project on GitHub. This project deepened my understanding of how abstract AI models can be implemented within an MCU.

Since that time, my growing fascination with the intersection of ML and hardware led me to pursue two complementary projects that strengthened my skills in both domains. Over one summer, I built a single-cycle RISC-V CPU from scratch in Verilog, capable of booting assembly code and running user C programs with all execution stages fully visualized in Digital — an open-source project that deepened my understanding of Yale Patt’s book from the gate level upward. In another project, I tackled analog circuit parameter optimization under multiple PVT corners, where I proposed a Graph Attention Network (GAT) and Reinforcement Learning (RL)–based method to resolve conflicting gradients across corners. This idea reframed the problem as a graph-learning task and significantly improved optimization stability. These experiences strengthened my foundation in both RTL design and applied ML, while sharpening my object-oriented programming, debugging, and collaborative development skills with Git.

With my understanding of hardware and ML deepened, I started my graduation project. I chose a topic on edge computing without hesitation: Accelerating MobileViT by Apple on FPGA using RISC-V Vector Extension assembly. My goal was to develop full-stack deployment capability — not just software running on a PC. The project is still ongoing; I used the latest Coral NPU framework by Google and I am currently reproducing a paper on sparse DNN acceleration.

As for career goals, in the near term, I hope to become an ML engineer, leveraging my strengths in AI and hardware acceleration during my graduate study, focusing on LLM and generative AI. In the long run, I aspire to become a research scientist at a company like NVIDIA, where I can contribute to cutting-edge software-hardware co-design in AI. To be even more “unrealistic”, as I am extremely impressed by emergence behavior projects like the Game of Life and Artificial Life, and those amazing videos about phase change and percolation by 3blue1brown, I am also super excited to do research in AGI, which I see will probably be a lot different than the structure of a transformer, instead, it may look more decentralized and neuromorphic.

There’s a little story though before I switched to AI. After I passed the selection interview at CUHK by the end of 2025, I was initially in contact with a prospective mentor at CUHK whose research focuses on EDA; however, upon further reflection, I realized that my long-term interests align more closely with AI research. Studying AI often led me to reflect on underlying mechanisms and open problems, whereas I felt less intrinsic intellectual engagement when working on EDA-related topics. I took time to carefully reconsider my direction, as I want to fully commit to the field I pursue.

In conclusion, my academic journey to date, enriched by interdisciplinary experiences, has equipped me well for the challenges of graduate study. I am excited to settle down on LLMs and generative AIs to further my education and contribute meaningfully to the evolving field of Computer Science.