Disclaimer: A mere mortal’s attempt to understand the cosmos.
Thread I: First PC
Around 9 years old, I built my first PC. To be fair, my elder brother helped me quite a bit. But I retain my bragging rights at family get-togethers for being the first to turn it on. It ran an Intel Pentium I processor, clocked @1.9 Ghz, 8MB RAM, and 1.9 GB of storage.
Once it was powered up, it’s not hard to imagine what we did first. Wolf 3D and NFS - we lost ourselves in the digital realm of PC gaming. As impressed as we were by our creation, we hungered for more. We saved our pocket money and were able to eventually buy an extra 8MB of RAM to run our games more smoothly.
Well, that was decades ago. And after having studied Electronics and Microprocessor Design and working for a HW company, it’s fair to say that I have built hundreds of machines.
It is surreal to look back on these memories as I type on my Daily Driver, fitted with an Intel i9 9980XE 18 Core CPU @4.5 GHz, with 128GB of RAM, Dual Titan RTX Graphic Cards, 16TB of storage, an Optane 3D Xpoint 905P SSD (plugged directly to PCie 4X lanes), a HDDL (High Density Deep Learning) Mustang Card (PCie), and an Arria 10 FPGA card.
Why elaborate on the specs of this machine? I want to showcase the delta between my first and current PC. The difference in architecture is extreme.
Thread II: Scalar, Vector, Matrix, and Spatial
Let’s talk about architecture. In the last couple of years at Intel, I stretched my knowledge on the various kinds of architecture in use. And I was fortunate enough to have crossed paths with architects behind the cores of the iPhone, Xbox, PS4, and Tesla’s Autopilot.
The architecture of the modern computing world comes in four variants.
Scalar: CPU Architecture
Vector: GPU Architecture
Matrix: AI Architecture
Spatial: FPGA or ASICs etc
This difference in architecture comes down to the dimensionality of the arrays being operated upon, as well as which kind of workload operates on what kind of data array. Though theoretically, any Turing complete core can compute any kind of data array. But the “elegance” with which it operates will vary.
Thread III: SKU’s
I have built SKUs of Binary Engines for systems ranging from hand-held platforms to server racks. To be more precise, they can be articulated as:
FPGA (Field Programmable Gate Arrays): Logic is hard-coded and synthesized down to its Configurable Logic Blocks, with no middle layers of abstraction in between.
Wearables: Low Power Battery-Operated x86 SKU’s. Intended for mobile-friendly workloads.
High-end Workstations: No compromise on Specs. EinRig is an example of that.
Server Racks: Building server racks is like building a skyscraper. It’s about channeling the workloads to each “apartment” in an elegant way. Whether you scale-in or scale-out, or take the route of an external NIC, or spread it internally via the PCIe lanes, it's all a function of the workload’s expectations. Last year, I built and named my server rack “Gothmog” (if there’s any Silmarillion fans out there…).
In my experience, I have realized that all computers are, in essence, the same. It’s the scale and point of reference that vary. Computers have four primary, complementary components:
Cores
Memory
Interconnects
Peripherals
Thread IV: Indie
Many of my experiments derived from my own personal needs, which, over time, turned into a research methodology called “Participant Observation.” My needs were a product of my interests in graphics, prototyping, media editing, sound design, etc. By understanding these needs, I was able to empathize with the possible end users.
Over the last year, the indie retro gaming scene has also caught my attention (I wonder if my recent affection towards Synthwave music has triggered that).
So, here I was, looking at diverse ends of the spectrum of computing: HPC (High Performance Computing) and indie core-driven retro.
There is beauty in both (plus it’s fun to get the hands dirty sometimes).
Thread V: Note to future self
What would you do if you had one ExaFlop at your disposal?