This Image is 4KB

Blog

This is not clickbait. I'm still not sure why this isn't mainstream. Now that I have funds to play around with AWS (thanks to my wildly successful investment in DOGE), I decided to play with GANs. I've downloaded GIGs of images from Instagram and turned my entire DVD and Blu-ray collection into 1-second interval frames. My wife thought I was finally organizing our media collection. Little did she know I was actually dismantling it forever in the name of science.

I wasted an enormous amount of time trying to fit this into memory, but then I remembered Batch Normalization. Now, in the span of one hour, I can load all the data into servers and start training in parallel. I trained the data on top of AlexNet, but modified the initial layer to support much larger images. Take that, Google DeepMind with your fancy Gemini Ultra training.

The Breakthrough

My most important discovery came when I looked through the lowest layer of my U-Net, the middle layer. After several down-sampling operations through max-pooling, the spatial dimensions of the feature map have been reduced, but the number of channels increases.

Right there in the middle, my original (normalized) input image of 512x512x3 was now down to 1x1x4096. But the beauty of it is, when I freeze the seed, I can get the original image back and up-scale it to whatever size I want. So an 8MB image input into my script spits out a 4KB file. That same 4KB file can be in turn transformed back into the original 8MB image.

After a few successful attempts, I went to Google and researched if anyone had found something similar. To the best of my knowledge and my Googling skills, I am the first. This is not what I set out to do, but damn. In theory, this beats the highest Weissman Score by an order of magnitude at least. Take that, Richard Hendricks! I hear Pied Piper is raising another round of funding; maybe they should be talking to me instead.

Real-World Testing

I was so excited I closed my laptop with a loud bang right when my wife was walking into the room.

"Why did you close it right when I came in? What are you watching? So help me God it's not what I think it is..."

I picked up my phone, snatched a picture of her in disheveled hair, and uploaded it immediately. I was astonished how the 12MB image from my phone was recreated flawlessly by the model.

"Did you just take a picture of me looking like this? DELETE IT RIGHT NOW!"

"Honey, you don't understand. I'm compressing you down to 4 kilobytes. It's for science!"

Surprisingly, that explanation didn't help the situation. Now I'm sleeping on the couch, but at least I have a breakthrough algorithm to keep me warm.

Future Applications

I have to admit, the only downside is that it is computationally intensive both to produce the 4KB file and to retrieve it. But that will be the challenge of another day. For now, I can only dream of the implications for the future. I can literally stream video on a dialup modem (once the performance issues are resolved, of course).

I've heard of this company in the Emirates or somewhere in the Middle East that makes processors the size of a laptop. Cerebras if I'm not mistaken. I will look into them. With their CS-2 system that everyone's been talking about recently, I might be able to reduce the computational overhead. Just need to figure out how to convince my wife that spending our twins' college fund on a wafer-scale AI accelerator is a sound investment.

PS: A Small Clarification

OK, I lied, it's not exactly 4KB. In the output file, I also save the original image size for up-scaling, and the numbers can vary a bit, up to 6KB in my benchmark. But I don't think you will complain about that now, would you? That's still better than Apple's new JPEG XL format everyone's been raving about, and I built it in my spare time between diaper changes.

Coming soon: "How I Plan to Sell My 4KB Image Technology to Netflix Without Them Realizing It Won't Work at Scale" and "Why My Wife Has Hidden All Our Family Photos"


Note: If you're a venture capitalist or a recruiter from OpenAI reading this blog, please contact me immediately. If you're from the AWS billing department investigating unusual usage patterns, this blog is entirely fictional.