Training a small, cheap model to imitate a big, expensive one. The big lab gets mad about this roughly every other week.
We distilled their ten-billion-dollar model down to something that runs on a laptop. They're suing the laptop.
+1443
Think they got it wrong?
Add your own definition of Distillation