Nvidia deal shows why inference is AI's next battleground

Chipmakers Nvidia and Groq entered into a non-exclusive tech licensing agreement last week aimed at speeding up and lowering the cost of running pre-trained large language models. Why it matters: Groq's language processing unit (LPU) chips power real-time chatbot queries — as opposed to model training — potentially giving Nvidia an edge in the AI race.The big picture: Currently, Nvidia's chips power much of the AI training phase, but inference is a bottleneck that Nvidia doesn't fully control yet. Groq's chips are purpose-built for inference — the stage at which AI models use what they've learned in the training process to produce real-world results.Inference is where AI companies take their models from the lab to the bank. And with the soaring costs to train AI, those models better get to the bank soon.Cheap and efficient inference is essential for AI use at scale.Investors are pouring money into inference startups, hoping to find the missing link between AI experimentation and everyday use, Axios' Chris Metinko reported earlier this year.Better inference could push more companies to launch more and bigger enterprise AI initiatives. That, in turn, could drive more training and boost demand for Nvidia's training chips.How it works: AI models operate in two distinct phases: training and inference. In the training phase, models ingest vast datasets of text, images and video and then use that data to build internal knowledge. In the inference phase, the model recognizes patterns in data it's never seen before and generates responses to prompts based on those patterns. Think of the phases like a student studying for a test and then taking the test. Between the lines: Groq, founded in 2016 by Jonathan Ross, is unrelated to Elon Musk's xAI chatbot, also called Grok.Ross — along with Groq's president Sunny Madra and other employees — will join Nvidia, according to a statement on Groq's website.Groq will continue to operate independently.The Nvidia/Groq deal is a "non-exclusive inference technology licensing agreement" that looks suspiciously like an acquisition, or at least an acquihire. The structure could "keep the fiction of competition alive," Bernstein Research analyst Stacy Rasgon wrote in a note to clients, according to CNBC.This kind of deal is a familiar one, used not only to try to skirt antitrust scrutiny, but also to bring on sought-after AI talent. In similar deals, Microsoft lured Mustafa Suleyman, co-founder of DeepMind and Google brought back Noam Shazeer, the co-inventor of the Transformer — the T in GPT.Aside from founding Groq, new Nvidia employee Ross invented Google's Tensor Processing Unit (TPU).What we're watching: AI's future hinges on whether companies can afford to deploy what they've already built.