My understanding is that minimizing perplexity (what LLMs are generally optimize...

		blackle on May 6, 2024 \| parent \| context \| favorite \| on: Using a LLM to compress text My understanding is that minimizing perplexity (what LLMs are generally optimized for) is equivalent to finding a good probably distribution over the next token.