Name: Multimodal in Practice
Availability: InStock

Adding images and audio to a system is not a parameter change. It is a new set of failure modes. This deep dive covers what multimodal models actually perceive, where they hallucinate differently, and the evaluation that catches it.

Vision and audio change the failure modes, not just the inputs. What breaks when the model has to see and hear.

This edition is free to read onsite. Each chapter has its own URL, so readers can bookmark, share, and return to the exact section they need.