lux-tts is a voice cloning text-to-speech model that creates natural-sounding speech at 48kHz audio quality from text and a reference voice sample. The model uses a distilled 4-step architecture for fast inference, making it practical for real-time applications.