You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/en/tasks/text-to-speech.md
+43-8Lines changed: 43 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,16 +22,14 @@ Text-to-speech (TTS) is the task of creating natural-sounding speech from text,
22
22
languages and for multiple speakers. Several text-to-speech models are currently available in 🤗 Transformers, such as [Dia](../model_doc/dia), [CSM](../model_doc/csm),
23
23
[Bark](../model_doc/bark), [MMS](../model_doc/mms), [VITS](../model_doc/vits) and [SpeechT5](../model_doc/speecht5).
24
24
25
-
You can easily generate audio using the `"text-to-audio"` pipeline (or its alias - `"text-to-speech"`). Some models, like Dia,
26
-
can also be conditioned to generate non-verbal communications such as laughing, sighing and crying, or even add music.
27
-
Here's an example of how you would use the `"text-to-speech"` pipeline with Dia:
25
+
You can easily generate audio using the `"text-to-audio"` pipeline (or its alias - `"text-to-speech"`).
26
+
Here's an example of how you would use the `"text-to-speech"` pipeline with [CSM](https://huggingface.co/sesame/csm-1b):
... {"role": "0", "content": [{"type": "text", "text": "How much money can you spend?"}]},
63
+
... ]
64
+
>>> output = pipe(conversation)
65
+
```
66
+
67
+
Some models, like [Dia](https://huggingface.co/nari-labs/Dia-1.6B-0626), can also be conditioned to generate non-verbal communications such as laughing, sighing and crying, or even add music. Below is such an example:
0 commit comments