-
Notifications
You must be signed in to change notification settings - Fork 7
Description
Description
Given a small set of high-quality, multi-turn conversations (e.g., 100 customer support chats), generate 1,000 new, realistic, and high-quality conversations.
This is critical for training chatbots and customer service models, which are starved for high-quality, diverse training data.
Constraints
-
Maintaining Coherence: each generated conversation must be logical from turn to turn. The "agent" and "user" must respond to what the other just said. The LLM can't just generate two good "user" lines and two good "agent" lines; they must flow.
-
Semantic Diversity (Avoiding "Mode Collapse"): the LLM will tend to find a few "safe" or common conversational paths and generate 1,000 minor variations of them. The challenge is to generate truly different conversations, covering a wide range of topics, user intents, and emotional tones (e.g., angry users, confused users, happy users).
-
Adherence to Persona and Goals: the generated "agent" must consistently follow its script, persona, or rules. The "user" must have a clear and consistent intent from the beginning of the chat to the end.
Before start this issue suggest a solution and wait for the approval.