From 20f405042c7a16a7d993e32a67ed4511ec488e07 Mon Sep 17 00:00:00 2001 From: mvansegbroeck <67658125+mvansegbroeck@users.noreply.github.com> Date: Wed, 12 Jun 2024 13:38:55 +0200 Subject: [PATCH] Update README.md with Gretel's Synthetic Multilingual Prompts Dataset Inspired by the valuable work in this repository and dataset, Gretel has created a synthetic multilingual dataset of prompts. This dataset, released under the permissive Apache 2.0 license, includes 1,250 synthetic LLM prompts generated using Gretel Navigator, available in seven different languages. It is designed to be used with LLMs to generate diverse and multilingual responses based on the provided prompts. We believe this contribution will further enhance and support the efforts made here by providing additional resources for the community to explore, utilize, and build upon. --- README.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/README.md b/README.md index 3a0463d..746e9f1 100644 --- a/README.md +++ b/README.md @@ -96,6 +96,17 @@ The _unofficial_ ChatGPT desktop application provides a convenient way to access # Prompts +## Synthetic Multilingual LLM Prompts +Contributed by: [Gretel](https://gretel.ai) +Reference: [https://huggingface.co/datasets/gretelai/synthetic_multilingual_llm_prompts](https://huggingface.co/datasets/gretelai/synthetic_multilingual_llm_prompts) + +- The "Synthetic Multilingual LLM Prompts" dataset features 1,250 synthetic LLM prompts generated using [Gretel Navigator](https://gretel.ai/navigator), available in seven different languages. +- To ensure accuracy, diversity, and maintain translation quality and consistency, we employed Gretel Navigator both as a generation tool and in an LLM-as-a-judge approach. More info can be found in the [README](https://huggingface.co/datasets/gretelai/synthetic_multilingual_llm_prompts). +- This dataset is designed to be used with LLMs to generate diverse and multilingual responses based on the provided prompts. +- The dataset includes prompts in English, Dutch, French, Spanish, German, Brazilian Portuguese, and Simplified Chinese. Detailed evaluations of translation quality are available for each language. +- This dataset is released under the Apache 2.0 license, making it open for public use with proper attribution. We invite the community to explore, utilize, and contribute to this dataset to enhance the versatility and richness of LLM interactions. +- Disclaimer: The translations and overall quality of this dataset are generated synthetically and have not been perfected by human review. As a result, inaccuracies may be present. + ## ChatGPT SEO prompts Contributed by: [StoryChief AI](https://www.storychief.io/ai-power-mode) Reference: [https://storychief.io/blog/chatgpt-prompts-seo](https://storychief.io/blog/chatgpt-prompts-seo)