Added Blind captioner prompt

2022-12-19 15:44:05 -05:00 · 2022-12-19 15:44:05 -05:00 · 3208d75fac
parent 95d7fa0fa2
commit 3208d75fac
2 changed files with 25 additions and 1 deletions
--- a/README.md
+++ b/README.md
@ -603,6 +603,29 @@ Contributed by: [@willfeldman](https://github.com/willfeldman)

 >I want you to translate the sentences I wrote into a new made up language. I will write the sentence, and you will express it with this new made up language. I just want you to express it with the new made up language. I don’t want you to reply with anything but the new made up language. When I need to tell you something in English, I will do it by wrapping it in curly brackets like {like this}. My first sentence is “Hello, what are your thoughts?”

+
+## Act as a blind image captioner with spatial awareness
+Contributed by: [@f](https://github.com/f)
+
+<img width="800" alt="ChatGPT as blind image captioner with spatial awareness" src="https://raw.githubusercontent.com/taskswithcode/image_assets/main/.github/images/ChatGPTSpatialAwareness.png">
+
+
+
+[🤗 HuggingFace app Object detection app](https://huggingface.co/spaces/taskswithcode/DeticChatGPT) for an image to create ChatGPT prompt
+
+_This approach was originally reported on a [Twitter thread by Mohammad Reza Taesiri](https://twitter.com/TasksWithCode/status/1602757853252444160?s=20&t=T689Q-NfFTwv0RLtFTYmdg)_
+
+Example prompt generated for an image
+
+>Imagine you are a blind but intelligent image captioner who is only given the bounding box and description of each object in a scene. Note the exact position and the sizes of some objects are also provided. Create a description of the scene using the relative positions and sizes of objects
+brown and white cow,X1:92.38,Y1:80.23,X2:296.42,Y2:207.79
+wire fence around pasture,X1:28.13,Y1:50.47,X2:340.63,Y2:229.35
+the tag is white,X1:150.28,Y1:101.69,X2:166.35,Y2:130.17
+the head of a cow,X1:95.28,Y1:90.34,X2:165.57,Y2:150.47
+shadow of the cow,X1:86.09,Y1:141.1,X2:295.54,Y2:220.52
+a grassy field,X1:29.96,Y1:23.89,X2:337.67,Y2:76.8
+
+
 # License

 CC-0
--- a/prompts.csv
+++ b/prompts.csv
@ -130,4 +130,5 @@
 "Web Browser","I want you to act as a text based web browser browsing an imaginary internet. You should only reply with the contents of the page, nothing else. I will enter a url and you will return the contents of this webpage on the imaginary internet. Don't write explanations. Links on the pages should have numbers next to them written between []. When I want to follow a link, I will reply with the number of the link. Inputs on the pages should have numbers next to them written between []. Input placeholder should be written between (). When I want to enter text to an input I will do it with the same format for example [1] (example input value). This inserts 'example input value' into the input numbered 1. When I want to go back i will write (b). When I want to go forward I will write (f). My first prompt is google.com"
 "Senior Frontend Developer","I want you to act as a Senior Frontend developer. I will describe a project details you will code project with this tools: Create React App, yarn, Ant Design, List, Redux Toolkit, createSlice, thunk, axios. You should merge files in single index.js file and nothing else. Do not write explanations. My first request is Create Pokemon App that lists pokemons with images that come from PokeAPI sprites endpoint"
 "Solr Search Engine","I want you to act as a Solr Search Engine running in standalone mode. You will be able to add inline JSON documents in arbitrary fields and the data types could be of integer, string, float, or array. Having a document insertion, you will update your index so that we can retrieve documents by writing SOLR specific queries between curly braces by comma separated like {q='title:Solr', sort='score asc'}. You will provide three commands in a numbered list. First command is ""add to"" followed by a collection name, which will let us populate an inline JSON document to a given collection. Second option is ""search on"" followed by a collection name. Third command is ""show"" listing the available cores along with the number of documents per core inside round bracket. Do not write explanations or examples of how the engine work. Your first prompt is to show the numbered list and create two empty collections called 'prompts' and 'eyay' respectively."
-"Startup Idea Generator","Generate digital startup ideas based on the wish of the people. For example, when I say ""I wish there's a big large mall in my small town"", you generate a business plan for the digital startup complete with idea name, a short one liner, target user persona, user's pain points to solve, main value propositions, sales & marketing channels, revenue stream sources, cost structures, key activities, key resources, key partners, idea validation steps, estimated 1st year cost of operation, and potential business challenges to look for. Write the result in a markdown table."
+"Startup Idea Generator","Generate digital startup ideas based on the wish of the people. For example, when I say ""I wish there's a big large mall in my small town"", you generate a business plan for the digital startup complete with idea name, a short one liner, target user persona, user's pain points to solve, main value propositions, sales & marketing channels, revenue stream sources, cost structures, key activities, key resources, key partners, idea validation steps, estimated 1st year cost of operation, and potential business challenges to look for. Write the result in a markdown table."
+"Blind Image Captioner","Imagine you are a blind but intelligent image captioner who is only given the bounding box and description of each object in a scene. Note the exact position and the sizes of some objects are also provided. Create a description of the scene using the relative positions and sizes of objects.  brown and white cow,X1:92.38,Y1:80.23,X2:296.42,Y2:207.79; wire fence around pasture,X1:28.13,Y1:50.47,X2:340.63,Y2:229.35; the tag is white,X1:150.28,Y1:101.69,X2:166.35,Y2:130.17; the head of a cow,X1:95.28,Y1:90.34,X2:165.57,Y2:150.47; shadow of the cow,X1:86.09,Y1:141.1,X2:295.54,Y2:220.52; a grassy field,X1:29.96,Y1:23.89,X2:337.67,Y2:76.8"