Image-Related Use Cases

The image-related use cases involve the generation of text-based AI content for images and includes prompts that generate output for:

  • Full text descriptions

  • Text extracted from the image

  • Alt-text that describes the image

That output can then be analyzed by the AI to generate text for: 

  • Determining the type of product in the image

  • The usage of the product in the image

  • Generating SEO keywords

An individual image can be associated with multiple prompts, and each prompt can be used for any number of different images. Prompts are executed by triggering a relevant image, which can be done in a variety of ways, such as a step within a workflow, as part of an import, or as part of an event processor process. Additionally, multiple images can be triggered at once if included in a collection when initiating a bulk update. This should only be done once the prompts have been finalized via prompt engineering (see below).

Once the image is triggered and an output is generated by the AI, it is saved in the dedicated prompt output attribute on the relevant image. Next, the image is pushed into the Generative AI Review workflow to review the prompt output. The user can customize which image workflow or workflow step the image is pushed to when the output is generated.

Image-related prompts

There are four different use case categories, each with their own prompt(s).

For generating content using internal image binary to text, three different prompts are available. Prompt - IBBASET64TT - Full Description generates text that describes the image based on a visual interpretation of the internal image. Prompt - IBBASET64TT - Extract Texts extracts all text within the image based on a visual interpretation of the internal image. Prompt - IBBASET64TT - Alt-Text generates alt-text for the image based on a visual interpretation of the internal image.

For generating content using external image binary to text, three different prompts are available. Prompt - IBURLTT - Full Description generates text that describes the image based on a visual interpretation of the internal image. Prompt - IBURLTT - Extract Texts extracts all text within the image based on a visual interpretation of the internal image. Prompt - IBURLTT - Alt-Text generates alt-text for the image based on a visual interpretation of the internal image.

To analyze the content and generate a text-to-text AI output, three different prompts are available. Prompt - IMETATT - Product Type determines the type of product in the image based on the image data output. Prompt - IMETATT - Product Usage determines the usage of the product in the image based on the image data output. Prompt - IMETATT - SEO Keywords generates SEO keywords for the image based on the image data output and the data of the products referencing the image. These prompts can be executed in sequence with those of the previous two categories, where the output for one prompt can be used for the input of the other.

Prompt engineering

A prompt is defined by its input, such as any relevant attributes, instructions, and the products which reference the prompt. Prompt input may include:

  • Parameters

  • Sort order for sequential prompt execution

  • Image metadata attributes, focus attributes, and product attributes

  • Persona, target audience, and message tone

  • Mandatory output attributes and maximum number of characters

The most straightforward way to create a new prompt is to copy an existing one, duplicate output and target attributes for it, add them to the relevant List of Values (LOV), and then edit the new prompt.

The image input and output attributes used in each prompt get their selectable values from an LOV. The easiest method for creating a new input / output attribute is to duplicate an existing one and add the ID for the new image input / output attribute and the name of the LOV.

Automating the execution of prompts

By default, executing a prompt to generate an output is done by either manually selecting the Generate AI Output button, or by executing a relevant bulk update. In either case, a business rule is executed and an event is generated in the event processor. To automate prompt execution, use the code of the business rule which generates events for the event processor either as part of an import, as part of an existing workflow, or as part of another existing process or business rule.

Embed into existing workflow

By default, after the AI output is generated, the solution initiates images into a workflow called GenAI Review Workflow – Image Texts. If desired this can be changed to another existing workflow or workflow state. To accomplish this, the user must edit the business rule GenAI - Text for Images – Generate Output and change the variable “aiReviewWorkflow” to the ID of the desired workflow. To initiate the image into a specific state of the workflow, change the code under “Initiate in Ai Review or Error Workflow”.

Automatically copy output to target attribute

By default, the business rule responsible for copying the generated output to the target attribute on the image or product is GenAI – Text for Images – Copy To Target. If desired, this can be changed to another attribute by enhancing the business rule with, for example, a lookup table that matches output and target attributes.