Automating Image Metadata in Sitecore Content Hub with GPT-4 Vision

SergeyYatsenko
Sitecore Technology MVP & Sr. Director
  • Twitter
  • LinkedIn

Manual metadata management for large volumes of digital assets in Sitecore Content Hub is time-consuming. Content authors need to write titles and descriptions for each image individually, which is... such a slow way to do it in the age of AI :).

I would like to describe an approach I used to automate the generation of image metadata in Sitecore Content Hub using OpenAI's GPT-4 Vision model. This solution demonstrates how to leverage AI capabilities to automate image metadata generation in Sitecore Content Hub, significantly enhancing the efficiency of asset management. By integrating OpenAI's GPT-4 Vision model, we're able to automatically generate descriptive titles and detailed descriptions for images.


Solution: AI-Powered Metadata Generation

The solution uses a client application that:

  1. 1. Iterates through assets in Content Hub
  2. 2. Retrieves the original rendering of each image
  3. 3. Uses OpenAI's vision model for image analysis
  4. 4. Updates the metadata in Content Hub


Process Flow

AI Powered Content Tagging1


Implementation

The implementation uses the Sitecore Content Hub JavaScript SDK for Content Hub API interactions and OpenAI's GPT-4 Vision model for image analysis.

const { ContentHubClient } = require("@sitecore/content-hub-one-sdk");
const { Configuration, OpenAIApi } = require("openai");

// Initialize Content Hub client with your credentials
const contentHubClient = new ContentHubClient({
  clientId: "YOUR_CLIENT_ID",
  clientSecret: "YOUR_CLIENT_SECRET",
  authEndpoint: "YOUR_AUTH_ENDPOINT",
  apiEndpoint: "YOUR_API_ENDPOINT",
});

// Initialize OpenAI client with your API key
const configuration = new Configuration({
  apiKey: process.env.OPENAI_API_KEY,
});
const openai = new OpenAIApi(configuration);

// Function to fetch assets from Content Hub
async function getAssets() {
  try {
    const assets = await contentHubClient.assets.getAll({
      limit: 100, // Fetches 100 assets at a time, adjust as needed
      fields: ["id", "name", "renditions"], // Specify fields to retrieve
    });
    return assets;
  } catch (error) {
    console.error("Error fetching assets:", error);
    throw error;
  }
}

// Function to get the original rendition URL of an asset
async function getOriginalRendition(assetId) {
  try {
    const renditions = await contentHubClient.assets.getRenditions(assetId);
    const originalRendition = renditions.find((r) => r.name === "original");
    return originalRendition ? originalRendition.downloadUrl : null;
  } catch (error) {
    console.error(`Error getting original rendition for asset ${assetId}:`, error);
    return null;
  }
}

// Function to update asset metadata in Content Hub
async function updateAssetMetadata(assetId, title, description) {
  try {
    await contentHubClient.assets.update(assetId, {
      properties: {
        title: { values: [{ value: title }] },
        description: { values: [{ value: description }] },
      },
    });
    console.log(`Updated metadata for asset ${assetId}`);
  } catch (error) {
    console.error(`Error updating metadata for asset ${assetId}:`, error);
    throw error;
  }
}

// Main function to process assets with AI
async function processAssetsWithAI() {
  const assets = await getAssets();

  for (const asset of assets) {
    const originalRenditionUrl = await getOriginalRendition(asset.id);
    if (!originalRenditionUrl) {
      console.log(`No original rendition found for asset ${asset.id}`);
      continue;
    }

    try {
      // Use OpenAI's GPT-4 Vision to analyze the image
      const response = await openai.createImageAnalysis({
        model: "gpt-4-vision-preview",
        messages: [
          {
            role: "user",
            content: [
              { type: "text", text: "Describe this image in detail." },
              { type: "image_url", image_url: originalRenditionUrl },
            ],
          },
        ],
      });

      // Extract description and generate title from AI response
      const description = response.data.choices[0].message.content;
      const title = description.split(".")[0]; // Use first sentence as title

      // Update asset metadata with AI-generated title and description
      await updateAssetMetadata(asset.id, title, description);
    } catch (error) {
      console.error(`Error processing asset ${assetId}:`, error);
    }
  }
}

// Execute the main function and handle any errors
processAssetsWithAI().catch(console.error);


Conclusion

This solution demonstrates how to leverage AI capabilities to automate image metadata generation in Sitecore Content Hub, significantly enhancing the efficiency of asset management. By integrating OpenAI's GPT-4 Vision model, we've created a powerful tool that can automatically generate descriptive titles and detailed descriptions for images.

The approach outlined here can be further extended to enrich asset metadata in various ways:

  1. 1. Tagging: The AI model could be prompted to generate relevant tags based on the image content, making asset discovery easier.
  2. 2. Taxonomies: By analyzing the image and its generated description, the system could categorize assets into predefined taxonomies, improving content organization.
  3. 3. Sentiment Analysis: AI could be used to determine the mood or emotion conveyed by the image, adding another layer of searchability.
  4. 4. Object Detection: Specific objects or elements within the image could be identified and added as metadata.
  5. 5. Color Analysis: The dominant colors in the image could be extracted and added as metadata, useful for brand consistency and design purposes.

These extensions to the metadata generation process are explored in more detail in another blog post, which delves into advanced AI-driven metadata enrichment techniques for digital asset management.

By implementing such AI-powered solutions, organizations can dramatically improve their asset management workflows, ensuring that their digital assets are well-described, easily searchable, and primed for optimal use across various channels and campaigns.


Results and Considerations

Key points to consider when implementing this solution:

  1. 1. API costs: Monitor OpenAI API usage for cost management
  2. 2. Rate limiting: Account for rate limits on both Content Hub and OpenAI APIs
  3. 3. Error handling: Implement robust error handling and retries
  4. 4. Batch processing: Consider batch processing for large asset libraries


Useful Links