Thursday 15 February 2024

Generate images using Azure OpenAI DALL·E 3 in SPFx

Dall E 3 is the latest AI image generation model coming out of OpenAI. It is leaps and bounds ahead of the previous model Dall E 2. Having explored both, the image quality as well as the adherence to text prompts is much better for Dall E 3. It is now available as a preview in Azure OpenAI Service as well.

Given all this, it is safe to say if you are working on the Microsoft stack and want to generate images with AI, using the Azure OpenAI Dall E 3 model would be the recommended option.

In this post, let's explore the image generation API for Dall E 3 and also how to use it from a SharePoint Framework (SPFx) solution. The full code of the solution is available on GitHub:

First, let's build the web api which will wrap the Azure OpenAI API to create images. This will be a simple ASP.NET Core Web API which will accept a text prompt and return the generated image to the client.

To run this code, we will need the following NuGet package:

Now for calling the API, we will use a standard React based SPFx webpart. The webpart will use Fluent UI controls to grab the text prompt from user and send it to our API.

Hope this helps!

Thursday 25 January 2024

Get structured JSON data back from GPT-4-Turbo

With the latest gpt-4-turbo model out recently, there is one very helpful feature which came with it: The JSON mode option. Using JSON mode, we are able to predictably get responses back from OpenAI in structured JSON format. 

This can help immensely when building APIs using Large Language Models (LLMs). Even though the model can be instructed to return JSON in it's system prompt, previously, there was no guarantee that the model would return valid JSON. With the JSON mode option now, we can specify the required format and the model will return data according to it. 

To know more about JSON mode, have a look at the official OpenAI docs:

Now let's look at some code to see how this works in action:

I am using the Azure OpenAI service to host the gpt-4-turbo model and I am also using the v1.0.0.-beta.12 version of the Azure OpenAI .NET SDK found on NuGet here:

What is happening in the code is that in the system message, we are instructing the LLM that analyse the text provided by the user and then extract the cities mentioned in this text and return them in the specified JSON format. 

Also important is line 22 where we explicitly specify to use the response format as JSON.   

Next, we provide the actually text to parse in the user message. 

Once we get the data back in expected JSON schema, we are able to convert it to objects which can be used in code.

And as expected we get the following output:

Hope this helps!