Smarter ideas worth writing about.

Dynamic Language Translation with Chatbots and Azure Cognitive Services

Tags: Azure , Bots

Chatbots are a hot topic these days. The available tool-sets have enabled many businesses to successfully integrate chatbot technology into their enterprise systems saving time and money. One of the challenges development teams face when building a new chatbot is how to handle language translation for a globally deployed chatbot. Due to the dynamic aspects of a chat conversation, app localization is difficult at best.

What a global chatbot needs is a way to detect the incoming language from the user, translate it to the language your bot understands, then translate the bot response to the users original input language. All this needs to happen dynamically and on the fly. The Microsoft Cognitive Service Translator Text API provides just the tools to accomplish this.

In this tutorial, we’ll be using the Microsoft Bot Builder framework v3, the Bot Framework Emulator, and a Translator Text API resource in Azure. We’ll assemble these resources into a functioning echo bot that will perform language translation on your inputted text. I invite you to clone a copy of the completed sample solution from GitHub to use as a reference.

The Bot Application

We’ll start with the Visual Studio bot application template, which can be downloaded here. Create a new project from that template, give it a name and press ‘Ok’. This will create a basic echo bot that echoes back any text typed in. When you’re done with this tutorial, the updated chatbot application will detect the incoming text language, translate it to English so the bot can process it, and finally respond, in the user’s language.

Azure Cognitive Service

To access Azure Cognitive Services, you’ll need an active Azure subscription. You can get a trial Azure account here. Cognitive Services encompass a wide range of features including speech, vision, and language processing. You can read more about these features here.

To perform language detection and translation, we need to use the Text Translator API endpoint. The Text Translation API can perform automatic language detection, translation, transliteration and bilingual dictionary lookups. It supports over 60 languages and Microsoft continues to add more. A full list of supported languages can be found here. To create a translator resource, login to the Azure portal and click the “+ Create a Resource” button in the top-left corner. Search for ‘Translate Text’, then click the ‘Translator Text’ resource type, finally click ‘Create’.

In the Translator Text Create blade, give your service a name and select the Azure subscription to associate the resource with. Select F0 for the pricing tier, this is a free tier and will allow up to 2 million characters per month to be translated. Create a new or select an existing resource group. For a new resource group, pick ‘East US 2’ as the resource group location. Finally, click the ‘Create’ button and Azure will create the new API resource.

Once Azure finishes creating the Translator Text resource, go to the resource overview and click on the ‘Keys’ tab. Here you’ll find the access keys you need to access to API from your code. Copy either ‘Key 1’ or ‘Key 2’ and store it somewhere. We’ll use it later in the tutorial.

The Translator Text API

For this tutorial, we’ll be using three functions of the translator text API version 3.0. Version 2.0 is still available, but it’s scheduled to be discontinued in 2019.

  • Languages - Supported Languages

    This API method returns a list of the currently supported languages by other Text API methods. We’ll use this to lookup the names and native names of the languages being used.

  • Detect - Language Detection

    This API method will take your input text, or an array of input text and detect the language. If a text element could be more than one language, a score between zero and one is returned for each possible alternative based on the confidence in that result. In this tutorial, we’ll detect the incoming language and store it in the conversation state. We’ll use that information later when building our bot output.

  • Translate - Language Translation

    This API method will take your input text, or an array of input text and translate to one or more languages. If no source language is provided, the API will attempt to discern the language on its own. The API can also translate well-formed html and will correctly ignore tags and other non-text elements. This method is the backbone of our goal and will be used to translate text back and forth between English and the input language.

These functions are all that we’ll need to detect and translate between multiple languages. When writing a chatbot that could potentially be used anywhere around the world, they’ll give your app the support it needs to communicate with any user in their native language.

First, Some Setup

The first thing we want to do is add some app settings to the web.config. We want to add settings for the translator text api subscription key and the api endpoint. Note, as of this time, the endpoint shown on the Azure portal is for version 2.0 of the API. This tutorial references version 3.0, so, the endpoint you want to use is https://api.cognitive.microsofttranslator.com. Go ahead and add the following entries in the appSettings section of web.config. This is the only web.config change we need to make for the tutorial

<add key="trns:APIKey" value="<REPLACE WITH YOUR SUBSCRIPTION KEY>" />
<add key="trns:APIEndpoint" value="https://api.cognitive.microsofttranslator.com" />

The Translator Code

I added a service class to call the Text Translator API, named “LanguageUtilities”. I also created an interface from the class and will be using dependency injection to inject it into the other parts of the bot that need to use it. For this demo, those will be MessageController.cs, RootDialog.cs and a piece I’ll talk about later. There is also a read-only property for the default language. This will be “en” for English for this demo.

public interface ILanguageUtilities
{
	string DefaultLanguage { get; }
	Task<T> SupportedLanguagesAsync<T>();
	Task<T> DetectInputLanguageAsync<T>(string inputText);
	Task<T> TranslateTextAsync<T>(string inputText, string outputLanguage);
}

Let’s take a closer look at the implementation of ILanguageUtilities, LanguageUtilities.cs. You’ll notice that there is a single private method, ExecuteAPI. All of the API methods in the Translator Text API work in a similar manner so I created ExecuteAPI to act as the single point for all calls to go through. This method uses generics to define the return type. Each API call returns a JSON object and this will deserialize it into the generic type. Other than the use of generics, this method call is a straightforward http get or post request.

private async Task<T> ExecuteAPI<T>(string apiPath, string bodyText)
{
	string requestBody = String.Empty;
	if (!String.IsNullOrEmpty(bodyText))
	{
		System.Object[] body = new System.Object[] { new { Text = bodyText } };
		requestBody = JsonConvert.SerializeObject(body);
	}

	string apiKey = ConfigurationManager.AppSettings["trns:APIKey"];
	string url = ConfigurationManager.AppSettings["trns:APIEndpoint"];
	var uri = new Uri($"{url}/{apiPath}");

	using (var client = new HttpClient())
	using (var request = new HttpRequestMessage())
	{
		request.Method = !String.IsNullOrEmpty(requestBody) ? HttpMethod.Post : HttpMethod.Get;
		request.RequestUri = uri;
		request.Content = !String.IsNullOrEmpty(requestBody) ? new StringContent(requestBody, Encoding.UTF8, "application/json") : null;
		request.Headers.Add("Ocp-Apim-Subscription-Key", apiKey);

		var response = await client.SendAsync(request);
		var responseBody = await response.Content.ReadAsStringAsync();

		var setting = new JsonSerializerSettings();
		setting.StringEscapeHandling = StringEscapeHandling.EscapeNonAscii;
		var result = JsonConvert.DeserializeObject<T>(responseBody);

		return result;
	}
}

The public methods, which are the implementations of the methods of the interface, are all for the specific api calls and only provide the api path to ExecuteAPI and pass on the generic return type information. I generated C# classes to reflect the return values of the different API calls. When a method is called the expected return type is set as the type thus letting me have one method that can return different types. The website, json to csharp, is a great way to save some time if you want to quickly convert a json object to C# classes. Text Translator API methods all share one common query string parameter. This is ‘api-version’ and must be set to ‘3.0’.

public async Task<T> SupportedLanguagesAsync<T>()
{
	var path = $"languages?api-version=3.0&scope=translation";
	return await ExecuteAPI<T>(path, String.Empty);
}

SupportedLanguagesAsync() is a GET HTTP request and has one additional parameter, ‘scope’. This defines the group or groups of languages to return. We are only working with the translation function in this tutorial, so it will be set to ‘scope=translation’. Other options include ‘transliteration’, ‘dictionary’. You can leave the scope parameter out and all three will be returned.

public async Task<T> DetectInputLanguageAsync<T>(string inputText)
{
	var path = $"detect?api-version=3.0";
	return await ExecuteAPI<T>(path, inputText);
}

DetectInputLanguageAsync() is a POST HTTP request and has no additional parameters. However, since it’s a POST method, it does need body content. The body is a JSON array and each element in the array is a JSON object with a string property of ‘text’. The value of the text property is what will be analyzed by the language detection. The JSON body should look something like this: [{"Text": "Lorem ipsum dolor sit amet, consectetur adipiscing elit."}]. There are some limitations on how big the request can be. Please refer to the online API documentation for the specifics on those limitations.

public async Task<T> TranslateTextAsync<T>(string inputText, string outputLanguage)
{
	var path = $"translate?api-version=3.0&to={outputLanguage}&includeSentenceLength=true";
	return await ExecuteAPI<T>(path, inputText);
}

TranslateTextAsync() is a POST HTTP request. It has several additional query parameters but only one of them is required. The ‘to’ parameter defines the language to translate text to. The parameter can be repeated to define more than one language to translate text to. You don’t need to supply the language you’re translating from. The translate method will try to detect it for you. If you choose to supply it, the parameter name is ‘from’. If you need to send a block or page of html to the API for translation, you obviously don’t want the entire contents to be translated. That would include the html tags and those results would be unusable. To tell the api that you’re sending html, you want to use the optional parameter, ‘textType’, with a value ‘html’. The other possible value is ‘plain’ which is the default if you leave this parameter out. There are several other optional parameters, again, please refer to the API document to learn about those.

This is also a POST request, so a body must be included. The body for this request is similar to the Detect API method and is a JSON array. Each array element is a JSON object with a string property named ‘Text’, which represents the string to translate.

The Bot Code

Now that we have built a library to access the Translator text API, we can begin looking at how to tie this in with the bot application. One of the key concepts about Microsoft’s bot framework is that a bot application starts as nothing more than a WebAPI controller with a single Post() method. All messages coming into the bot go through this method first, and this gives us the perfect place to capture, detect, and translate the incoming text.

In the api controller Post() method, the code takes the activity text and calls the DetectInputLanguageAsync() method to detect the language of the incoming text.

var msgLanguage = await _languageUtilities.DetectInputLanguageAsync<List<AltLanguageDetectResult>>(activity.Text);

Now that we know the incoming language, we need to store it in the conversation state, so it will persist from call to call. We’ll do this using a BotDataStore. For this tutorial, an in-memory store is used but you can use any data store.

var userData = await _botDataStore.LoadAsync(key, BotStoreType.BotPrivateConversationData, CancellationToken.None);

var storedLangugageCode = userData.GetProperty<string>("ISOLanguageCode");
storedLangugageCode = storedLangugageCode ?? _languageUtilities.DefaultLanguage;

if (!storedLangugageCode.Equals(outputLanguage))
{
    userData.SetProperty("ISOLanguageCode", outputLanguage);
    await _botDataStore.SaveAsync(key, BotStoreType.BotPrivateConversationData, userData, CancellationToken.None);
    await _botDataStore.FlushAsync(key, CancellationToken.None);
}

Finally, we translate the incoming text to the bot default language, English in this tutorial

if (!msgLanguage.Equals(_languageUtilities.DefaultLanguage))
translatedObj = await _languageUtilities.TranslateTextAsync<List<AltLanguageTranslateResult>>(activity.Text,
_languageUtilities.DefaultLanguage);

Now, set the original incoming text to the translation text.

activity.Text = translatedObj[0].translations[0].text;

You can see that this code is straightforward but there is a lot happening. During a single user input, the text language is detected, the information is being stored, and the text is being translated to the bot’s default language.

Now, how do you handle the bot response? You could translate all bot responses just prior to calling context.PostAsync() but then you would have to find all those places in your code and remember to add it anytime you added new responses. Just like the incoming message from the user, we need to find a single place to handle all messages outgoing from the bot.

Unfortunately, unlike the api controller’s post method, there is nothing in the project solution that will provide a single, central place for us to add translation logic. However, there is a way to do it and that way is IMessageActivityMapper. To understand fully what this interface does requires an understanding of the dialog internal logic and how messages are processed from user to bot to user. That’s a bit out of scope for this article. The short explanation is that when you send a message it gets processed by multiple implementations of the IBotToUser interface. One of those implementations is MapToChannelData_BotTouser class. This class allows you to implement IMessageActivityMapper and it will be called before any message is sent to the user. This gives you the opportunity to execute any additional work, such as translating your bot response. This is just what we need.

Look in the Utilities folder again and open ‘TranslatorMessageActiviyMapper’. First thing to notice is that we are again using dependency injection to inject the language utility and bot datastore instances but the important thing to look at is the implementation of the Map() method.

public IMessageActivity Map(IMessageActivity message)
{
	Task<string> translation = Task<String>.Factory.StartNew(() =>
	{
		//store key is based on user to bot data. We need to build this a little different
		var key = Address.FromActivity(message);
		var userKey = new Address(key.UserId, key.ChannelId, key.BotId, key.ConversationId, key.ServiceUrl);
		var userData = _botDataStore.LoadAsync(userKey, BotStoreType.BotPrivateConversationData, CancellationToken.None).Result;

		var storedLangugageCode = userData.GetProperty<string>("ISOLanguageCode");
		storedLangugageCode = storedLangugageCode ?? _languageUtilities.DefaultLanguage;

		var translatedText = _languageUtilities.TranslateTextAsync<List<AltLanguageTranslateResult>>(message.Text,storedLangugageCode).Result;
		return translatedText[0].translations[0].text;
	});
	message.Text = translation.Result;

	return message;
}

The Map() method is taking the outgoing bot to user message text and translating it to the user language we stored earlier. Note, when we create the storage key we must rearrange the values. This is because we are now working with a bot to user activity. In the api controller, we are dealing with user to bot so we need to manipulate the Address values to get the right bot data storage key. Once we have the language to translate to, we take the original message.Text, translate it, and then update message.text to the translated text. There it is. We are now translating all outgoing messages in one central location.

The last thing to look at is RootDialog.cs, located in the Dialogs folder. The MessageReceivedAsync() method does a little more processing of the incoming user text. When you run the bot application in Visual Studio, set a breakpoint on this method, then send something in a language other than English. When the code stops on your breakpoint, look at the value of activity.Text. You’ll see that your input has already been translated to English. Remember, this happens at the api controller Post() method. The code then requests a list of supported languages, so we can get a mapping between the ISO language codes and language names. Finally, we construct a reply, in English, which is then sent by the bot framework via the context.PostAsync() method. Because we have an implementation of IMessageActivityMapper waiting, this text gets translated to the user’s language before being sent back to the chat client.

private async Task MessageReceivedAsync(IDialogContext context, IAwaitable<object> result)
{
	var activity = await result as Activity;
	context.PrivateConversationData.TryGetValue<string>("ISOLanguageCode", out string lang);

	var languages = await _languageUtilities.SupportedLanguagesAsync<JObject>();
	var englishText = $"You sent '{activity.Text}', which was originally written in {languages["translation"][lang]["name"].ToString()}";

	await context.PostAsync(englishText);           
	context.Wait(MessageReceivedAsync);
}

Summary

… and we’re done. This tutorial is a small taste of what the Translator Text API can do for your bot application. There are many other APIs available in the Azure Cognitive Services suite and I encourage you to learn about them. The full project is available in GitHub. Feel free to grab a copy and play around with it. Don’t forget that you need an Azure account to use Cognitive Services. I hope this tutorial has helped you to get started with these services and given you some new ideas and inspiration.

Share:

About The Author

App Dev Principal Consultant

Edward is a principal application development consultant in Cardinal's Atlanta office. He specializes in cloud development and application modernization but also works in open source and DevOps.