Intelligent applications are all the rage and I for one was really surprised to see how easy and quickly some basic recognition can be developed using the Windows Runtime. It took me just a few minutes to get a webcam to recognize if one or more faces are in frame in a Universal Windows App (UWA).

In the following XAML we have a simple button and a Capture element. The capture element is the means by which we stream from a webcam onto a page or form of the UWA. We are then going to use the Face Analysis class to check how many faces are in the image. So here goes with XAML:

<CaptureElement x:Name="CameraCaptureElement" HorizontalAlignment="Left" Height="600" 
                Margin="0,0,-240,0" VerticalAlignment="Top" Width="600" />
<Button x:Name="button" Content="Button" HorizontalAlignment="Left" 
        Margin="145,605,0,0" VerticalAlignment="Top" Click="button_Click" />

In the UWA form I set the OnLoad method to complete the following steps:-

  • Find all the video capture apparatus associated with the Windows device.
  • Select the front facing video device.
  • Use Windows.Media.Capture.MediaCapture to initialize the CaptureElement
// Find all the videos, and select the one that is Front facing
var videoDevices = await DeviceInformation.FindAllAsync(DeviceClass.VideoCapture);
var frontCamera = videoDevices.FirstOrDefault(item => item.EnclosureLocation != null 
                        && item.EnclosureLocation.Panel == Windows.Devices.Enumeration.Panel.Front);

// Initialize the selected camera 
MediaCapture mediaCaptureMgr = new MediaCapture();
await mediaCaptureMgr.InitializeAsync(new MediaCaptureInitializationSettings { VideoDeviceId = frontCamera.Id });

// Assign the camera to the CaptureElement on the Form
CameraCaptureElement.Source = mediaCaptureMgr;
await mediaCaptureMgr.StartPreviewAsync();

Then we can use a simple OnClick event to capture an image from the CatpureElement note the steps:-

  • Use the MediaCapture class to capture audio, videos and image streams from the webcam.
  • Convert the stream to SoftwareBitmap and ensure that the SoftwareBitmap is in the correct pixel format
  • Use the FaceDetector class to do its thing…
// Grab the image from the CaptureElement into a stream
InMemoryRandomAccessStream stream = new InMemoryRandomAccessStream();
MediaCapture mediaCaptureMgr = (MediaCapture)CameraCaptureElement.Source;
await mediaCaptureMgr.CapturePhotoToStreamAsync(ImageEncodingProperties.CreateJpeg(), stream);

// Get the SoftwareBitmap from the stream and convert it to a supported format
BitmapDecoder decoder = await BitmapDecoder.CreateAsync(stream);
SoftwareBitmap softwareBitmap = await decoder.GetSoftwareBitmapAsync(BitmapPixelFormat.Bgra8, BitmapAlphaMode.Straight);
IReadOnlyList<BitmapPixelFormat> supportedBitmapPixelFormats = FaceDetector.GetSupportedBitmapPixelFormats();
SoftwareBitmap convertedBitmap = SoftwareBitmap.Convert(softwareBitmap, supportedBitmapPixelFormats.First());

//Detect number of faces
FaceDetector 
faceDetect = await FaceDetector.CreateAsync();

IList<DetectedFace> faces = await faceDetect.DetectFacesAsync(convertedBitmap);
await new MessageDialog(string.Format("{0} faces detected.",faces.Count)).ShowAsync(); 

The native Face Analysis in WinRT is rather elementary (no expression analysis), but there are a couple other interesting (albeit limited) detection namespaces. There is support for developing synthesized speech (voice) by converting text strings to an audio stream (text-to-speech), and speech recognition for command and control within Windows Runtime apps. Next I want to take a look at Microsoft Cognitive Services (cloud).



Comment Section

Comments are closed.