Interest in HoloLens, the mixed reality device from Microsoft, and digital reality in general, is growing rapidly. A large part of that interest comes from developers wanting to know how to build software for the device. And guess what: It isn't that difficult at all. With some basic C# knowledge and a free copy of Unity 3D you can get started in very little time. The first step is to get comfortable with the five main pillars of mixed reality development. First of all there are the three key input forms.
Gaze works much in the same way as you use a mouse cursor. With HoloLens your head movements moves the cursor of the device, or your gaze. When you gaze at objects, you can interact with them and have them react to your gaze. It is up to the developer to manage the interaction with the gaze input, and decide when an action is appropriate. The framework will provide you with the tools to handle the interaction, but the actual decision of when to enable it is up to you and your design.
A basic code block to handle gaze input could look like the this
In this case the RayCast object is what the gaze is using to detect if any object is in the gaze direction. If the RayCast is true (so we are looking at something), then show the cursor on that object. If we aren't looking at anything (RayCast returns false), then hide the cursor.
Gaze can also be detected, so you can have objects that react to being looked at. This can be really useful, if there are hundreds or even thousands of objects in the mixed reality experience, in order to highlight which one the cursor is pointing, or gazing, at.Gestures
The second form of input is what most people associate with HoloLens, namely gestures. If you have seen any HoloLens demo you would have seen the "tap" gesture that is prevalent for the experience. This gesture forms the base for all other built in gestures, except for one (the "bloom"). As a developer you can handle all of these gestures easily, as they are exposed by the framework. These gestures are tap, hold, manipulate and navigate. As the HoloLens detects the gestures, it raises events for you to handle. An example in code could look like this.
In this example we create a GestureRecognizer object, which is handles any gestures you want to subscribe to. In this case, the TappedEvent is being handled by sending a message to any child objects in the Unity hierarchy. These objects can then respond to the message as appropriate and react to the gesture in real time. In the above code snippet, we check every frame in the Update() function if the user is gazing at anything, which they can then tap op.
The last of the three input methods for HoloLens is voice. Using voice commands is very similar to handling gestures, in a code sense. The HoloLens is extremely good at recognizing voice and words, and this makes the implementation very simple.
You use a KeywordRecognizer, which is a very similar structure to the GestureRecognizer. You add a list of keywords the Recognizer will listen for, and then the KeywordRecognizer_OnPhraseRecognized event will be triggered whenever a keyword is recognized. In this case a function is passed with the keyword which is then invoked.
The easy part of implementing voice commands is the code, the hard part is the design of the commands. Creating effective voice commands starts long before the coding, when you decide what the commands are, how they relate to each other and how you educate your users on what they are. In particular, pay attention to these points:
- What actions can be taken through speech?
- Is speech input a good option for completing a task?
- How does a user know when speech input is available?
- Is the app always listening?
- What phrases initiate an action or behavior?
- What is the interaction dialog between app and user?
- Is network connectivity required?
If you give all of these points proper thought, your voice commands are much more likely to be effective and useful.
We have gone through the gaze-gesture-voice paradigm of input for the HoloLens development experience, and although audio isn't strictly speaking an input method, it does play a critical part in a successful mixed reality experience. HoloLens uses head related transfer function to mimic a binaural audio source. This means audio can be spatialized (and should in most cases) so that it appears as realistic as possible. Digital 3D assets will have natural audio, giving the whole experience much more immersive and genuine feel.
You register and import the audio bites in Unity, but then you can use the code above with relative ease. You set your audio clip properties, and then handle the system events for collisions, OnCollisionEnter, OnCollisionStay and OnCollisionExit. It is that simple.
Just like with voice commands, the easy part is the code, the hard part is the audio design. There are four parts of audio design to get right to achieve great spatial audio for a HoloLens app:
- Grounding: Just like real objects, you want to be able to hear holograms even when you can't see them, and you want to be able to locate them anywhere around you.
- User Attention: When you want to direct your user's gaze to a particular place, rather than using an arrow to point them visually, placing a sound in that location is a very natural and fast way to guide them.
- Immersion: When objects move or collide, we usually hear those interactions between materials. Spatialized sound make up the "feel" of a place beyond what we can see.
- Interaction Design: When we press a button in the real world, the sound we hear comes from that button. By spatializing interaction sounds, we again provide a more natural and realistic user experience.
The HoloLens uses four environmental cameras on the front of the device to map the physical surroundings and build up a 3D model of the real world. As you use the device, it continually creates a spatial mapping model of the space you are in and updates any existing mapping. The really cool thing is that the developer portal that comes with the developer tools allows you to see this 3D spatial mapping in real time, and it works for both physical HoloLens devices as well as the emulator.
Make sure your device is on the same physical network as your computer, then access the developer portal by entering your device’s IP address in any browser. In the developer portal, go to the 3D View menu option on the left, which will bring up an empty plane. Press the Update button and the current state of the 3D mapping model is drawn on the plane as seen below.
Spatial mapping is what makes or breaks the HoloLens and the mixed reality experience you are creating.
Although this is a very short intro to HoloLens development, it gives you an idea of just how approachable the development paradigm is. If you are already a C# developer, you will only have to get familiar with Unity 3D, which can be daunting I must admit. However, the reward is instant, and the best bit is that you don't event need a physical HoloLens device to get started. Included in the free tooling is an emulator that is as good as it gets. It gives a great quick impression of what your experience is going to be like, before you deploy to the real world and real devices.
All code samples are from official Microsoft examples and can be found on their HoloLens Developer Portal