Categories
ios ios-keyboard-extension

How does Google’s custom iOS keyboard, Gboard, programmatically dismiss the frontmost app?

Google’s custom iOS app, Gboard, has an interesting feature that can’t be accomplished using public APIs for in the iOS SDK (as of iOS 10). I’d like to know exactly how Google accomplishes the task of programmatically popping back one app in the App Switching stack in Gboard.

Custom iOS keyboards have two major components: the container app and the keyboard app extension. The keyboard app extension runs in a separate OS process that is started up whenever a user is in any app on their phone that requires text input.

These are the approximate steps that can be followed, using Gboard, to see the effect of programmatically returning to a previous app:

  1. A user starts the Apple Messages app on their iPhone and taps a text field to begin entering text.
  2. The Gboard keyboard extension is launched and the users sees the Gboard custom keyboard (while they are still in the Apple Messages app).
  3. The user taps the microphone key inside the Gboard keyboard extension to do voice-to-text input.
  4. Gboard uses a custom url scheme to launch the Gboard container app. The Gboard keyboard and Apple messages app are pushed down one layer in the App stack and the Gboard container app is now the frontmost app in the App stack. The Gboard container app uses the microphone to listen to the user’s speech and translates it into text which it places onto the screen.
  5. The user taps the “Done” button when they are satisfied with the text input they see on the screen.
  6. This is where the magic happens… as the text input screen is dismissed, the Gboard container app is also dismissed automatically. The Gboard container app goes away and is replaced by the Apple Messages app (sometimes the Gboard keyboard extension process is still alive, sometimes it is relaunched, and sometimes it needs to be re-launched manually by tapping inside a text field.) . How does Google accomplish this?
  7. Finally, the user sees the text that was just translated inserted automatically inside the text input field. Presumably Google accomplishes this by sharing data between the Gboard container app and the keyboard extension.

I would assume that Google is using private APIs by exploring the status bar’s view hierarchy using Objective-C runtime introspection and somehow synthesizing tap events or calling an exposed target / action. I’ve explored this a very little and have been able to find interesting UIView subclasses inside the status bar, like UIStatusBarBreadcrumbItemView which contains an array of UISystemNavigationActions. I’m continuing to explore these classes in the hope that I can find some way of replicating the user interaction.

I understand that using private APIs is a good way to get your app submission rejected from the App Store – this isn’t a concern that I’d like to be addressed in the answer. I’m looking primarily for specific answers about how exactly how Google accomplishes the task of programmatically popping back one app in the App Switching stack in Gboard.