Developing iOS tweaks: A case study of fetching system's audio buff...

Repositories

The projects I'm discussing are available here: audio visualizer and TCP/IP server for audio data.

Introduction; the issue

I'm a (relatively new to the scene) iOS tweak developer. I've forked the excellent Mitsuha tweak (audio visualizer for Music and Spotify) to upgrade it to iOS 11. "Mitsuha for Music" previously used an unreliable and extremely unstable method of getting audio data to visualize - assigning a new AudioMix to the AVPlayerItem. It was problematic - when the user skipped tracks the audio player left Mitsuha trying to visualize data from an object that no longer existed (sometimes Music replaced the AudioMix with a new instance leaving us with dangling pointers). This resulted in a segmentation fault. Spotify version was not plagued with such issues.

Looking for a solution

This left me searching for a more reliable method. The code present in Spotify version was a part of the answer. Instead of creating our own AudioMix and MTAudioProcessingTap (and so on) it relied on a clean hook of the AudioUnitRender function. This wasn't possible from the Music app, though.

I've stumbled upon an old call recording code snippet from StackOverflow - this solved another part of the riddle. I've swiftly put together a tweak that hooked into mediaserverd and instead of hooking into AudioUnitProcess (which I've tried in my first commit regarding this experiment) I've created a hook on the AudioUnitRender function (seen here). This allowed me to get audio data that's not influenced by the system media volume setting (at least on most devices; this is definitely the case with my iPhone 7+).

Getting access to the system-wide audio buffer from a sandboxed process

Now the last thing I had to achieve was to find a way to transmit the audio buffer data from mediaserverd to a sandboxed process (Music app!). I've tried everything described on the IPC page on the exceptionally informative iphonedevwiki, but the only way with the least amount of overhead added was to use either POSIX sockets or TCP/IP. POSIX sockets weren't possible on iOS 11 with sandboxed processes as clients, so I've chosen the latter. The initial implementation can be seen in the same commit I've mentioned earlier.

Fixing stability issues that became noticeable while implementing a new feature

Implementing a stable TCP/IP server was not easy. Despite many libraries being available for that purpose (for Objective-C) I've decided to do it the good old-fashioned C way to minimize overhead - keep that in mind this is a critical system process we're hooking into.

One issue I had at the beginning was that when I've manually killed mediaserverd the client would crash. This wasn't a problem in the Music app - mediaserverd usually restarted itself automatically, and restarting the app wasn't that big of a problem for the end user. At that point, though, I've been working on a visualizer for the lock screen - this meant hooking into SpringBoard (the iOS UI). When SpringBoard crashed it unloaded all the apps from memory - a huge potential for data loss.

Another issue with SpringBoard crashing was that when it came back (all daemons automatically restart themselves) it spammed mediaserverd with connection requests which brought it down (and consecutively SpringBoard also crashed due to the client sockets being unexpectedly closed).

I've added lots of checks (here and here) to combat these issues. This might've increased the overhead, but so far no users reported random crashes or freezes. I also haven't pushed any updates to the AudioSnapshotServer (the MSHook tweak seen in the commits I've linked - it was split into its own repo while I was preparing a stable release of MitsuhaXI 0.4.0) after 1.0.0. I believe it's stable now.

GitHub account

@Ominousness

Developing iOS tweaks: A case study of fetching system's audio buffer from sandboxed processes.