Edit: Someone from Unity Tech. found this post and is working on a fix. So that's good 😀
This post could also be titled "Why I Absolutely Hate Closed Source Game Engines (And I'm Totally Not Talking About One In Particular)".
This weekend, a couple of friends and I participated in the Game Maker's Toolkit 2018 Game Jam*. As part of it, I dusted off an input system I wrote for a local-multiplayer Unity game a few years ago. Since it uses XInput for Xbox controllers, I thought - "Hey, I have an Xbox One set up for devmode and I read somewhere that UWP (Universal Windows Platform) games now have unlimited resources... Wouldn't it be neat to port our game jam game to the One?" Yeah, that would be neat, and being able to eventually tack "Shipped Xbox One Indie Title" onto my resume would be pretty cool. Unfortunately, it wasn't quite that simple. So I'm writing a blog post about it, mostly because I want to rant about the last 18-or so hours of my life. Bleh.
* We missed the deadline by ~5 seconds due to an issue in our itch.io page. Turns out you need to actually publish a game if you want to submit it to a game jam. Bummer, because our game was pretty neat 🙁
Given that Xinput is tied to Xbox controllers, I assumed that it was still supported on Xbox One. I had never done any development for UWP, but to my optimistic mind, that was a pretty safe assumption. So the first thing I did was simply change my build settings to target UWP (which, to my surprise, now used IL2CPP by default - this will be important later).
After a lengthy build process in Unity and an equally lengthy build in Visual Studio, I ran it on my desktop and, lo and behold, it... crashed... It looked like it was crashing in XInputDotNet's pinvoke code, so I tried replacing the version in our Unity project with the latest release and... it all worked!
"Since that was so simple, it shouldn't take much more work to run it on Xbox," I naively thought. So I configured Visual Studio for remote deployment (to any Visual Studio devs out there, maybe I'm in the minority, but this process is ridiculously unintuitive to me) and... We have a loading screen... and a main menu! With... no input... Ugh...
After a bit of searching around, I found that Microsoft has more or less abandoned XInput in favor of the Windows.Gaming.Input namespace. Just to verify, I made a new Unity project and made a script to move around a sprite with the new API, deployed to Xbox, and it worked
It is a pretty well engineered API and, from what I can tell, should support gamepads other than just Xbox. Unfortunately, it isn't supported by Unity standalone builds, so we can't just use it everywhere. Drat.
I started combing through the internals of my several-years-old input system and realized that it was a horrible mess. It would have taken at least as much time to work the new API into the old system as it would to just rewrite it, so I opted for the latter and implemented code to conditionally use Windows.Gaming.Input rather than XInput if UNITY_WINRT was defined.
At some point during all of this, I switched to the .NET scripting backend so that I could iterate on the hardware a bit faster as .NET builds are considerably faster than IL2CPP builds. This will be very important later.
I deployed to the One, and after a bit of fiddling... It worked! It was horribly slow (compiled in Debug mode), but it worked!
And then my controller died, and the game crashed.
Huh... That shouldn't happen... Maybe it was a one time thing...? Time to reproduce it, and...
Nope, it crashes every time under those exact conditions. But only if the controller is connected when the game launches. If I disconnect the controller, start the game, and then connect/disconnect it, everything works fine.
Well this is going to be fun...
After fiddling with it for awhile, I went back to that minimal project I mentioned earlier and tried it with UWP on my desktop. The same behavior occurred here, so at least it wasn't an Xbox specific bug. That sped things up quite a bit, as I could build locally much faster than building and deploying to the Xbox.
As I mentioned before, I was using the .NET backend for UWP. Because of this, errors in native code seemed to be confusing the hell out of the debugger. I found a combination of steps with which I could actually get the erroneous native callstack (which also required hunting down the Unity player symbols - why are these not loaded by default?), and it seemed to be breaking when attempting to access address 0x10.
Maybe that would be helpful to someone out there, but to me, that specific address just seems... Weird... It would be one thing if it was trying to dereference 0x0, but consistently 0x10?
At some point, I realized that if I never called any code from Windows.Gaming.Input, the crash didn't occur. At first I thought it might be a race condition involving querying the gamepad API, so I tried only calling GetCurrentReading() once at the beginning of the application's lifetime, but that unfortunately didn't change anything. It was still crashing when attempting to access 0x10.
After a taking a food break, I decided to try moving back to my main project. I was getting a really weird compilation error with the .NET backend, so I decided to try switching back to IL2CPP. After another lengthy build process, I built the game in debug mode and the same error occurred. The major difference, though, was that since I was debugging IL2CPP (native code), the debugger hit a proper breakpoint rather than the buggy mess that was .NET debugging. Finally... progress!
It appeared that an assertion was failing in
dynamic_array<GameControllers::Gamepads::XboxGamepad, 0>::erase(GameControllers::Gamepads::XboxGamepad*, GameControllers::Gamepads::XboxGamepad *), from UnityPlayer. This looked to me like an internal Unity call, but for all I knew, it could have been internal to the Windows.Gaming.Input library. The assertion that actually failed was 'input_end <= end()', which looks to me like a bounds check that failed.
My best guess is that the Windows.Gaming.Input library will keep a gamepad around for a set number of polls after a controller has disconnected so that services can safely handle disconnects, but the fact that I'm calling the code as well as Unity confuses it, leading to a failing bounds check as Unity tries to read past the number of allocated controllers. This wouldn't explain why controllers plugged in after the game starts are bug-free, unless Unity allocates space differently for them. However, since Unity is closed source, this is all guesswork. I really have no idea, and short of going through the disassembled code or getting a job at Unity, there's not much way for me to know.
"Oh well," I thought, "I'll submit a bug report and hope they get around to it. Let's at least see how the game performs in Release mode."
So I built for release mode and, of course, tried to replicate the bug. And I couldn't. It was gone.
The Case For Open Source Technology
My guess is that the bug disappeared in Release mode because it skipped over the bounds checking assertion. Which, of course, means that the bug is still there, but it's being silently ignored. Assuming I was accurate in my assessment of the symptoms, this shouldn't actually end up being a problem since it only ever occurred for the first gamepad. I doubt this will cause any problems, but the whole thing still feels very bad.
No matter what, all of this highlights a problem I have with using Unity (or closed source libraries) in general - I shouldn't have to blindly guess what the problem is and hope someone else fixes it.
Unity is a pretty decent game engine. I have some gripes with it, but overall, it's a pleasant enough experience if you do relatively simple things the way they want. However, as soon as you start doing things that are slightly more advanced or otherwise do not fit in their framework, you start finding bugs and corner cases where everything starts to break down. If the engine wasn't such a black box, a lot of this bugginess might honestly be perfectly fine. I'm very happy to drop down and try to fix the engine code if necessary, then submit a pull request. You have happier customers with a more stable product. Everybody wins.
Instead, I'm forced to fiddle around with snippets of barely readable, machine generated C++ code and guess what, in functions I cannot access, is causing a potentially production-stopping bug, only to have the bug disappear because the error-checking mechanisms are disabled.
What?!?!?! How can this be acceptable of a professional product being used to ship thousands of games?
Something like this seems like it should be a fairly trivial bug in Unity's gamepad logic or, at the very least, it seems like something that could be easily hacked around to prevent crashing on a pretty important platform.
But we can't do either, and are forced to hope that Unity Technologies fixes the bug eventually. However, it doesn't take long searching through their bug tracker to realize that they don't exactly have the best track record with maintaining existing features.
Whatever. The project is working thanks to bugs killing other bugs, so that's good enough for now.