OSX image release post mortem

ios
osx
xcode
xcodebuild

#1

On Thursday, November 3, CircleCI released an updated version of our OS X image. This update introduced regressions that caused build failures for some of our users. These regressions were not caught by our internal verification process of the image. This post mortem provides insight into what happened and why it happened so we can improve the process to ensure this wont happen again.

##1. Ruby Version Manager
After in-depth research and high customer demand we chose to ship a combination of chruby and ruby-install as part of the new image moving forward. Adding these gives users more control over the Ruby environment in the containers, which impacts certain developer tools and the Xcode toolchain.

###What Went Wrong
chruby is incredibly good at getting out of the way and forces as little opinion onto the developer as possible. When installing chruby it asks the user to add the following line to either their ~/.bashrc or ~/.profile.

source /usr/local/share/chruby/chruby.sh

This is the chruby itself, which is responsible for switching out environment variables settings and updating the $PATH correctly. But in order to automatically switch to a certain Ruby version, we would have also had to add this second script as well:

source /usr/local/share/chruby/auto.sh

This is an optional script that detects .ruby-version files when moving into a directory. It’s necessary for our setup because every custom command that is added to a circle.yml configuration file is run in its own shell since a new SSH connection is established each of them. The following snippet:

dependencies:
  pre:
    - chruby 2.3.1
    - ruby -v

looks like it should do the correct thing. However, it will only switch the Ruby version for the first executed command. Once ruby -v is being executed in a new shell (because of a new SSH connection), it reverts back to the system Ruby installation v2.0.0.

###Mitigation
To prevent such behavior in the future, we will update all of our documentation that described the old behavior and advise customers to either check in a .ruby-version file in their repo or add the following step to their circle.yml:

dependencies:
  pre:
    - echo "2.3.1" > ~/.ruby-version

This will work now because source /usr/local/share/chruby/auto.sh is now part of our ~/.bashrc.

##2. Missing Gem For Xcode Toolchain In Ruby Versions 2.1.9, 2.2.5 & 2.3.1
There have been reports of builds failing with errors like: "A build only device can’t be used for archiving.” Customers noticed that switching to the system Ruby installation fixes these issues. Xcode, when installed on a new machine, installs a Ruby gem called CFPropertyList. This gem is needed for certain tasks within the Xcode toolchain. Without it, the build errors out.

###What went wrong
The gem was not added during the installation phase of Ruby versions 2.1.9, 2.2.5 and 2.3.1.

###Mitigation
We installed CFPropertyList for all installed Ruby versions on the system and added it to the list of gems that we ship out of the box in our internal documentation.

##3. Multiple iOS Simulators
Xcode internally (within its own toolchain) treats iOS simulators and its SDKs in a special way. Every Xcode version being downloaded has a single SDK version for every platform embedded within its App bundle, which is NOT shared across Xcode installs.

These SDKs can be found at:

iOS
/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator.sdk
tvOS
/Applications/Xcode.app/Contents/Developer/Platforms/AppleTVSimulator.platform/Developer/SDKs/AppleTVSimulator.sdk
macOS
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX/Developer/SDKs/MacOSX.sdk
watchOS
/Applications/Xcode.app/Contents/Developer/Platforms/WatchSimulator/Developer/SDKs/WatchSimulator.sdk

This leaves us with the problem that older SDKs and simulators aren’t picked up across Xcode installations, even though we ship 6 versions in our current OS X image. In previous image iterations, we copied the SDKs into ~/simulator-sdks and then symlinked them back into the directory from where they were taken. This worked reliably, but had edge cases and occasionally not all installed SDKs showed up for all Xcode versions installed.

###What Went Wrong
To make those show up when querying all simulators with xcrun simctl list, we moved the simulator runtime profiles as well, which are located for example in:

/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/Library/CoreSimulator/Profiles/Runtimes/ iOS\ 10.0.simruntime
``

We tried to make all simulators and SDKs available to all Xcode versions installed (currently 6) in our OSX image by copying these runtime profiles into `~/simlator-runtimes` and then symlinking them back from where they were taken. However, this led to a spike in failed builds because Xcode only expects one runtime profile in it's own app bundle. After we placed a lot of them there we had multiple simulators show up for a single given combination of simulator type and OS.


###Mitigation
Going forward, we will add new SDKs only through Xcode's settings panel going forward. While this substantially increases the disk size of our OSX image, it will provide platform stability.


>###As an example
A user has Xcode 8.0 and Xcode 8.1 installed on their system. Both ship with iOS SDKs and simulators out of the box so that they're able to start developing once the downloads finish. Xcode 8.0 ships with the iOS 10.0 SDK, simulators and toolchain and Xcode 8.1 ships with iOS 10.1 SDK,simulators and toolchain.
While the user is hard at work building in functionality, Xcode 8.2 beta 1 is released. After downloading the beta version and beginning to develop with it, the user still needs to maintain backwards compatibility for customers. In order to do that, the user needs to install the older SDKs because they do not show up when trying to select the simulators for the given iOS version they'd like to test.
It seems reasonable to assume that having Xcode 8.0 and Xcode 8.1 installed means all the SDKs should be available. But the SDK’s still need to be downloaded because the SDKs only live in the app bundles and nowhere outside of it. If the user wants to have them available for all Xcode installs, even future ones, they'll have to re-download them and give up space on their hard drive.
Xcode 8.0 can download the iOS 9.3 - 8.1 SDK but can't download the iOS 10.0 SDK itself, since that ships with it. For that, they'll need Xcode 8.1. And for those who want to make symlinks between the global install and the app bundles in order to save space, that won't work. The global installs are compiled into a binary format and saves in


> `/Library/Developer/CoreSimulator/Profiles/Runtimes `

> when installed through Xcode's settings panel.

#2

Clear, detailed writeup - thanks for dealing with this nightmarish stew of changes.

One question about this line:

While this substantially increases the disk size of our OSX image, it will provide platform stability.

There’s no downside for CircleCI customers for this, correct?

Sounds like re-imaging a Mac might take a little longer in between builds during a clean-up phase, perhaps meaning you’ll need more build machines to support everyone, but it shouldn’t slow down startup of a machine for when a build starts, or provide any other negative side effects for us users, correct?


#3

Hey Karl,

No this has absolut zero impact on you. All our containers are restored to how they were before a build ran and then spun up. Once that is done they’re placed in a queue waiting for a build to be assigned to them. That is why , on our platform, the second a build is trigger we are already connected to a build container and start setting it up for you. There is no waiting for a container to boot and get into a “ready” state. That way you wont have to wait for the container, the container is waiting for you.


#4