Swift Array Copying Ambiguity

Update, July 9: Apple has made big changes to Swift’s array implementation in the latest Xcode beta – happily, the descriptions below no longer apply.

In my last post I discussed Swift’s peculiar treatment of immutability in arrays. Below I’ll look at another unusual aspect of arrays – the ambiguous results you get when copying.

When a copy is not a copy

Unlike all other struct-based types in Swift, arrays don’t actually perform a copy when you assign one to another. Instead, they share the values of the initial array, even after manipulation of one or the other. Here, the copiedNames array isn’t actually copied, but still references the all the values of the original names array:

var names = ["Stuffy", "Chilly", "Hallie"]
let copiedNames = names

names[0] = "Lambie"                // changes to the first array
println(copiedNames)               // show up in the "copy"
> [Lambie, Chilly, Hallie]

copiedNames[2] = "Fabulous Fabio"  // and changes to the copy
println(names)                     // show up in the original
> [Lambie, Chilly, Fabulous Fabio]

Why is this happening? According to The Swift Programming Language,

Swift only performs an actual copy [of an array] behind the scenes when it is absolutely necessary to do so. Swift manages all value copying to ensure optimal performance, and you should not avoid assignment to try to preempt this optimization.

“Absolutely necessary”

If an assignment like let copiedNames = names doesn’t create a copy initially, when does a copy get made? Swift determines that a copy is necessary whenever an operation might modify the length of either array. If elements of one of the arrays are added or removed, the arrays are unlinked:

names += "Sir Kirby"              // adding an item
println(copiedNames)              // separates the arrays
> [Lambie, Chilly, Fabulous Fabio]

How do you know?

Here’s a riddle – take a look at the following code, and tell me what gets printed at the end. The swizzle() function takes an array and moves the first item to the end.

var cities = ["Minneapolis", "Atlanta", "Seattle", "Chicago"]
let originalCities = cities

cities = swizzle(cities)           // move the first item to the end
println(originalCities[0])
> ???

The answer is: It depends!

Depending on how swizzle() is implemented, the shared values may or may not be copied. This version of the function breaks the linkage, because it uses .append() and .removeAtIndex(), which modify the array’s length:

func swizzle(var arr: String[]) -> String[] {
    arr.append(arr.removeAtIndex(0))
    return arr
}

while this version keeps the sharing intact, since it only moves the values around:

func swizzle(arr: String[]) -> String[] {
    let first = arr[0]
    for i in 0..(arr.count - 1) {
        arr[i] = arr[i + 1]
    }
    arr[arr.count - 1] = first

    return arr
}

That’s bananas!

Look again: we’re calling a function with cities as a parameter, and depending on its implementation we may or may not be also changing the originalCities array. What?

Keep ‘em separated

If you need to test whether or not the arrays you’re working with are still sharing values, Swift provides the identity operator ===.

var numbers = [4, 8, 15, 16, 23, 42]
let copiedNumbers = numbers

println(numbers === copiedNumbers)
> true

numbers += 99                       // increase the size of original
println(numbers === copiedNumbers)
> false

If you need to manually separate your shared arrays, you can use the .unshare() method on an existing array, or copy your array with the source array’s .copy() method. Note that you can’t unshare a “constant” array, so plan ahead.

var numbers = [4, 8, 15, 16, 23, 42]
let copiedNumbers = numbers

println(numbers === copiedNumbers)
> true

numbers.unshare()                   // unshare the original variable
println(numbers === copiedNumbers)
> false

let anotherCopy = numbers.copy()
println(numbers === anotherCopy)
> false

I’m tempted to always copy() arrays, just so I have a reliably autonomous array. This ambiguity feels like a pretty big potential pitfall in what is otherwise a fairly safe language.