Update, July 9: Apple has made big changes to Swift’s array implementation in the latest Xcode beta – happily, the descriptions below no longer apply.

In my last post I discussed Swift’s peculiar treatment of immutability in arrays. Below I’ll look at another unusual aspect of arrays – the ambiguous results you get when copying.

When a copy is not a copy

Unlike all other struct-based types in Swift, arrays don’t actually perform a copy when you assign one to another. Instead, they share the values of the initial array, even after manipulation of one or the other. Here, the copiedNames array isn’t actually copied, but still references the all the values of the original names array:

  • var names = ["Stuffy", "Chilly", "Hallie"]
  • let copiedNames = names
  •  
  • names[0] = "Lambie"                // changes to the first array
  • println(copiedNames)               // show up in the "copy"
  • [Lambie, Chilly, Hallie]
  •  
  • copiedNames[2] = "Fabulous Fabio"  // and changes to the copy
  • println(names)                     // show up in the original
  • [Lambie, Chilly, Fabulous Fabio]

Why is this happening? According to The Swift Programming Language,

Swift only performs an actual copy [of an array] behind the scenes when it is absolutely necessary to do so. Swift manages all value copying to ensure optimal performance, and you should not avoid assignment to try to preempt this optimization.

“Absolutely necessary”

If an assignment like let copiedNames = names doesn’t create a copy initially, when does a copy get made? Swift determines that a copy is necessary whenever an operation might modify the length of either array. If elements of one of the arrays are added or removed, the arrays are unlinked:

  • names += "Sir Kirby"              // adding an item
  • println(copiedNames)              // separates the arrays
  • [Lambie, Chilly, Fabulous Fabio]

How do you know?

Here’s a riddle – take a look at the following code, and tell me what gets printed at the end. The swizzle() function takes an array and moves the first item to the end.

  • var cities = ["Minneapolis", "Atlanta", "Seattle", "Chicago"]
  • let originalCities = cities
  •  
  • cities = swizzle(cities)           // move the first item to the end
  • println(originalCities[0])
  • ???

The answer is: It depends!

Depending on how swizzle() is implemented, the shared values may or may not be copied. This version of the function breaks the linkage, because it uses .append() and .removeAtIndex(), which modify the array’s length:

  • func swizzle(var arr: String[]) -> String[] {
  •     arr.append(arr.removeAtIndex(0))
  •     return arr
  • }

while this version keeps the sharing intact, since it only moves the values around:

  • func swizzle(arr: String[]) -> String[] {
  •     let first = arr[0]
  •     for i in 0..(arr.count - 1) {
  •         arr[i] = arr[i + 1]
  •     }
  •     arr[arr.count - 1] = first
  •  
  •     return arr
  • }

That’s bananas!

Look again: we’re calling a function with cities as a parameter, and depending on its implementation we may or may not be also changing the originalCities array. What?

Keep ‘em separated

If you need to test whether or not the arrays you’re working with are still sharing values, Swift provides the identity operator ===.

  • var numbers = [4, 8, 15, 16, 23, 42]
  • let copiedNumbers = numbers
  •  
  • println(numbers === copiedNumbers)
  • true
  •  
  • numbers += 99                       // increase the size of original
  • println(numbers === copiedNumbers)
  • false

If you need to manually separate your shared arrays, you can use the .unshare() method on an existing array, or copy your array with the source array’s .copy() method. Note that you can’t unshare a “constant” array, so plan ahead.

  • var numbers = [4, 8, 15, 16, 23, 42]
  • let copiedNumbers = numbers
  •  
  • println(numbers === copiedNumbers)
  • true
  •  
  • numbers.unshare()                   // unshare the original variable
  • println(numbers === copiedNumbers)
  • false
  •  
  • let anotherCopy = numbers.copy()
  • println(numbers === anotherCopy)
  • false

I’m tempted to always copy() arrays, just so I have a reliably autonomous array. This ambiguity feels like a pretty big potential pitfall in what is otherwise a fairly safe language.