August 26, 2014

Swift's Hidden Array.each Method

If you came to iOS development from the realms of Ruby or jQuery-inflected Javascript, you were probably happy to see enumerator methods like map, filter, and reduce show up in Swift. But hang on – what about this one?

arr = [1, 2, 3, 4, 5]
arr.each { |a| print a -= 10, " " }
> > -9 -8 -7 -6 -5

Ruby describes the each method this way:

In case of Array’s each, all elements in the Array instance are yielded to the supplied block in sequence.

Hmmm. So it calls a block (closure) with each element of the array in sequence. Doesn’t that sounds an awful lot like… map?

let arr = [1, 2, 3, 4, 5]
arr.map { a in print("\(a - 10) ") }
> > -9 -8 -7 -6 -5

Cool. When called this way, map infers that the return type of the closure is Void, builds an array of Void from the elements in the array, and then throws it out since it isn’t stored in another variable. The compiler might even be savvy enough to know that it doesn’t need space for the resulting array and skip that step.

How is that any better than a for...in loop? Well, closures don’t need to be defined inline – they can be stored in a variable or passed as a parameter to a function. For example, if you need to reset different groups of text fields in your view controller at different times, you could do that by writing a closure to update a text field and then passing that to map for the group you need to reset:

// define closure once
let resetTextField = {
        (textField: UITextField) -> () in
        textField.textColor = UIColor.lightGrayColor()
        textField.text = "tap to edit"
    }

// use it in somewhere else on an Array of UITextFields
textFields.map(resetTextField)

by Nate Cook swift arrays Comments


August 25, 2014

Set Type Follow-up

After hearing from a few folks on Twitter, and thinking through the implementation a little more, I have a couple of changes to the Set type from my last post. I think each of these changes brings the Set type a little closer to how it might have been implemented as a first-party datatype.

A better Generator

In my last post, I added SequenceType conformance by returning the internal Dictionary’s keys.generator(), like this:

extension Set : SequenceType {
    typealias Generator = MapSequenceGenerator<DictionaryGenerator<T, Bool>, T>
    func generate() -> Generator {
        return contents.keys.generate()
    }
}

It works fine, but get a load of that return type! MapSequenceGenerator<DictionaryGenerator<T, Bool>, T> is pretty ugly, and exposes way too much of the internal workings of the Set type for my comfort. What if I want to change the underlying data structure? (Ahem.)

Instead, I can use the handy GeneratorOf type to create a generator that is custom-tailored to the Set’s subtype and masks all the internal gobbledegook. The initializer for GeneratorOf takes a closure that acts as the next() method for the generator. Much cleaner:

extension Set : SequenceType {
    typealias Generator = GeneratorOf<T>
    public func generate() -> Generator {
        var generator = contents.keys.generate()
        return Swift.GeneratorOf {
            return generator.next()
        }
    }
}

Bool vs. Void?

I had a couple requests about why I used Bool instead of Void in the Set type’s internal Dictionary. It’s a good question – just the presence of the key is enough to establish an element’s existence in the set, so why use space for even a single-bit data type when Void should take up no space at all? What if the internal storage is inefficient, and the Bool value is taking up even more than a single bit?

I decided to run a quick test to see if I could figure out how much space I could save converting to a Void, and it turns out: zero. No matter how large my Dictionary got, the [Key: Bool] and [Key: Void] versions of the process used the same amount of memory.1 My hunch is that a Void type is represented internally by a single zero bit, so it’s no savings over a Bool using that same bit for true/false.

Interestingly enough, the Void testing brought up another challenge as well, since the conversion broke the the Set equality test. To check for equality between two sets, I’m deferring to the == operator on the underlying contents Dictionary, but for that to work both the key and value types of the Dictionary need to be Equatable. The Void type isn’t equatable (it isn’t anything, really), so that comparison stopped working.

In the end, I’m going to stick with the boolean true as the sentinel value – it feels better (semantically) to have the internal Dictionary actually representing the Set in that way.

Better indexing for a better collection

After taking a second look, I realized my ExtensibleCollectionType implementation was pretty far off the mark. Dictionary instances use the DictionaryIndex type as their index, which is nearly opaque (it seems to hold some sort of enum). Since the ExtensibleCollectionType protocol requires that you implement a subscript for your index type, this allows for both “indexed” subscripting and keyed subscripting.

var dict = ["a": "Apple",                   // set up the Dictionary
            "b": "Banana",
            "c": "Cherry"]                  
let bananaIndex = dict.indexForKey("b")!    // retrieve an index
println(dict[bananaIndex])                  // subscript via DictionaryIndex<String, String>
> > (b, Banana)
println(dict["c"])                          // subscript via String key
> > Optional("Cherry")

My original Set used a simple Int as an index, which, in addition to not really making sense, prevented me from having an element-based subscript.

let set: Set = ["Apple", "Banana", "Cherry"]  // set up the Set
let start: Int = set.startIndex
println(set[start])
> > Cherry

println(set["Apple"])
> error: type 'Index' does not conform to protocol 'StringLiteralConvertible'

A better solution is to create a new SetIndex type that embeds the DictionaryIndex we saw in action up above. The SetIndex type can simply pass the index between the caller and the internal Dictionary, once again obfuscating the working details. To be used as an index, SetIndex has to at least implement the ForwardIndexType protocol, but we’ll go all the way and implement BidirectionalIndexType. Note that the initializer is private – this is typical for an index in Swift. You can retrieve one from the instance you want to traverse, but you can’t simply create an index directly.

struct SetIndex<T: Hashable> : BidirectionalIndexType {
    private var index: DictionaryIndex<T, Bool>
    private init(_ dictionaryIndex: DictionaryIndex<T, Bool>) {
        self.index = dictionaryIndex
    }
    func predecessor() -> SetIndex<T> {
        return SetIndex(self.index.predecessor())
    }
    func successor() -> SetIndex<T> {
        return SetIndex(self.index.successor())
    }
}

func ==<T: Hashable>(lhs: SetIndex<T>, rhs: SetIndex<T>) -> Bool {
    return lhs.index == rhs.index
}

Next we need to update our ExtensibleCollectionType extension to use the new SetIndex type instead of Int as the index. I’ve added two additional methods – indexForElement() returns a SetIndex for the requested element, or nil if the element isn’t in the set, and subscripting with set elements now acts as a shortcut for the contains() method, returning true if the element is in the set and false otherwise.

extension Set : ExtensibleCollectionType {
    typealias Index = SetIndex<Element>
    
    var startIndex: Index { return Index(contents.startIndex) }
    var endIndex: Index { return Index(contents.endIndex) }
    
    func indexForElement(element: Element) -> Index? {
        if let index = contents.indexforKey(element) {
            return Index(index)
        }
        return nil
    }
    
    subscript(i: Index) -> Element {
        return contents.keys[i.index]
    }

    subscript(element: Element) -> Bool {
        return contains(element)
    }

    // some things that didn't change...
}

Back to our indexing code up above, we can now use either a SetIndex or the Set’s subtype for subscripting:

let set: Set = ["Apple", "Banana", "Cherry"]  // set up the Set
let start = set.startIndex
println(set[start])
> > Cherry

println((set["Apple"], set["Pear"]))
> > (true, false)

To see all the changes from this post, you can view the diff on GitHub. I’ve moved this into a new project on GitHub, SwiftSets, to make room for some more data types in the future. Please take a look!

  1. We don’t know how a Dictionary’s storage is handled internally, but with a million Int: Bool pairs in the Dictionary the swift process was consuming about 43.5MB. If Dictionary is implemented as some kind of binary tree, each node would use five 8-byte values (for the key itself, the hashed key, and pointers to the parent and two children) plus a single bit for the Bool. Multiplying everything together gives a total of roughly 38.3MB – could be?2

  2. Nope! David Smith weighed in on twitter and it turns out my poorly informed back-of-the-envelope calculations are not entirely reliable.

by Nate Cook swift sets Comments


August 20, 2014

Creating a Set Type in Swift

Over on NSHipster, in the midst of a great article about Swift literal convertibles, Mattt Thompson dropped this aside:

A real, standard library-calibre implementation of Set would involve a lot more Swift-isms, like generators, sequences, and all manner of miscellaneous protocols. It’s enough to write an entirely separate article about.

Sounds like a good exercise!

A set is a data structure, like an array, that holds values in no particular order and without repetition. You could use a set to represent the vowels in the alphabet, or the lowercase letters of the alphabet itself.

In this post I’ll walk through building a Set type on par with the built-in Array and Dictionary types, taking advantage of Swift’s features with “all manner of miscellaneous protocols.” Let’s get started!

Building our Set

We’ll create our Set type with a Swift-native Dictionary as its data store. To get rolling, let’s make a new struct1 that we can add to, remove from, and check for the existence of members, along with a few other useful properties.

struct Set<T: Hashable> {
    typealias Element = T
    private var contents: [Element: Bool]
    
    init() {
        self.contents = [Element: Bool]()
    }

    /// The number of elements in the Set.
    var count: Int { return contents.count }
    
    /// Returns `true` if the Set is empty.
    var isEmpty: Bool { return contents.isEmpty }
    
    /// The elements of the Set as an array.
    var elements: [Element] { return Array(self.contents.keys) }

    /// Returns `true` if the Set contains `element`.
    func contains(element: Element) -> Bool {
        return contents[element] ?? false
    }
        
    /// Add `newElements` to the Set.
    mutating func add(newElements: Element...) {
        newElements.map { self.contents[$0] = true }
    }
    
    /// Remove `element` from the Set.
    mutating func remove(element: Element) -> Element? {
        return contents.removeValueForKey(element) != nil ? element : nil
    }
}

And now we can create our set of five vowels:

var vowelSet: Set<String> = Set()               // empty set
vowelSet.add("a", "e", "i", "o", "u", "y")      // a, e, i, o, u, y
vowelSet.remove("y")                            // eh, leave out y
println(vowelSet.contains("e"))
> > true

// if we add existing members, the count stays the same
println(vowelSet.count)
> > 5
vowelSet.add("a", "e")
println(vowelSet.count)
> > 5

Sequencing

It would be nice to be able to loop over all the values in our Set, the way you can loop over an Array with a for...in loop. Swift allows any type to opt in to this kind of looping by conforming to the SequenceType protocol. In this case, it’s a simple addition of just a single method, since iterating over our Set is just iterating over the keys of the internal Dictionary.

extension Set : SequenceType {
    typealias Generator = MapSequenceGenerator<DictionaryGenerator<T, Bool>, T>
    func generate() -> Generator {
        return contents.keys.generate()
    }
}

Great! Now we can loop over the contents of the Set to print all the values. (Remember what I said about sets not having any particular order?)

for vowel in vowelSet {
    println(vowel)
}
> > a
> > o
> > e
> > u
> > i

Easier initialization

Wouldn’t it be great if we could create and initialize our set in one step, instead of having to create the set, then add elements? With a little generics magic, we can create another initializer that will create a Set instance from a sequence of any kind. This initializer accepts anything that conforms to the SequenceType protocol.

extension Set {
    init<S: SequenceType where S.Generator.Element == Element>(_ sequence: S) {
        self.contents = [Element: Bool]()
        Swift.map(sequence) { self.contents[$0] = true }
    }
}

With this extension, we can now create new Set instances from Arrays, Ranges, and other sequences:

let setFromArray = Set([8, 6, 7, 5, 3, 0, 9])
let setFromRange = Set(1...10)
let setFromString = Set("abcdefg")

It would also be nice to create our Set through simple assignment, a task we can accomplish by implementing the initializer required for the ArrayLiteralConvertible protocol. With our new sequence-based initializer, this is a cinch.

extension Set : ArrayLiteralConvertible {
    init(arrayLiteral: Element...) {
        self.init(arrayLiteral)
    }
}

What’s great about both of these is that Swift’s type inference even figures out the subtype of our Set, so we don’t need to be as specific when declaring it. The first two lines of our example above, creating the vowelSet and adding the vowels to it, can be condensed to this one line:

var vowelSet: Set = ["a", "e", "i", "o", "u", "y"]

Going native

When we implemented the SequenceType protocol, we not only got the ability to loop through our set’s elements, we also got access to some very useful global functions. map(), filter(), and reduce() are all defined for sequences, so if we want to add up all the numbers in our set, we can do something like this:

let numberSet = Set(1...10)
let sum = reduce(numberSet, 0, { return $0 + $1 })
println(sum)
> > sum = 55

The only thing is, the global versions of map() and filter() return Arrays, not new Set instances. Let’s add these three methods to our Set type so they can type-specific and convenient.

extension Set {
    func filter(includeElement: (T) -> Bool) -> Set<T> {
        return Set(self.contents.keys.filter(includeElement))
    }

    func map<U>(transform: (T) -> U) -> Set<U> {
        return Set<U>(self.contents.keys.map(transform))
    }

    func reduce<U>(var initial: U, combine: (U, T) -> U) -> U {
        return Swift.reduce(self, initial, combine)
    }
}

Now, just like Swift’s built-in Arrays, our Set type can be sliced and diced using map() and filter().

let numberSet = Set(1...20)
let evenNumberSet = numberSet.filter { $0 % 2 == 0 }
let stringNumberSet = evenNumberSet.map { "#\($0)" }
println(stringNumberSet.contains("#4"))
> > true

Set algebra

We’re finally ready to implement some set-specific logic – that is, tests to see if one set is a subset or a superset of another and set equality, and mutating functions that will handle union, intersection, and subtraction.

extension Set {
	/// Returns `true` if the Set has the exact same members as `set`.
    func isEqualToSet(set: Set<T>) -> Bool {
        return self.contents == set.contents
    }
    
	/// Returns `true` if the Set shares any members with `set`.
    func intersectsWithSet(set: Set<T>) -> Bool {
        for elem in self {
            if set.contains(elem) {
                return true
            }
        }
        return false
    }

	/// Returns `true` if all members of the Set are part of `set`.
    func isSubsetOfSet(set: Set<T>) -> Bool {
        for elem in self {
            if !set.contains(elem) {
                return false
            }
        }
        return true
    }

	/// Returns `true` if all members of `set` are part of the Set.
    func isSupersetOfSet(set: Set<T>) -> Bool {
        return set.isSubsetOfSet(self)
    }

	/// Modifies the Set to add all members of `set`.
    mutating func unionSet(set: Set<T>) {
        for elem in set {
            self.add(elem)
        }
    }

	/// Modifies the Set to remove any members also in `set`.
    mutating func subtractSet(set: Set<T>) {
        for elem in set {
            self.remove(elem)
        }
    }
    
    /// Modifies the Set to include only members that are also in `set`.
    mutating func intersectSet(set: Set<T>) {
        self = self.filter { set.contains($0) }
    }
    
    /// Returns a new Set that contains all the elements of both this set and the set passed in.
    func setByUnionWithSet(var set: Set<T>) -> Set<T> {
        set.extend(self)
        return set
    }

    /// Returns a new Set that contains only the elements in both this set and the set passed in.
    func setByIntersectionWithSet(var set: Set<T>) -> Set<T> {
        set.intersectSet(self)
        return set
    }

	/// Returns a new Set that contains only the elements in this set *not* also in the set passed in.
    func setBySubtractingSet(set: Set<T>) -> Set<T> {
        var newSet = self
        newSet.subtractSet(set)
        return newSet
    }
}

Let’s put them in action:

let vowelSet = Set("aeiou")
let alphabetSet = Set("abcdefghijklmnopqrstuvwxyz")

println(alphabetSet.isSupersetOfSet(vowelSet))
> > true
println(vowelSet.isSubsetOfSet(alphabetSet))
> > true
println(vowelSet.isEqualToSet(alphabetSet))
> > false

var someLetterSet = Set("qwerasdfzxcv")
someLetterSet.intersectSet(vowelSet)
println(someLetterSet.elements)
> > [a, e]

Nearly operational

I’m not going to go all Ruby on you and have firstSet << secondSet take on some esoteric meaning, but we do want the basic addition and equality operators to work as expected. Using the built-in Array type as an example, we can implement the following operators for our Set type.

extension Set : Equatable { }

func +=<T>(inout lhs: Set<T>, rhs: T) {
    lhs.add(rhs)
}

func +=<T>(inout lhs: Set<T>, rhs: Set<T>) {
    lhs.unionSet(rhs)
}

func +<T>(lhs: Set<T>, rhs: Set<T>) -> Set<T> {
    return lhs.setByAddingSet(rhs)
}

func ==<T>(lhs: Set<T>, rhs: Set<T>) -> Bool {
    return lhs.isEqualToSet(rhs)
}

Taking these operators out for a spin:

var moreVowelsSet = vowelSet
println(moreVowelsSet == vowelSet)
> > true
moreVowelsSet += "y"               // hey, y is *sometimes* a vowel
moreVowelsSet += Set("åáâäàéêèëíîïìøóôöòúûüù")
println(moreVowelsSet == vowelSet)
> > false

Collecting ourselves

In order to be good Swift citizens (and get the benefit of some useful features like lazy collections), we should also implement the CollectionType and ExtensibleCollectionType protocols.

extension Set : ExtensibleCollectionType {
    typealias Index = Int
    var startIndex: Int { return 0 }
    var endIndex: Int { return self.count }

    subscript(i: Int) -> Element {
        return Array(self.contents.keys)[i]
    }

    mutating func reserveCapacity(n: Int) {
        // can't really do anything with this
    }
    mutating func append(newElement: Element) {
        self.add(newElement)
    }
    mutating func extend<S : SequenceType where S.Generator.Element == Element>(seq: S) {
        Swift.map(seq) { self.contents[$0] = true }
    }
}

For that extra work we get the benefit of a few more global functions, like countElements() and join().

println(countElements(vowelSet))
> > 5
println(join(", ", vowelSet.map { "\($0)" }))
> > i, o, e, u, a

Final touches

Last but not least, it would be nice to have some way of inspecting the contents of our Set instances. Time to implement the Printable and DebugPrintable protocols:

extension Set : Printable, DebugPrintable {
    var description: String {
        return "Set(\(self.elements))"
    }

    var debugDescription: String {
        return description
    }
}

Unfortunately, as of beta 6 these methods don’t seem to be used for user-defined types in either the Playground or during command-line execution, but you can still call them manually:

println(vowelSet.description)
> > Set([i, o, e, u, a])

Wrap up

There you have it, a fully functional Swift-native Set type. Take a look at the full source code in this gist (along with tests and performance timing) or let me know if I’ve missed anything over on Twitter. Thanks for reading!

Update: I’ve written a follow-up post with a few changes to the implementation, and posted a revised version of the Set type as a project called SwiftSets on GitHub.

  1. Why a struct and not a class? The Swift Blog had a post on the differences between value and reference types last week, and I’m following their lead with the Array and Dictionary types. Having Set be a value type gives guaranteed immutability and thread safety without any extra work.

by Nate Cook swift sets generics Comments


July 11, 2014

C-Style "typedef enum" in Swift

Two years ago, Apple introduced two macros to standardize the way C enum types are declared: NS_ENUM and NS_OPTIONS. Swift imports enumerations declared using these macros automatically as native language structures that can be used in obvious and convenient ways. In retrospect, having enumerations defined with their types in a consistent way paved their inclusion in the coming type-safe Swift.

// UITableViewCellStyle defined with NS_ENUM
let cellStyle = UITableViewCellStyle.Subtitle

// UIViewAutoresizing defined with NS_OPTIONS
let resizing: UIViewAutoresizing = .FlexibleHeight | .FlexibleWidth
if resizing & .FlexibleHeight {
    // do something
}

All of Apple’s APIs are defined with the appropriate macro (and therefore get imported into Swift nicely), but what if you’re using a third-party library that is a little older, or didn’t jump on the macro train?

Swift imports these older, C-style typedef enum declarations as well, you just need to use a little extra care when working with them. A C enum can be used for either an enumeration or an options bitmask, but Swift imports them all the same way. Let’s look at what’s happening behind the scenes, and then how to use each in turn.

Bridged enums

Since Swift doesn’t know whether an enum defined in C is meant to be an enumeration or a bitmask, it brings them all in using the same format. Any enum defined in C will become a struct in Swift with a single value property, while the individual elements are declared as global constants of that type. The value property will be of type UInt32 unless the type was specifically declared otherwise in C.

That is to say, this enum defined in an Objective-C header:

typedef enum {
    MesozoicPeriodTriassic,
    MesozoicPeriodJurassic,
    MesozoicPeriodCretaceous,
} MesozoicPeriod;

gets translated to this in Swift:

struct MesozoicPeriod {
    var value: UInt32
    init(_ val: UInt32) { value = val }
}
let MesozoicPeriodTriassic = MesozoicPeriod(0)
let MesozoicPeriodJurassic = MesozoicPeriod(1)
let MesozoicPeriodCretaceous = MesozoicPeriod(2)

Enumerations

Enumerations are used to hold a single choice from a set – the style of a table view cell or the method of an HTTP request. How do we use imported C enumerations in Swift?

var period = MesozoicPeriodJurassic     // assign like normal...
if period == MesozoicPeriodJurassic {   // ...but this doesn't work
> Error: 'MesozoicPeriod' is not convertible to '_ArrayCastKind'

The problem is that the equality operator (==) isn’t defined for these imported enumerations. We need to compare the value properties, whether in an if or switch statement:

var period = MesozoicPeriodJurassic
if period.value == MesozoicPeriodJurassic.value {
    println("Jurrasic!")                // Jurrasic!
}

Keep in mind that since this isn’t a true Swift enumeration we miss out on some compiler safety guards, like requiring switch statements to exhaust all possibilities, or keeping values within the defined bounds (you can easily set period to MesozoicPeriod(42) if you’d like).

Options Bitmasks

Bitmasks, on the other hand, use bit arithmetic to hold multiple values simultaneously. I have a generator for making your own Swift-native bitmasks, but if you’re using a bitmask defined in a C-based library, you’re stuck with the imported version.

Let’s look at how to use this bitmask of attributes that a people-eater might have:

typedef enum {
    OneEyed     = 1 << 0,
    OneHorned   = 1 << 1,
    Flying      = 1 << 2,
    Purple      = 1 << 3
} PeopleEaterAttributes;

The big stumbling block here is that none of the bitwise operators (&, |, ~, etc.) are defined for this imported bitmask’s type. So again we need to use the value property, not only for comparison but also for operations on the bitmask.

Initialize your bitmask either using one of the global constants or as a new struct with zero value, and then modify the value property from there.

var purplePeopleEater = PeopleEaterAttributes(0)
purplePeopleEater.value = Purple.value | OneEyed.value | OneHorned.value | Flying.value

Then, when testing your bitmask, you need to use an explicit comparison to zero. The result of a bitwise operation is another number of the same type, which Swift won’t convert to a Boolean for you.

// this only works for NS_OPTIONS-declared bitmasks:
if purplePeopleEater & Flying {
> Error: 'PeopleEaterAttributes' is not convertible to 'Bool'

// using .value is the right idea, but still doesn't work:
if purplePeopleEater.value & Flying.value {
> Error: 'UInt32' is not convertible to 'Bool'

// this is the way for C-style "typedef enum" bitmasks
if 0 != (purplePeopleEater.value & Flying.value) {
    println("Flying purple people eater")
}

You can read more on bitmasks in this great piece from Big Nerd Ranch, and about the NS_ENUM and NS_OPTIONS macros at NSHipster.

by Nate Cook swift enum Comments


July 9, 2014

Fully Value-Typed Arrays in Swift

The latest beta of Xcode 6 brings a much-needed revision to Swift’s arrays. Instead of the sometimes-value, sometimes-by-reference behavior, where “immutable” arrays could be modified and “copied” arrays maintained links to the original, arrays now act much more as one would expect. Constant arrays are truly constant and immutable, and copied arrays can’t have values changed with modifications to the source (or vice versa).

let names = ["Thomas", "Percy", "Rosie", "Daisy"]   // actually immutable!
> [Thomas, Percy, Rosie, Daisy] 
names[0] = "Edward"                                 // error

by Nate Cook swift arrays