Over on NSHipster, in the midst of a great article about Swift literal convertibles, Mattt Thompson dropped this aside:

A real, standard library-calibre implementation of Set would involve a lot more Swift-isms, like generators, sequences, and all manner of miscellaneous protocols. It’s enough to write an entirely separate article about.

Sounds like a good exercise!

A set is a data structure, like an array, that holds values in no particular order and without repetition. You could use a set to represent the vowels in the alphabet, or the lowercase letters of the alphabet itself.

In this post I’ll walk through building a Set type on par with the built-in Array and Dictionary types, taking advantage of Swift’s features with “all manner of miscellaneous protocols.” Let’s get started!

Building our Set

We’ll create our Set type with a Swift-native Dictionary as its data store. To get rolling, let’s make a new struct1 that we can add to, remove from, and check for the existence of members, along with a few other useful properties.

struct Set<T: Hashable> {
    typealias Element = T
    private var contents: [Element: Bool]
    
    init() {
        self.contents = [Element: Bool]()
    }

    /// The number of elements in the Set.
    var count: Int { return contents.count }
    
    /// Returns `true` if the Set is empty.
    var isEmpty: Bool { return contents.isEmpty }
    
    /// The elements of the Set as an array.
    var elements: [Element] { return Array(self.contents.keys) }

    /// Returns `true` if the Set contains `element`.
    func contains(element: Element) -> Bool {
        return contents[element] ?? false
    }
        
    /// Add `newElements` to the Set.
    mutating func add(newElements: Element...) {
        newElements.map { self.contents[$0] = true }
    }
    
    /// Remove `element` from the Set.
    mutating func remove(element: Element) -> Element? {
        return contents.removeValueForKey(element) != nil ? element : nil
    }
}

And now we can create our set of five vowels:

var vowelSet: Set<String> = Set()               // empty set
vowelSet.add("a", "e", "i", "o", "u", "y")      // a, e, i, o, u, y
vowelSet.remove("y")                            // eh, leave out y
println(vowelSet.contains("e"))
> > true

// if we add existing members, the count stays the same
println(vowelSet.count)
> > 5
vowelSet.add("a", "e")
println(vowelSet.count)
> > 5

Sequencing

It would be nice to be able to loop over all the values in our Set, the way you can loop over an Array with a for...in loop. Swift allows any type to opt in to this kind of looping by conforming to the SequenceType protocol. In this case, it’s a simple addition of just a single method, since iterating over our Set is just iterating over the keys of the internal Dictionary.

extension Set : SequenceType {
    typealias Generator = MapSequenceGenerator<DictionaryGenerator<T, Bool>, T>
    func generate() -> Generator {
        return contents.keys.generate()
    }
}

Great! Now we can loop over the contents of the Set to print all the values. (Remember what I said about sets not having any particular order?)

for vowel in vowelSet {
    println(vowel)
}
> > a
> > o
> > e
> > u
> > i

Easier initialization

Wouldn’t it be great if we could create and initialize our set in one step, instead of having to create the set, then add elements? With a little generics magic, we can create another initializer that will create a Set instance from a sequence of any kind. This initializer accepts anything that conforms to the SequenceType protocol.

extension Set {
    init<S: SequenceType where S.Generator.Element == Element>(_ sequence: S) {
        self.contents = [Element: Bool]()
        Swift.map(sequence) { self.contents[$0] = true }
    }
}

With this extension, we can now create new Set instances from Arrays, Ranges, and other sequences:

let setFromArray = Set([8, 6, 7, 5, 3, 0, 9])
let setFromRange = Set(1...10)
let setFromString = Set("abcdefg")

It would also be nice to create our Set through simple assignment, a task we can accomplish by implementing the initializer required for the ArrayLiteralConvertible protocol. With our new sequence-based initializer, this is a cinch.

extension Set : ArrayLiteralConvertible {
    init(arrayLiteral: Element...) {
        self.init(arrayLiteral)
    }
}

What’s great about both of these is that Swift’s type inference even figures out the subtype of our Set, so we don’t need to be as specific when declaring it. The first two lines of our example above, creating the vowelSet and adding the vowels to it, can be condensed to this one line:

var vowelSet: Set = ["a", "e", "i", "o", "u", "y"]

Going native

When we implemented the SequenceType protocol, we not only got the ability to loop through our set’s elements, we also got access to some very useful global functions. map(), filter(), and reduce() are all defined for sequences, so if we want to add up all the numbers in our set, we can do something like this:

let numberSet = Set(1...10)
let sum = reduce(numberSet, 0, { return $0 + $1 })
println(sum)
> > sum = 55

The only thing is, the global versions of map() and filter() return Arrays, not new Set instances. Let’s add these three methods to our Set type so they can type-specific and convenient.

extension Set {
    func filter(includeElement: (T) -> Bool) -> Set<T> {
        return Set(self.contents.keys.filter(includeElement))
    }

    func map<U>(transform: (T) -> U) -> Set<U> {
        return Set<U>(self.contents.keys.map(transform))
    }

    func reduce<U>(var initial: U, combine: (U, T) -> U) -> U {
        return Swift.reduce(self, initial, combine)
    }
}

Now, just like Swift’s built-in Arrays, our Set type can be sliced and diced using map() and filter().

let numberSet = Set(1...20)
let evenNumberSet = numberSet.filter { $0 % 2 == 0 }
let stringNumberSet = evenNumberSet.map { "#\($0)" }
println(stringNumberSet.contains("#4"))
> > true

Set algebra

We’re finally ready to implement some set-specific logic – that is, tests to see if one set is a subset or a superset of another and set equality, and mutating functions that will handle union, intersection, and subtraction.

extension Set {
	/// Returns `true` if the Set has the exact same members as `set`.
    func isEqualToSet(set: Set<T>) -> Bool {
        return self.contents == set.contents
    }
    
	/// Returns `true` if the Set shares any members with `set`.
    func intersectsWithSet(set: Set<T>) -> Bool {
        for elem in self {
            if set.contains(elem) {
                return true
            }
        }
        return false
    }

	/// Returns `true` if all members of the Set are part of `set`.
    func isSubsetOfSet(set: Set<T>) -> Bool {
        for elem in self {
            if !set.contains(elem) {
                return false
            }
        }
        return true
    }

	/// Returns `true` if all members of `set` are part of the Set.
    func isSupersetOfSet(set: Set<T>) -> Bool {
        return set.isSubsetOfSet(self)
    }

	/// Modifies the Set to add all members of `set`.
    mutating func unionSet(set: Set<T>) {
        for elem in set {
            self.add(elem)
        }
    }

	/// Modifies the Set to remove any members also in `set`.
    mutating func subtractSet(set: Set<T>) {
        for elem in set {
            self.remove(elem)
        }
    }
    
    /// Modifies the Set to include only members that are also in `set`.
    mutating func intersectSet(set: Set<T>) {
        self = self.filter { set.contains($0) }
    }
    
    /// Returns a new Set that contains all the elements of both this set and the set passed in.
    func setByUnionWithSet(var set: Set<T>) -> Set<T> {
        set.extend(self)
        return set
    }

    /// Returns a new Set that contains only the elements in both this set and the set passed in.
    func setByIntersectionWithSet(var set: Set<T>) -> Set<T> {
        set.intersectSet(self)
        return set
    }

	/// Returns a new Set that contains only the elements in this set *not* also in the set passed in.
    func setBySubtractingSet(set: Set<T>) -> Set<T> {
        var newSet = self
        newSet.subtractSet(set)
        return newSet
    }
}

Let’s put them in action:

let vowelSet = Set("aeiou")
let alphabetSet = Set("abcdefghijklmnopqrstuvwxyz")

println(alphabetSet.isSupersetOfSet(vowelSet))
> > true
println(vowelSet.isSubsetOfSet(alphabetSet))
> > true
println(vowelSet.isEqualToSet(alphabetSet))
> > false

var someLetterSet = Set("qwerasdfzxcv")
someLetterSet.intersectSet(vowelSet)
println(someLetterSet.elements)
> > [a, e]

Nearly operational

I’m not going to go all Ruby on you and have firstSet << secondSet take on some esoteric meaning, but we do want the basic addition and equality operators to work as expected. Using the built-in Array type as an example, we can implement the following operators for our Set type.

extension Set : Equatable { }

func +=<T>(inout lhs: Set<T>, rhs: T) {
    lhs.add(rhs)
}

func +=<T>(inout lhs: Set<T>, rhs: Set<T>) {
    lhs.unionSet(rhs)
}

func +<T>(lhs: Set<T>, rhs: Set<T>) -> Set<T> {
    return lhs.setByAddingSet(rhs)
}

func ==<T>(lhs: Set<T>, rhs: Set<T>) -> Bool {
    return lhs.isEqualToSet(rhs)
}

Taking these operators out for a spin:

var moreVowelsSet = vowelSet
println(moreVowelsSet == vowelSet)
> > true
moreVowelsSet += "y"               // hey, y is *sometimes* a vowel
moreVowelsSet += Set("åáâäàéêèëíîïìøóôöòúûüù")
println(moreVowelsSet == vowelSet)
> > false

Collecting ourselves

In order to be good Swift citizens (and get the benefit of some useful features like lazy collections), we should also implement the CollectionType and ExtensibleCollectionType protocols.

extension Set : ExtensibleCollectionType {
    typealias Index = Int
    var startIndex: Int { return 0 }
    var endIndex: Int { return self.count }

    subscript(i: Int) -> Element {
        return Array(self.contents.keys)[i]
    }

    mutating func reserveCapacity(n: Int) {
        // can't really do anything with this
    }
    mutating func append(newElement: Element) {
        self.add(newElement)
    }
    mutating func extend<S : SequenceType where S.Generator.Element == Element>(seq: S) {
        Swift.map(seq) { self.contents[$0] = true }
    }
}

For that extra work we get the benefit of a few more global functions, like countElements() and join().

println(countElements(vowelSet))
> > 5
println(join(", ", vowelSet.map { "\($0)" }))
> > i, o, e, u, a

Final touches

Last but not least, it would be nice to have some way of inspecting the contents of our Set instances. Time to implement the Printable and DebugPrintable protocols:

extension Set : Printable, DebugPrintable {
    var description: String {
        return "Set(\(self.elements))"
    }

    var debugDescription: String {
        return description
    }
}

Unfortunately, as of beta 6 these methods don’t seem to be used for user-defined types in either the Playground or during command-line execution, but you can still call them manually:

println(vowelSet.description)
> > Set([i, o, e, u, a])

Wrap up

There you have it, a fully functional Swift-native Set type. Take a look at the full source code in this gist (along with tests and performance timing) or let me know if I’ve missed anything over on Twitter. Thanks for reading!

Update: I’ve written a follow-up post with a few changes to the implementation, and posted a revised version of the Set type as a project called SwiftSets on GitHub.

  1. Why a struct and not a class? The Swift Blog had a post on the differences between value and reference types last week, and I’m following their lead with the Array and Dictionary types. Having Set be a value type gives guaranteed immutability and thread safety without any extra work.