Do you need help on a specific subject? Use the contact form (Request a blog entry) on the right hand side.

2015-02-19

Swift Gotcha: String.Index and the time of creation

String.Index can cause some nasty surprises, especially at runtime, just after you shipped your application...

Ok, that did not happen to me. But it is still frustrating when unit testing starts "EXC_BAD_..." even though it seems you did everything right.

Maybe I missed it in the documentation, but String.Index is a special kind of type. It is defined as a struct and that should have given me a hint. First I thought that String.Index stay's associated with the string it is derived from, but that was to optimistic.

When we assign a String.Index parameter, that is the time where the range for the parameter is fixed.

I.e. when a String.Index parameter is assigned, and the string it is derived from has 10 characters, then the "endIndex" associated with the parameter is fixed at this value. Even when later the string is appended to, the parameter cannot be incremented beyond this endIndex.

We can test this with the following:


As you can see, even though the original string now has a higher endIndex the associated parameter 'ind' cannot be updated beyond the original endIndex.

To me this is a major PITA, it means that for every update to a string, all associated index parameters must be updated as well.

One particularly instance where this is irksome is inserting a string in a string. Swift defines the insert function but that only inserts one character. When it is necessary to insert a string the usual solution is like this:

var str = "1234567890"

let s = "abcd"

var index = str.startIndex.successor().successor()

for c in s {
    str.insert(c, atIndex: index)
    index = index.successor()
}

This works, but only so long as the insertion index does not grow beyond the original str.endIndex. As soon as that happens the following occurs:


The only solution I see at this time is the following:

extension String {
    
    func fixIndex(index: String.Index) -> String.Index {
        let kludge = distance(startIndex, index)
        return advance(startIndex, kludge)
    }
    
}

var str = "1234567890"

let s = "abcd"

var index = str.endIndex.predecessor().predecessor()

for c in s {
    str.insert(c, atIndex: index)
    index = str.fixIndex(index)
    index = index.successor()
}

This works, but it's quite ugly imo.

Happy coding...

Did this help?, then please help out a small independent.
If you decide that you want to make a small donation, you can do so by clicking this
link: a cup of coffee ($2) or use the popup on the right hand side for different amounts.
Payments will be processed by PayPal, receiver will be sales at balancingrock dot nl
Bitcoins will be gladly accepted at: 1GacSREBxPy1yskLMc9de2nofNv2SNdwqH

We don't get the world we wish for... we get the world we pay for.

2015-02-16

Array subscript with multiple parameters

I needed to build an array of double values that ran as a cache next to a string. That is, the string would hold, say, 10 characters, then the array would hold 10 doubles, one for each character.

Like this:

let str = "1234567890"
let arr = Array<Double>(count: countElements(str), repeatedValue: 0.0)

When I need to lookup a value in the array, I have a String.Index and need to convert it to an Int every time I need to lookup a value, like this:

for var i = str.startIndex; i < str.endIndex; i = i.successor() {
    println(arr[distance(str.startIndex, i)])
}

This works, but it looks horrible. Every time I need to access the array, I need to use the distance operator.

It would be nice if I could use the String.Index as the index in the array also, like this:

extension Array {
    
    subscript (index: String.Index) -> T {
        set {
            self[distance("".startIndex, index)] = newValue
        }
        get {
            return self[distance("".startIndex, index)]
        }
    }
}

But a String.Index always stays associated with the string that it is derived from. You can check this with the following code:

let str1 = "1234567890"
let str2 = "12"
var j = str2.startIndex
var c = str[j]
j = j.successor()
c = str[j]
j = j.successor()
c = str[j]

The above will "EXC_BAD_INSTRUCTION" on us when 'j' is incremented beyond the endIndex of its associated string.

Thus the first implementation of the String.Index accessor above will not work. The first variable in the 'distance' call will immediately throw an EXC_BAD_INSTRUCTION when we try to increment beyond String.startIndex.

To my surprise however, it is possible to use multiple parameters in a subscript accessor, like this:

extension Array {
    
    subscript (string: String, index: String.Index) -> T {
        set {
            self[distance(string.startIndex, index)] = newValue
        }
        get {
            return self[distance(string.startIndex, index)]
        }
    }
}


let str = "1234567890"
let arr = Array<Double>(count: countElements(str), repeatedValue: 0.0)

for var i = str.startIndex; i < str.endIndex; i = i.successor() {
    println(arr[str, i])
}

After I tried this, I looked it up, and it is indeed documented that multiple parameters are possible in a subscript accessor. It is also written in the documentation that Swift will not copy a string unless it has to, so there should be no penalty associated with using the string itself in the subscript accessor instead of the slightly uglier looking String.startIndex.

It would be nice if it were possible to use only String.Index when indexing an element. Unfortunately it is not possible to subclass Array as it is defined as a struct. So if we want to use a single parameter index we need to define a wrapper and pass-through the relevant calls. Like this:

class StringIndexArray<T> {
    
    var array: Array<T> = []
    var startIndex: String.Index
    
    init(forString: String) {
        self.startIndex = forString.startIndex
    }
    
    subscript (index: String.Index) -> T {
        set {
            array[distance(startIndex, index)] = newValue
        }
        get {
            return array[distance(startIndex, index)]
        }
    }
    
    // pass through of relevant calls

    func append(value: T) { array.append(value) }

}

Update 2015.02.19: The above implementation of StringIndexArray will fail if the string it is created from is updated later. See my next blog about "That pesky String.Index, part 2"

Update 2015.05.07: When the presented Array extension is used, be aware that the String.endIndex may "count" too many characters when "Combining" characters (unicode \u + CC + xx) are present in the base string.

Happy coding...

Did this help?, then please help out a small independent.
If you decide that you want to make a small donation, you can do so by clicking this
link: a cup of coffee ($2) or use the popup on the right hand side for different amounts.
Payments will be processed by PayPal, receiver will be sales at balancingrock dot nl
Bitcoins will be gladly accepted at: 1GacSREBxPy1yskLMc9de2nofNv2SNdwqH

We don't get the world we wish for... we get the world we pay for.

2015-02-11

Swift extensions: Array function removeObject

I have several instances in my Objective-C code that use the "removeObject" operation on a NSMutableArray.
In Swift, the Array type only offers us removeAtIndex, removeAll, removeLast and removeRange. No removeObject anymore.

But Swift does allow type extensions, thus we can implement it ourselves. My first attempt was this:

extension Array {
    
    mutating func removeObject(object: T) -> T? {
        if count > 0 {
            for index in startIndex ..< endIndex {
                if self[index] === object { return self.removeAtIndex(index) }
            }
        }
        return nil
    }

}

But that does not work, it generates an error: Type 'T' does not conform to protocol 'AnyObject'

Hmm. So that is the reason why 'removeObject' no longer exists. The generic item used in the array does not implement AnyObject. As you may recall 'AnyObject' is only available for class types, whereas 'Any' can be used for any type at all.

Since 'Array' can be instantiated with 'Any' type, the '===' operator is not available. This operator compares only references, and 'Any' type objects are not guaranteed to use references.

So if we want to implement a 'removeObject' then we must introduce an additional constraint on the type to be used. Like this:

extension Array {
    
    mutating func removeObject<U: AnyObject>(object: U) -> T? {
        if count > 0 {
            for index in startIndex ..< endIndex {
                if (self[index] as! U) === object { return self.removeAtIndex(index) }
            }
        }
        return nil
    }
}

Type checking in Swift enforces that the type the array is called for is of the same type as the type the array was instantiated for. Hence the cast 'as U' is safe.

Happy coding...

Did this help?, then please help out a small independent.
If you decide that you want to make a small donation, you can do so by clicking this
link: a cup of coffee ($2) or use the popup on the right hand side for different amounts.
Payments will be processed by PayPal, receiver will be sales at balancingrock dot nl
Bitcoins will be gladly accepted at: 1GacSREBxPy1yskLMc9de2nofNv2SNdwqH

We don't get the world we wish for... we get the world we pay for.

2015-02-09

JSON + Swift = SwifterJSON

I just released a beta of my SwifterJSON project on github.

SwifterJSON is a single class JSON framework written -of course- in Swift. There are existing frameworks, and Apple themselves have a JSON parser in Cocoa. Still I wanted to use JSON slightly different and thought it would be a suitable vehicle to start programming in Swift.

SwifterJSON is slightly different in the sense that it creates a JSON hierarchy as a top level object that can be written to or read from. The top level object can be created from naught, or it can be build from an existing string (or file).

Access to the hierarchy is intended to use the subscript model.

A minimum implementation would be like this (after adding the SwifterJSON file to the project):

        let top = SwifterJSON.createJSONHierarchy()
        top["books"][0]["title"].stringValue = "THHGTTG"

        let myJsonString = top.description

The myJsonString will then contain: "{"books":[{"title":"THHGTTG"}]}"

To parse the above JSON string the following code can be used:

        let (topOrNil, errorOrNil) = SwifterJSON.createJsonHierarchyFromString(myJsonString)
        if let top = topOrNil {
            if let title = top["books"][0]["title"].stringValue {
                println("The title of the first book is: " + title)
            } else {
                println("The title of the first book in myJsonString was not found")
            }
        } else {
            println(errorOrNil!)
        }

When reading a value from a JSON hierarchy it is importent to always check for nil. Remember that a JSON string is often an input from the external world, and we need to make sure that our software handles errors that we receive from the outside world gracefully.

If you are curious to see how SwifterJSON would work in your projects, please go and take a look at it on github.

Oh, and tell me what you think of it....

Happy coding...

Did this help?, then please help out a small independent.
If you decide that you want to make a small donation, you can do so by clicking this
link: a cup of coffee ($2) or use the popup on the right hand side for different amounts.
Payments will be processed by PayPal, receiver will be sales at balancingrock dot nl
Bitcoins will be gladly accepted at: 1GacSREBxPy1yskLMc9de2nofNv2SNdwqH

We don't get the world we wish for... we get the world we pay for.

2015-02-08

Lazy and cache

Lazy properties are perfect if one needs to calculate a certain property only once. But it would be nice to have control over the lazy property in order to recalculate its value on demand.

I am probably not the only one who's first thought was to do the following:

class ClassWithLazyProperty {
    var a: Int {
        set { distance = nil }
        get { return self.a }
    }
    var b: Int {
        set { distance = nil }
        get { return self.b }
    }
    lazy var distance: Int? = {
        return self.b - self.a
    }()

}

While this compiles nicely, it does not work. After the first update to either 'a' or 'b' the distance will be 'nil' and the closure that calculates its value will never be called again.

So I needed something else. This was my second approach:

class ClassWithLazyProperty {
    var a: Int {
        set { p_distance = nil }
        get { return self.a }
    }
    var b: Int {
        set { p_distance = nil }
        get { return self.b }
    }
    var distance: Int {
        if p_distance == nil { p_distance = self.b - self.a }
        return p_distance!
    }
    private var p_distance: Int?
}

This actually works, but the code is horrible, and I have to do it again and again for each property that needs this kind of access.

Then I realised that what I needed was a cache. So here is my plea to Apple: Please introduce an attribute "cached" that would make the code in the first example work.

Absent the 'cached' attribute, we can still roll our own. Using generics and closures this looks actually pretty easy. My first approach was this:

class Cached<T> {
    
    private var cached: T?
    private var function: () -> T
    
    init(function: () -> T ) {
        self.function = function
    }
        
    func get() -> T {
        if cached == nil {
            cached = function()
        }
        return cached!
    }
    
    func reset() {
        cached = nil
    }
}

And when we need a cached variable:

class ClassWithCachedProperty {
    var a: Int { didSet { distance.reset() } }
    var b: Int { didSet { distance.reset() } }
    var distance = Cached<Int>(function: { [unowned self] in return self.b - self.a } )
    init() {
        a = 10
        b = 100
    }
    func doSomething() {
        println(distance.get())
    }
}

Unfortunately that does not compile. The compiler will insist that I use self before it is known. The solution would be to shove the initialisation of the Cached variable into the init function. But then the compiler complains that 'distance' is used before it is initialised.

Fortunately the second error message can be easily avoided:

class Cached<T> {
    
    private var cached: T?
    private var function: () -> T
    
    init(function: () -> T ) {
        self.function = function
    }
    
    func get() -> T {
        if cached == nil {
            cached = function()
        }
        return cached!
    }
    
    func reset() {
        cached = nil
    }
}

class ClassWithCachedProperty {
    var a: Int { didSet { distance.reset() } }
    var b: Int { didSet { distance.reset() } }
    var distance: Cached<Int>!
    init() {
        a = 10
        b = 100
        distance = Cached(function: { [unowned self] in return self.b - self.a } )
    }
    func doSomething() {
        println(distance.get())
    }
}

When we forget to put the proper initialisation in 'init' the thing will bomb on us during runtime. Not a perfect solution, but that possible bomb is easy to find during unit testing, so that is acceptable to me.

Happy coding

Edit: I have refined this approach in this post

Did this help?, then please help out a small independent.
If you decide that you want to make a small donation, you can do so by clicking this
link: a cup of coffee ($2) or use the popup on the right hand side for different amounts.
Payments will be processed by PayPal, receiver will be sales at balancingrock dot nl
Bitcoins will be gladly accepted at: 1GacSREBxPy1yskLMc9de2nofNv2SNdwqH

We don't get the world we wish for... we get the world we pay for.

2015-02-05

That pesky String.Index

I don't know about you, but I keep getting "interesting" behaviour when using String.Index. All in all, iterating over strings using the String.Index has gotten a whole lot more difficult in Swift as opposed to Obj-C.

Don't take me wrong, I actually like the String.Index, it forces me to really think about what I want and make sure that the Data Model and GUI code are as separated as they should be.

Still... :-)

Take this example: I have to iterate over a string, incrementally forming ever larger substrings starting from the one-character substring up the the complete length of the original string.

I.e, given the string "Swift" the first substring would be "S", the second "Sw" and so on up to the last one which would be "Swift".

The first thought is:

let str = "Swift"

for i in str.startIndex.successor() ... str.endIndex {
    let substr = str.substringToIndex(i)
    println(substr)

}

But that won't work. You will get the error "cannot increment str.endIndex" at runtime. Even though str.substringToIndex(str.endIndex) will actually work.

I tried rewriting this in several ways, but it seems that no loop construct can properly handle this case. You will need to roll your own:

let str = "Swift"

var i = str.startIndex.successor()
while true {
    
    let substr = str.substringToIndex(i)
    
    println(substr)
    
    if i < str.endIndex { i = i.successor() } else { break }
}

Oh well...

Happy coding...

Did this help?, then please help out a small independent.
If you decide that you want to make a small donation, you can do so by clicking this
link: a cup of coffee ($2) or use the popup on the right hand side for different amounts.
Payments will be processed by PayPal, receiver will be sales at balancingrock dot nl
Bitcoins will be gladly accepted at: 1GacSREBxPy1yskLMc9de2nofNv2SNdwqH

We don't get the world we wish for... we get the world we pay for.

2015-02-03

Parameter passing, part 2 - Immutable value ... only has mutating members named ...

Actually, I wanted to post something different about parameter passing, but since the following just happend to me I thought this would make a nice example of how value and reference passing can sometimes lead to very interesting results.

Such is the case for the "for in .. {}" loop.

The "for in .. {}" loop uses a generator function to retrieve the values from a source. That source can be an array of basic types, or complexer types as structs and classes.

Example:

struct InsertionPoint {
    var index: Int
    var offset: CGFloat
    mutating func adjust(#indexBy: Int, offsetBy: CGFloat) {
        self.index += indexBy
        self.offset += offsetBy
    }
}

var insertionPoints: Array<InsertionPoint> =
    [InsertionPoint(index: 0, offset: 0.0),
     InsertionPoint(index: 10, offset: 120.0)]

let width: CGFloat = 15.0
let chars = 1

for ip in insertionPoints {
    ip.adjust(indexBy: chars, offsetBy: width)

}

I thought that the insertion points in the array will thus be 'adjusted' by the given values.

But no, what I actually got is a compilation error:

ip.adjust(indexBy: chars, offsetBy: width)
         (!)Immutable value of type 'InsertionPoint' only has mutating members named 'adjust' 

At first glance I found this surprising, why should this compilation error occur?

Then I remembered that a struct is passed by value, and the generator function that retrieves the values from the insertionPoints array does exactly that: it gives us value copies from the array. Trying to update these immutable copies simply does not work. Even if it would work, it would be useless (in the above case) since the original values would remain unaffected.

The solution is to work directly with the array itself:

for i in 0 ..< insertionPoints.count {
    insertionPoints[i].adjust(indexBy: chars, offsetBy: width)

}

An array lookup gives us direct access to the value within.

Another solution would be to change the struct to a class. Classes are passed by reference, thus the error would simply disappear:

class InsertionPoint {
    var index: Int
    var offset: CGFloat
    init(index: Int, offset: CGFloat) {
        self.index = index
        self.offset = offset
    }
    func adjust(#indexBy: Int, offsetBy: CGFloat) {
        self.index += indexBy
        self.offset += offsetBy
    }
}

var insertionPoints: Array<InsertionPoint> =
    [InsertionPoint(index: 0, offset: 0.0),
     InsertionPoint(index: 10, offset: 120.0)]

let width: CGFloat = 15.0
let chars = 1

for ip in insertionPoints {
    ip.adjust(indexBy: chars, offsetBy: width)

}

Personally I prefer the later. The code looks cleaner and more readable to me.

Happy coding...

Did this help?, then please help out a small independent.
If you decide that you want to make a small donation, you can do so by clicking this
link: a cup of coffee ($2) or use the popup on the right hand side for different amounts.
Payments will be processed by PayPal, receiver will be sales at balancingrock dot nl
Bitcoins will be gladly accepted at: 1GacSREBxPy1yskLMc9de2nofNv2SNdwqH

We don't get the world we wish for... we get the world we pay for.

2015-02-02

Parameter passing

This article will take a look at parameter passing in Swift.

Function parameters are usually stored in a special place in memory, called "the stack". Before the function is called the stack is set up and after the function completes the contents of the stack that was used for the function is discarded.
The compiler can use the stack to pass parameters in two different ways: it either copies the value of a parameter onto the stack, or it copies the address of a parameter into the stack.
When the value is copied ("by value"), the code inside the function has no way to determine the original location of the variable, and hence it is impossible to change this original variable.
When the location of the original variable is copied ("by reference") the code inside the function has full access to the original variable and any changes made will be visible outside the function.

Some types are always passed by value, other are always passed by reference.

The basic types (Int, Double etc) are always passed by value unless the developer takes extra effort to ensure that they are passed by reference.

Class types are always passed by reference. In order to ensure that a function does not change the original object, the developer must create a copy of the original object before calling the function.

Time for some code:

func addFive(value: Int) -> Int {
    return value + 5
}

var a = 5
var b = addFive(a)

println(a)

// prints "5"

This example show passing by value, the original variable "a" is not changed by the code inside the function.

Also note that the following is not possible:

func addFive(value: Int) -> Int {
    value = value + 5  <<< Compiler error
    return value + 5

}

Even though though the value is copied into local memory, the compiler protects against changes to the variable. This makes it clear that the original value is protected against changes. In effect the above is equivalent to:

func addFive(let value: Int) -> Int {
    value = value + 5   <<< compiler error
    return value + 5
}

Here the let keyword makes it explicit that the parameter "value" is not to be messed with.

This also gives us a clue that in order to be able to use "value" as a real variable we can use the var  keyword to make is so:

func addFive(var value: Int) -> Int {
    value = value + 5
    return value + 5
}

var a = 5
var b = addFive(a)
println(a)

// prints "5"

But the original variable is still unaffected by the addition of var. Do note that the value of 'b' is now 15 instead of 10 as in the first code snippet.

If we want to change the original value of 'a', we have to make this explicit through the use of the inout keyword. Adding the inout modifier makes from a call by value a call by reference:

func addFive(inout value: Int) -> Int {
    value = value + 5
    return value + 5
}

var a = 5
var b = addFive(&a)

println(a)

// prints "10"

This time the value of 'a' is changed. But note that the call to the function has also changed: the "&" must be placed in front of the variable. This means that we tell the compiler not to copy the value of 'a', but to use a reference to 'a'. (Called a pointer in C and Objective-C).

Since the code inside the function can now change the original variable, the original variable must be defined as a var, not as a let:

func addFive(inout value: Int) -> Int {
    value = value + 5
    return value + 5
}

let a = 5
var b = addFive(&a<<< compiler error
println(a)

It does not matter if the variable 'a' is never changed, as soon as a variable is passed by reference, it must be defined as a var. Especially on the calls to the lower level API's this can cause some confusion since it can be necessary to pass some immutable data around using pointers. But even though that data itself may be immutable, it must still be defined in a mutable variable. Otherwise we cannot use them in pointer parameters.

Happy Coding...

Did this help?, then please help out a small independent.
If you decide that you want to make a small donation, you can do so by clicking this
link: a cup of coffee ($2) or use the popup on the right hand side for different amounts.
Payments will be processed by PayPal, receiver will be sales at balancingrock dot nl
Bitcoins will be gladly accepted at: 1GacSREBxPy1yskLMc9de2nofNv2SNdwqH

We don't get the world we wish for... we get the world we pay for.