James's Twisted Ape

Thoughts, code and anything else.

The Best Code Isn’t Always the Most Readable

I often see it expressed that the best solution to a problem is the most readable to other people. I do not agree with this, it is often enough to know what the code does, not how it does it. For example a good function to count the number of bits in an integer is:

1
2
3
4
5
6
7
8
9
10
  int countSetBits (uint64_t a) {
  /*
    Counts the number of set Bits
    Uses Hamming weight https://en.wikipedia.org/wiki/Hamming_weight.
  */
  a = a - ((a >> 1) & 0x5555555555555555);
  a = (a & 0x3333333333333333) + ((a >> 2) & 0x3333333333333333);
  a = (a + (a >> 4)) & 0x0f0f0f0f0f0f0f0f;
  return (a * 0x0101010101010101) >> 56;
}

This code isn’t immediately understandable; it doesn’t matter as it has a good name and a comment directing people where to find out how it works (if they’re interested).

Code isn’t intrinsically better because everyone can read it and immediately understand it. Almost all of the code that we run is entirely inside the blackbox of a method/function. It’s enough that somebody else has understood it and checked that it works. The above code can be easily checked against the trival algorithm for all values.

Whilst there isn’t anything wrong with writing complex code (code that can’t be understood immediately, without additional reading/thought), there’s also no reason to go out of our way looking for the fastest way. In this case it would be incredibly easy to write a loop to count the number of set bits. If you don’t already know about the above algorithm then you may as well get on with your job and write the simple algorithm. If profiling your code at the end shows that this section needs to go faster, look up a faster way to write the code.

Comparing Floating Point Numbers

You often see the advice that you shouldn’t compare floating point numbers with the == operator, however it is rarely well explained, and often the alternative that is suggested to using == isn’t any better (and can be much worse).

The reason that people often say that you shouldn’t use the == operator is that floating point numbers don’t exactly represent the value that you set them to.

Looking at a decimal example, representing $\frac{1}{3}$ as a decimal floating point number gives $0.333333$. The floating point standard doesn’t have a notion of recurring numbers so the last place is rounded to a set precision. There may be an instance in our code where we want to find out if three times our number is equal to $1$, i.e. we may have code that does this:

1
2
3
4
5
6
7
8
decFloat a = 1.0 / 3;
...
if (3 * a == 1.0) {
  ...
} else {
  ...
}

Naïvely we may expect 3 * a to be 1.0 , however $0.333333 \cdot 3$ is $0.999999$ which is not equal to $1$ so the code will actually run into the else statement. In the case of binary floating point this example will actually work (i.e. (1.0f / 3) * 3 equals exactly 1.0). This is because multiplying by three causes our value to go above the next power of two, giving us more binary places of precision than a float can represent. This extra precision is used in the multiplication to round the answer up to 1.0. This isn’t the case for all examples of this type, for instance in binary (1.0f / 83) * 83 does not equal exactly 1.0 as the rounding ends up going in the other direction.

A solution to our problem is to choose a small value (often called epsilon) and say that if the difference between our values is less than epsilon then the test is passed. It’s very often the case that the results of floating point calculations are very close to the results that we expect (in fact right next to them on the floating point number line).

Choosing the Wrong Epsilon

As we only want the test to pass if we’re really close, choosing the smallest possible difference as our epsilon might lead to code that looks like this:

DON’T DO THIS IT’S WRONG!!

1
2
3
4
5
/* This is a broken floating point compare function, don't use it!! */
bool compareNearlyEqual (float a, float b) {
  return fabs (a - b) < FLT_EPSILON;
}

This will work great for values between $0.5$ and $2.0$, however it is broken for all numbers greater than $2$, for instance this will work:

1
2
3
4
5
6
7
float a = 1.0 / 83;
...
if (compareNearlyEqual (1, 83 * a)) {
  ...
} else {
  ...
}

However it won’t work for an example like this:

1
2
3
4
5
6
7
float a = 1.0 + 1.0 / 83;
...
if (compareNearlyEqual (84.0, 83 * a)) {
  ...
} else {
  ...
}

It fails to recognize 83 * a as close enough to 84.0. In this case there is no number between 83 * a and 84.0 i.e. 83 * a is the very next number that floating point numbers can represent, so we’d hope that our test would say that they are close enough.

The problem here is each time the you go past a power of $2$ the gap between each number that can be represented doubles. To see this imagine that we have a floating point representation that has a precision of 6 places. The number one would have a representation $1$ and the next value representable above one will have the value $1.00001$, so the gap between one and the next value is $0.00001$. Now if we do the same with two and the next number we get $10$ and $10.0001$. A gap of $0.0001$, which is twice as big as the gap between 1 and the next nearest value (which has four zeros before the $1$, not three).

In the coded example we’ve chosen FLT_EPSILON as our epsilon; this number is the gap between $1$ and the next nearest value. If the floating point numbers we are trying to compare are larger than $2$ the gap between any float and the next nearest float is larger than our FLT_EPSILON. This means that the numbers we are comparing are either identical (so we may as well have used ==) or the gap between them is larger than FLT_EPSILON so the test will fail (just as it would have done if we’d used ==). Here’s a piece of code demonstrating that this compare method is broken.

If your number is greater than 2 the above comparison function is identical to ==

So what Epsilon Should you Choose?

Due to floating point inaccuracy one of the most common problems is that you end up calculating a value that is right next to the value that you wanted. This means that the epsilon you want to choose is the gap between the larger float and its next nearest value. A value very close to this can be found easily by multiplying the larger of the two values by FLT_EPSILON. This leads to a floating point compare function like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
bool compareNearlyEqual (float a, float b) {
  float epsilon;
  /* May as well do the easy check first. */
  if (a == b)
    return true;

  if (a > b) {
    epsilon = a * FLT_EPSILON;
  } else {
    epsilon = b * FLT_EPSILON;
  }

  return fabs (a - b) < epsilon;
}

Here’s an example of this compare function working where the original compare function failed.

Why is the incorrect compare so prevalent?

The answer to this is that == works a lot better than most people think, so it usually doesn’t matter much that their compare function is almost always doing exactly the same thing as ==. For instance due to rounding the first time that (1.0f / n) * n doesn’t equal exactly $1$ is when n is $41$. There is a large and commonly used subset of floating point numbers where the representation of the number is exact. Any number of the form $\frac{a}{2^n}$ (note that $n$ can be zero) will have an exact representation (provided $a$ is representable within the precision of the standard being used). With 32 bit floats this includes all integers up to $2^{24}$. If your floating point number is exactly representable you can add, subtract and multiply them by other exact floating point numbers and the answer will remain exact (again provided the answer doesn’t go over your precision). It is also very common to be able to divide and have the result remain exact. With this important section of numbers you could use == without causing yourself any problems.

Basically == works so often that despite the wrong compare routine having exactly the same result as using == (for the values likely to be used) most of the time you’ll get exactly the result you expected, so you won’t notice.

What can still go wrong?

Floating point numbers can still end up farther apart than the epsilon that you have chosen. The easiest way to demonstrate this is with subtraction. For instance let’s work with 6 places of precision and we compare $10 \frac{1}{3} - 10$ with $\frac{1}{3}$. Working through $10.3333 - 10 = 0.333300$ comparing this with our compare function gives $0.333333 - 0.3333 = 0.000033$ this value is much bigger than the epsilon we’d calculate for value ($0.000001$). Really we need to use the epsilon that we would get for comparing the number $10.3333$, as when we do the subtraction we cut off the first few digits, and can’t just invent new digits to go at the end, so it is assumed that those digits are $0$. If possible it would be better to write this compare the other way round i.e. compare $10 \frac{1}{3}$ with $\frac{1}{3} + 10$.

You may decide that you want to increase the size of your epsilon so that it catches more cases. To do this it’s a good idea to multiply the epsilon in the function by a value (this could either be a constant or a value you pass in so you can change your epsilon for different situations).

No number is close to 0, this is in general true. The size of a value depends on your perspective of it. For instance if you’re dealing with numbers around a million, 1 looks really close to 0. If on the other hand you’re dealing with numbers close to one millionth 1 looks like a really enormous number. This means that if you want to compare numbers to 0 and you think the floats your comparing may not be exact, you need to choose a value which is you’re smallest non-zero value i.e., in this case you need to have a static epsilon. The epsilon for your code may be much smaller than FLT_EPSILON or it could be much larger, depending on the size of values that you are normally dealing with. As the floating point standard can’t represent all numbers it does actually have a value that is next to zero but this would normally be a very poor choice of epsilon for anything.

There are other times when doing a comparison that you may want to choose a static epsilon, particularly if you’re wanting to behave in a certain way depending on if you’re around a number. This isn’t really the same situation as trying to determine whether you have floating point inaccuracies.

Really Horrible Effects of Using a Constant Epsilon

It’s not great that using FLT_EPSILON as a constant epsilon reduces the compare function to == for values greater than 2 (I explored this because normally programmers are dealing with numbers of this sort of size). It’s even worse if your code has to deal with small numbers. The smaller your numbers get, the larger FLT_EPSILON is compared to these numbers. This means that if you’re doing calculations with numbers around $10^{-8}$ and smaller and you use a static epsilon of FLT_EPSILON all of your numbers will compare equal to each other so any comparison you do will be entirely pointless (it will always be true).

Conclusion

If you’re trying to solve floating point inaccuracy problems using an epsilon the epsilon has to be relative to the size of the numbers that you are comparing. If it isn’t then there is a good chance that your comparison function is pointless for most floating point numbers and dangerous to use with others.

There can be good reasons to use static epsilons, however you do need to think carefully about what the epsilon is going to be. The size and range of the floating point numbers you are using needs to have a strong bearing on the epsilon you choose. It’s incredibly rare that FLT_EPSILON is a good choice of static floating point epsilon. It may be OK if you’re only comparing numbers that are between $0.5$ and $2.0$ or as a close to zero number.

Recursion

I really like recursion and I find that it is an option that we often fail to consider when writing code. There have been times that I have thought of needlessly complicated solutions to a problem because, in general, I don’t look for recursive solutions early. I was reminded about this recently when looking at a StackOverflow question on calculating:

using continued fractions, implementing code to calculate it using the following formula:

$$ log(\frac{z + 1}{z - 1}) = \frac{2 \cdot z}{1 - \frac{z^2}{3 - \frac{4 \cdot z^2}{(2 \cdot n - 1) - \frac {n^2 \cdot z^2}{…}}}} $$

where $n$ is the count of divisions, i.e. in the line under the first division $n = 1$ gives: $1 - \frac{z^2}{…}$.

In this example you can reasonably easily work out how to solve the problem without recursion by noticing that you calculate the last term that you want first; however, this requires you to think backwards. It also results in code that no longer looks like the algebraic expression, meaning you should comment the code well to explain what you’ve done. The result of avoiding recursion is that you’ve had to do more thinking, the code no longer looks like the calculation, and the next person who looks at your code will find it harder to work out what you’ve done.

Solving this problem recursively looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
/* `i` is the iteration of the recursion and `n` is
   just for testing when we should end. 'zz' is z^2 */
double theRestOfTheContinuedFraction (double zz, int i, int n) {
  if (!n)
    return 1;

  return 2 * i - 1 - i * i * zz / theRestOfTheContinuedFraction (zz, i + 1, n - 1);
}

double contFracLog (double z, int n) {
  return 2 * z / theRestOfTheContinuedFraction (z * z, 1, n);
}

This looks a lot like the original equation and is calculated in the same order as you read the equation in.

Working Through a recursive problem

On another occasion on StackOverflow there was a question (probably either a homework or interview question) asking for the fastest way to print all of the combinations of 4 numbers between 1 and 9 that add to give 20. The answers to this question were fairly practical, such as checking every combination of numbers and seeing which of them added to 20 and scrapping the rest. In a comment, I suggested that you could use a recursive solution to only iterate over the values that actually added to give 20. Here’s how you go about thinking of such a solution:

The first thing to do is to choose a starting value; since we’re likely going to have to iterate through all starting values, either 1 or 9 would be a good choice. Let’s start with 1. We remove 1 from 20, and we get a new but very similar problem i.e. we will then need to find all of the combinations of three values between 1 and 9 that add together to give 19. Being able to reduce the problem to an almost identical but slightly simpler problem is a sure sign that recursion is a good approach. So, if we construct a function that allows us to calculate all the combinations of “n” values that add to “x” then it can do all of the work for both of these cases (e.g. in the first case above “n” is 4 values and x is the number 20, in the second case “n” is 3 and “x” is 19). As a result, we’d expect our function definition to look like this:

1
2
3
/* x is the value that we need to sum to, and n is the number
   of values (between 1 and 9) that we can use to add to get x */
void findAllCombinationsOfNValuesWhichAddToX (int n, int x, char *foundVals);

We need the foundVals variable to keep track of all the numbers that we’ve already decided that we’re going to use, i.e. before calling the next findAllCombinationsOfNValuesWhichAddToX we need to have added a number to the foundVals string. foundVals will eventually contain the four numbers separated by a comma and a space in a C string so when we start, foundVals needs 11 chars of space. For each of the first three numbers 3 chars are required (the number, a comma, and a space), then we need the final number and the string terminator.

This is most of the work done (which is one of the lovely things about recursion). In the simplest form, the solution works like this; we iterate over all of the numbers that we can subtract from x (i.e. 1 - 9), on each iteration we write down the value (put it in a string), then we find all the combinations of n - 1 numbers that add to give x - iterationValue. We stop and print our result when n gets to 1 if our final value is between 1 and 9.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
/* x is the value that we need to sum to, and n is the number
   of values (between 1 and 9) that we can use to add to x */
void findAllCombinationsOfNValuesWhichAddToX (int n, int x, char *foundVals) {
  if (n > 1) {
    /* Let's iterate over all of the numbers that we could subtract from x,
       subtract these values from x, and then find all combinations of
       numbers that can sum to give the new value (keeping track of
       numbers that we've used in foundVals). */
    for (int i = 1; i < 9; ++i) {
      // First append the value that we've chosen to the string.
      int origStrLen = strlen (foundVals);
      char appendStr[4];
      sprintf(appendStr, "%d, ", i);
      strcat (foundVals, appendStr);
      /* Now work out the rest of the values that we need to add to get to the
         remaining number. */
      findAllCombinationsOfNValuesWhichAddToX (n - 1, x - i, foundVals);

      /* Remove the value that we chose this time round ready to
         append a new value next time round. */
      foundVals[origStrLen] = '\0';
    }
    return;
  }

  if (x > 9 || x < 1)
    return;

  // We've found a group of numbers that adds to 20.
  printf ("%s%d\n", foundVals, x);
}

Now we have something that works (well kind of). The current solution has two major problems. The first is that we’re still iterating over all the possible combinations (a problem we hopped recursion would help us with), and the second is we get duplicate values (e.g. we find the value 5, 5, 4, 6 and 5, 5, 6, 4 which aren’t unique solutions).

To solve these issues we need to be more intelligent about which values we iterate over. So in the for loop let’s remove the hard coded 1 and 9 and replace them with variables minValue and maxValue.

Looking at minValue first, there is a limit on how small minValue can be. Imagine that we had two values that need to add to 16, now we know that the largest value that we can use is 9 so the smallest value we can use is 16 - 9. if you choose any value smaller than this you aren’t able to make 16. Another example would be 3 values that add to 20. The maximum that each of the next two values can be is 9 so the smallest viable value we can choose is 20 - 2 * 9 = 2. In general the smallest value that we can choose is x - (n - 1) * 9. If this value is less than 1 then we can go back to the minimum value of 1. This gives us code like this:

1
2
3
4
5
6
int minValue = x - (n - 1) * 9;
if (minValue < 1)
  minValue = 1;

for (int i = minValue; i < 9; ++i) {
...

Now we can use the same reasoning to get a maximum value. If we need two values that sum to 8 then the largest value that we should choose is 8 - minValue. Any value larger than this and we will get a total greater than 8. Similarly, if we have n values that add to x then the maxValue will be x - (n - 1) * minValue. This gives code that looks like this:

1
2
3
  int maxValue = x - (n - 1) * minValue;
  if (maxValue > 9)
    maxValue = 9;

Putting this together gives:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
/* n is the value that we need to sum to, and x is the number
   of values (between 1 and 9) that we can use to add to n */
void findAllCombinationsOfNValuesWhichAddToX (int n, int x, char *foundVals) {
  if (n > 1) {
    /* Let's iterate over all of the numbers that we could subtract from n,
       subtract these values from n, and then find all combinations of
       numbers that can sum to give the new value. (keeping track of
       numbers that we've used in foundVals). */

    int minValue = x - (n - 1) * 9;
    if (minValue < 1)
      minValue = 1;

    int maxValue = x - (n - 1) * minValue;
    if (maxValue > 9)
      maxValue = 9;

    for (int i = minValue; i < maxValue; ++i) {
      // First append the value that we've choosen to the string.
      int origStrLen = strlen (foundVals);
      char appendStr[4];
      sprintf(appendStr, "%d, ", i);
      strcat (foundVals, appendStr);
      /* Now work out the rest of the values that we need to add to get to the
         remaining number. */
      findAllCombinationsOfNValuesWhichAddToX (n - 1, x - i, foundVals);

      /* Remove the value that we chose this time round ready to
         append a new value next time round. */
      foundVals[origStrLen] = '\0';
    }
    return;
  }

  if (x > 9 || x < 1) {
    printf ("There's something wrong with the code we should never get here");
    exit (0);
  }

  // We've found a group of numbers that adds to 20.
  printf ("%s%d\n", foundVals, x);
}

Note that we can now use our original method of checking whether we should print the values to make sure that our new method is working (i.e. that we aren’t iterating over values that don’t add to x). We should never get to the end of the recursion without a set of numbers that add to x.

So with the recursion we’re now only looping over values that will genuinely add to x and all other values will be ignored. The only remaining issue is duplicates. This issue is resolved by noticing that once we’ve found all of the combinations that have a 1, we never need to include 1 again. What this means in practice, is that after we’ve iterated over a value, we want to tell the following calls to the recursive function not to use this value again. As our iterative values are increasing, we can do this by passing a minimum value into the recursion instead of assuming that the minimum value is 1. This leads to the final code looking like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
/* n is the value that we need to sum to, and x is the number
   of values (between 1 and 9) that we can use to add to n */
void findAllCombinationsOfNValuesWhichAddToX (int n, int x, char *foundVals,
                                              int minValue) {
  if (n > 1) {
    /* Let's iterate over all of the numbers that we could subtract from x,
       subtract these values from x, and then find all combinations of
       numbers that can sum to give the new value. (keeping track of
       numbers that we've used in foundVals). */
    int newMin = x - (n - 1) * 9;
    if (newMin < minValue)
      newMin = minValue;

    int maxValue = x - (n - 1) * minValue;
    if (maxValue > 9)
      maxValue = 9;

    for (int i = newMin; i <= maxValue; ++i) {
      // First append the value that we've chosen to the string.
      int origStrLen = strlen (foundVals);
      char appendStr[4];
      sprintf(appendStr, "%d, ", i);
      strcat (foundVals, appendStr);
      /* Now work out the rest of the values that we need to add to get to the
         remaining number. */
      findAllCombinationsOfNValuesWhichAddToX (n - 1, x - i, foundVals, i);

      /* Remove the value that we chose this time round ready to
         append a new value next time. */
      foundVals[origStrLen] = '\0';
    }
    return;
}

  if (minValue > x) {
    /* This deals with a single edge case where choosing the maxValue in
       the iteration can leave you with n as maxValue - 1, which would 
       print a combination that has already been found (e.g. two numbers
       that add to give 17, first we get 8, 9, then when we do the next
       iteration we'd choose 9, first leaving n as 8, and end up here). */
    return;
  }

  if (x > 9 || x < 1) {
    printf ("There's something wrong with the code we should never get here");
    exit (0);
  }

  // We've found a group of numbers that adds to 20.
  printf ("%s%d\n", foundVals, x);
}

Once you’re accustomed to considering a recursive solution, many problems become simpler, shorter, and easier to understand.

C Pointers

I’ve written this post because I see a lot of confusion on the internet about C pointers and it’s almost always a misunderstanding of the basic syntax. So here’s a post explaining the syntax.

For the whole of this post we’re going to ignore the case where * means multiply (when it is placed between two number types).

Random Access Memory (RAM)

To understand what’s going on with pointers you first need to know about how memory works on your computer. If you know this already skip onto the next section.

Your computer has some very fast memory used for running programs, this is called RAM (Random Access Memory). This memory is addressed; this is done by giving each byte a number. The computer can then ask the RAM chip for the data at that address (number) and it’ll get the data. This is why 32 bit systems can only support 4 GBs of RAM; a 32 bit system has an address size 32 bits long, i.e. it only has 2^32 numbers with which to address the RAM (2^32 bytes = 4 Gigabytes).

Addresses

When we use the word address we mean the location (byte number) in RAM of the start of the data we are interested in.

When you create a variable in your program that variable is put into the RAM for the lifetime of the current scope (within the {...} which the variable is declared). This means that your variable has an address, you can get the address using the & symbol. The following code prints the address of variable a:

1
2
3
int a;
printf ("address of a = %#llx \n", &a);
printf ("address of a = %p \n", &a);

%#llx prints up to a 64 bit number in hexidecimal integer with 0x preceeding it. The top print will give you a type warning because the compiler knows that &a has the type of an address at which there is an integer, not type long long int (which is the type we have asked it to print) but as an address is just a number it’ll work without any problems. %p is used to print addresses. If you run this code you will find that both lines will print exactly the same thing, which is the location/address in RAM of a.

Pointers

The first thing to do is to introduce the jargon:

  • A pointer: A variable that contains the address of some area of your RAM.
  • Pointed at: The data at the address of a pointer

I.E. “The thing pointed at by a”: a is a pointer, and we want to talk about the data at the address stored in a)

All you need to know to understand pointers is that * means “the thing pointed at by”. Let’s look at some examples of this:

1
int a;

This means “a is a variable of type int”.

1
int *a;

This means “the thing pointed to by a is of type int”.

1
2
3
4
int *a;
int b = 10;
a = &b;
int c = *a;
  • The thing pointed at by a is an int.
  • b is an int and equals 10.
  • a equals the address of b.
  • c is an int and equals the thing pointed at by a (the int at the address stored in a).

At the end of this c will equal 10.

1
2
3
4
int *a;
int b = 10;
a = &b;
*a = 15;

This is nearly the same as above except the last line. The last line says:

The thing pointed at by a equals 15. a points at the memory of b so if we printed the value of b at the end of this we would find that it had changed from 10 to 15.

As we can see from the examples every time we see * we can read the thing pointed at by.

You can create and set a pointer in a single line like this:

1
2
int a = 10;
int *b = &a;

The second line of this code reads “The thing pointed at by b is an int, and b equals the address of a”. Note that in this case the * only applies to the declaration of b and not to setting b. i.e. This code is equivalent to:

1
2
3
int a = 10;
int *b;
b = &a;

When using * like we have above (both in declarations and when placed in front of a variable) it works on the variable that comes directly after it. So in this case:

1
int *a;

* changes a, it does not change the type. This means that if you write something like this:

1
int *a, b;

The code above can be read as b, and the thing pointed at by a are ints. Note that it would seem more natural to write the statement above with a first as this is how we declared it, unfortunately this statement is ambiguous in English, and would need to be read like this: (the thing pointed at by a) and b are ints, taking note of the brackets.

To get a and b to be pointers you need to write:

1
int *a, *b;

To put this in technical terms * is a unary prefix operator. An operator is a symbol that represents a specific action, unary means that it acts on a single element, and prefix means that is comes directly before the element.

You will see people write code like this int* a, this makes it look like * is changing the type, i.e. changing the type from int to a pointer to an int. This isn’t how C works and the compiler will still read the code like this int *a. Doing int* a is as weird as writing bool a = true; bool b =! a; Note if you did this it would seem like bool c =! a && b would mean bool c = !(a && b).

What’s next

There is loads more to explore about pointers in more detail such as arrays and pointer arithmetic or types/typecasting (especially interesting in an Object Oriented situation such as Objective C or GTK).

Have fun.

Blocks and Memory Management

What happens when you create a block?

Normally when you write a block in Objective C, the compiler actually creates an Objective C object, however the object is likely different to any other Objective C object that you will have seen. One of two types of block will be created:

  1. If the block doesn’t use any variables from outside its scope a global block will be created. In concept this works exactly the same as a nameless function; it’s an area of code that you can call into and is always there, it never gets destroyed. If you were to save the address of this block when creating it you could call into this code at any time during the lifetime of your program. As you don’t reference any variables from outside the block, there are no issues with memory management. These blocks are not interesting from a memory management point of view, so will be mostly ignored in the rest of the blog post.

  2. If the block references any variables that are declared outside the block then the block will be created as a stack object. This isn’t something that exists in the rest of the Objective C world. The rest of the time when you create an object, memory is put on the heap, not the stack (other languages do support stack objects as a normal concept).

The block being a stack object has some nice advantages. It’s always really fast to create the object; the program’s stack is calculated in advance so there is no need to do any work to find contiguous memory to allocate for the object, it’s already there. The second (and more important) is that all of the object’s memory drops off the stack once you leave the current stack frame, i.e. when you return from your function or method. This means that in a manual reference counted (MRC) environment you can create the block object without having to keep a reference to it to manually free its memory, i.e. you can do this:

1
2
3
4
5
NSArray *array = [NSArray arrayWithObject:@"testing"];

[array enumerateObjectsUsingBlock:^(id obj, NSUInteger idx, BOOL *stop) {
  NSLog (@"%@", array);
}];

We don’t need to worry about the object that this has secretly created because all of that object’s memory will be cleared once we leave the function.

This is very different to other objects on the heap. In an MRC environment the following will cause you to leak memory:

1
NSArray *array = [NSArray arrayWithObject:[[NSView alloc] init]];

We have created an NSView but have no way to release it again. To solve this we would need to do the following:

1
2
3
NSView *view = [[NSView alloc] init];
NSArray *array = [NSArray arrayWithObject:view];
[view release];

So by making blocks secretly stack objects, they can work nicely with manual reference counting with a nice syntax where we appear to be just adding a block of code as an argument.

Another important thing to note is that because blocks are secretly stack objects, they can’t take ownership (i.e. increment the retain count) of any variables used inside the block that were declared outside the block (well at least not without the magic of ARC). The block objects aren’t deallocated; their memory is simply dropped when the method returns, so under MRC there is nowhere where the block could remove its ownership of the objects (which would normally be done in dealloc). This means that code like this:

Block Ownership
1
2
3
4
5
6
7
NSArray *array = [[NSArray alloc] initWithObjects:@"test", nil];
void (^block)(void) = ^{
  NSLog(@"%@", array);
};

[array release];
block();

may cause a crash as the array object has been deallocated. (If you’re very unlucky the array pointer will point at something that responds to the description method and you won’t get the crash.)

As the block is on the stack, keeping a reference to the block that outlasts the scope of the function/method won’t work:

Break Stack Block
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
@implementation JTAObject
{
  void (^keepBlock)(void);
}

- (id)init
{
  if ((self = [super init])) {
    keepBlock = ^{
      // We need to reference `self` so we don't make a global block by mistake.
      NSLog(@"self = %@", self);
    };
  }

  [self performSelector:@selector(breakingBlockCode) withObject:nil afterDelay:3];

  return self;
}

- (void)breakingBlockCode
{
  keepBlock();
}

The above code causes a crash (without the reference to self within the block, a global block is created, so everything would work fine).

If you want to keep a block around outside of your method you need to call the copy method on the block. This method will copy the block from the stack to the heap. It will also retain all of the variables that the block references. This means that the block now owns the variables that are referenced inside the block. So if we change the block ownership code example a couple above then we won’t crash:

block ownership with copy
1
2
3
4
5
6
7
8
9
10
11
NSArray *array = [[NSArray alloc] initWithObjects:@"test", nil];
void (^block)(void) = ^{
  NSLog(@"%@", array);
};

void (^blockCpy)(void) = [block copy]

[array release];
blockCpy();

[blockCpy release];

There are a couple of things to note here. We need to keep a reference to our copied block; this block is now under the normal reference counting rules and we need to release it when appropriate. The variables declared outside the block that are used within the block have also had their retain counts incremented, so when we call blockCpy after releasing array we don’t crash.

What changes in ARC?

In ARC the memory management works differently. Suddenly with ARC there is the magical ability to deallocate the instance variables of the block when the block drops out of scope. This means that the block can have strong instance variables (i.e. it can increment the retainCount of its instance variables) without the risk of leaking memory.

When you create a block in ARC the block’s instance variables (the variables that were declared outside the block, and used inside the block) are declared with the same ARC Ownership Qualifiers as they were declared with outside of the block. This means that if you use a variable that is declared as __strong inside a block, the block will keep a __strong copy of the variable (i.e. the variable’s retainCount will be incremented). If however the variable’s ownership qualifier is __weak or __unsafe_unretained the variable’s retainCount won’t be incremented by the block (you can’t capture an __autoreleasing variable in a block). Here is code similar to the Block Ownership code from earlier, and it won’t crash:

Block Ownership ARC
1
2
3
4
5
6
7
NSArray *array = [[NSArray alloc] initWithObjects:@"test", nil];
void (^block)(void) = ^{
  NSLog(@"%@", array);
};

array = nil;
block();

The block has kept a reference to array because array is declared as __strong.

Let’s instead try passing in a __weak reference:

1
2
3
4
5
6
7
8
NSArray *array = [[NSArray alloc] initWithObjects:@"test", nil];
__weak NSArray *wkArray = array;
void (^block)(void) = ^{
  NSLog(@"%@", wkArray);
};

array = nil;
block();

This code will print (null) to the console. This is because when the object referenced by a __weak variable is deallocated the __weak variables associated with the object are set to nil. In this case when we do array = nil, wkArray is set to nil.

If instead we gave wkArray an ownership qualifier of __unsafe_unretained we’d likely get a crash (we’d definitely get a crash if Zombies are enabled). This is because wkArray now points to the location of an object that has been deallocated, and __unsafe_unretained doesn’t have the magic ability of __weak to make the variable nil when it is deallocated.

In some instances you may be thinking that your code should always have a reference to the variable that you have used in a block in a non-__strong way (i.e. you know that another __strong reference to the object exists when you call the block) and that you’d rather have a crash if there is no reference to the object. In this case I’d recommend using __weak and asserting that the variable exists within the block, as using __unsafe_unretained doesn’t guarantee you will have a crash when you try and reference the variable.

This makes the ARC situation with blocks very different to the MRC situation. When you use a __weak reference within a block it doesn’t matter if the block is on the stack or the heap, the block won’t take ownership of the object. Conversely if you use a __strong reference regardless of whether the block is on the heap or the stack the block will take ownership of the object.

The ability to use __weak references within a block allow you to avoid retain cycles. The following code will never cause a retain cycle:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
@implementation JTAObject
{
  void (^block)(void);
}

- (id)init
{
  if ((self = [super init])) {
    __weak JTAObject *weakSelf = self;
    block = ^{
      NSLog (@"self = %@", weakSelf);
    }
  }
}

If the block kept a __stong reference to self here, we would have a retain cycle.

However, because the block doesn’t keep a strong reference to our JTAObject, if something else has a reference to the block it’s possible for the block to be called after self has been deallocated. If this is a possibility in your code, due to the magic of weak references you can check whether weakSelf exists and exit the block early if it doesn’t.

The above applies regardless of whether the block is on the stack or the heap. The memory management is done entirely based on the ownership qualifiers of the objects referenced within the block.

It doesn’t really matter any longer whether a block is on the stack or the heap as ARC will clear things up anyway. It is however interesting to know what types of blocks we are creating:

  1. If you create a block and no variables are referenced within the block, then a global block is created and all reference counting (including ARC) is ignored. The following options assume that you reference a variable from outside the block.

  2. If you create a block and don’t keep a reference to the block (i.e. you pass it into a method immediately) then a stack block will be created.

  3. If you create a block and keep a __strong reference to the block (which is the default) the block will be immediately copied to the stack so that ARC can deallocate the block sensibly when it goes out of scope.

  4. If you create a block and keep an __unsafe_unretained reference then the block will be created on the stack like it is in MRC.

  5. If you try to create a block and only keep a __weak reference to it, the block will be moved to the heap and immediately deallocated, probably causing a crash. So this is a very bad idea.

If you’re interested can test the type of block you have by printing the class block in the debugger (i.e. po [myBlock class]). Global blocks have a class __NSGlobalBlock__, blocks on the heap have a class of __NSMallocBlock and blocks on the stack have a class of __NSStackBlock__.

Great South Run

On the 26th of October, my wife and I are going to be running the great south run. We’re running in support of Tourettes Action. We’ve chosen this charity because our oldest child has Tourette’s, along with several related issues. These issues add considerable extra difficulties for both him and us. However, the support of friends made through this charity have helped emensely.

Your support for this charity would be greatly appreciated. If you feel inclined you can give using my wife’s Just Giving page.

Xcode 6 Simulator Folders

In the latest version of Xcode finding your files on an iOS simulator has become a lot harder. The file structure has changed presumably to help with sharing data between applications. As a result I have written a short bash script to help you find your applications folders on the iOS simulator. You can download it here.

I’ve added this program to my PATH so I can run it from anywhere in Terminal.

If you run it with no commands it will print some help, and a list of simulator folders found on your machine:

Jamess-MacBook-Air:~ james$ simulatorAppFolders
simulatorAppFolders [-?sap]
-? show help, and prints simulators currently installed
-s <device name> prints the folder of this simulator without -i will choose the first matching folder
-i <iOS version> finds the folder matching the iOS version as well as the simulator name
-p Used with -s to print the installed apps on a simulator
-a <application name> Used with -s to print the application and data folder of the application

simulator names:
iOS-8-0 iPad 2
iOS-8-0 iPhone 6
iOS-8-0 iPhone 6 Plus
iOS-7-1 iPad 2
iOS-7-1 iPhone 5s
iOS-8-0 iPad Air
iOS-8-0 iPad Retina
iOS-7-1 iPhone 4s
iOS-8-0 iPhone 5s
iOS-8-0 iPhone 4s
iOS-7-1 iPad Air
iOS-7-1 iPhone 5
iOS-8-0 Resizable iPhone
iOS-7-1 iPad Retina
iOS-8-0 Resizable iPad
iOS-8-0 iPhone 5

With the -s command you can give it a device name and it will print you the folder for the given device (don’t forget to quote the device name):

Jamess-MacBook-Air:~ james$ simulatorAppFolders -s "iPad Air"
/Users/james/Library/Developer/CoreSimulator/Devices/6A9DEE93-FAEF-4BF4-9989-BC14CD38AEA0

If there are multiple devices with this name (say one with iOS 7 and one with iOS 8) then you can add the -i command to specify the iOS version:

Jamess-MacBook-Air:_posts james$ simulatorAppFolders -s "iPad Air" -i iOS-8-0
/Users/james/Library/Developer/CoreSimulator/Devices/6A9DEE93-FAEF-4BF4-9989-BC14CD38AEA0

If you add the -p command it will print the applications that you currently have installed:

Jamess-MacBook-Air:_posts james$ simulatorAppFolders -s "iPad Air" -i iOS-8-0 -p
PagingScrollView
CollectionViewSource

Finally if you use replace the -p command with the -a command the script will print the folders associated with your app:

Jamess-MacBook-Air:_posts james$ simulatorAppFolders -s "iPad Air" -i iOS-8-0 -a PagingScrollView
App folder = /Users/james/Library/Developer/CoreSimulator/Devices/6A9DEE93-FAEF-4BF4-9989-BC14CD38AEA0/data/Containers/Bundle/Application/8F64CD7F-A227-43D6-98AC-41D3919827CB
dataFolder = /Users/james/Library/Developer/CoreSimulator/Devices/6A9DEE93-FAEF-4BF4-9989-BC14CD38AEA0/data/Containers/Data/Application/96C40A21-5A5C-4399-839C-B155B8A72932

Update

Thanks to Derek Clarkson at odecee who pointed out that it’s also useful to be able to search on the device type as well as the device name and amended the script to be able to do this using a -d option. If you re-download the script you can do this.

I’ve also written another script which makes makes a folder in your home area called iOS_SIMULATORS (if you already have a folder called this it will delete it), in this folder there is a directory structure where you can easily find your applications (they’re sensibly named!):

You can us a -f option on this script and it will collapse/remove useless folders, so running this on the same system as the original command you get this:

You can download this script here.

Update

There is a git repo with both of the scripts here.

Horizontal Paging Scroll View With Varying Width in Storyboards

If you are using autolayout in your projects then scroll views can initially appear to cause an issue. The trick to solving this is that if the size of the scroll view is tightly constrained and the size of the contents of the scroll view is tightly constrained then autolayout will do the rest of the work for you. Historically it has been quite simple to make a horizontally paging scroll view because the width of the screen hasn’t changed so it’s unlikely that the width of your scroll view needed to change. This meant you could hard code the width of the scroll view content to be a multiple of the width of the scroll view. This is now changing so we need to be able to make these scroll views for varying width screens.

To layout the contents of a scroll view in iOS you need to (at least temporarily) make more room in your storyboard so you can fit the contents of the scroll view on screen. To do this select the view Controller that contains the scrollView, select the ruler symbol, and change the ‘Simulated Size’ from ‘fixed’ to ‘freeform’. Then adjust the width so you will have enough room to add all of the contents of your scrollView.

Next add your scrollView, and add the constraints to bind it to the correct position in the screen (this will normally be bound tightly to the sides of the screen, NOTE: you should turn off the ‘Constrain to Margins’ option (formally the Prefer margin relative) if you are going to bind to the edge).

If you have a lot of constraints it can be useful to rename the constraints as later it makes it considerably easier to adjust the constraints later:

We’re now going to add a view to constrain the size of the contents of your scroll view. Select a UIView and add it to the inside of the scroll view so it entirely fills the inside of the scroll view. I’d recommend renaming the view so that its purpose is clear, I’ve renamed mine to ‘ContentView’. Renaming the views means that the constraints will have much more informative auto-generated names, and reduces the amount of renaming constraints that you need to do. Add constraints to your content view to bind it to the edge of your scrollView.

Now we’re going to make our content view have the correct width to make our scrollView scrollable. To do this we need to add width and height constraints. We need the width and height constraints to be relative to the width and height of the scroll view. To achieve this we start by adding constraints to make the content equal to the size of the scrollView. Right click on the constraint view add drag onto the scroll view, then select the ‘Equal Widths’ option. Then do the same for heights.

Instead of having the content equal in width to the scroll view we need it to be a multiple of the size of the scroll view, so we edit the equal width constraint that we created. Select the equal width constraint, then select the ruler icon in the right hand panel. In the top section make sure that the ‘First item’ is ‘contentsView.width and the ‘Second item’ is Superview.width. If the items are in the other order click the drop down menu on either item and select the ‘Reverse First and Second Item’ option. Change the multiplier option so that its value is the number of pages that you want in your scrollView. The storyboard will now warn you that the width of your content view doesn’t match your constraints. You don’t need to worry about this at the moment.

Next you can add views to be the background for your pages. To do this add a UIView for each page to the inside of the content view, with a width the same size as you want your scroll view to appear in the storyboard:

Now we want to add the constraints to the pages, in Xcode 5 you can make the page views an equal width to the scroll view. This doesn’t seem to work in the latest beta of Xcode 6 so instead you need to make the page views a proportional width of the content view. To do this make the page view the same width as the contentsView using the ‘Equal Width’ option as we did above. Then select the constraint and make sure that ContentsView.width is the top item. Once this is the case you can again set the multiplier to the number of pages that you want to make. This will constrains the page to the correct width.

Now you will need to add constraints for your pages so the pages are in the correct place in the view:

After this step you can layout your pages however you desire.

There will still be some warnings on your view controller. This is because the content view and the scroll view are the same size and the constraint says that the content view should be twice the size of the scroll view. To resolve this turn the ViewController back to a fixed size. After this look through any remaining warnings, it should simply be a matter of updating the frames of your views so that they match the constraints that you have added.

Here is a sample project with a resizable horizontal paging scroll view.

New Adventures

I recently handed in my notice at my current place of work and I don’t plan on immediately looking for a full time role anywhere else!

The plan is that my wife will do a PGCE (teacher training) in secondary mathematics at the local university, and I will stay at home and look after the kids and support her. I am both excited and terrified by this prospect. I’m very much looking forward to having more time to spend with the family and on my own work. But after 6 years having a 9-5 job writing code in C and Objective C this feels like a massive change.

I’m not planning on giving up the programming work. Hopefully I will have more time to spend writing code and learning things that I’m interested in. This may well mean that I’m able to post on this blog more often.

Whilst I’m not going to be actively looking for work during this time, if there are any freelance (or even permanent) remote jobs I may well look into these iff I am interested in them.

I’m hoping the new situation will leave me feeling more in control of my own life and am looking forward to having more time to spend on the things I love.

Google Polyline Encoding

The Google Polyline Encoding is used by Google to encode latitudes and longitudes so that later a line can be drawn on a map by decoding the values. You can see the algorithm for encoding a series of latitudes and longitudes to a polyline here.

I have written a C implementation to encode and decode Google Polylines, along with a test iPad app here.

What is this algorithm doing, and why?

The aim of the algorithm is to dramatically decrease the amount of space required to represent the coordinates of the line (especially when encoded as a string). This is done by noticing that the gap between each point is small and so if you encode these diferences then you will need far less than 32 bits to encode each change in value.

Firstly, we need to work out the value that we are going to encode (we will do the latitude, encoding the longitude is identical in process). So we take the current latitude, multiply it by 1e5 and round the value to the nearest integer. Integers have a simpler representation than floats making them easier to manipulate.

From this value we subtract the previous integer latitude (which we calculated in the same way). This small difference between the latitudes is the value that we encode (if there is no previous latitude we choose it as 0).

We now likely have a small value to encode, so we could do with a way of encoding the value that doesn’t require a minimum of 32 bits. We’ll go through the algorithm with the positive value 35, and the negative value -35.

The first problem is negative values. Computers generally encode integers using twos compliment arithmetic; if the value is negative then the top bit is set, worse, small negative values have all of the high bits set. As the aim is to use as few bits as possible we need to encode negative values differently. The binary representations of our values are:

So instead of using twos compliment, the Google algorithm uses a sign bit (this means that a specific bit is used to determine the sign, e.g. if the sign bit is 1 the number is negative, if it’s 0 the number is positive. This bit is ignored by the rest of the number). The forth and fifth steps of the algorithm change the encoding away from twos complement if the value is negative.

Step four of the algorithm shifts the binary to the left, this allows room for the sign bit as the right most bit. We don’t lose the information from the bit shifted off the left of the number because this information is saved in the new sign bit on the right.

Step five of the algorithm inverts the number if the original was negative. This inversion flips the value of the right most bit to a 1 indicating we have a negative value. The inversion also flips all the bits on the left hand side. So small negative values get a lot shorter.

Now we need to encode these binary values using as few base64 values as possible. To do this we need a way of knowing when we have come to the end of our number (otherwise when decoding we won’t know where one number ends and the next begins). There are two ways to do this:

You can start your number with a length field (a fixed number of bits at the beginning which tells you how many base64 values make up the number). To guarantee that we can encode the whole number we need five base64 values, this means that a length field would have to be three bits long.

The other option is to set a continuation bit; if this bit is set, the base64 value we are reading is not the last value in the number.

The continuation bit is a more efficient encoding if we use less than three 64 bit values (as the length field takes three bits), and less efficient if we use more than three. As we are hoping our change in values is going to be small (i.e. less than or equal to three base64 values), a continuation bit should be more effiecient, so is used.

Now we will split the value up to encode it in base64. 64 is $2^6$, so base64 uses six bits. We need to keep a bit for the continuation bit, so we split the number up into five bit sections:

Now the algorithm reverses the orders of the chunks. The reason for this is that it is easier in code to read off the 5 bit chunks from the small end, and write the base64 bits from the large end. Our values now look like this:

Next, the algorithm adds the continuation bit at the left hand side if the chunk isn’t the last chunk of the number. To do this it bitwise ORs all but the last chunk with 0x20 (100000). This leaves our chunks like this:

Finally, we add 63 to each chunk, making that each chunk is a sensible ascii value. Each chunk is stored in its own byte, which can immediately be displayed as an ascii string. In our case after adding 63 we have:

Here is the code to encode the integer difference between the current location and the previous location (after each have been multiplied by 1e5 and rounded):

polyline encoder
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
void encodedIntValue (int32_t val, char *result, unsigned *length)
{
  bool isNeg = val < 0;
  /* Shift the value right by 1 to make room for the sign bit on the right 
     hand side. */
  val <<= 1;

  if (isNeg) {
    /* As the value is stored as a twos compliment value, small values have a 
       lot of bits set so binary not the value. This will also flip the value of the
       sign bit so when we come to decode the value we will know that it is 
       negative. */
    val = ~val;
  }

  unsigned count = 0;

  do {
    /* get the smallest 5 bits from our value and add them to the charaters. */
    result[count] = val & 0x1f;

    /* We've saved the last 5 bits we can remove them from the value. We shift
       the value by 5 meaning that the next 5 bits that we need to save will be
       at the end of the value. */
    val >>= 5;

    if (val) {
      /* We have more bits to encode, so we need to set the continuation bit. */
      result[count] |= 0x20;
    }

    result[count] += 63;

    ++count;
  } while (val);

  if (!length)
    printf ("required value `length` not set in `encodeIntValue`");
  else
    *length = count;
}

I have a C library for encoding and decoding Google polylines together with an iPad test app here.

When running the test app in the simulator make sure you go to the Debug menu and turn location services on (this will turn itself off on rebuilds).

When you stop recording locations the polyline is printed to the console, if you copy it to here you can see where the simulator has taken you.