Archives ::

Sat, 07 May 2011

A quick update, now that I have a slightly more awesome website updater system (which could break at any moment!) - I wrote a little Lego challenge chooser thingy in JavaScript. Just set the difficulty you want, the themes and constraints you want, and what brick sets you can support with your collection, and then click "Choose your game" and the random chooser will choose your game for you. Simple!

Sun, 19 Dec 2010

I'll just leave this code here - my solution to a problem posed earlier on today.

#include <stdio.h>

 * Calculate the square root of a number with the following restrictions:
 * - only use integers
 * - no division, just multiplication and addition
 * - no function calls, just do it all in main

#define TEST_RANGE_HIGH 1000

int main(int argc, char *argv[])
  int result;
  int n;
  int sqrtOfNumber;
  int offset;
  int previousOffset;

  for (n = 1; n < TEST_RANGE_HIGH; n++)

    sqrtOfNumber = 1;
      if (  (sqrtOfNumber * sqrtOfNumber <= n)
         && ((sqrtOfNumber+1) * (sqrtOfNumber+1) > n)
      offset = 1;
      previousOffset = 1;
      while ((sqrtOfNumber + offset) * (sqrtOfNumber + offset) <= n)
        previousOffset = offset;
        offset *= 2;

      sqrtOfNumber += previousOffset;

    printf("%d -> %d\n", n, sqrtOfNumber);

  return 0;


I think this is a reasonable compromise between simple and efficient.

Wed, 10 Nov 2010

I've been messing about with emacspeak on an old laptop, originally as part of a plot to do without screens for a week, but now just to be kind of geeky. Here's a breakdown of what I ended up doing:

  1. Remove the graphical login
    The OS installed is Ubuntu Desktop 10.04. To disable the graphical login, I simply renamed /etc/init/gdm.conf to /etc/init/gdm.disabled.
    sudo mv /etc/init/gdm.conf /etc/init/gdm.disabled
    Then I proceeded with a fairly standard install of emacspeak, which needed a little bit of low-level tweaking (that I've now tragically forgotten about) to make it work.
  2. Remap the CapsLk key to be a Ctrl key
    On this system, the magical incantations appear to be in /etc/default/console-setup. I changed the settings here to include:
    file /etc/default/console-setup:
    and then redid the setup with:
    sudo dpkg-reconfigure -phigh console-setup
  3. Auto-start emacspeak on boot-up
    This one was a little tricky, but after some reading about autologins on the console, I came up with a solution with minimal impact. Firstly, I wrote a little shim to patch into getty:
    New file emacspeakzoe.c:
    #include <unistd.h>
    int main() {
      execlp("su", "zoe", "-c", "'emacspeak'", NULL);
    Compiled up with:
    cc -o emacspeakzoe emacspeakzoe.c
    and finally installed with:
    sudo cp emacspeakzoe /usr/local/sbin
    Then I modified upstart's /etc/init/tty1.conf to read:
    file /etc/init/tty1.conf:
    exec /sbin/getty -n -l /usr/local/sbin/emacspeakzoe 38400 tty1
    Took the machine down, brought it up again, and up it pops.

Now, all I need to do is make the time to really get stuck into using emacspeak for real...

Tue, 08 Jun 2010

My good friend Haegin posted an article about finding free space at the commandline. I was asked if it contained anything too zsh-specific; my response was of course to rewrite the whole thing. After a couple of iterations, here's what we ended up with:

2>/dev/null df | awk '
  total=0; join=0;
NF < 3 {
!join && NF >= 3 {
join && NF >= 3 {
  print int(total / 1024 / 1024) " MB"

The main changes are to accommodate the output format from df better and to move most of the processing into awk. This should make the whole thing a little more portable to other shells in the sh family.

Wed, 28 Apr 2010

We deal with two-dimensional coordinate systems and vectors all the time when we're working with computer systems. Documents are two-dimensional, using a physical representation of the page. The screen is two-dimensional, and us regular mouse-users move the mouse in two dimensions. By and large, when we look at the horizontal movement, we count from the left across to the right - this is how the ruler on your favourite word-processor or graphics system is typically set up. It kind of makes sense given the prevalence of Western influence in mathematics and engineering; our writing is written left-to-right and so we write the numbers on our rulers this way. It's pretty difficult to escape from the left-to-right paradigm, even if you're left-handed. When it comes to the vertical, however, the rules are less fixed.

Sometimes the vertical axis is numbered from bottom to top; positive values above the origin, negative values below. I don't really know where this convention comes from, but it's easy to understand. When I want to say how far away something is from me, chances are I'll say that it's a certain distance in front of me. If I denote that on a piece of paper and then hold it up, I find that the further away something is, the higher up I place it on the paper.

Things that measure up the page:

  • Maps
  • x-y plots
  • Plans and elevations
  • Chessboards (from white side)

Sometimes, however, we measure from top to bottom. Imagine the process of writing on a piece of paper. You're probably imagining something that starts from the edge furthest from you, and coming towards you. Why do we go this way? One reason is that if you start close and move away, there's more chance of smudging what you've already done as you reach over it to write the next line. (It's the same as the process of filling up the auditorium for a lecture, a play or a film. You ought to start in the least accessible places and work outwards, rather than perching on the end of the row and forcing everyone to go past you to get to their seats.) If you start counting lines down the page, why not measure distances that way as well?

Things that measure down the page:

  • Framebuffers
  • Text lines
  • CRT beams (most of the time)
  • Chessboards (from black side)

Different computer systems follow different conventions. For laying out graphics on a page, you often find the origin in the lower left and the Y-axis increasing upwards. For writing text, the Y-axis is more naturally oriented increasing downwards. I've just checked in my current install of Office 2007; Word measures distance down the page, and Visio measures it upwards.

Is this ever a problem? Well, it depends if you're likely to be referring to coordinate positions a lot. If you're writing code that manipulates graphics, you'll need to know which way the vertical axis is measured. Have you memorised which way up the vertical axis is in LaTeX picture environments? Matlab plots? HTML 5 canvases? ImageMagick -draw commands?

The take-home message here is never to assume that you know what the vertical axis direction is. Always check - it may even be configurable, in which case you may have made an incorrect assumption about it. The message extends out to angles (which axis is zero? Is positive clockwise or anticlockwise? Are you measuring in degrees or radians?) and into three (and more) dimensions. It's best to make sure, and perhaps think now and again on where these discrepancies come from.

Thu, 01 Apr 2010

Now and again I get asked a question by a student or an aspiring programmer. It's of the form, ``I've been programming in <comfy language> and now I want to learn some C. What should I concentrate on?'' So, here's some areas I think you need to look at when moving from a higher-level language to C, some of which will only be appropriate when coming from certain languages:

  • Pointers - some higher-level languages manage pointer dereferencing themselves. Java, for example, is syntactically rather quiet about pointers. In C, you have to declare things as pointers, remember that they're pointers and dereference them when you want access to the things they point to. Oh, and if you dereference a pointer that is accidentally not pointing at what you expected, then if you're lucky you'll get a segmentation violation, and if you're unlucky you'll get some weird unpredictable behaviour in your code. If you're not used to this, I recommend finding some specific exercises to improve your pointer-handling.
  • Memory management - a typical high-level language will provide a "new" operator, allocate some appropriate space in memory, and once the memory isn't being used it will eventually reclaim it with a garbage-collector. In C, none of this is the case. To allocate memory, you call the malloc library function. To signal that the memory is available for further allocation, you use the free library function. If you malloc memory and there's no way for that memory to become free, it will just stay allocated throughout the lifetime of the program - the so-called memory leak. There are tools that help, of course - valgrind tracks memory allocations and there are garbage collectors that you can add to your programs - but the fundamental issue is that any time you allocate memory, you need to assign responsibility for freeing it up again.
  • Inheritance/variant records - Most modern programming languages provide at least something like variant records, in which different record formats are combined into a single type. In more object-oriented languages, a class hierarchy can be used to a similar effect, but with variant behaviour thrown in as well. C has neither of these things. Its union facility allows a type to contain different record formats, but it is up to the programmer to keep track of which member's format is being used at any given moment. Inherited behaviour are possible with function pointers, often wrapped up in a set of macros.
  • Interfaces - Interfaces in high-level languages typically identify a set of method signatures that a class must implement. This mechanism allows the specification of an interface without also specifying an implementation, useful in a variety of inheritance idioms. C has no specific construct that replicates this, having no built-in mechanism for mapping a name to a different behaviour depending on the type of an object.
  • Arrays - Arrays are conceptually a fixed-length list of items of a certain type. Arrays are typically safe and easy to use in high-level languages, and there are often libraries of specialised containers for more advanced purposes. In C, arrays are given a fixed size, but there are no checks made on accesses to the array. Furthermore, there is a kind of interchangeability of pointers and arrays that is idiomatic in C but lends itself to abuse. The problems can be surmounted with macros, defensive programming or external checking (e.g. with valgrind).
  • Iteration - Many high-level languages have powerful iterators that step over a data structure to perform some action element by element. For arrays in C, you have two choices for your loop (apart from the style choice of choosing while or for): you can choose to advance a pointer across the array, or you can choose to advance an index over the array.
  • Type-safety - In many languages, there is a level of type-safety. This ensures that an identifier in a program is guaranteed to relate to a meaningful value of a given type. C is not strong on applying domain-level meanings to data; it operates very much in terms of numbers and pointers. It keeps no run-time type information and allows arbitrary type conversions, particularly from pointers to untyped pointers (void *) and back again.
  • Strings - In C, a string is just a pointer to a piece of memory, which contains a character. The string begins at the memory location pointed to, and increments through the memory space, ending when a zero byte is reached. This has some particular consequences - firstly, comparison of pointers to characters will just show whether they point to the same string, not necessarily whether they point to strings that contain the same characters. Secondly, any operation on a string needs to preserve the end-of-string zero marker to ensure that the result can be interpreted as a string. Finally, it is not possible for a string to contain a zero (NUL) byte. Some of these problems are mitigated through the use of standard library routines; it is also possible to create a structure that stores a string length alongside the string contents. Newcomers to C may also find it difficult to think about characters and strings as entirely different types.
  • Standard library - Modern programming languages come with rich and varied libraries to handle containers, databases, GUIs and a whole lot more. In contrast, the C standard library is pretty thin, and is focused on low-level resources like memory, files, strings and so on. However, to compensate for this slight omission, pretty much any utility library is available for use in a C program.
  • Exceptions - C does not have the structured exception mechanism of other languages. Instead, errors are sometimes signalled by a non-zero return code from a function, and sometimes through a mysterious globally-accessible "errno" variable. C programmers often forget to check for exceptional conditions, which leads to crash-prone code. Functions in C can only return a single value as the function result, which further encourages the omission or sidelining of error signalling.
  • Input/output - The C input/output system is heavily constrained. Most of the library routines work on fixed-length buffers. In other languages, there is a uniform model of streams or readers and writers. For C, there is one system for dealing with files, a system for dealing with sockets, and then a whole series of ioctl calls for managing low-level details. There is still little built-in support for unicode characters.
  • Initialisation - In most useful languages, data is initialised to some kind of empty representation before it can be used. In C, anything that has not been initialised will contain whatever leftover data happens to be in the corresponding memory location from some previous activity. Failing to initialise data in a C program is a common source of problems.
  • Polymorphism - The concept of polymorphism appears in many languages. This refers to different behaviours that are called depending on the type of any arguments, direct object or return value specified. Polymorphism is key to object-oriented programs. In C, there is no polymorphism - a particular name for a function represents a particular address of a piece of code to execute. Some forms of polymorphism can be achieved in C through the use of union types.
  • Lambdas - in some languages, it is possible to create an expression that denotes a behaviour as written inside that expression. In C, this kind of structure does not really exist. However, it is possible to write the behaviour out as a function and then create a pointer that points to that behaviour, and pass that pointer to another function to be called.
  • Packaging - C provides two levels of scope - file scope and block scope. The file scope runs from the point of declaration to the end of the file, and is used for "global" data. The block scope runs from the point of declaration to the end of the enclosing block. At any point in the program, a given name refers to a particular entity declared with that name, starting at the block in which the reference occurs and moving out to enclosing scopes until a scope is found in which the name is present. Since function definitions do not nest within other function definitions, there is no possibility of defining closures as found in languages like Scheme.

Maybe in some future posts I'll try to pull together some specific discussion of each of these topics in more depth.

Thu, 21 Jan 2010

I maintain a collection of extensions to deco that allow a disk image to be unpacked as though it were an archive. Version 0.3 of the collection is now out, with contributions from Dirk Jadgmann for iso and Commodor 64 disk images.

Mon, 14 Dec 2009

Following on from the essays (now slightly updated), there's also a projects suggestion page. Go nuts!

Wed, 09 Dec 2009

I have prepared a list of essay topics that I would personally find interesting. If you have comments, find it useful, or (gasp) want me to read an essay on such topics, please do get in touch.

Sun, 11 Oct 2009

So I installed Ubuntu 9.04 on the old laptop, nice and fresh. One thing that I noticed - and this is probably an X thing - is that installing a font file in ~/.fonts makes it available immediately, no need to log out or reboot or any faffing. Now, if only the applications could watch the list of available fonts for changes and update accordingly, that would be even more spiffy.