Sun, 19 Dec 2010

I'll just leave this code here - my solution to a problem posed earlier on today.

#include <stdio.h>

 * Calculate the square root of a number with the following restrictions:
 * - only use integers
 * - no division, just multiplication and addition
 * - no function calls, just do it all in main

#define TEST_RANGE_HIGH 1000

int main(int argc, char *argv[])
  int result;
  int n;
  int sqrtOfNumber;
  int offset;
  int previousOffset;

  for (n = 1; n < TEST_RANGE_HIGH; n++)

    sqrtOfNumber = 1;
      if (  (sqrtOfNumber * sqrtOfNumber <= n)
         && ((sqrtOfNumber+1) * (sqrtOfNumber+1) > n)
      offset = 1;
      previousOffset = 1;
      while ((sqrtOfNumber + offset) * (sqrtOfNumber + offset) <= n)
        previousOffset = offset;
        offset *= 2;

      sqrtOfNumber += previousOffset;

    printf("%d -> %d\n", n, sqrtOfNumber);

  return 0;


I think this is a reasonable compromise between simple and efficient.

Wed, 10 Nov 2010

I've been messing about with emacspeak on an old laptop, originally as part of a plot to do without screens for a week, but now just to be kind of geeky. Here's a breakdown of what I ended up doing:

  1. Remove the graphical login
    The OS installed is Ubuntu Desktop 10.04. To disable the graphical login, I simply renamed /etc/init/gdm.conf to /etc/init/gdm.disabled.
    sudo mv /etc/init/gdm.conf /etc/init/gdm.disabled
    Then I proceeded with a fairly standard install of emacspeak, which needed a little bit of low-level tweaking (that I've now tragically forgotten about) to make it work.
  2. Remap the CapsLk key to be a Ctrl key
    On this system, the magical incantations appear to be in /etc/default/console-setup. I changed the settings here to include:
    file /etc/default/console-setup:
    and then redid the setup with:
    sudo dpkg-reconfigure -phigh console-setup
  3. Auto-start emacspeak on boot-up
    This one was a little tricky, but after some reading about autologins on the console, I came up with a solution with minimal impact. Firstly, I wrote a little shim to patch into getty:
    New file emacspeakzoe.c:
    #include <unistd.h>
    int main() {
      execlp("su", "zoe", "-c", "'emacspeak'", NULL);
    Compiled up with:
    cc -o emacspeakzoe emacspeakzoe.c
    and finally installed with:
    sudo cp emacspeakzoe /usr/local/sbin
    Then I modified upstart's /etc/init/tty1.conf to read:
    file /etc/init/tty1.conf:
    exec /sbin/getty -n -l /usr/local/sbin/emacspeakzoe 38400 tty1
    Took the machine down, brought it up again, and up it pops.

Now, all I need to do is make the time to really get stuck into using emacspeak for real...

Tue, 08 Jun 2010

My good friend Haegin posted an article about finding free space at the commandline. I was asked if it contained anything too zsh-specific; my response was of course to rewrite the whole thing. After a couple of iterations, here's what we ended up with:

2>/dev/null df | awk '
  total=0; join=0;
NF < 3 {
!join && NF >= 3 {
join && NF >= 3 {
  print int(total / 1024 / 1024) " MB"

The main changes are to accommodate the output format from df better and to move most of the processing into awk. This should make the whole thing a little more portable to other shells in the sh family.

Wed, 28 Apr 2010

We deal with two-dimensional coordinate systems and vectors all the time when we're working with computer systems. Documents are two-dimensional, using a physical representation of the page. The screen is two-dimensional, and us regular mouse-users move the mouse in two dimensions. By and large, when we look at the horizontal movement, we count from the left across to the right - this is how the ruler on your favourite word-processor or graphics system is typically set up. It kind of makes sense given the prevalence of Western influence in mathematics and engineering; our writing is written left-to-right and so we write the numbers on our rulers this way. It's pretty difficult to escape from the left-to-right paradigm, even if you're left-handed. When it comes to the vertical, however, the rules are less fixed.

Sometimes the vertical axis is numbered from bottom to top; positive values above the origin, negative values below. I don't really know where this convention comes from, but it's easy to understand. When I want to say how far away something is from me, chances are I'll say that it's a certain distance in front of me. If I denote that on a piece of paper and then hold it up, I find that the further away something is, the higher up I place it on the paper.

Things that measure up the page:

  • Maps
  • x-y plots
  • Plans and elevations
  • Chessboards (from white side)

Sometimes, however, we measure from top to bottom. Imagine the process of writing on a piece of paper. You're probably imagining something that starts from the edge furthest from you, and coming towards you. Why do we go this way? One reason is that if you start close and move away, there's more chance of smudging what you've already done as you reach over it to write the next line. (It's the same as the process of filling up the auditorium for a lecture, a play or a film. You ought to start in the least accessible places and work outwards, rather than perching on the end of the row and forcing everyone to go past you to get to their seats.) If you start counting lines down the page, why not measure distances that way as well?

Things that measure down the page:

  • Framebuffers
  • Text lines
  • CRT beams (most of the time)
  • Chessboards (from black side)

Different computer systems follow different conventions. For laying out graphics on a page, you often find the origin in the lower left and the Y-axis increasing upwards. For writing text, the Y-axis is more naturally oriented increasing downwards. I've just checked in my current install of Office 2007; Word measures distance down the page, and Visio measures it upwards.

Is this ever a problem? Well, it depends if you're likely to be referring to coordinate positions a lot. If you're writing code that manipulates graphics, you'll need to know which way the vertical axis is measured. Have you memorised which way up the vertical axis is in LaTeX picture environments? Matlab plots? HTML 5 canvases? ImageMagick -draw commands?

The take-home message here is never to assume that you know what the vertical axis direction is. Always check - it may even be configurable, in which case you may have made an incorrect assumption about it. The message extends out to angles (which axis is zero? Is positive clockwise or anticlockwise? Are you measuring in degrees or radians?) and into three (and more) dimensions. It's best to make sure, and perhaps think now and again on where these discrepancies come from.

Thu, 01 Apr 2010

Now and again I get asked a question by a student or an aspiring programmer. It's of the form, ``I've been programming in <comfy language> and now I want to learn some C. What should I concentrate on?'' So, here's some areas I think you need to look at when moving from a higher-level language to C, some of which will only be appropriate when coming from certain languages:

  • Pointers - some higher-level languages manage pointer dereferencing themselves. Java, for example, is syntactically rather quiet about pointers. In C, you have to declare things as pointers, remember that they're pointers and dereference them when you want access to the things they point to. Oh, and if you dereference a pointer that is accidentally not pointing at what you expected, then if you're lucky you'll get a segmentation violation, and if you're unlucky you'll get some weird unpredictable behaviour in your code. If you're not used to this, I recommend finding some specific exercises to improve your pointer-handling.
  • Memory management - a typical high-level language will provide a "new" operator, allocate some appropriate space in memory, and once the memory isn't being used it will eventually reclaim it with a garbage-collector. In C, none of this is the case. To allocate memory, you call the malloc library function. To signal that the memory is available for further allocation, you use the free library function. If you malloc memory and there's no way for that memory to become free, it will just stay allocated throughout the lifetime of the program - the so-called memory leak. There are tools that help, of course - valgrind tracks memory allocations and there are garbage collectors that you can add to your programs - but the fundamental issue is that any time you allocate memory, you need to assign responsibility for freeing it up again.
  • Inheritance/variant records - Most modern programming languages provide at least something like variant records, in which different record formats are combined into a single type. In more object-oriented languages, a class hierarchy can be used to a similar effect, but with variant behaviour thrown in as well. C has neither of these things. Its union facility allows a type to contain different record formats, but it is up to the programmer to keep track of which member's format is being used at any given moment. Inherited behaviour are possible with function pointers, often wrapped up in a set of macros.
  • Interfaces - Interfaces in high-level languages typically identify a set of method signatures that a class must implement. This mechanism allows the specification of an interface without also specifying an implementation, useful in a variety of inheritance idioms. C has no specific construct that replicates this, having no built-in mechanism for mapping a name to a different behaviour depending on the type of an object.
  • Arrays - Arrays are conceptually a fixed-length list of items of a certain type. Arrays are typically safe and easy to use in high-level languages, and there are often libraries of specialised containers for more advanced purposes. In C, arrays are given a fixed size, but there are no checks made on accesses to the array. Furthermore, there is a kind of interchangeability of pointers and arrays that is idiomatic in C but lends itself to abuse. The problems can be surmounted with macros, defensive programming or external checking (e.g. with valgrind).
  • Iteration - Many high-level languages have powerful iterators that step over a data structure to perform some action element by element. For arrays in C, you have two choices for your loop (apart from the style choice of choosing while or for): you can choose to advance a pointer across the array, or you can choose to advance an index over the array.
  • Type-safety - In many languages, there is a level of type-safety. This ensures that an identifier in a program is guaranteed to relate to a meaningful value of a given type. C is not strong on applying domain-level meanings to data; it operates very much in terms of numbers and pointers. It keeps no run-time type information and allows arbitrary type conversions, particularly from pointers to untyped pointers (void *) and back again.
  • Strings - In C, a string is just a pointer to a piece of memory, which contains a character. The string begins at the memory location pointed to, and increments through the memory space, ending when a zero byte is reached. This has some particular consequences - firstly, comparison of pointers to characters will just show whether they point to the same string, not necessarily whether they point to strings that contain the same characters. Secondly, any operation on a string needs to preserve the end-of-string zero marker to ensure that the result can be interpreted as a string. Finally, it is not possible for a string to contain a zero (NUL) byte. Some of these problems are mitigated through the use of standard library routines; it is also possible to create a structure that stores a string length alongside the string contents. Newcomers to C may also find it difficult to think about characters and strings as entirely different types.
  • Standard library - Modern programming languages come with rich and varied libraries to handle containers, databases, GUIs and a whole lot more. In contrast, the C standard library is pretty thin, and is focused on low-level resources like memory, files, strings and so on. However, to compensate for this slight omission, pretty much any utility library is available for use in a C program.
  • Exceptions - C does not have the structured exception mechanism of other languages. Instead, errors are sometimes signalled by a non-zero return code from a function, and sometimes through a mysterious globally-accessible "errno" variable. C programmers often forget to check for exceptional conditions, which leads to crash-prone code. Functions in C can only return a single value as the function result, which further encourages the omission or sidelining of error signalling.
  • Input/output - The C input/output system is heavily constrained. Most of the library routines work on fixed-length buffers. In other languages, there is a uniform model of streams or readers and writers. For C, there is one system for dealing with files, a system for dealing with sockets, and then a whole series of ioctl calls for managing low-level details. There is still little built-in support for unicode characters.
  • Initialisation - In most useful languages, data is initialised to some kind of empty representation before it can be used. In C, anything that has not been initialised will contain whatever leftover data happens to be in the corresponding memory location from some previous activity. Failing to initialise data in a C program is a common source of problems.
  • Polymorphism - The concept of polymorphism appears in many languages. This refers to different behaviours that are called depending on the type of any arguments, direct object or return value specified. Polymorphism is key to object-oriented programs. In C, there is no polymorphism - a particular name for a function represents a particular address of a piece of code to execute. Some forms of polymorphism can be achieved in C through the use of union types.
  • Lambdas - in some languages, it is possible to create an expression that denotes a behaviour as written inside that expression. In C, this kind of structure does not really exist. However, it is possible to write the behaviour out as a function and then create a pointer that points to that behaviour, and pass that pointer to another function to be called.
  • Packaging - C provides two levels of scope - file scope and block scope. The file scope runs from the point of declaration to the end of the file, and is used for "global" data. The block scope runs from the point of declaration to the end of the enclosing block. At any point in the program, a given name refers to a particular entity declared with that name, starting at the block in which the reference occurs and moving out to enclosing scopes until a scope is found in which the name is present. Since function definitions do not nest within other function definitions, there is no possibility of defining closures as found in languages like Scheme.

Maybe in some future posts I'll try to pull together some specific discussion of each of these topics in more depth.

Fri, 19 Mar 2010

Well, I finally managed to achieve another of the Day Zero goals. I have been up and about by 06:30 5 days in a row, and it's been a really productive week! Not a lot to show for it, as it's mainly been the moving into a different house, but still. I may try to make 06:30 the standard time to be up at. Maybe.

Sun, 14 Feb 2010

I noticed, in recent ponderings, that there are a few shows where I've seen all of the first season, and then none of the rest. Here's the lowdown on what, and perhaps why.

Dawson's Creek
I was watching this with housemates at the time. It was pretty slow, and full of characters that I didn't really care for. I related a little to Joey - poorer family, harder times, travelling long distances to have a social life - but even so, the characters seemed like TV drama idiots with little depth.
Survivors (modern)
The thing with British dramas is that we tend to go for short runs full of plot. This can be a blessing, especially if it turns out to be dire (Bonekickers, I hear, falls neatly into this category). When I first heard about Survivors, I thought a great many things, such as "please be better than Last Train" and "this has potential". What I though after I'd seen the first season is "these characters could use more character and less useless" and "this season could have done with more plot". I noticed recently that a second season is airing, about halfway through at the moment. I feel no urge to catch up. I am reliably informed that the original series is more compelling, so I may have to go and seek this out instead.
Roswell High
A great premise here, the potential to really examine the intricacies of supposed family bonds, overcoming differences in relationships, and the effect of weird, otherworldliness on the mundane. What actually happened was that the aliens' powers hardly ever mattered, the on-again off-again relationships were tiresome and the mysterious destiny of the characters was revealed, as usual, at a snail's pace. I caught a couple of the later episodes where the mysterious destiny was revealed, and I'm not sorry that I failed to wade through the rest of it to get there.
OK, confession, I didn't really watch the whole of the first season. I missed a couple of bits out. As far as I can tell, though, I didn't miss anything important. This is probably because, in Lost, very little actually happens, and there's too little consistent view of the characters for me to find anyone to really relate to, understand or sympathise with. I suspect this is one of those shows that's better to talk about than to actually watch.
Earth: Final Conflict
This one had so much potential, and of all of these series, is the one where I'm still a little intrigued to see beyond the first season. There was ample commentary on the impact of the Companions' arrival on society, on political wranglings and secrets, and on the effect of knowing just enough to be dangerous. While there's no one character I related to here, I felt that I understood them all (one of the many things that made Babylon 5 great) and that none of them were being excessively stupid to support a lazy plot. Probably it was just scheduling that made it difficult to see the later material.
Oh boy, here's a big can of worms. The gradual reveal strikes again. There was an obvious shadowy organisation, which was pretty comical as such organisations go, and a hint of another shadowy organisation that its mysterious leader was working against. There was a time-travel plot and a precognition plot, both of which seemed to be deliberately spread out so that it's difficult to see if there's any inconsistency. The characters were pretty much universally idiots, too. When one gets awesome powers, one needs to practice with them. The only character who I felt a connection with, who truly seemed to be at the mercy of her "powers", was Niki - and even then, she did some stupid things regarding finding out about her past. Even the motif of the eclipse juxtaposing with the emergence of accelerated mutations felt weak. At least in the X-Men films they tried to include wide-scale social commentary from the start, exploring a little of the public reaction right from the start. Maybe there was just too much going on in Heroes for any one redeeming part of the plot to shine through enough to intrigue me.
The West Wing
So, on to another one that I might catch up with yet. See, I really enjoyed Sports Night, another Aaron Sorkin show with a similar structure. The same pace of dialogue showed up in West Wing, which was pretty cool. I ended up borrowing a box-set to catch up with the whole season. What tipped it for me, though, is that once I'd got to the end and reached the cliffhanger... I realised I didn't really care who did it or what the consequence was. I'd been watching more for the writing style than the plot or the characters. Admittedly, they did cover some interesting issues in interesting ways, and that's pretty powerful stuff, and maybe that's what will eventually draw me back into it.

Coming soon, later or perhaps never: shows I watched all the way through despite myself, shows that could be redone far better, shows where I've only managed to catch a handful of episodes and perhaps some shows that just need that extra season to round them out.

Thu, 21 Jan 2010

I maintain a collection of extensions to deco that allow a disk image to be unpacked as though it were an archive. Version 0.3 of the collection is now out, with contributions from Dirk Jadgmann for iso and Commodor 64 disk images.