Avatar

A Short Introduction to Computers, Programming and Abstraction

2014.02.17 15:16:43
Index

image A while ago I was asked to write an introduction to programming, preferably about Java. However, writing yet another beginners tutorial seems a bit boring and without challenge to me, so I set out to do something different. Instead of a tutorial, I will try to formulate a few core concepts and ideas about computing that should be able to be applied everywhere.

This is a bit of a difficult endeavour and since I am far from knowing every language or even how the machines themselves work in great detail, it might very well be that I make a few mistakes or present a few ideas that are not as general as I would like them to be. I apologize in advance for this, but would love if someone of greater understanding in the area can correct me.

At the very core, the tasks a computer fulfills are very simple in nature. A series of bits are transformed and moved according to a set of instructions in a processing unit. A good way to understand this is through a Turing machine; a device that runs along an infinitely long tape and has the ability to read from and write to this tape. By defining certain patterns that the machine can recognize, we can make it execute instructions and modify the tape in a way we want. By then further writing down instructions (and data, though the two are essentially the same) onto the tape, we can make it execute calculations, or in other words, transformations of data.

This very ambiguous and primitive model allows for an immense amount of possibilities. However, writing down every instruction by hand quickly becomes very repetitive. When a repetition or a certain pattern in the instructions occurs, the need for abstraction arises. The most common form of this abstraction is a block. We designate a certain area of instructions as a block and instead of repeating the block, we simply repeat an instruction to execute the block.

Another common repetition lies in the data we use for the instructions. In order to abstract this away, we introduce variables; a form of instruction that points to a certain region on the tape that contains the data we want to operate on. At least at this point we have to notice that we need some kind of way to segment and control the tape. We have to store a form of index that tells us how the tape is split and which regions do what. Especially when it comes to writing, we also have to keep in mind to either extend a “writing region” if the value becomes too large or to move the value to a place that has enough tape available. Otherwise we would end up writing into other regions of the tape and thus destroying existing data (buffer overflow, segfault). Now of course we may want to design a system that doesn't care and puts the burden onto the programmer to check that just enough is written, but it remains a problem nevertheless.

As a side effect to blocks and variables, we have introduced a necessary instruction to our machine: Jumping. This tells it to jump to a certain position on the tape and continue execution from there. Using this jump mechanic allows us to introduce the loop abstraction. This works very simply by jumping to the beginning of a block again and again, either indefinitely or until a certain instruction delegates execution away from the jump.

Having blocks, loops and variables is already pretty good. But we need to go further. We can extend the block abstraction by allowing unbound (free; not pointing to any position) variables within the block. These variables will only be bound once the block is executed. With this, we have created functions. Often times, this is where the language abstraction stops. The language offers us these constructs to eliminate repetition within our instructions. Advanced languages will also do a lot of background work to optimize the instructions further, protect the user from errors and faults within the program and try to detect mistakes ahead of time.

Do note that by introducing a function, we can introduce a new form of instruction to our device. At this very general level there is no need to differentiate between “processor instructions”, “language instructions” and instructions made with our language. They all can theoretically look syntactically identical in their use.

But we needn't stop with abstraction at this point. There is still one kind of pattern that remains to be abstracted: the syntax, if you will. A pattern in the structure of our instructions; the way they are organized, without necessarily concentrating on what they individually are. An example of this form of abstraction is the loop from before. Creating a block and jumping back to its beginning according to a predicate is a certain structure that can be abstracted into a loop construct. This kind of abstraction necessitates a uniform syntax; that there is no real distinction made between “core instructions” and “user instructions”, no distinction between language syntax and the program itself. This lack of distinction makes the abstraction of syntax possible to be executed safely and reliably; something which cannot be done in most languages.

Everything up to this point has been an abstraction on the language design level. You can use these constructs if they are offered by the language. Any language merely offers a special type of data, one that can be read and compiled or interpreted into another form of instructions. The way this transformation works decides how efficient the language is in terms of both computation and writing, and it also decides which tasks it is usable for in the first place. This consolidates the fact that there is no real distinction between data and instructions. This distinction is only made in order to think properly about the programs we write, since their purpose is to modify a certain kind of input to a certain kind of output. Whether this happens to be instructions or not is irrelevant.

Now that we have covered language abstractions, let's take a look at abstractions within a program. These abstractions deal with the way the data is treated. Functions should be designed to take a limited set of inputs and perform a clearly defined modification on the input, resulting in a limited set of outputs. This idea of keeping functions small and fixed on a single purpose is a good one, as it means they remain easy to understand and use. The less connections a function has to its environment the easier it becomes to comprehend and debug, as the user only has to keep a limited set of information in their head. Having many small functions that perform an easily understandable action propagates upwards. Other functions that build on these are in turn easier to understand as well.

The real issue however lies in the composition of the functions, the way everything is orchestrated into different layers of abstraction. Pulling this off well is an extremely difficult task and it is all too easy to forget about corner cases or rely on methods that lead to unexpected side-effects. Being able to use the tools your language provides to their full extent to create code that is understandable, reusable and efficient takes a lot of experience, knowledge and dedication.

What I can say with certainty is that so far every language I've learned has taught me a lot of things over the course of me using them and I'm certain that no matter for how long you work with one, there will be new and useful things to discover when switching to a different one. This is mostly due to different languages offering differing views and environments for your problems, which leads to new insights on how to solve them, but also due to the fact that most languages are designed with a specific purpose or methodology in mind, which means that people using it will offer a certain mindset. Accumulating the solutions these environments offer broadens your understanding of techniques and where they are applicable.

I don't think I can offer much more advice on this broad of an area, as everything else would delve into advice that is language specific. Hopefully this generalized explanation was still useful, or in the very least interesting. If I should explain a certain part in more detail or if something else intrigues you, do let me know and I will see that I can write about that as well. image

Written by shinmera