What is source code, anyway?

I’m preparing a talk on Open Source for a group of project managers, and my contact mentioned that being clear about what source code actually is would help.

So, here we go. I’m sharing this below for the benefit of mankind my two readers.

Apart from that, I’m going to tell them How the ASF works, point them to a paper by the European Working Group on Free Software (mainly about licences but there are lots of good things in there), and talk about the lightweight tools that we use to reach our goals.

Here’s the blurb about source code – WDYT?

What is “source code”?

Wikipedia says:

Source code (commonly just source or code) is any series of statements written in some human-readable computer
programming language.

and

A computer program’s source code is the collection of files that can be converted from human-readable form to
an equivalent computer-executable form. The source code is either converted into executable by a compiler for a
particular computer architecture, or executed from the human readable form with the aid of an interpreter.

An example in the C language

Here’s the source code of a very simple program, written in a programming language called “C”:

main()
{
printf("Hello, World!\n");
}

When compiled and executed, this program displays Hello, World! on screen.

Real programs are much more complex, of course, often consisting of thousands of such source code files,
totalizing tens of thousands or even millions of lines of source code.

The compilation process creates a machine-readable binary version of the program, optimized for
quick execution on a given computer system.

Before Open Source, this binary version was the only one people
would get when they bought software: a static, frozen version of the software.

Here’s a binary listing of the compiled version of the above program (on mac OSX, using the od -h command):

0000000     feed    face    0000    0012    0000    0000    0000    0002
0000020     0000    000a    0000    0500    0000    0085    0000    0001
0000040     0000    0038    5f5f    5041    4745    5a45    524f    0000
0000060     0000    0000    0000    0000    0000    1000    0000    0000
... about 400 more lines like this ...
0026360     6e64    5f64    796c    645f    7265    6d6f    7665    5f69
0026400     6d61    6765    5f68    6f6f    6b00    0000
0026414

(Note that “feed face” don’t mean “give food to your head”, they’re hexadecimal values ;-)

As you can imagine, modifying the program from the source code (for example to change the message) is relatively easy,
but doing the same change on the binary, compiled version of the program, although
theoretically possible, is much harder, and in most practical cases impossible.

Update: thanks to Ezra for pointing out my mistake, the binary was a dump of the source code instead of the compiled version, it is corrected now. Also, the three comments seem to indicate that you readers are indeed more than two. I stand corrected on both points.

5 Responses to What is source code, anyway?

  1. Matthias says:

    Hey, you have twice as many readers as I have!

    As far as source code goes, I think your explanation is pretty clear, short and to the point. Best not mention Assembler, though :-).

  2. Nice! We recently hired a PR girl and I had to explain her these exact same basics. I’ll point her to this post :-)

  3. Ezra says:

    It looks as though you’ve listed the source code in hex, not the object code. Was that intentional?

  4. Glen Mazza says:

    Good luck with your presentation, Bertrand. (Aber leider muss ich Ihnen einer Warnung geben, dass Ihr “Gesicht-fuettern” Witz nicht sehr lustig ist… ;)

    Glen

  5. Stephan Schmidt says:

    Source code is nothing but a very detailed, formalized specification.

%d bloggers like this: