The CompileMe page.

The purpose of this page is to help understand the basics of configuration, compilation and installation of a generic  *nix project into our new Linux Box.

*BSD has a special mechanism that is going to be explained in the BsdMe page.

For better understanding of this page
you have to read:
SpareMe
DistroMe
LicenceMe
TipMe
SourceMe

If inpatient, go directly to the Overview paragraph in this page.

Even if portability is a common issue in most *nix OSes and projects, that doesn't mean that this task is a very easy one.

Paragraphs in this page are:

Difficulties
Usual steps
Overview
In depth view of the compilation procedure
Making a Slackware package from a project source





Difficulties

1) Processor families have very different instruction sets.

2) Endianess is a serius issue considering how data is stored in memory.

Intel processor (including AMD and CYRIX) families are Little Endian while Alpha Processors for example are Big Endian.

Little Endianess describes the byte order of any binary element, from bit to byte to word (
max register width data),
The byte order refers to any memory storage, from hardware register to RAM.

Big Endian: higher bit that describes higher power of 2 is stored in higher position in a memory array.
Little Endian: lower bit (lower power of 2) is stored to higher position in a memory array.

3) Word width is also a concern when defining constants and variables.

Another difficulty is the width of bits when defining words (max register width data).
This case varies very much between 16-bit, 32-bit and 64-bit processors.
Combine this with the Big/Little Endianess and you can figure the overall difficulty.

4) A hardware platform is not only the Processor, but also controllers,  memory, bus, peripherals etc.

5) Different Operating Systems have more or less different Interrupts.

Q: What's an interrupt?
A: Don't confuse this with the interrupt request. Interrupt is a bios or OS command that instructs the processor to leave whatever it currently does, do a specific task and then return to the previous job.

   Interrupt reguest is a by number alert to the processor that a serious event has happened to a peripheral (sound card finished with the audio data for example and requests for more) thus requiring immediate attempt.
   IRQs are one of the constant pains when adding a new device to the computer. There are simply never enough! Now, with APIC (Advanced Programmable Interrupt Controller) things tend to be easier.

Q: Now it seems that portability is impossible.

A: When a project in source is going to be compiled in our system, certain steps are followed.





Usual steps

The installation of a portable project in
source code usually takes place in 3 steps:

a) Configuration.


The project's "configure" script checks always:

The processor family and/or specific model
Endianess and word width
The existance of the right compiler and-or compiler addon
Whether prequisite libraries exist or not

And optionally:

Special libraries (like the SDL Lib for X)
Modules (usually sound, USB etc.)
Daemons (like the Arts or Esound Sound servers)
Devel Packages, (usually source) to combine and compile, if needed

If all that is needed exists, the configure script does all that has to be done and exits successfully.

The configuration can understand parameters.

As the installation path is defined in this step, Always Be Aware Of It, especially if the project you want to use is in a computer that you don't have root permissions.

b) Compilation.

There is a "make" executable in the path that understands the instructions written by the configuration and does the right thing, without intervention from the user.

The process usually takes a while (have patience) and the output is very verbose.
When all is done corectly, make exits succesfully.

It is advisable to make the above two steps as regular users, Not As root.

c) Installation

Here there are two options, that have to be considered in the first step.

1) You can be root in the system you are working on:

Then, be root and issue the command to install (usually "make install")
The usual path to installation, especially if "configure" ran without parameters, is /usr/local.
So, the bin path is in /usr/local/bin, the lib path in the /usr/local/lib etc.

2) You can't be root in the system you are working on:

In this case you have to specify the install dir to be in your ~home i.e. ~/usr in the configuration step: ./configure --prefix=~/usr

If this is done correctly, and this dir exists you simply issue the command "make install" as a user.
careful though, ~/usr/bin, ~/usr/lib etc. might have to be pre- created.



Overview

In a GNU/Linux Distro, the tools are almost always GNU.
I assume that you have uncompressed the project into a dir you are permitted to.
"Op." steps are Optional, depending on you or on the instructions of the project.
All the usual steps are:
cd <project>
Go to the project source top directory.
less README ; less INSTALL
Read the README and INSTALL text files, if exist
./configure --help | less
Op. see all the configuring parameters and the defaults.
./configure
Issue the configuration command
make depend
Op. depending on the Project documentation
make
The compile process
make check
Op. depending on the Project documentation
su -c 'make install'
Single command to install as root
make clean
Op. cleanup the binaries created in the project dir
su -c 'make uninstall'
Op. uninstall as root
make distclean
Op. clean up files from the configuration step

This is an example of my projects dir structure:
/home/shared/setup/<project role>/<project name>
with the home/shared dir created Not As root.

Again, to be very boring and frustrating:

Be root Only to install to or uninstall from the system!




In depth view of the compilation procedure

Ok. We know how the general procedure to install from source goes, but how about a more computer oriented description?


The procedure we name "compilation" is broken in many steps that require special tools, all included in the gcc (Gnu C Compiler) package which is always included in all GNU/Linux installations.

The first step is the source file, the last step is machine code that a processor or virtual machine understands.


1) Preprocessor

The preprocessor is the first tool that handles our source code. Most common tasks are:

a) Remove all comments. Comments are useful in source, but obsolete in the binaries.
b) Prepair the declarations (constants etc.) inside the source code.
c) React accordingly to the "#" starting lines.

One example of "#" starting line is the #include statement.

The include files are simply external source files that either:
* contain source code that can easily be outside the main source file
* define functions that use the present libraries in the system.

Every library or set of libraries must have its include file, as an intermediate source to the actual library dependent source code, containing the library functions that must be defined.
So, think of include file as the "way to use the libraries".

These "include" files may be absent from the system only if the user plans never to compile.
Include files usually are inside: /usr/include /usr/local/include and so on.


2) Compiler

The compiler has now the source prepared by the preprocessor in order to produce the appropriate assembly code for the specified platform i.e. Linux for PowerPC.

Assembly is a completely kernel-cpu oriented language as close to the "bare metal" as a symbolic language can be.

Gcc is a  retargetable compiler, meaning that it can work for different cpu architectures and it's not limited to the platform it currently works on. This increases portability a lot, but somehow compromises the quality that a cpu-specialized compiler would produce.

GCC is not limited to C. It includes frontends for: ADA, C, C++, Objective-C, Fortran and Java. The frontend issue is already complicated by itself and exceeds the purpose of this paragraph.

This assembly code can be generic, oriented to the processor family (i386) as well as optimized for the specific type of processor we use, e.g AthlonXP, or alternatively using an intermediate instruction set (486 or PentiumMMX).


3) Assembler

Now it's the assembler's turn to take the assembly file and produce an object file from it.
But what is an object file?
The object file is a machine code binary file oriented to a specific kernel-processor, but still not yet an executable or library.

This intermediate step is useful for the overall compilation method.
Modularity helps programming.


4) Linker

Now it's the linker's turn to create the final executable or library. The method is as follows:

a) Multiple object files can be assembled to a single executable or library.

Keeping small code (objects) before making the whole, definetly helps the overall procedure.
It's now the linker's turn to construct the whole.

b) The undefined symbols are linked to the relevant libraries they point to.

"Undefined Symbols" means that functions or variables that are mentioned in the object are not inside it, therefore being inside another object (think library).

The linker's job is to link these symbols to the actual code that they are referring to.
There are 2 types of linking:

Static linking: The referenced library gets embeeded inside the object. Thus, every executable has its own copy of the needed libraries inside it.

The advantage is independency and stability.
The disadvantages are great loss of disk space and the need of relinking and/or recompiling the entire project when the referenced libraries get upgraded.

Dynamic linking: Instead of being embeeded in the executable, the library gets loaded by the Operating System's loader when the executable runs (loadtime) or when needed (runtime).

The advantages are great disk economy and instant upgrade of the overall system when the libraries get upgraded, because no relinking is needed.

The disadvantage is heavy dependency; if the library gets deleted/renamed/moved or updated to an incompatible version for the programs that need it, malfunction is more than probable to happen.

In Unix compatible OSes, all dynamic libraries are named after the Shared Object extension ".so" and are most possibly located inside /lib, /usr/lib and so on.

Issue: ls -lh /lib/*so for a little library browsing.

c) The object's address space may be relocated inside a program.

When the compiler makes the objects, it doesn't know what they will be used for, therefore assumes a common base address, for example 0.

"Memory Address" is a virtual memory location used by software that will  be translated by the hardware into real (physical) memory address.

The linker, knowing which objects will be assembled into whole programs, relocates the object memory addresses.

The Memory Address issue is a lot more complicated of course, and it also includes the Virtual Memory (or Swap Space) mechanism that our modern OSes use in order to extend our computer's physical memory.




Making a Slackware package from a project source

There are cases where we want to compile a project once and use it many times:
1) in a network full of same disto boxes,
2) after reinstalling our favorite distro once more,
3) to send the project to a less experienced friend,
4) to save time especially when having to compile again and again.

I will try to cover the simplest use of the packaging method for Slackware.
This paragraph is just an introduction to the packaging method in Slackware.

The makepkg script does all the work, constructing a directory tree relatively from the current directory it runs and down.

This means that if we are currently in /home/shared/setup/Pool and this location includes a "usr" directory, then the makepkg script would make a package containing /usr as the first subdir below the "root" of the filesystem...
... which means that we can easily reproduce the desired tree wherever we want.

So, /home/shared/setup/Pool/usr/local/bin can easily become /usr/local/bin in the package, as long as we run the makepkg script from
/home/shared/setup/Pool. Simple, no?

Reproducing the directory tree is easy if we configure the package to use
/home/shared/setup/Pool/usr/local base installation location in the configure step:
./configure --prefix=
/home/shared/setup/Pool/usr/local
Then make will compile the package and make install will place it accordingly.

I suggest to have a regular user owning /home/shared/setup, so that root is not needed for the install step.

Now, we change directory: cd /home/shared/setup/Pool
and become root: su

and issue: makepkg -c y packagename.tgz

The script will make the package in no time and change the ownerships of all the files to root and the permissions to 755 (writable for root but not for the rest), which is the most secure way for a decent unix compatible os.

Afterwards, root must move the package.tgz to a suitable location containing custom packages and change the name of the package to a more appropriate name.
Then, still as root do: rm -rf /home/shared/setup/Pool to reinitialize the location for the next compilation and packaging job.

An overview of the packaging mechanism for Slackware is here.
A sample script that could do all the work can be found here.