Saturday, June 1, 2013

ARM Startup and Compilation and the Relevant Files on the STM32F4-Discovery


Author: Richard Ballard
Date Created: 30/05/13
Date Modified: 23/06/13
Version: 2.0

I've revised this post to make it easier to understand. The previous version was written during my university exam period so it was a tad rushed.

This tutorial is about understanding how an ARM microcontroller starts its code and what is needed for this to occur. I will very quickly go thorough how the basic process and then get stuck into the key files that are needed. So to make this tutorial too generic, I'm using the STM32F4-Discovery Board as the target, so the file names and code will be geared towards this but should be easily translatable to other microcontrollers.

After I've discussed the startup process in basic form, I'll discuss the compilation process. This includes what the linker does, how the library is connected to our code and then what the make files does. The makefile is equivalent to how IDEs compile everything automatically but are much more flexible.

After I'll then discuss the startup file (startup_stm32f4xx.s) then move onto the linker file (stm32_flash.ld), makefile and some basics of the peripherals library provided by ST. The peripherals library is a set of functions that provide an easy to use interface to manage the registers on the microcontroller. All ARM Cortex-M based microcontrollers should have something similar provided by the manufacturer. If one is not provided, it should be very easy to create a similar library. The files used in this document can all be downloaded as part of Jeremy Herbert's STM32 Template from https://github.com/jeremyherbert/stm32-templates/archive/master.zip. They can also be downloaded from ST (minus the makefile) by downloading the STM32F4-Discovery Firmware Applications Package. The code in Jeremy's templates are from the IO_toggle example program in the application package.


Part 1 – The Startup Process)
The startup process can be quite confusing for anyone new to ARM microcontrollers. I know I found this to be the case with all these weird files such as the “startup file” and the linker script. Hopefully here I can explain what is going on and enable the reader to understand why a file or process is needed.

SO, when the microcontroller is powered on, it starts executing code from the location of the reset vector within the exception vector table [1]. This should be at either address 0 or address 0xFFFF0000. This vector then needs to perform some or all of the following operations (copied from the Cortex A series Programmers Guide). It is in effect similar to the BIOS on a PC.
    • in a multi-processor system, put non-primary processors to sleep
    • initialise exception vectors
    • initialise the memory system, including the MMU
    • initialise processor mode stacks and registers
    • initialise variables required by C
    • initialise any critical I/O devices
    • perform any necessary initialisation of NEON/VFP
    • enable interrupts
    • change processor mode and/or state
    • handle any set-up required for the Secure world
    • call the main() application
The code for this process is called the startup file and must be written in assembly. We need this regardless of if we are going to be running a 'bare metal' system or one with an operating system. In our case we only perform a few of these operations, mainly initialise the vectors, memory, stack and registers, and then call the main() application. We also setup the system clocks which for our template code is in a separate C file. Once the startup process has finished, we then startup our program.

So to recap, we power on, setup our microcontroller and then we can start our user code or boot loader.

Part 2 – The Compilation Process)
Before we even get to running our program we obviously have to compile it, but where do we start! Well in my case, I run a makefile. IDEs on the other hand by default have internal mechanisms that perform the same operations as a makefile. In Eclipse IDE for example, this can be turned off and set to use a makefile. The makefile when run, is, in essence, a script that determines what has changed in our project directory and will then tell the compiler what to compile. These files that need compiling are any updated file or new files as well as any that use sections of these updated or new files. Once compiled, the objects are not ready to be run on the microcontroller though. The object files are then linked with other object files and written out in the desired format with options on where sections of code go in the output file and if any sections need to be removed. This is what a linker script does, it tells the linker that when creating the output file that certain sections from the compiled object files go in one place while other sections go in another. Examples of different sections are executable code, initialised data (such as constants) and uninitialised data (such as variables). Libraries work in the same way as normal code in that they are compiled and then linked/copied into the output file.



Part 3 – The Startup File)
The startup file is a file containing assembly code that sets up the chip before starting our main C code. In basic terms, it is the reset handler and it configures our microcontroller up and sets up the interrupts and memory. Once setup, it starts the 'main' function of our c code.

Most of the information below is pulled from the GNU AS manual. So to explain the file, I'll go through the commands being used and explain what is happening in the file startup_stm32f4xx.s.

First up is .syntax unified. This tells the assembler how it should understand code phrases.

.cpu cortex-m3 is the same as using -mcpu on the command line as is done in our makefile. We could change this to cortex-m4 for our compiler. Alternative values can be found in the GNU AS manual.

.fpu softvfp is the same as the -mfpu command line command for as. Alternative values can be found in the GNU AS manual.

.thumb sets the instruction set being used to thumb. The Cortex-M chips don't support the ARM instruction set, only thumb, thumb 2 plus any extras such as FPU.

.global This is the same as making a variable global scope in C

.word This just inserts a symbol of word length into the object file that is created after assembling. The ones near the top of the file are related to the linker script talked about below.

.section This is about where to store the content below in the assembled object file. More about sections below when we talk about the linker script.

.weak basically is like setting a variable but if one exists, it will not be set here and will instead use the alternative. This means we can set default behaviour for interrupt handlers for example but can then later write one in C. The handler in C will then replace the weak version supplied here when processed by the linker.

.type sets the type for a symbol. In our file, we set words to types of function and object. The % can just as well be a #, @, or in "" and then the type in upper case. Examples are %function, #function and @function. The types we are able to use for ARM (i.e. not for GNU systems) are as follows (The descriptions are from the GNU AS manual):
      • function Mark the symbol as being a function name.
      • object Mark the symbol as being a data object.
      • common Mark the symbol as being a common data object.
      • tls_object Mark the symbol as being a thead-local data object.
      • notype Does not mark the symbol in any way. It is supported just for completeness.


NAME: sets a label of a section of code. Reset_Handler is an example of this. We can set types of statements with the type directive above. For us, the Reset_Handler then runs some other assembly code which I'm not going to cover but is all quite basic. It also runs a system setup/initialisation function (this function resides in system_stm32f4xx.c) for setting up the clocks. By default a symbol is of type 'notype'. This is not specifically stated in the manual but can be found by looking at the binutil source>include>elf>common.h.

.size sets the size of a symbol.

Default_Handler is another function, that if executed, goes into an infinite loop. This is used if an unexpected interrupt occurs when for example, we haven't overridden the default handler due to the weak symbol set.

We then lastly set another section with all the interrupt vectors which are of object type. We firstly define the symbols for them and then weak link them to the default handler. This is done with the .thumb_set directive. This directive sets the symbols as alias to the default handler and also marks them as a thumb instruction function entry point i.e. can be executed. The order these are in are important as the chip will expect to be able to go to the relevant address to find where to go for each interrupt. This is the reason there are some words with a name of 0. To override the default handler in C we can simply use something similar to void WWDG_IRQHandler(void) where the name is the equivalent of that in that in the table in this file.

Part 4 – The Linker Script)
The Linker script is a script that sets up how the compiled objects are copied into the output executable. It is mostly dependent on the microcontroller being used and the surrounding circuit so does not need to be changed per project. We can change details about memory in this file though so it may still be useful to change for individual projects. With the help of the GNU LD manual, I'll run through the supplied stm32_flash.ld file.

Firstly, ENTRY(Reset_Handler), this tells the linker what to set a jump for at the starting address, i.e. this tells the linker, which routine to start. Reset_Handler is from our startup_stm32f4xx.s.

We now next create some variables in our linker script for ease of use, that as the file says, allows us to tell the linker the size of the heap and stack (I'm not going to explain what a heap or stack are, as this is explained in detail many times on the net).

We now set the memory areas. This allows the linker to warn about regions of memory being near full after a compilation. Its main use is though, to tell the linker where to put sectors of code. ( >FLASH for example, at the end of an output section, places that section in flash.)

Now, we need to understand how a binary application is put together. We have a binary file that has different sections for different types of data. The main sections are executable code (.text), initialised data (.data) and uninitialised data (.bss). In this case we also have sections for the interrupt vectors (.isr_vector), the heap and stack (._user_heap_stack), and some ARM related sections. The interrupt vectors are just executable code but we want them placed before the general executable code so we add them separately. The heap and stack section contains no data, as these are used as placeholders to check availability of memory.

So each output section (the .name bit) is put in order but the linker needs to know what to put in these sections. So under each output section is a line similar to *(.name). This is an input section (also called an output section command, confusing I know). What this does is get the sections from the object files that the linker is processing and put them under the output section for the final output file/executible. So in object files that the linker is dealing with, there might be .text, . rodata, or .text* where * is any suffix. These sections then all get bundled together inside a single .text section in the output. If you don't specify an output section explicitly, the linker will create a section in the output with the same name as the section of the input object. It will then use the attributes in the memory section to work out where this section will be stored. The * in any location means the same as a unix shell. For example * would mean anything that matches the rest of the string if any. So for input sections, if it is *(.text) then this will include all .text sections from all files. It can be done per object file by replacing the * with the object file name, this is processed in order though, so the first match will be used.

One last thing before I get into some keywords in this file. For the .data section, you will see it has an AT (_sidata) parameter. This variable (_sidata) is set just above and is used so the startup script knows where the initialised data is stored (by referring to the _sidata variable).

So the main keywords in this script are ALIGN, PROVIDE, PROVIDE_HIDDEN, KEEP and /DISCARD/.

The ALIGN keyword is the most used of these and so I'm going to explain this in the most detail. The ALIGN keyword is used to calculate the alignment of data to a byte rather than placing at the current program counter location. ALIGN(4) calculates the program counter to the next 'word' but does not set this. To set, we have to use .=ALIGN(4) with '.' being the program counter and a word being 4 bytes in size. We use word alignment, as the processor, processes a word (4 bytes) at a time. If our instructions weren't word aligned, we could have some garbage in our 'word' that we are processing and so end up crashing with an alignment fault. In essence, we are padding our data structure out with zeros so we don't crash.

PROVIDE means that the linker will provide a definition of this variable if there is not one found in the object files. Normally in the cases used for our script there won't be definitions and so the linker will always provide them. These variable definitions in this script just provide locations/pointers to where individual sections start and stop. PROVIDE_HIDDEN is similar to PROVIDE but is used only for ELF type output files and 'won't be exported'.

KEEP will make sure a section or part there of, will not be removed. I'm not quite sure how this is different from normal usage, I'll test this sometime.

/DISCARD/ does the opposite of KEEP and removes any sections that correspond to its arguments. In our script, it removes information on the standard libraries if there were accidentally included in the linking process (such as when including a directory with these inside).

Part 5 – The MakeFile)
The makefile is a script that tells our system what needs to be compiled, in what order and how. It is run using the program make. This sample project uses implicit rules so we don't have to create a separate rule for each file and instead you can just list the files. Similarly to the startup file and linker above, the information below is mostly extracted from the respective manual, in this case GNU make.

So at the top of our top level makefile, we set a variable that contains the file names of all of our C code that needs compiling. Later down we convert this list to a list of object files which then is used for the implicit rules that tell make which files need compiling. PROJ_NAME is the name used for generating output files.

Next we set up the compilers and compiler flags. Nothing too special here, other than we tell the linker where our linker script is, to use little-endian and that our chip is cortex-m4 based. The command line flags override those set in the files themselves so the cpu and fpu set in the startup_stm32f4xx.s file are overridden with these values.

The vpath directive allows you to tell make where to look for certain types of files. For example vpath %.c src says it can look in the directory src for files that end in .c

Next we add some more compiler flags/arguments. These are simply adding search paths to find files so we can say #include stm32f4xx_conf.h for example and it will find the one in the inc folder rather than saying it can't find it in the src folder or any of the system library locations.

The next line, adds the startup file (startup_stm32f4xx.s) to the build, talked about above.

The following line then, is setting up the object files as talked about before. I.e. setting up our list of object files for the implicit rules.

We then setup the make recipes. Of note, a phony target is one that is not based on a source file. In this case, lib is phony as it calls another make script inside the lib directory.

Part 6 – The Peripherals Library)
The Peripherals Library is provided by ST to meet the CMSIS standard. It is basically a set of functions that manipulate the registers to provide the desired operation without the need to manually edit the registers yourself. I recommend you read the related Reference Manual for the microcontroller you are using in conjunction with using the library so you understand the full picture of what is happening. For the STM32F4xxx series the Reference manual code is RM0090.

There are quite a few C files included as part of the application package and peripherals library. Only some of these do we need to “include” in our code. All of the peripherals library needs to be included in the project if you use only a section of it unless you want to edit the library to exclude unused files (this is however pointless as the linker will only copy what is needed).

I will look at the files stm32f4_discovery.h, stm32f4xx.h, stm32f4xx_conf.h, the stm32f4xx_it and system_stm32f4xx.c. These are all part of the application package/template but are not necessarily part of the peripherals library.

First up stm32f4_discovery.h. This file is aimed mainly at the demonstration projects for the STM32F4-Discovery. Its basically just alias definitions such as LED4_PIN goes to GPIO_Pin_12. The one useful feature, which we'll have to do separately, is to include the stm32f4xx.h file. This is not part of the peripherals library and not needed outside demonstration projects.

The stm32f4xx.h file is the “CMSIS Cortex-M4 Device Peripheral Access Layer Header File. [It] contains all the peripheral register's definitions, bits definitions and memory mapping for STM32F4xx devices.” By changing the registers defined in this file, we can control the processor without any extraction libraries if we so desire. This can be very useful but for most, it will be best to use the rest of the peripheral library functions as they are more portable and yet still relatively lightweight. This is not part of the peripherals library itself BUT is required. Chips that don't have a peripherals library should still have a header similar to this.

Next, we have stm32f4xx_conf.h. This is the file, we have to include if we want to use any of the peripheral library functions. We include these directly if we wish not to include all of the library such as when you want to create your own functions in the same namespace. Names reference back to those used in the reference manual. In this file, the high speed oscillator (HSE_VALUE) is set to 8Mhz for the Discovery Board also. If this is set wrong for a project, it can have major effects due to timers and communications depending on it.

We next have the stm32f4xx_it.* files. These files are provided by ST so we can add interrupt handlers all into the same set of files. We don't need to use these though, as the handlers given in these files are identical to the default handler (an infinite loop) and we can set our own custom handlers anywhere in our source tree.

Finally we have, system_stm32f4xx.c. The content of this is very important, without it your program won't start! The reason for this is that, the file contains a function, that initialises the setup of the clocks. This is run as part of the startup process before the main function is run. The other two functions in this file allow for updating the onboard clock. This file configures quite a lot and because some of the configurations are project dependent, it is placed in src rather than as library code.

Part 7 – Miscellaneous)
Here are some extra tips for understanding the peripheral library. Firstly, if you run doxygen on the project directory, you will end up with a great documentation resource. This will have extracted all of the inline documentation of the peripherals library (and your user code if you documented it) and put it into an easily navigable set of files.

Another tip, is that when looking through demonstration/example projects using an IDE, you can likely get an option that will find the decoration of the desired function, macro, struct etc. In Eclipse IDE, you can highlight one of these and press F3 or right click and select "Find Declaration". Eclipse will then take you to the definition (assuming it is within the project) whether it be in the current file or another file somewhere else in your project.

If you want to have the GNU AS and Linker manuals as PDFs rather than the HTML versions from their websites, you have to make them manually. To do this, you have to download binutils and then group together all the tex and texinfo files. You can then, assuming that TeXLive is installed, run texi2dvi --pdf as.texinfo or texi2dvi --pdf ld.texinfo. I have included links in the references below to the manuals on my Google Drive.

Conclusion)
Hopefully now, you understand slightly better how compilation and boot processes work and how the different pieces fit together. I hope sharing this helps other people on their way to learning how to program an ARM processor.

Changes:
23rd June 2013 – version 2.0:
• New introduction to change emphasis from just the files themselves to the process that involves these files
• Added information on how the microcontroller starts up
• Added information on how the code is compiled.
• Rephrased startup file section to change from a list style to a more structured style.
• Rephrased first paragraphs of most sections to reflect updates

References:
[1] Cortex-A Series – Programmer's Guide version 2 – http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0013c/index.html – Downloaded on 27th August 2011
Manuals
[2] – GNU Assembler manual – Created from doxygen on version 2.23.2 version of binutils
(https://docs.google.com/file/d/0B5LruPF-c2m9RTFVSWNndHRnaDg/edit?usp=sharing)
[3] – GNU Linker manual – Created from doxygen on version 2.23.2 version of binutils
(https://docs.google.com/file/d/0B5LruPF-c2m9ZlVvZmt2a1NZZXM/edit?usp=sharing)
[4] – GNU make manual - http://www.gnu.org/software/make/manual/