2nd ed., Feb 1, 2024, 1st ed. Jan 9th, 2010
A common misconception is that assembly language programming is a relic of the past. This is certainly not the case, and assembly language remains a core knowledge area for embedded systems development, digital design, and algorithm development in the 21st century.
A second misconception, especially amongst those who are only familiar with higher level languages (Python, Ruby, C#/.NET, Perl), is that assembly language is a defective programming language and therefore not worth the time to invest in.
But assembly language is more than ‘just another general purpose programming language’. It is actually the control signal specification for the microprocessor or microcontroller that will be running the instructions, and whose digital design must be reasonably well understood in order to get it to work successfully.
Higher level languages typically hide the underlying toolchains behind turnkey integrated development environments (IDEs). But the toolchains are valuable in their own right, comprising various software components (pre-processor, compiler, assembler, linker, loader) which take the high level code and transform it to executable machine code that can run on the target processor, optionally producing assembly code for inspection along the way. Familiarity with this toolchain can help evaluate how much overhead the high-level tools introduce on the code, which is an important part of understanding how much you’re trading off.
In this article, we’ll look first take a look at the software toolchain involved in general terms, before turning to specific tools you can use on a modern Windows computer (through Windows 11) to target an x86 chip (no longer in your PC but in a DOS Emulator). Similar skills and approaches carry over to the toolchain for the Atmel 328P and ATTiny 85 with a graphics application (TinyPhoto) on the ATTiny85 here.
A. Overview: High Level Orientation
Assembly language programming is about driving a complex microchip (logical computing device) whose design expects electronic signals to arrive in a prescribed fashion. Higher level languages (HLLs) are constructed to hide those details and provide a generic algorithmic description of what is desired in a machine-independent syntax. The compiler (a complex piece of translation software) parses the HLL source file and generates the required assembly language output.
Instruction Set of a Chip Every general purpose computing chip (microcontroller or microprocessor) has an electronic control language with instructions encoded as numbers (operating codes, or op-codes in short). These numbers are the chip’s machine code and assembly language is simply the assignment of convenient mnemonics to these machine instructions in a way that humans will find it easy to learn, remember, and apply. Assembly language is a 1-to-1 translation interface with machine language instructions. Assembly language instructions are exactly those commands which the chip can perform, and the assembly language manual is therefore the specification of what the system can do, and how the programmer needs to instruct the system to get it done.
Toolchain Talking to a chip requires a bit of setup and some tools, some hardware, some software. Let’s take a look.
1. Target Platform First choose your hardware (computing) target platform. Is it going to be a system-on-chip or a computer system consisting of a number of interacting chips? What will be the architecture of this chip? Will it be an Intel x86 microprocessor? A Motorola 64K microprocessor? An Intel 8051 microcontroller? A MicroChip PIC microcontroller? An ARM microcontroller?
Once you have your processor, you have a processor organization and processor features which determine the programmer’s interface, i.e. the opcodes (instruction set). You have to get a copy. Where? Get the chip datasheet. And its User Guide, if there is one. And the chip’s Application notes. You have to understand at least a minimal subset of these in order to write your program and have it work as you expect.
If you’re programming for a bare chip, i.e. with no operating system wrapping your binary, then you are in charge of all timing, sequencing, etc. There is nothing negotiating between your instructions and the electronic signals they cause. You and your instructions are the only things (and all the things) driving the schedule. (Now, this actually makes things easier for deterministic, hard real-time software development.)
If you’re programming within an operating system environment (16-bit MS-DOS, 32-bit MS Windows, Linux, or a real-time operating system — RTOS), you’ll need to understand the functions that the operating system makes available as well as the calling convention required to use these functions. For 16-bit MS-DOS operating in real mode, functions aren’t too difficult and you have the full run of the computer and all of its peripherals and co-processors. For 32-bit assembly language programming on modern Windows in protected mode (from Windows 2000/XP onward), you will need to understand how to work with the extensive Win32 API. Unless you have specific requirements that dictate the use of assembly language (or unless education or masochism is the goal), you will typically be better served by programming in a language that is at least at the level of the operating system. In both the Win32 and the *NIX cases, that language is C.
2. The Host Platform
Now you have to choose a host, i.e. the platform on which you are going to compose the program. The easiest situation is if your host and target are the same. If they aren’t, then you’re going to need a cross-assembler. But you’re also going to need an emulator so that you can test your program on the host machine before cross-assembling it for testing on the target.
3. Assembly Language Dialects for the same chip
Depending on your choice of target, there could be different DIALECTS of assembly languages for the specific chip. For the x86 microprocessor, for example, there is the Intel syntax and the AT&T syntax. Though knowing one you could probably READ another, they are different enough that it is not immediate to be able to WRITE in a dialect that you are not practiced in. Elements of your toolchain will typically have a decided preference of which syntax they prefer, i.e. they will not typically be dialect indifferent. For example, the GNU toolchain is AT&T syntax oriented, while the NASM, MASM, etc. line of assemblers is Intel syntax oriented.
So, if you choose a Windows PC as your host and target (best for learning), you have to choose the dialect of your assembly language: AT&T / Motorola syntax or Intel x86 syntax.
You’ll find that the AT&T syntax is a more regularized dialect, but Intel x86 is often the choice for beginning since it appears less daunting at the outset (though its idiosyncrasies make it less appealing once you know what you’re doing).
4. Assemblers and Assembler-specific syntax and code organization
Now you need to pick an assembler. x86 assemblers will typically require that your source code be written in a particular syntax dialect (Intel or AT&T), and so, if you want to choose an assembler first, then you will need to use the dialect that it prefers.
In addition, every assembler will introduce coding conventions and definitions: pseudo-ops (db, equ, etc.), memory and pointer specifying conventions, as well as macros, pre-processor definitions, etc. The syntax for these language elements often varies from assembler to assembler. Is it mov ax, es:foo or mov ax, [es:foo]?
Finally, there may be some different (or limited) choices when it comes time to assemble, and you’ll need to know what you’re doing to get things to roll along without errors.
If you choose Intel x86 syntax, you have several choices of assemblers. I recommend NASM (Netwide Assembler).
If you choose AT&T syntax, you can use gas (GNU Assembler), which is part of the gcc toolchain.
NOTE: you can also get gas to assemble intel syntax using the directive .intel_syntax noprefix, presumably from within the code, though the documentation for this is somewhat thin.
5. Automatic Syntax Converters
If you find yourself torn between wanting to code in one dialect and wanting to use a toolchain element that requires the other dialect, you do have a last-ditch option of using an automatic syntax converter that takes you back and forth between the two. For the x86, there is Intel2ATT and ATT2Intel.
Though the converter can take a large part of the pain away, not everything will always go smoothly (as with anything automatic), and so you’ll find that you end up having to know enough about both dialects anyway.
NOTE: gcc IS able to spit out Intel x86 syntax with the switch -masm=intel, but this is not compilable by NASM. ATT2Intel gets most of the details right so is the recommended tool to get from GCC generated assembly code in AT&T syntax to Intel assembly code that can be assembled with NASM.
6. Using a Stripped-down C Compiler as an Automatic Assembly Code Generator
One way to get reasonably legible assembly language code (minus the comments!) is to write the logic in C and then use a stripped down compiler to compile it to assembly. The key here is getting the compiler to generate the least amount of unnecessary code.
For GCC, use gcc -S (compile) with the following switches:
-fno-exceptions : don’t generate exception handling code
-s : strip out symbolic information
-O0 : don’t optimize
-Os : optimize for size (i.e. don’t use unnecessary assembler)
Or use Fabrice Bellard’s Tiny C Compiler (tcc) and disassemble the executable (PEBrowse works well here).
7. A Cornucopia of x86 Object Code Formats
Your chosen x86 assembler will take your source assembly code and assemble it into object code, and spit out an object file. But there are many choices here too: omf, win32, coff, a.out, elf, etc. The object file you are hoping to produce imposes its own syntax and organizational requirements which means that it affects HOW you write your assembly language program and arrange it in your source file. In other words, your assembly language source file cannot typically be run through the same pipe to assemble it into different object file formats.
Another key reason to care about object code formats is if you wish to take advantage of pre-assembled functionality from other libraries or from the operating system. If this functionality exists in the form of object code that you can link to, then your code will need to be assembled to a compatible object format in order for the linker to link all of the necessary pieces.
8. Linkers
To get a stand-alone executable that you can run on your target, you need a linker. There are many linkers, each having its own syntax and options. The linker will need to be able to accept the object file format(s) that you got from your assembler with the correct word length (8-, 16-, 32-, or 64- bit lengths).
9. Linking Libraries
The linker may need run-time or static libraries to resolve definitions. These will also need to be in a compatible format. You will need to know where they are, how to link them in (more syntax), and the right settings.
10. x86 Executable (Binary)
Finally, if you have done all this right, and you will get an executable that will be able to run on an x86 platform, either in a COM (DOS) window or directly on the OS (Win32 or Linux). The executable too, will have a format (e.g. PE-i386, elf, etc.), and this will determine which disassemblers and/or debuggers you are able to use with it.
11. Library (Static or DLL)
If you are creating a library, you have further considerations: static (.a or .lib) or dynamic (with its various flavors). But if dynamic, you also have to know whether the code that is produced is position independent, relocatable, or neither.
12. Debugging
And though you now have an executable, it doesn’t end here. Your program is likely not correct in all of its details (at least not the first several times through). Some mistakes will be obvious quickly. But for others you may find that you need a debugging method.
Debugging an assembly language program isn’t easy without good tools. And good tools add their own additional intricacies to what you have already had to master.
But now that you’ve seen the roadmap, stick around and play a while.
B. Assembly language programming on Windows
To help you get started with assembly language on your x86 box, this article describes a set of open source (free) toolchain elements that play well together on an x86 PC running Windows. The focus is on NASM (Netwide Assembler) and TCC (Tiny C Compiler), but we also cover VAL (linker), SST (scroll-screen tracer), ms-debug (original relic), and Freebasic. Forth F-PC is covered here.
In the Windows case, it is essential to separate discussions of 16-bit assembly language programming using MS-DOS facilities from 32-bit assembly language programming using the Win32 API, since the two involve quite different issues.
A Sandbox for a Gentle Introduction
Before I list bare bones tools, let me mention a gentler option for playing with 32-bit Intel x86 assembly language without having to extend your assembly code with Windows or C utility functions. The gentler option is GNU’s FreeBasic. This has a clean inline assembly syntax that gives you clear access to everything that you could do in assembly, while at the same time providing you with the convenience that Basic offers, in its characteristically basic manner.
What does this mean? It means that you can draft up your assembly within the pleasant FreeBasic harness or sandbox. Then when you have everything debugged and working the way you want, you can simply copy and paste the inline assembly into an .asm, compile with NASM, link with gcc and off you go with tested code. Nice. This can take some of the pain away when first starting out.
Now, for the hardcore, peel back my fingernails, I wanna code in assembly and eat nails for breakfast using only the bare bones tools, please ignore these gentler paragraphs and carry on. Hooah.
A bare bones toolset
16-bit x86 assembly language programming using MS-DOS facilities
- ms-debug – interesting to get a feel for machine language (hex entry)
- SST – command line debugger for 16/32-bit object code
- NASM (16-bit assembler using Intel syntax, object code format switches: -f bin, -f bin with exebin macro, or -f obj)
- VAL (16-bit linker that takes NASM -f obj and produces omf exes)
32-bit x86 assembly language programming using win32 facilities
- PEBrowse – visual debugger for 32-bit object code
- NASMW (32-bit assembler using Intel syntax, object code format switch: -f win32)
- gcc (MinGW) – C compiler and 32-bit linker taking NASM -f win32 objects
- ld (MinGW) – 32-bit linker
- gas (MinGW) – 32-bit assembler using AT&T x86 syntax
- Microsoft link – 32-bit linker taking NASM -f win32 objects
- tcc – Tiny C Compiler, 32-bit linker taking NASM -f elf objects, 32-bit assembler using AT&T syntax
32-bit Linux assembly language programming
- flat file format NASM -f bin
- elf format NASM -f elf
Mixed language programming
- NASM & tcc for Assembly extended with C
- tcc for C with embedded Assembly (Intel syntax)
- NASM & gcc for Assembly extended with C
- gcc for C with embedded Assembly (AT&T syntax)
- vcc for C with embedded Assembly (Intel syntax)
Disassemblers & Debuggers
- ms-debug – good to start with, but only for training experience
- NASM Disassembler – good to get training experience with hex editor and layout of 16-bit DOS programs.
- PEBrowse – modern 32-bit visual debugger
nasm
This assembler assembles Intel syntax x86 assembly language into one of a variety of object file formats that can be linked into an executable. NASM assembles source code into 16-bit flat binary (.bin, .com), 16-bit OMF obj (for linking into an OMF executable for MS-DOS with the VAL linker), 32-bit ELF obj (for linking, using tcc, into a PE-i386 executable for modern Windows), and 32-bit win32-coff obj (for linking, using gcc, into a PE-i386 executable for modern Windows).
tcc
- this compiler compiles into straightforward assembly language with straightforward use of procedure calls. Small program (10% of mingw’s gcc)
- ALSO an x86 assembler (AT&T syntax, gas style) (with .s files)
- Can handle GNU inline assembly with asm keyword
mingw gcc
This compiler can be made to compile into straightforward assembly to compare with tcc. Use the switches: gcc -fno-exceptions -s -O0 -Os Tje default compilation is the most general and so understandably compiles into convoluted assembly language with spaghetti like flow through procedure calls.
The standardization, flexibility, and thorough documentation of gcc then makes it more desirable than tcc as a toolchain for bare essential programming and automatic code generation.
Both of these compilers use procedures from msvcrt.dll. These call other functions in msvcrt.dll, which use functions in kernel32.dll, which in turn use functions in ntdll.dll
Note: it isn’t useful to count TOTAL instructions in a disassembled exe (including kernel32.dll and ntdll.dll). It’s the cleanness of the application code’s assembly. The TOTAL instructions can be very large since it involves OS required code and OS provided functionality.
linkers (tcc, gcc, val, link)
– the first two linkers allow the creation of 32-bit PE-i386 format executables for modern windows. The Val linker creates only 16-bit MS-OMF format executables for MS-DOS. The two modern linkers are even more useful because they allow the mixing of code: the extension of assembly with C functions and the embedding of assembly functions into C.
disassemblers and debuggers (ms-debug, SST, NASM, PEBrowse)
Preparing a 16-bit MS-DOS assembly language program
For 16-bit MS-DOS assembly language programs, you are free to take advantage of all DOS facilities. This is done by putting DOS function codes into the ah register, setting up the parameters that the function needs, and then activating the DOS interrupt 0x21.
1. Exploring 16-bit Machine Language
You can enter a machine language program using ms-debug one hex opcode at a time.
You will proceed using the e100 command (edit from offset 0100h) finishing each entry with a space. (Debug starts program execution at 0100h in whatever memory segment the program is assigned.)
You will have to have mapped out the memory and pointer referencing yourself by hand.
2. ms-debug Assembler for executing a flat binary (COM)
If you have a COM file containing x86 machine code, then you can load and run it directly within ms-debug.
You can explore both of these use cases using the MS-DEBUG Hands-On Exploration worksheets (PDF) from a computing weekend workshop with high school students held in the UK in 2018..
3. NASM Assembler for flat binary (COM)
If you are entering the program in an assembly language and if you will be assembling it with NASM to obtain a binary targeted at the debug platform, then you will need to add the ORG 0100 instruction at the start of the program so that debug interprets relative references relative to the ORG. (Debug needs to start program execution at 0100h.) Otherwise, the rest is just the actual assembly code. Very easy. No setup overhead except for the ORG 0100 line at the start.
4. NASM Assembler for OMF EXE using exebin macro and some setup code
If you are entering the program in an assembly language and if you want an EXE for x86 and are assembling it with NASM including the exebin.mac macro, then you need to provide appropriate scaffolding to your assembly program for exebin to work. This simply means identifying sections (.text, .data), setting up the stack properly, and initializing the data segment register with the address held in the code segment register
5. NASM Assembler for OBJ object code; VAL linker to get OMF exe
If you are entering the program in an assembly language and if you want an OBJ file (OMF format) for x86 that will then be able to be linked into an EXE, and if you are compiling using NASM and linking using VAL, then you need to provide the appropriate segmentation and associated program scaffolding to your assembly program. This simply means identifying segments (code, data, stack), marking the ..start place, initializing the ds, ss, and sp registers, reserving the stack space, and setting the stacktop: address. Note: the three segments can be in any order, not necessarily in code, data, stack order.
Preparing a 32-bit Windows console assembly language program
For 32-bit Windows console assembly language programs, you CANNOT use DOS facilities. You have to use instead the Windows OS functions, and these are the Win32 API functions. Though the Win32 API functions are C style functions, so the way you’ll use them will also be the way you’ll be able to extend assembly language with C functions.
In assembly language for 32-bit x86 running Windows, there is no DOS and you don’t have access to software interrupts. How to access peripherals?
KEY INSIGHT: With Windows 32, the operating system is WINDOWS, and the I/O and other basic OS calls are no longer DOS interrupts. Instead, they are in Kernel.lib, etc. So in order to write assembly language programs and use a linker that creates PE-i386 32-bit executables, you CANNOT use DOS interrupts. That was for 16-bit x86 programming under DOS (or under the DOS emulator in Windows).
So as with the 16-bit model, you need need go through the operating system, in this case Windows, and use the standard Windows functions.
Now, even though the Windows operating system is written using C, the Windows standard library functions use the PASCAL calling convention (also called the standard calling convention). This means that, for any real world use of assembly language, you need to know and understand the fundamentals of standard calling for functions, that is, passing parameters on the stack and indicating the total number of bytes pushed so that the function can clean them up, i.e. pop them all of, clear its portion of the stack.
Summary: With stdcall (a Pascal calling convention) as opposed to _cdecl (the C convention), there are a fixed number of parameters, caller pushes them onto the stack in reverse order. The function being called knows exactly how many bytes to remove off the stack. (In NASM you could also use the higher level instructions ENTER and LEAVE.)
How does the function return values? Any return value is stored in EAX.
Where do you find these standard Window functions? They are in the include headers for both NASM and tcc, but that won’t help with anything but the names and the API interface. They are also in the Win32 Help file:
WIN32.HLP
and also in an even better application:
APILOAD.EXE
which tells you which library they belong to so you can link the library in.
6. NASM Assembler for WIN32 object code; Microsoft link linker to get EXE
Many of the functions are in vcc’s kernel32.lib. You will link that in statically on the linking line. The library to be linked can come before the object code, but it is better style to put it after. Static libraries for Link are .lib Dynamic libraries are .dll (Windows convention)
7. NASM Assembler for ELF object code; TCC for linking to get EXE
The library to be linked must come AFTER the object code. Cannot link in vcc library — says invalid object code. Either use -llib and it finds it in -Ldir. Or give the full path for the library to be linked in. Static libraries for TCC are .a Dynamic libraries are .so (Unix convention)
Mixing Languages by Extension: Assembly extended with C
So, now, after writing an assembly language program for win32 and having to use functions from win32, there arise a number of practical matters that point toward extending Assembly with C:
- how is win32 a more efficient library than the C library
- for most people and the majority of basic functions, the C standard library is far more familiar than win32
- at the end of the day, using C instead of Windows at least gives you some greater generality and universality. Win32 will ONLY work on Windows operating systems
- C then becomes an interface for operating system functionality, abstracting away the different between DOS, Windows and other operating systems. The machine specific compiler deals with the specific mappings
- you can save your coding in assembly for the things for which you really care to have coded in assembly. For mathematicians like Knuth, this might be a core algorithm, a tight loop, an optimized graphics routine, etc.
Progressing this way makes sense. After all, Dennis Ritchie and Ken Thompson invented C as a sort of macro assembler, a way to automatically generate assembler without sacrificing any of the power and flexibility that assembler allows the programmer. That is also why, by the way, the C language is a fairly complex and sophisticated language to learn. Learning C is much easier after Assembly. Learning Assembly is easier after learning C. Whichever you learn first is going to be difficult.
Now, to use C functions, you have to use the C calling convention. In a nutshell, you push parameters onto the stack (in reverse order) and then YOU yourself clean up after yourself by following up the function call with a stack adjustment, typically add esp, byte 8, indicating the total number of bytes to adjust the stack pointer by. This number will be equivalent to the total size of all parameters pushed onto the stack.
Mixing Languages by Embedding: C with embedded Assembly
The MIXED LANGUAGE model of using another language and DROPPING INTO assembly from within that language (i.e. EMBEDDED assembly) is usually preferred over EXTENDING assembly with another language’s library calls or functions. Why? It is because of the sheer complexity of the assembly language toolchain and the comparative simplicity of staying within a higher level language’s toolchain and letting the compiler/interpreter handle the “absorption” of the embedded assembly.
You can use C and drop into Assembly.
You can use Basic and drop into Assembly. FreeBasic makes this very easy.
“Register Preservation
When an ASM block is opened, the registers ebx, esi, and edi are pushed to the stack, when the block is closed, these registers are popped back from the stack. This is because these registers are required to be preserved by most or all OS’s using the x86 CPU. You can therefore use these registers without explicitly preserving them yourself.”
Assembly Language Forensics: Signatures and Disassembly
When you’re working in assembly and its intricate toolchain, it helps to be able to look at a random intermediate file spit out by some link along the toolchain and identify where it came from and where it can be taken by some other element of the toolchain. It is also invaluable to be able to set a disassembler on the file and reverse out (unassemble) the object code and see a human readable source file.
To this end, let’s go over the main file formats, object and executable, their signatures, file formats, headers, and disassembly methods.
Executable types
- PE-i386 – Portable Executable
- com – Common Object Format
- ELF – Embedded Linker Format
- win32coff – Windows 32 COFF
Toolchain:
NASM has disassembler.
PEBrowse is much better.
NASM win32 object file – In Lister, starts with “L***”. No Netwide Assembler stamp in it.
Microsoft Link – In List, starts with “MZ”. No stamp in it. Takes NASM win32 object. Creates PE-i386 EXE.
NASM elf object file – In List, starts with “ELF”. Netwide Assembler stamp in it.
TCC linker – Takes NASM elf object. Creates PE-i386 EXE
When looking at a bare gcc created EXE (i.e. with -fno-exceptions -s -O0 -Os), notice the following (by comparing the list output with the EXE in a hex editor):
– the .rdata section begins at 0x0C00
– the .text section begins at 0x0400
[…] are “Hurling Boulders (Assembly 1)“, Assembly for Embedded Systems (Assembly 2), “Assembly Toolkit (Assembly 3)“, “Bare Bones (C)“, and Electronics Gateway to […]
[…] compilers has changed computing. Chapter 7: Demystifying the Assembly Language Toolchain Chapter 8: A low-level toolchain for x86: NASM (Netwide Assembler) and TCC (Tiny C Compiler) Chapter 9: Language-Oriented Programming: Forth, Lisp, and Ruby: Languages that enable solving your […]
[…] Assembly Language programming (Part 1 | Part 2 | Part 3) […]
[…] From assembly language, two directions will open themselves up for you much more easily than otherwise as you will have the cultural and experiential grounding of what computing is all about at its simplest. The first direction is the road upward into programming and computer science: assemblers, compilers, parsers, and higher level languages; the other direction is taking the road downward into microcontrollers, digital logic, sensors, and embedded development. (Part II, Part III). […]
[…] Demystifying the Assembly Language Toolchain: DOS-DEBUG, NASM (Netwide Assembler), TCC (Tiny C Compi… […]