BINARY   CODED   DECIMALS
by Raymond Filiatreault
Copyright 2005
Latest revision April 2005

aam

The aam instruction is used to adjust the content of the AL and AH registers after the AL register has been used to perform the multiplication of two unpacked BCD bytes. The CPU uses the following simple logic:

   al = al mod 10
   ah = al/10

Although this instruction should be used immediately after the multiplication instruction, it could be used later as long as no other intervening instruction would have changed the AL register.

Example1:
   mov  al,7
   mov  ah,3
   mul  ah      ;al = 21 = 15h, ah = 0
   aam          ;al = 21mod10 = 1, ah = 21/10 = 2

In fact, this instruction could be used without any preceeding multiplication. It will convert any value in AL according to the logic described above.

Example2:
   mov  al,73
   aam          ;al = 73mod10 = 3, ah = 73/10 = 7

The logic of the aam instruction is based on the assumption that it would follow the multiplication of two unpacked BCD bytes. Since neither byte should exceed a value of 9, the multiplication result should never exceed a value of 81 and the AH register would always become 0 after the multiplication. This may be the reason why the Intel manual describes the logic as acting on the AX register instead of effectively acting only on the AL register as mentioned here. This can be verified on a debugger with the next example which is designed to show what could be expected when the instruction is not used as intended.

Example3:
   mov  al,209
   mov  ah,137
   mul  ah      ;ax = 28633 = 6FD9h (ah = 6Fh, al = D9h = 217)
   aam          ;al = 217mod10 = 7, ah = 217/10 = 21
Note: The ASCII value of numerical digits must first be converted to their binary value before a multiplication is performed. The content of the upper 4 bits of the ASCII values would affect the final result in AL.

Although the content of the AH register may be modified by the aaa and aas instructions, the resulting content of that register is rarely used when doing strictly additions or subtractions. However, with the aam instruction, the resulting content of the AH register is the actual carryover from one multiplication which must be added to the result of the next multiplication. And, following that addition, the aaa instruction is necessary; any correction of the digit in AL will then be reflected by the automatic increment of that overflow in the AH register.

The next example will show how such multiplications can be coded. It's designed to compute the factorial of 30, i.e. 1*2*3*4...*30. The exact answer is:
265252859812191058636308480000000
or approximately 2.65*1032. It could be expanded with some minor modifications to compute the factorial of any number, only limited by the available memory (and patience for very large numbers).

Because numbers will be increasing gradually, all BCD numbers are kept in reverse order. It makes for easier coding by simply appending new numbers at the end of the existing ones.


.data
   size1      dd   1           ;current number of digits in answer
   size2      dd   ?           ;current number of digits in multiplier
                               ;this variable is not used in this code
   counter    dd   1           ;initialize with a 1
   multiplier db   4 dup(0)
   answer     db   1,47 dup(0) ;initialize with a 1
   workbuf1   db   48 dup(0)   ;adequate for current code
   workbuf2   db   48 dup(0)   ;      idem

.code

start:

; Because the value of the counter will not exceed 81, the aam instruction
; will be used to obtain its unpacked BCD digits.

   inc   counter
   mov   edi,offset multiplier
   mov   eax,counter
   aam
   mov   [edi],al             ;store 1st multiplier digit
   or    ah,ah
   jz    singledigit

; If the multiplier has only 1 digit, the multiplication result
; can overwrite the previous answer directly.
; Otherwise, the result of individual multiplications must be added before
; being transferred to the answer buffer.

   inc   edi
   mov   [edi],ah             ;store 2nd multiplier digit

   mov   ecx,size1
   mov   edi,offset workbuf1
   mov   esi,offset answer
   mov   ebx,offset multiplier
   mov   dl,0                 ;DL used to keep carryovers
@@:
   lodsb                      ;get next digit from answer
   mul   byte ptr[ebx]        ;multiply it by 1st digit of multiplier
   aam                        ;convert the result
   add   al,dl                ;add the previous carryover
   aaa                        ;convert the result of this addition
   mov   dl,ah                ;save the new carryover
   stosb                      ;store the digit
   dec   ecx
   jnz   @B                   ;continue until all digits have been processed
   mov   [edi],dl             ;store the last carryover

   inc   ebx                  ;point to the next multiplier digit
   mov   ecx,size1
   mov   edi,offset workbuf2+1 ;aligns the two buffers for later addition
   mov   esi,offset answer
   mov   dl,0                 ;DL used to keep carryovers
   inc   size1                ;the new answer will have at least 1 more digit
@@:
   lodsb                      ;get next digit from answer
   mul   byte ptr[ebx]        ;multiply it by 2nd digit of multiplier
   aam                        ;convert the result
   add   al,dl                ;add the previous carryover
   aaa                        ;convert the result of this addition
   mov   dl,ah                ;save the new carryover
   stosb                      ;store the digit
   dec   ecx
   jnz   @B                   ;continue until all digits have been processed
   or    dl,dl
   jz    @F
   mov   [edi],dl             ;store the last carryover
   inc   size1

@@:

; The two multiplication results must now be added and the answer
; overwritten with the sum. If there were more than 2 digits in the
; multiplier, the workbuf1 would be overwritten with the sum and
; the result of subsequent multiplications overwriting the workbuf2.
; Care should then be taken to rezero the required front digits
; of that workbuf2 buffer.

   mov   ebx,offset workbuf1
   mov   esi,offset workbuf2
   mov   edi,offset answer
   mov   ecx,size1
   clc
@@:
   lodsb
   adc   al,[ebx]
   aaa
   stosb
   inc   ebx
   dec   ecx
   jnz   @B
   jnc   nextone
   mov   byte ptr[edi],1
   inc   size1
   jmp   nextone

singledigit:
   mov   edi,offset answer
   mov   ebx,offset multiplier
   mov   ecx,size1
   mov   dl,0                 ;DL used to keep carryovers
@@:
   mov   al,[edi]             ;get next digit from answer
   mul   byte ptr[ebx]        ;multiply it by the multiplier digit
   aam                        ;convert the result
   add   al,dl                ;add the previous carryover
   aaa                        ;convert the result of this addition
   mov   dl,ah                ;save the new carryover
   stosb                      ;store the digit
   dec   ecx
   jnz   @B                   ;continue until all digits have been processed
   or    dl,dl
   jz    nextone
   mov   [edi],dl             ;store the last carryover
   inc   size1

nextone:
   cmp  counter, 30
   jb   start

; The digits in the answer are in reverse order and in their binary form.
; The following will prepare the data for display as a null-terminated string

   mov   esi,offset answer
   mov   edi,esi
   add   edi,size1
   mov   byte ptr[edi],0
   dec   edi
@@:
   mov   al,[esi]
   mov   ah,[edi]
   add   ax,3030h
   mov   [esi],ah
   mov   [edi],al
   inc   esi
   dec   edi
   cmp   esi,edi
   jbe   @B

; The answer is now ready for display


The described aam instruction is a special case of the more generalized function available on the CPU. The machine code for aam is:

   D4 0A
The imm8 value following the D4 first byte is effectively taken to divide the content of the AL register. This means that the content of AL could be adjusted to two unpacked digits in any other numeric base.

However, no mnemonics are available for bases other than 10. The instruction would then need to be hand coded. For example, if the unpacked digits were needed in base 12, hand coding such as follows would be required:

   db   0D4h,12