by

Copyright 2005

Latest revision April 2005

**aad
**

**
The aad instruction is used to adjust the content of the AX register before that register is used to perform the division of two unpacked BCDs by another unpacked BCD digit. The CPU uses the following logic:
**

**
**

al = ah*10 + al ah = 0

The original intent of this instruction was to be able to divide by a single unpacked BCD digit, i.e. values of 9 or lower. Remainders being lower than 9 and located in the AH register after a division of the AX register by a byte, the maximum value stored in the AL register by the aad instruction before the next division could only be 8*10+9 (=89).

The following example only shows an application for a small number but could be expanded to any length for the number to be divided.

.data num db 7,6,4,3,5,8 ;853467 in reverse order numsize dd 6 divisor db 7 answer db 8 dup(0) ;answer to be stored in string order for display .code mov edi,offset answer mov esi,offset num mov ecx,numsize add esi,ecx ;points to byte following most significant digit mov ah,0 ;initialize "remainder" @@: dec esi mov al,[esi] aad div divisor ;quotient -> AL, remainder -> AH add al,30h ;convert result to ASCII stosb ;store answer in string order dec ecx jnz @B mov [edi],cl ;terminate string with 0

As can be observed from the CPU logic, any value in AH higher that 25, when multiplied by 10, would exceed the 255 limit of the AL register. The resulting correct multiplication value in the AX register would then become erroneous when the AH register gets zeroed. Even a value of 25 in the AH register combined with a value exceeding 5 in the AL register would also result in an erroneous value in the AX register. However, this also means that integer values up to 25 inclusive could be used as the divisor (24 being the maximum remainder) and still be able to use the aad instruction.

The following example takes advantage of that flexibility to compute the value of the Napierian constant with up to 25 significant digits. That constant is defined as:

By convention, 0! is equal to 1. The answer buffer is initialized with the sum of the first 2 members of the equation (1+1) and the decimal delimiter to save some coding. The txtbuf is used to hold the computed 1/n! and is also initialized along with the divisor for the same reason. The result of computing the next 1/n! overwrites the previous 1/n! in the same txtbuf. Each computed 1/n! is then added to the answer buffer.e= sum(1/n!) = 1/0! + 1/1! + 1/2! + 1/3! + ...

.data answer db "2.",30 dup(0) txtbuf db 1,31 dup(0) ;initialize to 1/1! divisor db 2 ;initialize to next n .code start: ; Divide the current 1/n! by the next n mov edi,offset txtbuf mov ecx,30 xor eax,eax ;zeroes AH @@: mov al,[edi] ;get BCD digit aad ;convert digits in AH/AL to an integer in AX div divisor ;divide by a byte quotient->AL remainder->AH stosb ;store quotient and advance to next digit dec ecx jnz @B ; Add that result to the answer mov edi,offset answer+30 ;point to least significant digit mov esi,offset txtbuf+29 ; idem mov ecx,29 ;for the 29 fractional digits clc @@: mov al,[esi] adc al,[edi] aaa mov [edi],al dec esi dec edi dec ecx jnz @B inc divisor cmp divisor,25 jbe start ;until 1/25! has been computed and added ; Convert the answer to ASCII characters mov edi,offset answer+2 mov ecx,6 @@: mov eax,[edi] add eax,30303030h stosd dec ecx jnz @B mov [edi],cl ; Answer ready for display

When the divisor exceeds 25, the only solution is to simulate the action of the aad instruction but without zeroing any partial register. This would allow __integer__ divisors as large as can be held in 32 bits, i.e. 4,294,967,295. Some additional precautions must also be taken. The following code could be adapted.

; The divisor is a dword integer memory variable ; The dw10 is a dword memory variable set to 10 ; "number" is the memory address of the buffer where the BCD number is ; located ; "answer" is the memory address of the buffer where the BCD result must ; be stored mov esi,number+? ;must point initially to most significant digit mov edi,answer+? ;must point to large enough buffer mov ecx,number_size xor edx,edx ;initialize for remainder @@: mov eax,edx mul dw10 movzx ebx,byte ptr[esi] ;get the current BCD digit add eax,ebx adc edx,0 ;for any overflow of the addition div divisor ;quotient -> EAX, remainder -> EDX mov [edi],al ;only one digit at a time gets computed inc/dec esi ;depending on how BCDs are stored inc/dec edi ; idem dec ecx jnz @BWith the mul and div instructions being quite slow, the above code may not be any faster than the subtraction method shown with the aas instruction, specially if the divisor is already in the BCD format and would require to be converted to a binary integer.

For divisors larger than 4,294,967,295, the subtraction method shown with the aas instruction would be necessary for the division of unpacked BCDs until 64-bit registers become available; the higher limit on the divisor would then become approximately 2×10^{19}.

The described aad instruction is a special case of the more generalized function available on the CPU. The machine code for aad is:

D5 0AThe

However, no mnemonics are available for bases other than 10. The instruction would then need to be hand coded. For example, if the unpacked digits were in base 8, hand coding such as follows would be required:

db 0D5h,8