[BACK]Return to slogn.sa CVS log [TXT][DIR] Up to [local] / sys / arch / m68k / fpsp

Annotation of sys/arch/m68k/fpsp/slogn.sa, Revision 1.1.1.1

1.1       nbrk        1: *      $OpenBSD: slogn.sa,v 1.3 2003/11/07 10:36:10 miod Exp $
                      2: *      $NetBSD: slogn.sa,v 1.3 1994/10/26 07:49:54 cgd Exp $
                      3:
                      4: *      MOTOROLA MICROPROCESSOR & MEMORY TECHNOLOGY GROUP
                      5: *      M68000 Hi-Performance Microprocessor Division
                      6: *      M68040 Software Package
                      7: *
                      8: *      M68040 Software Package Copyright (c) 1993, 1994 Motorola Inc.
                      9: *      All rights reserved.
                     10: *
                     11: *      THE SOFTWARE is provided on an "AS IS" basis and without warranty.
                     12: *      To the maximum extent permitted by applicable law,
                     13: *      MOTOROLA DISCLAIMS ALL WARRANTIES WHETHER EXPRESS OR IMPLIED,
                     14: *      INCLUDING IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A
                     15: *      PARTICULAR PURPOSE and any warranty against infringement with
                     16: *      regard to the SOFTWARE (INCLUDING ANY MODIFIED VERSIONS THEREOF)
                     17: *      and any accompanying written materials.
                     18: *
                     19: *      To the maximum extent permitted by applicable law,
                     20: *      IN NO EVENT SHALL MOTOROLA BE LIABLE FOR ANY DAMAGES WHATSOEVER
                     21: *      (INCLUDING WITHOUT LIMITATION, DAMAGES FOR LOSS OF BUSINESS
                     22: *      PROFITS, BUSINESS INTERRUPTION, LOSS OF BUSINESS INFORMATION, OR
                     23: *      OTHER PECUNIARY LOSS) ARISING OF THE USE OR INABILITY TO USE THE
                     24: *      SOFTWARE.  Motorola assumes no responsibility for the maintenance
                     25: *      and support of the SOFTWARE.
                     26: *
                     27: *      You are hereby granted a copyright license to use, modify, and
                     28: *      distribute the SOFTWARE so long as this entire notice is retained
                     29: *      without alteration in any modified and/or redistributed versions,
                     30: *      and that such modified versions are clearly identified as such.
                     31: *      No licenses are granted by implication, estoppel or otherwise
                     32: *      under any patents or trademarks of Motorola, Inc.
                     33:
                     34: *
                     35: *      slogn.sa 3.1 12/10/90
                     36: *
                     37: *      slogn computes the natural logarithm of an
                     38: *      input value. slognd does the same except the input value is a
                     39: *      denormalized number. slognp1 computes log(1+X), and slognp1d
                     40: *      computes log(1+X) for denormalized X.
                     41: *
                     42: *      Input: Double-extended value in memory location pointed to by address
                     43: *              register a0.
                     44: *
                     45: *      Output: log(X) or log(1+X) returned in floating-point register Fp0.
                     46: *
                     47: *      Accuracy and Monotonicity: The returned result is within 2 ulps in
                     48: *              64 significant bit, i.e. within 0.5001 ulp to 53 bits if the
                     49: *              result is subsequently rounded to double precision. The
                     50: *              result is provably monotonic in double precision.
                     51: *
                     52: *      Speed: The program slogn takes approximately 190 cycles for input
                     53: *              argument X such that |X-1| >= 1/16, which is the usual
                     54: *              situation. For those arguments, slognp1 takes approximately
                     55: *               210 cycles. For the less common arguments, the program will
                     56: *               run no worse than 10% slower.
                     57: *
                     58: *      Algorithm:
                     59: *      LOGN:
                     60: *      Step 1. If |X-1| < 1/16, approximate log(X) by an odd polynomial in
                     61: *              u, where u = 2(X-1)/(X+1). Otherwise, move on to Step 2.
                     62: *
                     63: *      Step 2. X = 2**k * Y where 1 <= Y < 2. Define F to be the first seven
                     64: *              significant bits of Y plus 2**(-7), i.e. F = 1.xxxxxx1 in base
                     65: *              2 where the six "x" match those of Y. Note that |Y-F| <= 2**(-7).
                     66: *
                     67: *      Step 3. Define u = (Y-F)/F. Approximate log(1+u) by a polynomial in u,
                     68: *              log(1+u) = poly.
                     69: *
                     70: *      Step 4. Reconstruct log(X) = log( 2**k * Y ) = k*log(2) + log(F) + log(1+u)
                     71: *              by k*log(2) + (log(F) + poly). The values of log(F) are calculated
                     72: *              beforehand and stored in the program.
                     73: *
                     74: *      lognp1:
                     75: *      Step 1: If |X| < 1/16, approximate log(1+X) by an odd polynomial in
                     76: *              u where u = 2X/(2+X). Otherwise, move on to Step 2.
                     77: *
                     78: *      Step 2: Let 1+X = 2**k * Y, where 1 <= Y < 2. Define F as done in Step 2
                     79: *              of the algorithm for LOGN and compute log(1+X) as
                     80: *              k*log(2) + log(F) + poly where poly approximates log(1+u),
                     81: *              u = (Y-F)/F.
                     82: *
                     83: *      Implementation Notes:
                     84: *      Note 1. There are 64 different possible values for F, thus 64 log(F)'s
                     85: *              need to be tabulated. Moreover, the values of 1/F are also
                     86: *              tabulated so that the division in (Y-F)/F can be performed by a
                     87: *              multiplication.
                     88: *
                     89: *      Note 2. In Step 2 of lognp1, in order to preserved accuracy, the value
                     90: *              Y-F has to be calculated carefully when 1/2 <= X < 3/2.
                     91: *
                     92: *      Note 3. To fully exploit the pipeline, polynomials are usually separated
                     93: *              into two parts evaluated independently before being added up.
                     94: *
                     95:
                     96: slogn  IDNT    2,1 Motorola 040 Floating Point Software Package
                     97:
                     98:        section 8
                     99:
                    100:        include fpsp.h
                    101:
                    102: BOUNDS1  DC.L $3FFEF07D,$3FFF8841
                    103: BOUNDS2  DC.L $3FFE8000,$3FFFC000
                    104:
                    105: LOGOF2 DC.L $3FFE0000,$B17217F7,$D1CF79AC,$00000000
                    106:
                    107: one    DC.L $3F800000
                    108: zero   DC.L $00000000
                    109: infty  DC.L $7F800000
                    110: negone DC.L $BF800000
                    111:
                    112: LOGA6  DC.L $3FC2499A,$B5E4040B
                    113: LOGA5  DC.L $BFC555B5,$848CB7DB
                    114:
                    115: LOGA4  DC.L $3FC99999,$987D8730
                    116: LOGA3  DC.L $BFCFFFFF,$FF6F7E97
                    117:
                    118: LOGA2  DC.L $3FD55555,$555555A4
                    119: LOGA1  DC.L $BFE00000,$00000008
                    120:
                    121: LOGB5  DC.L $3F175496,$ADD7DAD6
                    122: LOGB4  DC.L $3F3C71C2,$FE80C7E0
                    123:
                    124: LOGB3  DC.L $3F624924,$928BCCFF
                    125: LOGB2  DC.L $3F899999,$999995EC
                    126:
                    127: LOGB1  DC.L $3FB55555,$55555555
                    128: TWO    DC.L $40000000,$00000000
                    129:
                    130: LTHOLD DC.L $3f990000,$80000000,$00000000,$00000000
                    131:
                    132: LOGTBL:
                    133:        DC.L  $3FFE0000,$FE03F80F,$E03F80FE,$00000000
                    134:        DC.L  $3FF70000,$FF015358,$833C47E2,$00000000
                    135:        DC.L  $3FFE0000,$FA232CF2,$52138AC0,$00000000
                    136:        DC.L  $3FF90000,$BDC8D83E,$AD88D549,$00000000
                    137:        DC.L  $3FFE0000,$F6603D98,$0F6603DA,$00000000
                    138:        DC.L  $3FFA0000,$9CF43DCF,$F5EAFD48,$00000000
                    139:        DC.L  $3FFE0000,$F2B9D648,$0F2B9D65,$00000000
                    140:        DC.L  $3FFA0000,$DA16EB88,$CB8DF614,$00000000
                    141:        DC.L  $3FFE0000,$EF2EB71F,$C4345238,$00000000
                    142:        DC.L  $3FFB0000,$8B29B775,$1BD70743,$00000000
                    143:        DC.L  $3FFE0000,$EBBDB2A5,$C1619C8C,$00000000
                    144:        DC.L  $3FFB0000,$A8D839F8,$30C1FB49,$00000000
                    145:        DC.L  $3FFE0000,$E865AC7B,$7603A197,$00000000
                    146:        DC.L  $3FFB0000,$C61A2EB1,$8CD907AD,$00000000
                    147:        DC.L  $3FFE0000,$E525982A,$F70C880E,$00000000
                    148:        DC.L  $3FFB0000,$E2F2A47A,$DE3A18AF,$00000000
                    149:        DC.L  $3FFE0000,$E1FC780E,$1FC780E2,$00000000
                    150:        DC.L  $3FFB0000,$FF64898E,$DF55D551,$00000000
                    151:        DC.L  $3FFE0000,$DEE95C4C,$A037BA57,$00000000
                    152:        DC.L  $3FFC0000,$8DB956A9,$7B3D0148,$00000000
                    153:        DC.L  $3FFE0000,$DBEB61EE,$D19C5958,$00000000
                    154:        DC.L  $3FFC0000,$9B8FE100,$F47BA1DE,$00000000
                    155:        DC.L  $3FFE0000,$D901B203,$6406C80E,$00000000
                    156:        DC.L  $3FFC0000,$A9372F1D,$0DA1BD17,$00000000
                    157:        DC.L  $3FFE0000,$D62B80D6,$2B80D62C,$00000000
                    158:        DC.L  $3FFC0000,$B6B07F38,$CE90E46B,$00000000
                    159:        DC.L  $3FFE0000,$D3680D36,$80D3680D,$00000000
                    160:        DC.L  $3FFC0000,$C3FD0329,$06488481,$00000000
                    161:        DC.L  $3FFE0000,$D0B69FCB,$D2580D0B,$00000000
                    162:        DC.L  $3FFC0000,$D11DE0FF,$15AB18CA,$00000000
                    163:        DC.L  $3FFE0000,$CE168A77,$25080CE1,$00000000
                    164:        DC.L  $3FFC0000,$DE1433A1,$6C66B150,$00000000
                    165:        DC.L  $3FFE0000,$CB8727C0,$65C393E0,$00000000
                    166:        DC.L  $3FFC0000,$EAE10B5A,$7DDC8ADD,$00000000
                    167:        DC.L  $3FFE0000,$C907DA4E,$871146AD,$00000000
                    168:        DC.L  $3FFC0000,$F7856E5E,$E2C9B291,$00000000
                    169:        DC.L  $3FFE0000,$C6980C69,$80C6980C,$00000000
                    170:        DC.L  $3FFD0000,$82012CA5,$A68206D7,$00000000
                    171:        DC.L  $3FFE0000,$C4372F85,$5D824CA6,$00000000
                    172:        DC.L  $3FFD0000,$882C5FCD,$7256A8C5,$00000000
                    173:        DC.L  $3FFE0000,$C1E4BBD5,$95F6E947,$00000000
                    174:        DC.L  $3FFD0000,$8E44C60B,$4CCFD7DE,$00000000
                    175:        DC.L  $3FFE0000,$BFA02FE8,$0BFA02FF,$00000000
                    176:        DC.L  $3FFD0000,$944AD09E,$F4351AF6,$00000000
                    177:        DC.L  $3FFE0000,$BD691047,$07661AA3,$00000000
                    178:        DC.L  $3FFD0000,$9A3EECD4,$C3EAA6B2,$00000000
                    179:        DC.L  $3FFE0000,$BB3EE721,$A54D880C,$00000000
                    180:        DC.L  $3FFD0000,$A0218434,$353F1DE8,$00000000
                    181:        DC.L  $3FFE0000,$B92143FA,$36F5E02E,$00000000
                    182:        DC.L  $3FFD0000,$A5F2FCAB,$BBC506DA,$00000000
                    183:        DC.L  $3FFE0000,$B70FBB5A,$19BE3659,$00000000
                    184:        DC.L  $3FFD0000,$ABB3B8BA,$2AD362A5,$00000000
                    185:        DC.L  $3FFE0000,$B509E68A,$9B94821F,$00000000
                    186:        DC.L  $3FFD0000,$B1641795,$CE3CA97B,$00000000
                    187:        DC.L  $3FFE0000,$B30F6352,$8917C80B,$00000000
                    188:        DC.L  $3FFD0000,$B7047551,$5D0F1C61,$00000000
                    189:        DC.L  $3FFE0000,$B11FD3B8,$0B11FD3C,$00000000
                    190:        DC.L  $3FFD0000,$BC952AFE,$EA3D13E1,$00000000
                    191:        DC.L  $3FFE0000,$AF3ADDC6,$80AF3ADE,$00000000
                    192:        DC.L  $3FFD0000,$C2168ED0,$F458BA4A,$00000000
                    193:        DC.L  $3FFE0000,$AD602B58,$0AD602B6,$00000000
                    194:        DC.L  $3FFD0000,$C788F439,$B3163BF1,$00000000
                    195:        DC.L  $3FFE0000,$AB8F69E2,$8359CD11,$00000000
                    196:        DC.L  $3FFD0000,$CCECAC08,$BF04565D,$00000000
                    197:        DC.L  $3FFE0000,$A9C84A47,$A07F5638,$00000000
                    198:        DC.L  $3FFD0000,$D2420487,$2DD85160,$00000000
                    199:        DC.L  $3FFE0000,$A80A80A8,$0A80A80B,$00000000
                    200:        DC.L  $3FFD0000,$D7894992,$3BC3588A,$00000000
                    201:        DC.L  $3FFE0000,$A655C439,$2D7B73A8,$00000000
                    202:        DC.L  $3FFD0000,$DCC2C4B4,$9887DACC,$00000000
                    203:        DC.L  $3FFE0000,$A4A9CF1D,$96833751,$00000000
                    204:        DC.L  $3FFD0000,$E1EEBD3E,$6D6A6B9E,$00000000
                    205:        DC.L  $3FFE0000,$A3065E3F,$AE7CD0E0,$00000000
                    206:        DC.L  $3FFD0000,$E70D785C,$2F9F5BDC,$00000000
                    207:        DC.L  $3FFE0000,$A16B312E,$A8FC377D,$00000000
                    208:        DC.L  $3FFD0000,$EC1F392C,$5179F283,$00000000
                    209:        DC.L  $3FFE0000,$9FD809FD,$809FD80A,$00000000
                    210:        DC.L  $3FFD0000,$F12440D3,$E36130E6,$00000000
                    211:        DC.L  $3FFE0000,$9E4CAD23,$DD5F3A20,$00000000
                    212:        DC.L  $3FFD0000,$F61CCE92,$346600BB,$00000000
                    213:        DC.L  $3FFE0000,$9CC8E160,$C3FB19B9,$00000000
                    214:        DC.L  $3FFD0000,$FB091FD3,$8145630A,$00000000
                    215:        DC.L  $3FFE0000,$9B4C6F9E,$F03A3CAA,$00000000
                    216:        DC.L  $3FFD0000,$FFE97042,$BFA4C2AD,$00000000
                    217:        DC.L  $3FFE0000,$99D722DA,$BDE58F06,$00000000
                    218:        DC.L  $3FFE0000,$825EFCED,$49369330,$00000000
                    219:        DC.L  $3FFE0000,$9868C809,$868C8098,$00000000
                    220:        DC.L  $3FFE0000,$84C37A7A,$B9A905C9,$00000000
                    221:        DC.L  $3FFE0000,$97012E02,$5C04B809,$00000000
                    222:        DC.L  $3FFE0000,$87224C2E,$8E645FB7,$00000000
                    223:        DC.L  $3FFE0000,$95A02568,$095A0257,$00000000
                    224:        DC.L  $3FFE0000,$897B8CAC,$9F7DE298,$00000000
                    225:        DC.L  $3FFE0000,$94458094,$45809446,$00000000
                    226:        DC.L  $3FFE0000,$8BCF55DE,$C4CD05FE,$00000000
                    227:        DC.L  $3FFE0000,$92F11384,$0497889C,$00000000
                    228:        DC.L  $3FFE0000,$8E1DC0FB,$89E125E5,$00000000
                    229:        DC.L  $3FFE0000,$91A2B3C4,$D5E6F809,$00000000
                    230:        DC.L  $3FFE0000,$9066E68C,$955B6C9B,$00000000
                    231:        DC.L  $3FFE0000,$905A3863,$3E06C43B,$00000000
                    232:        DC.L  $3FFE0000,$92AADE74,$C7BE59E0,$00000000
                    233:        DC.L  $3FFE0000,$8F1779D9,$FDC3A219,$00000000
                    234:        DC.L  $3FFE0000,$94E9BFF6,$15845643,$00000000
                    235:        DC.L  $3FFE0000,$8DDA5202,$37694809,$00000000
                    236:        DC.L  $3FFE0000,$9723A1B7,$20134203,$00000000
                    237:        DC.L  $3FFE0000,$8CA29C04,$6514E023,$00000000
                    238:        DC.L  $3FFE0000,$995899C8,$90EB8990,$00000000
                    239:        DC.L  $3FFE0000,$8B70344A,$139BC75A,$00000000
                    240:        DC.L  $3FFE0000,$9B88BDAA,$3A3DAE2F,$00000000
                    241:        DC.L  $3FFE0000,$8A42F870,$5669DB46,$00000000
                    242:        DC.L  $3FFE0000,$9DB4224F,$FFE1157C,$00000000
                    243:        DC.L  $3FFE0000,$891AC73A,$E9819B50,$00000000
                    244:        DC.L  $3FFE0000,$9FDADC26,$8B7A12DA,$00000000
                    245:        DC.L  $3FFE0000,$87F78087,$F78087F8,$00000000
                    246:        DC.L  $3FFE0000,$A1FCFF17,$CE733BD4,$00000000
                    247:        DC.L  $3FFE0000,$86D90544,$7A34ACC6,$00000000
                    248:        DC.L  $3FFE0000,$A41A9E8F,$5446FB9F,$00000000
                    249:        DC.L  $3FFE0000,$85BF3761,$2CEE3C9B,$00000000
                    250:        DC.L  $3FFE0000,$A633CD7E,$6771CD8B,$00000000
                    251:        DC.L  $3FFE0000,$84A9F9C8,$084A9F9D,$00000000
                    252:        DC.L  $3FFE0000,$A8489E60,$0B435A5E,$00000000
                    253:        DC.L  $3FFE0000,$83993052,$3FBE3368,$00000000
                    254:        DC.L  $3FFE0000,$AA59233C,$CCA4BD49,$00000000
                    255:        DC.L  $3FFE0000,$828CBFBE,$B9A020A3,$00000000
                    256:        DC.L  $3FFE0000,$AC656DAE,$6BCC4985,$00000000
                    257:        DC.L  $3FFE0000,$81848DA8,$FAF0D277,$00000000
                    258:        DC.L  $3FFE0000,$AE6D8EE3,$60BB2468,$00000000
                    259:        DC.L  $3FFE0000,$80808080,$80808081,$00000000
                    260:        DC.L  $3FFE0000,$B07197A2,$3C46C654,$00000000
                    261:
                    262: ADJK   equ     L_SCR1
                    263:
                    264: X      equ     FP_SCR1
                    265: XDCARE equ     X+2
                    266: XFRAC  equ     X+4
                    267:
                    268: F      equ     FP_SCR2
                    269: FFRAC  equ     F+4
                    270:
                    271: KLOG2  equ     FP_SCR3
                    272:
                    273: SAVEU  equ     FP_SCR4
                    274:
                    275:        xref    t_frcinx
                    276:        xref    t_extdnrm
                    277:        xref    t_operr
                    278:        xref    t_dz
                    279:
                    280:        xdef    slognd
                    281: slognd:
                    282: *--ENTRY POINT FOR LOG(X) FOR DENORMALIZED INPUT
                    283:
                    284:        MOVE.L          #-100,ADJK(a6)  ...INPUT = 2^(ADJK) * FP0
                    285:
                    286: *----normalize the input value by left shifting k bits (k to be determined
                    287: *----below), adjusting exponent and storing -k to  ADJK
                    288: *----the value TWOTO100 is no longer needed.
                    289: *----Note that this code assumes the denormalized input is NON-ZERO.
                    290:
                    291:      MoveM.L   D2-D7,-(A7)             ...save some registers
                    292:      Clr.L     D3                      ...D3 is exponent of smallest norm. #
                    293:      Move.L    4(A0),D4
                    294:      Move.L    8(A0),D5                ...(D4,D5) is (Hi_X,Lo_X)
                    295:      Clr.L     D2                      ...D2 used for holding K
                    296:
                    297:      Tst.L     D4
                    298:      BNE.B     HiX_not0
                    299:
                    300: HiX_0:
                    301:      Move.L    D5,D4
                    302:      Clr.L     D5
                    303:      Move.L    #32,D2
                    304:      Clr.L     D6
                    305:      BFFFO      D4{0:32},D6
                    306:      LSL.L      D6,D4
                    307:      Add.L     D6,D2                   ...(D3,D4,D5) is normalized
                    308:
                    309:      Move.L    D3,X(a6)
                    310:      Move.L    D4,XFRAC(a6)
                    311:      Move.L    D5,XFRAC+4(a6)
                    312:      Neg.L     D2
                    313:      Move.L    D2,ADJK(a6)
                    314:      FMove.X   X(a6),FP0
                    315:      MoveM.L   (A7)+,D2-D7             ...restore registers
                    316:      LEA       X(a6),A0
                    317:      Bra.B     LOGBGN                  ...begin regular log(X)
                    318:
                    319:
                    320: HiX_not0:
                    321:      Clr.L     D6
                    322:      BFFFO     D4{0:32},D6             ...find first 1
                    323:      Move.L    D6,D2                   ...get k
                    324:      LSL.L     D6,D4
                    325:      Move.L    D5,D7                   ...a copy of D5
                    326:      LSL.L     D6,D5
                    327:      Neg.L     D6
                    328:      AddI.L    #32,D6
                    329:      LSR.L     D6,D7
                    330:      Or.L      D7,D4                   ...(D3,D4,D5) normalized
                    331:
                    332:      Move.L    D3,X(a6)
                    333:      Move.L    D4,XFRAC(a6)
                    334:      Move.L    D5,XFRAC+4(a6)
                    335:      Neg.L     D2
                    336:      Move.L    D2,ADJK(a6)
                    337:      FMove.X   X(a6),FP0
                    338:      MoveM.L   (A7)+,D2-D7             ...restore registers
                    339:      LEA       X(a6),A0
                    340:      Bra.B     LOGBGN                  ...begin regular log(X)
                    341:
                    342:
                    343:        xdef    slogn
                    344: slogn:
                    345: *--ENTRY POINT FOR LOG(X) FOR X FINITE, NON-ZERO, NOT NAN'S
                    346:
                    347:        FMOVE.X         (A0),FP0        ...LOAD INPUT
                    348:        CLR.L           ADJK(a6)
                    349:
                    350: LOGBGN:
                    351: *--FPCR SAVED AND CLEARED, INPUT IS 2^(ADJK)*FP0, FP0 CONTAINS
                    352: *--A FINITE, NON-ZERO, NORMALIZED NUMBER.
                    353:
                    354:        move.l  (a0),d0
                    355:        move.w  4(a0),d0
                    356:
                    357:        move.l  (a0),X(a6)
                    358:        move.l  4(a0),X+4(a6)
                    359:        move.l  8(a0),X+8(a6)
                    360:
                    361:        TST.L   D0              ...CHECK IF X IS NEGATIVE
                    362:        BLT.W   LOGNEG          ...LOG OF NEGATIVE ARGUMENT IS INVALID
                    363:        CMP2.L  BOUNDS1,D0      ...X IS POSITIVE, CHECK IF X IS NEAR 1
                    364:        BCC.W   LOGNEAR1        ...BOUNDS IS ROUGHLY [15/16, 17/16]
                    365:
                    366: LOGMAIN:
                    367: *--THIS SHOULD BE THE USUAL CASE, X NOT VERY CLOSE TO 1
                    368:
                    369: *--X = 2^(K) * Y, 1 <= Y < 2. THUS, Y = 1.XXXXXXXX....XX IN BINARY.
                    370: *--WE DEFINE F = 1.XXXXXX1, I.E. FIRST 7 BITS OF Y AND ATTACH A 1.
                    371: *--THE IDEA IS THAT LOG(X) = K*LOG2 + LOG(Y)
                    372: *--                     = K*LOG2 + LOG(F) + LOG(1 + (Y-F)/F).
                    373: *--NOTE THAT U = (Y-F)/F IS VERY SMALL AND THUS APPROXIMATING
                    374: *--LOG(1+U) CAN BE VERY EFFICIENT.
                    375: *--ALSO NOTE THAT THE VALUE 1/F IS STORED IN A TABLE SO THAT NO
                    376: *--DIVISION IS NEEDED TO CALCULATE (Y-F)/F.
                    377:
                    378: *--GET K, Y, F, AND ADDRESS OF 1/F.
                    379:        ASR.L   #8,D0
                    380:        ASR.L   #8,D0           ...SHIFTED 16 BITS, BIASED EXPO. OF X
                    381:        SUBI.L  #$3FFF,D0       ...THIS IS K
                    382:        ADD.L   ADJK(a6),D0     ...ADJUST K, ORIGINAL INPUT MAY BE  DENORM.
                    383:        LEA     LOGTBL,A0       ...BASE ADDRESS OF 1/F AND LOG(F)
                    384:        FMOVE.L D0,FP1          ...CONVERT K TO FLOATING-POINT FORMAT
                    385:
                    386: *--WHILE THE CONVERSION IS GOING ON, WE GET F AND ADDRESS OF 1/F
                    387:        MOVE.L  #$3FFF0000,X(a6)        ...X IS NOW Y, I.E. 2^(-K)*X
                    388:        MOVE.L  XFRAC(a6),FFRAC(a6)
                    389:        ANDI.L  #$FE000000,FFRAC(a6) ...FIRST 7 BITS OF Y
                    390:        ORI.L   #$01000000,FFRAC(a6) ...GET F: ATTACH A 1 AT THE EIGHTH BIT
                    391:        MOVE.L  FFRAC(a6),D0    ...READY TO GET ADDRESS OF 1/F
                    392:        ANDI.L  #$7E000000,D0
                    393:        ASR.L   #8,D0
                    394:        ASR.L   #8,D0
                    395:        ASR.L   #4,D0           ...SHIFTED 20, D0 IS THE DISPLACEMENT
                    396:        ADDA.L  D0,A0           ...A0 IS THE ADDRESS FOR 1/F
                    397:
                    398:        FMOVE.X X(a6),FP0
                    399:        move.l  #$3fff0000,F(a6)
                    400:        clr.l   F+8(a6)
                    401:        FSUB.X  F(a6),FP0               ...Y-F
                    402:        FMOVEm.X FP2/fp3,-(sp)  ...SAVE FP2 WHILE FP0 IS NOT READY
                    403: *--SUMMARY: FP0 IS Y-F, A0 IS ADDRESS OF 1/F, FP1 IS K
                    404: *--REGISTERS SAVED: FPCR, FP1, FP2
                    405:
                    406: LP1CONT1:
                    407: *--AN RE-ENTRY POINT FOR LOGNP1
                    408:        FMUL.X  (A0),FP0        ...FP0 IS U = (Y-F)/F
                    409:        FMUL.X  LOGOF2,FP1      ...GET K*LOG2 WHILE FP0 IS NOT READY
                    410:        FMOVE.X FP0,FP2
                    411:        FMUL.X  FP2,FP2         ...FP2 IS V=U*U
                    412:        FMOVE.X FP1,KLOG2(a6)   ...PUT K*LOG2 IN MEMEORY, FREE FP1
                    413:
                    414: *--LOG(1+U) IS APPROXIMATED BY
                    415: *--U + V*(A1+U*(A2+U*(A3+U*(A4+U*(A5+U*A6))))) WHICH IS
                    416: *--[U + V*(A1+V*(A3+V*A5))]  +  [U*V*(A2+V*(A4+V*A6))]
                    417:
                    418:        FMOVE.X FP2,FP3
                    419:        FMOVE.X FP2,FP1
                    420:
                    421:        FMUL.D  LOGA6,FP1       ...V*A6
                    422:        FMUL.D  LOGA5,FP2       ...V*A5
                    423:
                    424:        FADD.D  LOGA4,FP1       ...A4+V*A6
                    425:        FADD.D  LOGA3,FP2       ...A3+V*A5
                    426:
                    427:        FMUL.X  FP3,FP1         ...V*(A4+V*A6)
                    428:        FMUL.X  FP3,FP2         ...V*(A3+V*A5)
                    429:
                    430:        FADD.D  LOGA2,FP1       ...A2+V*(A4+V*A6)
                    431:        FADD.D  LOGA1,FP2       ...A1+V*(A3+V*A5)
                    432:
                    433:        FMUL.X  FP3,FP1         ...V*(A2+V*(A4+V*A6))
                    434:        ADDA.L  #16,A0          ...ADDRESS OF LOG(F)
                    435:        FMUL.X  FP3,FP2         ...V*(A1+V*(A3+V*A5)), FP3 RELEASED
                    436:
                    437:        FMUL.X  FP0,FP1         ...U*V*(A2+V*(A4+V*A6))
                    438:        FADD.X  FP2,FP0         ...U+V*(A1+V*(A3+V*A5)), FP2 RELEASED
                    439:
                    440:        FADD.X  (A0),FP1        ...LOG(F)+U*V*(A2+V*(A4+V*A6))
                    441:        FMOVEm.X  (sp)+,FP2/fp3 ...RESTORE FP2
                    442:        FADD.X  FP1,FP0         ...FP0 IS LOG(F) + LOG(1+U)
                    443:
                    444:        fmove.l d1,fpcr
                    445:        FADD.X  KLOG2(a6),FP0   ...FINAL ADD
                    446:        bra     t_frcinx
                    447:
                    448:
                    449: LOGNEAR1:
                    450: *--REGISTERS SAVED: FPCR, FP1. FP0 CONTAINS THE INPUT.
                    451:        FMOVE.X FP0,FP1
                    452:        FSUB.S  one,FP1         ...FP1 IS X-1
                    453:        FADD.S  one,FP0         ...FP0 IS X+1
                    454:        FADD.X  FP1,FP1         ...FP1 IS 2(X-1)
                    455: *--LOG(X) = LOG(1+U/2)-LOG(1-U/2) WHICH IS AN ODD POLYNOMIAL
                    456: *--IN U, U = 2(X-1)/(X+1) = FP1/FP0
                    457:
                    458: LP1CONT2:
                    459: *--THIS IS AN RE-ENTRY POINT FOR LOGNP1
                    460:        FDIV.X  FP0,FP1         ...FP1 IS U
                    461:        FMOVEm.X FP2/fp3,-(sp)   ...SAVE FP2
                    462: *--REGISTERS SAVED ARE NOW FPCR,FP1,FP2,FP3
                    463: *--LET V=U*U, W=V*V, CALCULATE
                    464: *--U + U*V*(B1 + V*(B2 + V*(B3 + V*(B4 + V*B5)))) BY
                    465: *--U + U*V*(  [B1 + W*(B3 + W*B5)]  +  [V*(B2 + W*B4)]  )
                    466:        FMOVE.X FP1,FP0
                    467:        FMUL.X  FP0,FP0 ...FP0 IS V
                    468:        FMOVE.X FP1,SAVEU(a6) ...STORE U IN MEMORY, FREE FP1
                    469:        FMOVE.X FP0,FP1
                    470:        FMUL.X  FP1,FP1 ...FP1 IS W
                    471:
                    472:        FMOVE.D LOGB5,FP3
                    473:        FMOVE.D LOGB4,FP2
                    474:
                    475:        FMUL.X  FP1,FP3 ...W*B5
                    476:        FMUL.X  FP1,FP2 ...W*B4
                    477:
                    478:        FADD.D  LOGB3,FP3 ...B3+W*B5
                    479:        FADD.D  LOGB2,FP2 ...B2+W*B4
                    480:
                    481:        FMUL.X  FP3,FP1 ...W*(B3+W*B5), FP3 RELEASED
                    482:
                    483:        FMUL.X  FP0,FP2 ...V*(B2+W*B4)
                    484:
                    485:        FADD.D  LOGB1,FP1 ...B1+W*(B3+W*B5)
                    486:        FMUL.X  SAVEU(a6),FP0 ...FP0 IS U*V
                    487:
                    488:        FADD.X  FP2,FP1 ...B1+W*(B3+W*B5) + V*(B2+W*B4), FP2 RELEASED
                    489:        FMOVEm.X (sp)+,FP2/fp3 ...FP2 RESTORED
                    490:
                    491:        FMUL.X  FP1,FP0 ...U*V*( [B1+W*(B3+W*B5)] + [V*(B2+W*B4)] )
                    492:
                    493:        fmove.l d1,fpcr
                    494:        FADD.X  SAVEU(a6),FP0
                    495:        bra     t_frcinx
                    496:        rts
                    497:
                    498: LOGNEG:
                    499: *--REGISTERS SAVED FPCR. LOG(-VE) IS INVALID
                    500:        bra     t_operr
                    501:
                    502:        xdef    slognp1d
                    503: slognp1d:
                    504: *--ENTRY POINT FOR LOG(1+Z) FOR DENORMALIZED INPUT
                    505: * Simply return the denorm
                    506:
                    507:        bra     t_extdnrm
                    508:
                    509:        xdef    slognp1
                    510: slognp1:
                    511: *--ENTRY POINT FOR LOG(1+X) FOR X FINITE, NON-ZERO, NOT NAN'S
                    512:
                    513:        FMOVE.X (A0),FP0        ...LOAD INPUT
                    514:        fabs.x  fp0             ;test magnitude
                    515:        fcmp.x  LTHOLD,fp0      ;compare with min threshold
                    516:        fbgt.w  LP1REAL         ;if greater, continue
                    517:        fmove.l #0,fpsr         ;clr N flag from compare
                    518:        fmove.l d1,fpcr
                    519:        fmove.x (a0),fp0        ;return signed argument
                    520:        bra     t_frcinx
                    521:
                    522: LP1REAL:
                    523:        FMOVE.X (A0),FP0        ...LOAD INPUT
                    524:        CLR.L   ADJK(a6)
                    525:        FMOVE.X FP0,FP1 ...FP1 IS INPUT Z
                    526:        FADD.S  one,FP0 ...X := ROUND(1+Z)
                    527:        FMOVE.X FP0,X(a6)
                    528:        MOVE.W  XFRAC(a6),XDCARE(a6)
                    529:        MOVE.L  X(a6),D0
                    530:        TST.L   D0
                    531:        BLE.W   LP1NEG0 ...LOG OF ZERO OR -VE
                    532:        CMP2.L  BOUNDS2,D0
                    533:        BCS.W   LOGMAIN ...BOUNDS2 IS [1/2,3/2]
                    534: *--IF 1+Z > 3/2 OR 1+Z < 1/2, THEN X, WHICH IS ROUNDING 1+Z,
                    535: *--CONTAINS AT LEAST 63 BITS OF INFORMATION OF Z. IN THAT CASE,
                    536: *--SIMPLY INVOKE LOG(X) FOR LOG(1+Z).
                    537:
                    538: LP1NEAR1:
                    539: *--NEXT SEE IF EXP(-1/16) < X < EXP(1/16)
                    540:        CMP2.L  BOUNDS1,D0
                    541:        BCS.B   LP1CARE
                    542:
                    543: LP1ONE16:
                    544: *--EXP(-1/16) < X < EXP(1/16). LOG(1+Z) = LOG(1+U/2) - LOG(1-U/2)
                    545: *--WHERE U = 2Z/(2+Z) = 2Z/(1+X).
                    546:        FADD.X  FP1,FP1 ...FP1 IS 2Z
                    547:        FADD.S  one,FP0 ...FP0 IS 1+X
                    548: *--U = FP1/FP0
                    549:        BRA.W   LP1CONT2
                    550:
                    551: LP1CARE:
                    552: *--HERE WE USE THE USUAL TABLE DRIVEN APPROACH. CARE HAS TO BE
                    553: *--TAKEN BECAUSE 1+Z CAN HAVE 67 BITS OF INFORMATION AND WE MUST
                    554: *--PRESERVE ALL THE INFORMATION. BECAUSE 1+Z IS IN [1/2,3/2],
                    555: *--THERE ARE ONLY TWO CASES.
                    556: *--CASE 1: 1+Z < 1, THEN K = -1 AND Y-F = (2-F) + 2Z
                    557: *--CASE 2: 1+Z > 1, THEN K = 0  AND Y-F = (1-F) + Z
                    558: *--ON RETURNING TO LP1CONT1, WE MUST HAVE K IN FP1, ADDRESS OF
                    559: *--(1/F) IN A0, Y-F IN FP0, AND FP2 SAVED.
                    560:
                    561:        MOVE.L  XFRAC(a6),FFRAC(a6)
                    562:        ANDI.L  #$FE000000,FFRAC(a6)
                    563:        ORI.L   #$01000000,FFRAC(a6)    ...F OBTAINED
                    564:        CMPI.L  #$3FFF8000,D0   ...SEE IF 1+Z > 1
                    565:        BGE.B   KISZERO
                    566:
                    567: KISNEG1:
                    568:        FMOVE.S TWO,FP0
                    569:        move.l  #$3fff0000,F(a6)
                    570:        clr.l   F+8(a6)
                    571:        FSUB.X  F(a6),FP0       ...2-F
                    572:        MOVE.L  FFRAC(a6),D0
                    573:        ANDI.L  #$7E000000,D0
                    574:        ASR.L   #8,D0
                    575:        ASR.L   #8,D0
                    576:        ASR.L   #4,D0           ...D0 CONTAINS DISPLACEMENT FOR 1/F
                    577:        FADD.X  FP1,FP1         ...GET 2Z
                    578:        FMOVEm.X FP2/fp3,-(sp)  ...SAVE FP2
                    579:        FADD.X  FP1,FP0         ...FP0 IS Y-F = (2-F)+2Z
                    580:        LEA     LOGTBL,A0       ...A0 IS ADDRESS OF 1/F
                    581:        ADDA.L  D0,A0
                    582:        FMOVE.S negone,FP1      ...FP1 IS K = -1
                    583:        BRA.W   LP1CONT1
                    584:
                    585: KISZERO:
                    586:        FMOVE.S one,FP0
                    587:        move.l  #$3fff0000,F(a6)
                    588:        clr.l   F+8(a6)
                    589:        FSUB.X  F(a6),FP0               ...1-F
                    590:        MOVE.L  FFRAC(a6),D0
                    591:        ANDI.L  #$7E000000,D0
                    592:        ASR.L   #8,D0
                    593:        ASR.L   #8,D0
                    594:        ASR.L   #4,D0
                    595:        FADD.X  FP1,FP0         ...FP0 IS Y-F
                    596:        FMOVEm.X FP2/fp3,-(sp)  ...FP2 SAVED
                    597:        LEA     LOGTBL,A0
                    598:        ADDA.L  D0,A0           ...A0 IS ADDRESS OF 1/F
                    599:        FMOVE.S zero,FP1        ...FP1 IS K = 0
                    600:        BRA.W   LP1CONT1
                    601:
                    602: LP1NEG0:
                    603: *--FPCR SAVED. D0 IS X IN COMPACT FORM.
                    604:        TST.L   D0
                    605:        BLT.B   LP1NEG
                    606: LP1ZERO:
                    607:        FMOVE.S negone,FP0
                    608:
                    609:        fmove.l d1,fpcr
                    610:        bra t_dz
                    611:
                    612: LP1NEG:
                    613:        FMOVE.S zero,FP0
                    614:
                    615:        fmove.l d1,fpcr
                    616:        bra     t_operr
                    617:
                    618:        end

CVSweb