Annotation of sys/arch/m68k/fpsp/bindec.sa, Revision 1.1.1.1
1.1 nbrk 1: * $OpenBSD: bindec.sa,v 1.3 2001/09/20 17:02:30 mpech Exp $
2: * $NetBSD: bindec.sa,v 1.3 1994/10/26 07:48:51 cgd Exp $
3:
4: * MOTOROLA MICROPROCESSOR & MEMORY TECHNOLOGY GROUP
5: * M68000 Hi-Performance Microprocessor Division
6: * M68040 Software Package
7: *
8: * M68040 Software Package Copyright (c) 1993, 1994 Motorola Inc.
9: * All rights reserved.
10: *
11: * THE SOFTWARE is provided on an "AS IS" basis and without warranty.
12: * To the maximum extent permitted by applicable law,
13: * MOTOROLA DISCLAIMS ALL WARRANTIES WHETHER EXPRESS OR IMPLIED,
14: * INCLUDING IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A
15: * PARTICULAR PURPOSE and any warranty against infringement with
16: * regard to the SOFTWARE (INCLUDING ANY MODIFIED VERSIONS THEREOF)
17: * and any accompanying written materials.
18: *
19: * To the maximum extent permitted by applicable law,
20: * IN NO EVENT SHALL MOTOROLA BE LIABLE FOR ANY DAMAGES WHATSOEVER
21: * (INCLUDING WITHOUT LIMITATION, DAMAGES FOR LOSS OF BUSINESS
22: * PROFITS, BUSINESS INTERRUPTION, LOSS OF BUSINESS INFORMATION, OR
23: * OTHER PECUNIARY LOSS) ARISING OF THE USE OR INABILITY TO USE THE
24: * SOFTWARE. Motorola assumes no responsibility for the maintenance
25: * and support of the SOFTWARE.
26: *
27: * You are hereby granted a copyright license to use, modify, and
28: * distribute the SOFTWARE so long as this entire notice is retained
29: * without alteration in any modified and/or redistributed versions,
30: * and that such modified versions are clearly identified as such.
31: * No licenses are granted by implication, estoppel or otherwise
32: * under any patents or trademarks of Motorola, Inc.
33:
34: *
35: * bindec.sa 3.4 1/3/91
36: *
37: * bindec
38: *
39: * Description:
40: * Converts an input in extended precision format
41: * to bcd format.
42: *
43: * Input:
44: * a0 points to the input extended precision value
45: * value in memory; d0 contains the k-factor sign-extended
46: * to 32-bits. The input may be either normalized,
47: * unnormalized, or denormalized.
48: *
49: * Output: result in the FP_SCR1 space on the stack.
50: *
51: * Saves and Modifies: D2-D7,A2,FP2
52: *
53: * Algorithm:
54: *
55: * A1. Set RM and size ext; Set SIGMA = sign of input.
56: * The k-factor is saved for use in d7. Clear the
57: * BINDEC_FLG for separating normalized/denormalized
58: * input. If input is unnormalized or denormalized,
59: * normalize it.
60: *
61: * A2. Set X = abs(input).
62: *
63: * A3. Compute ILOG.
64: * ILOG is the log base 10 of the input value. It is
65: * approximated by adding e + 0.f when the original
66: * value is viewed as 2^^e * 1.f in extended precision.
67: * This value is stored in d6.
68: *
69: * A4. Clr INEX bit.
70: * The operation in A3 above may have set INEX2.
71: *
72: * A5. Set ICTR = 0;
73: * ICTR is a flag used in A13. It must be set before the
74: * loop entry A6.
75: *
76: * A6. Calculate LEN.
77: * LEN is the number of digits to be displayed. The
78: * k-factor can dictate either the total number of digits,
79: * if it is a positive number, or the number of digits
80: * after the decimal point which are to be included as
81: * significant. See the 68882 manual for examples.
82: * If LEN is computed to be greater than 17, set OPERR in
83: * USER_FPSR. LEN is stored in d4.
84: *
85: * A7. Calculate SCALE.
86: * SCALE is equal to 10^ISCALE, where ISCALE is the number
87: * of decimal places needed to insure LEN integer digits
88: * in the output before conversion to bcd. LAMBDA is the
89: * sign of ISCALE, used in A9. Fp1 contains
90: * 10^^(abs(ISCALE)) using a rounding mode which is a
91: * function of the original rounding mode and the signs
92: * of ISCALE and X. A table is given in the code.
93: *
94: * A8. Clr INEX; Force RZ.
95: * The operation in A3 above may have set INEX2.
96: * RZ mode is forced for the scaling operation to insure
97: * only one rounding error. The grs bits are collected in
98: * the INEX flag for use in A10.
99: *
100: * A9. Scale X -> Y.
101: * The mantissa is scaled to the desired number of
102: * significant digits. The excess digits are collected
103: * in INEX2.
104: *
105: * A10. Or in INEX.
106: * If INEX is set, round error occurred. This is
107: * compensated for by 'or-ing' in the INEX2 flag to
108: * the lsb of Y.
109: *
110: * A11. Restore original FPCR; set size ext.
111: * Perform FINT operation in the user's rounding mode.
112: * Keep the size to extended.
113: *
114: * A12. Calculate YINT = FINT(Y) according to user's rounding
115: * mode. The FPSP routine sintd0 is used. The output
116: * is in fp0.
117: *
118: * A13. Check for LEN digits.
119: * If the int operation results in more than LEN digits,
120: * or less than LEN -1 digits, adjust ILOG and repeat from
121: * A6. This test occurs only on the first pass. If the
122: * result is exactly 10^LEN, decrement ILOG and divide
123: * the mantissa by 10.
124: *
125: * A14. Convert the mantissa to bcd.
126: * The binstr routine is used to convert the LEN digit
127: * mantissa to bcd in memory. The input to binstr is
128: * to be a fraction; i.e. (mantissa)/10^LEN and adjusted
129: * such that the decimal point is to the left of bit 63.
130: * The bcd digits are stored in the correct position in
131: * the final string area in memory.
132: *
133: * A15. Convert the exponent to bcd.
134: * As in A14 above, the exp is converted to bcd and the
135: * digits are stored in the final string.
136: * Test the length of the final exponent string. If the
137: * length is 4, set operr.
138: *
139: * A16. Write sign bits to final string.
140: *
141: * Implementation Notes:
142: *
143: * The registers are used as follows:
144: *
145: * d0: scratch; LEN input to binstr
146: * d1: scratch
147: * d2: upper 32-bits of mantissa for binstr
148: * d3: scratch;lower 32-bits of mantissa for binstr
149: * d4: LEN
150: * d5: LAMBDA/ICTR
151: * d6: ILOG
152: * d7: k-factor
153: * a0: ptr for original operand/final result
154: * a1: scratch pointer
155: * a2: pointer to FP_X; abs(original value) in ext
156: * fp0: scratch
157: * fp1: scratch
158: * fp2: scratch
159: * F_SCR1:
160: * F_SCR2:
161: * L_SCR1:
162: * L_SCR2:
163: *
164:
165: BINDEC IDNT 2,1 Motorola 040 Floating Point Software Package
166:
167: include fpsp.h
168:
169: section 8
170:
171: * Constants in extended precision
172: LOG2 dc.l $3FFD0000,$9A209A84,$FBCFF798,$00000000
173: LOG2UP1 dc.l $3FFD0000,$9A209A84,$FBCFF799,$00000000
174:
175: * Constants in single precision
176: FONE dc.l $3F800000,$00000000,$00000000,$00000000
177: FTWO dc.l $40000000,$00000000,$00000000,$00000000
178: FTEN dc.l $41200000,$00000000,$00000000,$00000000
179: F4933 dc.l $459A2800,$00000000,$00000000,$00000000
180:
181: RBDTBL dc.b 0,0,0,0
182: dc.b 3,3,2,2
183: dc.b 3,2,2,3
184: dc.b 2,3,3,2
185:
186: xref binstr
187: xref sintdo
188: xref ptenrn,ptenrm,ptenrp
189:
190: xdef bindec
191: xdef sc_mul
192: bindec:
193: movem.l d2-d7/a2,-(a7)
194: fmovem.x fp0-fp2,-(a7)
195:
196: * A1. Set RM and size ext. Set SIGMA = sign input;
197: * The k-factor is saved for use in d7. Clear BINDEC_FLG for
198: * separating normalized/denormalized input. If the input
199: * is a denormalized number, set the BINDEC_FLG memory word
200: * to signal denorm. If the input is unnormalized, normalize
201: * the input and test for denormalized result.
202: *
203: fmove.l #rm_mode,FPCR ;set RM and ext
204: move.l (a0),L_SCR2(a6) ;save exponent for sign check
205: move.l d0,d7 ;move k-factor to d7
206: clr.b BINDEC_FLG(a6) ;clr norm/denorm flag
207: move.w STAG(a6),d0 ;get stag
208: andi.w #$e000,d0 ;isolate stag bits
209: beq A2_str ;if zero, input is norm
210: *
211: * Normalize the denorm
212: *
213: un_de_norm:
214: move.w (a0),d0
215: andi.w #$7fff,d0 ;strip sign of normalized exp
216: move.l 4(a0),d1
217: move.l 8(a0),d2
218: norm_loop:
219: sub.w #1,d0
220: add.l d2,d2
221: addx.l d1,d1
222: tst.l d1
223: bge.b norm_loop
224: *
225: * Test if the normalized input is denormalized
226: *
227: tst.w d0
228: bgt.b pos_exp ;if greater than zero, it is a norm
229: st BINDEC_FLG(a6) ;set flag for denorm
230: pos_exp:
231: andi.w #$7fff,d0 ;strip sign of normalized exp
232: move.w d0,(a0)
233: move.l d1,4(a0)
234: move.l d2,8(a0)
235:
236: * A2. Set X = abs(input).
237: *
238: A2_str:
239: move.l (a0),FP_SCR2(a6) ; move input to work space
240: move.l 4(a0),FP_SCR2+4(a6) ; move input to work space
241: move.l 8(a0),FP_SCR2+8(a6) ; move input to work space
242: andi.l #$7fffffff,FP_SCR2(a6) ;create abs(X)
243:
244: * A3. Compute ILOG.
245: * ILOG is the log base 10 of the input value. It is approx-
246: * imated by adding e + 0.f when the original value is viewed
247: * as 2^^e * 1.f in extended precision. This value is stored
248: * in d6.
249: *
250: * Register usage:
251: * Input/Output
252: * d0: k-factor/exponent
253: * d2: x/x
254: * d3: x/x
255: * d4: x/x
256: * d5: x/x
257: * d6: x/ILOG
258: * d7: k-factor/Unchanged
259: * a0: ptr for original operand/final result
260: * a1: x/x
261: * a2: x/x
262: * fp0: x/float(ILOG)
263: * fp1: x/x
264: * fp2: x/x
265: * F_SCR1:x/x
266: * F_SCR2:Abs(X)/Abs(X) with $3fff exponent
267: * L_SCR1:x/x
268: * L_SCR2:first word of X packed/Unchanged
269:
270: tst.b BINDEC_FLG(a6) ;check for denorm
271: beq.b A3_cont ;if clr, continue with norm
272: move.l #-4933,d6 ;force ILOG = -4933
273: bra.b A4_str
274: A3_cont:
275: move.w FP_SCR2(a6),d0 ;move exp to d0
276: move.w #$3fff,FP_SCR2(a6) ;replace exponent with 0x3fff
277: fmove.x FP_SCR2(a6),fp0 ;now fp0 has 1.f
278: sub.w #$3fff,d0 ;strip off bias
279: fadd.w d0,fp0 ;add in exp
280: fsub.s FONE,fp0 ;subtract off 1.0
281: fbge.w pos_res ;if pos, branch
282: fmul.x LOG2UP1,fp0 ;if neg, mul by LOG2UP1
283: fmove.l fp0,d6 ;put ILOG in d6 as a lword
284: bra.b A4_str ;go move out ILOG
285: pos_res:
286: fmul.x LOG2,fp0 ;if pos, mul by LOG2
287: fmove.l fp0,d6 ;put ILOG in d6 as a lword
288:
289:
290: * A4. Clr INEX bit.
291: * The operation in A3 above may have set INEX2.
292:
293: A4_str:
294: fmove.l #0,FPSR ;zero all of fpsr - nothing needed
295:
296:
297: * A5. Set ICTR = 0;
298: * ICTR is a flag used in A13. It must be set before the
299: * loop entry A6. The lower word of d5 is used for ICTR.
300:
301: clr.w d5 ;clear ICTR
302:
303:
304: * A6. Calculate LEN.
305: * LEN is the number of digits to be displayed. The k-factor
306: * can dictate either the total number of digits, if it is
307: * a positive number, or the number of digits after the
308: * original decimal point which are to be included as
309: * significant. See the 68882 manual for examples.
310: * If LEN is computed to be greater than 17, set OPERR in
311: * USER_FPSR. LEN is stored in d4.
312: *
313: * Register usage:
314: * Input/Output
315: * d0: exponent/Unchanged
316: * d2: x/x/scratch
317: * d3: x/x
318: * d4: exc picture/LEN
319: * d5: ICTR/Unchanged
320: * d6: ILOG/Unchanged
321: * d7: k-factor/Unchanged
322: * a0: ptr for original operand/final result
323: * a1: x/x
324: * a2: x/x
325: * fp0: float(ILOG)/Unchanged
326: * fp1: x/x
327: * fp2: x/x
328: * F_SCR1:x/x
329: * F_SCR2:Abs(X) with $3fff exponent/Unchanged
330: * L_SCR1:x/x
331: * L_SCR2:first word of X packed/Unchanged
332:
333: A6_str:
334: tst.l d7 ;branch on sign of k
335: ble.b k_neg ;if k <= 0, LEN = ILOG + 1 - k
336: move.l d7,d4 ;if k > 0, LEN = k
337: bra.b len_ck ;skip to LEN check
338: k_neg:
339: move.l d6,d4 ;first load ILOG to d4
340: sub.l d7,d4 ;subtract off k
341: addq.l #1,d4 ;add in the 1
342: len_ck:
343: tst.l d4 ;LEN check: branch on sign of LEN
344: ble.b LEN_ng ;if neg, set LEN = 1
345: cmp.l #17,d4 ;test if LEN > 17
346: ble.b A7_str ;if not, forget it
347: move.l #17,d4 ;set max LEN = 17
348: tst.l d7 ;if negative, never set OPERR
349: ble.b A7_str ;if positive, continue
350: or.l #opaop_mask,USER_FPSR(a6) ;set OPERR & AIOP in USER_FPSR
351: bra.b A7_str ;finished here
352: LEN_ng:
353: moveq.l #1,d4 ;min LEN is 1
354:
355:
356: * A7. Calculate SCALE.
357: * SCALE is equal to 10^ISCALE, where ISCALE is the number
358: * of decimal places needed to insure LEN integer digits
359: * in the output before conversion to bcd. LAMBDA is the sign
360: * of ISCALE, used in A9. Fp1 contains 10^^(abs(ISCALE)) using
361: * the rounding mode as given in the following table (see
362: * Coonen, p. 7.23 as ref.; however, the SCALE variable is
363: * of opposite sign in bindec.sa from Coonen).
364: *
365: * Initial USE
366: * FPCR[6:5] LAMBDA SIGN(X) FPCR[6:5]
367: * ----------------------------------------------
368: * RN 00 0 0 00/0 RN
369: * RN 00 0 1 00/0 RN
370: * RN 00 1 0 00/0 RN
371: * RN 00 1 1 00/0 RN
372: * RZ 01 0 0 11/3 RP
373: * RZ 01 0 1 11/3 RP
374: * RZ 01 1 0 10/2 RM
375: * RZ 01 1 1 10/2 RM
376: * RM 10 0 0 11/3 RP
377: * RM 10 0 1 10/2 RM
378: * RM 10 1 0 10/2 RM
379: * RM 10 1 1 11/3 RP
380: * RP 11 0 0 10/2 RM
381: * RP 11 0 1 11/3 RP
382: * RP 11 1 0 11/3 RP
383: * RP 11 1 1 10/2 RM
384: *
385: * Register usage:
386: * Input/Output
387: * d0: exponent/scratch - final is 0
388: * d2: x/0 or 24 for A9
389: * d3: x/scratch - offset ptr into PTENRM array
390: * d4: LEN/Unchanged
391: * d5: 0/ICTR:LAMBDA
392: * d6: ILOG/ILOG or k if ((k<=0)&(ILOG<k))
393: * d7: k-factor/Unchanged
394: * a0: ptr for original operand/final result
395: * a1: x/ptr to PTENRM array
396: * a2: x/x
397: * fp0: float(ILOG)/Unchanged
398: * fp1: x/10^ISCALE
399: * fp2: x/x
400: * F_SCR1:x/x
401: * F_SCR2:Abs(X) with $3fff exponent/Unchanged
402: * L_SCR1:x/x
403: * L_SCR2:first word of X packed/Unchanged
404:
405: A7_str:
406: tst.l d7 ;test sign of k
407: bgt.b k_pos ;if pos and > 0, skip this
408: cmp.l d6,d7 ;test k - ILOG
409: blt.b k_pos ;if ILOG >= k, skip this
410: move.l d7,d6 ;if ((k<0) & (ILOG < k)) ILOG = k
411: k_pos:
412: move.l d6,d0 ;calc ILOG + 1 - LEN in d0
413: addq.l #1,d0 ;add the 1
414: sub.l d4,d0 ;sub off LEN
415: swap d5 ;use upper word of d5 for LAMBDA
416: clr.w d5 ;set it zero initially
417: clr.w d2 ;set up d2 for very small case
418: tst.l d0 ;test sign of ISCALE
419: bge.b iscale ;if pos, skip next inst
420: addq.w #1,d5 ;if neg, set LAMBDA true
421: cmp.l #$ffffecd4,d0 ;test iscale <= -4908
422: bgt.b no_inf ;if false, skip rest
423: addi.l #24,d0 ;add in 24 to iscale
424: move.l #24,d2 ;put 24 in d2 for A9
425: no_inf:
426: neg.l d0 ;and take abs of ISCALE
427: iscale:
428: fmove.s FONE,fp1 ;init fp1 to 1
429: bfextu USER_FPCR(a6){26:2},d1 ;get initial rmode bits
430: add.w d1,d1 ;put them in bits 2:1
431: add.w d5,d1 ;add in LAMBDA
432: add.w d1,d1 ;put them in bits 3:1
433: tst.l L_SCR2(a6) ;test sign of original x
434: bge.b x_pos ;if pos, don't set bit 0
435: addq.l #1,d1 ;if neg, set bit 0
436: x_pos:
437: lea.l RBDTBL,a2 ;load rbdtbl base
438: move.b (a2,d1),d3 ;load d3 with new rmode
439: lsl.l #4,d3 ;put bits in proper position
440: fmove.l d3,fpcr ;load bits into fpu
441: lsr.l #4,d3 ;put bits in proper position
442: tst.b d3 ;decode new rmode for pten table
443: bne.b not_rn ;if zero, it is RN
444: lea.l PTENRN,a1 ;load a1 with RN table base
445: bra.b rmode ;exit decode
446: not_rn:
447: lsr.b #1,d3 ;get lsb in carry
448: bcc.b not_rp ;if carry clear, it is RM
449: lea.l PTENRP,a1 ;load a1 with RP table base
450: bra.b rmode ;exit decode
451: not_rp:
452: lea.l PTENRM,a1 ;load a1 with RM table base
453: rmode:
454: clr.l d3 ;clr table index
455: e_loop:
456: lsr.l #1,d0 ;shift next bit into carry
457: bcc.b e_next ;if zero, skip the mul
458: fmul.x (a1,d3),fp1 ;mul by 10**(d3_bit_no)
459: e_next:
460: add.l #12,d3 ;inc d3 to next pwrten table entry
461: tst.l d0 ;test if ISCALE is zero
462: bne.b e_loop ;if not, loop
463:
464:
465: * A8. Clr INEX; Force RZ.
466: * The operation in A3 above may have set INEX2.
467: * RZ mode is forced for the scaling operation to insure
468: * only one rounding error. The grs bits are collected in
469: * the INEX flag for use in A10.
470: *
471: * Register usage:
472: * Input/Output
473:
474: fmove.l #0,FPSR ;clr INEX
475: fmove.l #rz_mode,FPCR ;set RZ rounding mode
476:
477:
478: * A9. Scale X -> Y.
479: * The mantissa is scaled to the desired number of significant
480: * digits. The excess digits are collected in INEX2. If mul,
481: * Check d2 for excess 10 exponential value. If not zero,
482: * the iscale value would have caused the pwrten calculation
483: * to overflow. Only a negative iscale can cause this, so
484: * multiply by 10^(d2), which is now only allowed to be 24,
485: * with a multiply by 10^8 and 10^16, which is exact since
486: * 10^24 is exact. If the input was denormalized, we must
487: * create a busy stack frame with the mul command and the
488: * two operands, and allow the fpu to complete the multiply.
489: *
490: * Register usage:
491: * Input/Output
492: * d0: FPCR with RZ mode/Unchanged
493: * d2: 0 or 24/unchanged
494: * d3: x/x
495: * d4: LEN/Unchanged
496: * d5: ICTR:LAMBDA
497: * d6: ILOG/Unchanged
498: * d7: k-factor/Unchanged
499: * a0: ptr for original operand/final result
500: * a1: ptr to PTENRM array/Unchanged
501: * a2: x/x
502: * fp0: float(ILOG)/X adjusted for SCALE (Y)
503: * fp1: 10^ISCALE/Unchanged
504: * fp2: x/x
505: * F_SCR1:x/x
506: * F_SCR2:Abs(X) with $3fff exponent/Unchanged
507: * L_SCR1:x/x
508: * L_SCR2:first word of X packed/Unchanged
509:
510: A9_str:
511: fmove.x (a0),fp0 ;load X from memory
512: fabs.x fp0 ;use abs(X)
513: tst.w d5 ;LAMBDA is in lower word of d5
514: bne.b sc_mul ;if neg (LAMBDA = 1), scale by mul
515: fdiv.x fp1,fp0 ;calculate X / SCALE -> Y to fp0
516: bra.b A10_st ;branch to A10
517:
518: sc_mul:
519: tst.b BINDEC_FLG(a6) ;check for denorm
520: beq.b A9_norm ;if norm, continue with mul
521: fmovem.x fp1,-(a7) ;load ETEMP with 10^ISCALE
522: move.l 8(a0),-(a7) ;load FPTEMP with input arg
523: move.l 4(a0),-(a7)
524: move.l (a0),-(a7)
525: move.l #18,d3 ;load count for busy stack
526: A9_loop:
527: clr.l -(a7) ;clear lword on stack
528: dbf.w d3,A9_loop
529: move.b VER_TMP(a6),(a7) ;write current version number
530: move.b #BUSY_SIZE-4,1(a7) ;write current busy size
531: move.b #$10,$44(a7) ;set fcefpte[15] bit
532: move.w #$0023,$40(a7) ;load cmdreg1b with mul command
533: move.b #$fe,$8(a7) ;load all 1s to cu savepc
534: frestore (a7)+ ;restore frame to fpu for completion
535: fmul.x 36(a1),fp0 ;multiply fp0 by 10^8
536: fmul.x 48(a1),fp0 ;multiply fp0 by 10^16
537: bra.b A10_st
538: A9_norm:
539: tst.w d2 ;test for small exp case
540: beq.b A9_con ;if zero, continue as normal
541: fmul.x 36(a1),fp0 ;multiply fp0 by 10^8
542: fmul.x 48(a1),fp0 ;multiply fp0 by 10^16
543: A9_con:
544: fmul.x fp1,fp0 ;calculate X * SCALE -> Y to fp0
545:
546:
547: * A10. Or in INEX.
548: * If INEX is set, round error occurred. This is compensated
549: * for by 'or-ing' in the INEX2 flag to the lsb of Y.
550: *
551: * Register usage:
552: * Input/Output
553: * d0: FPCR with RZ mode/FPSR with INEX2 isolated
554: * d2: x/x
555: * d3: x/x
556: * d4: LEN/Unchanged
557: * d5: ICTR:LAMBDA
558: * d6: ILOG/Unchanged
559: * d7: k-factor/Unchanged
560: * a0: ptr for original operand/final result
561: * a1: ptr to PTENxx array/Unchanged
562: * a2: x/ptr to FP_SCR2(a6)
563: * fp0: Y/Y with lsb adjusted
564: * fp1: 10^ISCALE/Unchanged
565: * fp2: x/x
566:
567: A10_st:
568: fmove.l FPSR,d0 ;get FPSR
569: fmove.x fp0,FP_SCR2(a6) ;move Y to memory
570: lea.l FP_SCR2(a6),a2 ;load a2 with ptr to FP_SCR2
571: btst.l #9,d0 ;check if INEX2 set
572: beq.b A11_st ;if clear, skip rest
573: ori.l #1,8(a2) ;or in 1 to lsb of mantissa
574: fmove.x FP_SCR2(a6),fp0 ;write adjusted Y back to fpu
575:
576:
577: * A11. Restore original FPCR; set size ext.
578: * Perform FINT operation in the user's rounding mode. Keep
579: * the size to extended. The sintdo entry point in the sint
580: * routine expects the FPCR value to be in USER_FPCR for
581: * mode and precision. The original FPCR is saved in L_SCR1.
582:
583: A11_st:
584: move.l USER_FPCR(a6),L_SCR1(a6) ;save it for later
585: andi.l #$00000030,USER_FPCR(a6) ;set size to ext,
586: * ;block exceptions
587:
588:
589: * A12. Calculate YINT = FINT(Y) according to user's rounding mode.
590: * The FPSP routine sintd0 is used. The output is in fp0.
591: *
592: * Register usage:
593: * Input/Output
594: * d0: FPSR with AINEX cleared/FPCR with size set to ext
595: * d2: x/x/scratch
596: * d3: x/x
597: * d4: LEN/Unchanged
598: * d5: ICTR:LAMBDA/Unchanged
599: * d6: ILOG/Unchanged
600: * d7: k-factor/Unchanged
601: * a0: ptr for original operand/src ptr for sintdo
602: * a1: ptr to PTENxx array/Unchanged
603: * a2: ptr to FP_SCR2(a6)/Unchanged
604: * a6: temp pointer to FP_SCR2(a6) - orig value saved and restored
605: * fp0: Y/YINT
606: * fp1: 10^ISCALE/Unchanged
607: * fp2: x/x
608: * F_SCR1:x/x
609: * F_SCR2:Y adjusted for inex/Y with original exponent
610: * L_SCR1:x/original USER_FPCR
611: * L_SCR2:first word of X packed/Unchanged
612:
613: A12_st:
614: movem.l d0-d1/a0-a1,-(a7) ;save regs used by sintd0
615: move.l L_SCR1(a6),-(a7)
616: move.l L_SCR2(a6),-(a7)
617: lea.l FP_SCR2(a6),a0 ;a0 is ptr to F_SCR2(a6)
618: fmove.x fp0,(a0) ;move Y to memory at FP_SCR2(a6)
619: tst.l L_SCR2(a6) ;test sign of original operand
620: bge.b do_fint ;if pos, use Y
621: or.l #$80000000,(a0) ;if neg, use -Y
622: do_fint:
623: move.l USER_FPSR(a6),-(a7)
624: bsr sintdo ;sint routine returns int in fp0
625: move.b (a7),USER_FPSR(a6)
626: add.l #4,a7
627: move.l (a7)+,L_SCR2(a6)
628: move.l (a7)+,L_SCR1(a6)
629: movem.l (a7)+,d0-d1/a0-a1 ;restore regs used by sint
630: move.l L_SCR2(a6),FP_SCR2(a6) ;restore original exponent
631: move.l L_SCR1(a6),USER_FPCR(a6) ;restore user's FPCR
632:
633:
634: * A13. Check for LEN digits.
635: * If the int operation results in more than LEN digits,
636: * or less than LEN -1 digits, adjust ILOG and repeat from
637: * A6. This test occurs only on the first pass. If the
638: * result is exactly 10^LEN, decrement ILOG and divide
639: * the mantissa by 10. The calculation of 10^LEN cannot
640: * be inexact, since all powers of ten upto 10^27 are exact
641: * in extended precision, so the use of a previous power-of-ten
642: * table will introduce no error.
643: *
644: *
645: * Register usage:
646: * Input/Output
647: * d0: FPCR with size set to ext/scratch final = 0
648: * d2: x/x
649: * d3: x/scratch final = x
650: * d4: LEN/LEN adjusted
651: * d5: ICTR:LAMBDA/LAMBDA:ICTR
652: * d6: ILOG/ILOG adjusted
653: * d7: k-factor/Unchanged
654: * a0: pointer into memory for packed bcd string formation
655: * a1: ptr to PTENxx array/Unchanged
656: * a2: ptr to FP_SCR2(a6)/Unchanged
657: * fp0: int portion of Y/abs(YINT) adjusted
658: * fp1: 10^ISCALE/Unchanged
659: * fp2: x/10^LEN
660: * F_SCR1:x/x
661: * F_SCR2:Y with original exponent/Unchanged
662: * L_SCR1:original USER_FPCR/Unchanged
663: * L_SCR2:first word of X packed/Unchanged
664:
665: A13_st:
666: swap d5 ;put ICTR in lower word of d5
667: tst.w d5 ;check if ICTR = 0
668: bne not_zr ;if non-zero, go to second test
669: *
670: * Compute 10^(LEN-1)
671: *
672: fmove.s FONE,fp2 ;init fp2 to 1.0
673: move.l d4,d0 ;put LEN in d0
674: subq.l #1,d0 ;d0 = LEN -1
675: clr.l d3 ;clr table index
676: l_loop:
677: lsr.l #1,d0 ;shift next bit into carry
678: bcc.b l_next ;if zero, skip the mul
679: fmul.x (a1,d3),fp2 ;mul by 10**(d3_bit_no)
680: l_next:
681: add.l #12,d3 ;inc d3 to next pwrten table entry
682: tst.l d0 ;test if LEN is zero
683: bne.b l_loop ;if not, loop
684: *
685: * 10^LEN-1 is computed for this test and A14. If the input was
686: * denormalized, check only the case in which YINT > 10^LEN.
687: *
688: tst.b BINDEC_FLG(a6) ;check if input was norm
689: beq.b A13_con ;if norm, continue with checking
690: fabs.x fp0 ;take abs of YINT
691: bra test_2
692: *
693: * Compare abs(YINT) to 10^(LEN-1) and 10^LEN
694: *
695: A13_con:
696: fabs.x fp0 ;take abs of YINT
697: fcmp.x fp2,fp0 ;compare abs(YINT) with 10^(LEN-1)
698: fbge.w test_2 ;if greater, do next test
699: subq.l #1,d6 ;subtract 1 from ILOG
700: move.w #1,d5 ;set ICTR
701: fmove.l #rm_mode,FPCR ;set rmode to RM
702: fmul.s FTEN,fp2 ;compute 10^LEN
703: bra.w A6_str ;return to A6 and recompute YINT
704: test_2:
705: fmul.s FTEN,fp2 ;compute 10^LEN
706: fcmp.x fp2,fp0 ;compare abs(YINT) with 10^LEN
707: fblt.w A14_st ;if less, all is ok, go to A14
708: fbgt.w fix_ex ;if greater, fix and redo
709: fdiv.s FTEN,fp0 ;if equal, divide by 10
710: addq.l #1,d6 ; and inc ILOG
711: bra.b A14_st ; and continue elsewhere
712: fix_ex:
713: addq.l #1,d6 ;increment ILOG by 1
714: move.w #1,d5 ;set ICTR
715: fmove.l #rm_mode,FPCR ;set rmode to RM
716: bra.w A6_str ;return to A6 and recompute YINT
717: *
718: * Since ICTR <> 0, we have already been through one adjustment,
719: * and shouldn't have another; this is to check if abs(YINT) = 10^LEN
720: * 10^LEN is again computed using whatever table is in a1 since the
721: * value calculated cannot be inexact.
722: *
723: not_zr:
724: fmove.s FONE,fp2 ;init fp2 to 1.0
725: move.l d4,d0 ;put LEN in d0
726: clr.l d3 ;clr table index
727: z_loop:
728: lsr.l #1,d0 ;shift next bit into carry
729: bcc.b z_next ;if zero, skip the mul
730: fmul.x (a1,d3),fp2 ;mul by 10**(d3_bit_no)
731: z_next:
732: add.l #12,d3 ;inc d3 to next pwrten table entry
733: tst.l d0 ;test if LEN is zero
734: bne.b z_loop ;if not, loop
735: fabs.x fp0 ;get abs(YINT)
736: fcmp.x fp2,fp0 ;check if abs(YINT) = 10^LEN
737: fbne.w A14_st ;if not, skip this
738: fdiv.s FTEN,fp0 ;divide abs(YINT) by 10
739: addq.l #1,d6 ;and inc ILOG by 1
740: addq.l #1,d4 ; and inc LEN
741: fmul.s FTEN,fp2 ; if LEN++, the get 10^^LEN
742:
743:
744: * A14. Convert the mantissa to bcd.
745: * The binstr routine is used to convert the LEN digit
746: * mantissa to bcd in memory. The input to binstr is
747: * to be a fraction; i.e. (mantissa)/10^LEN and adjusted
748: * such that the decimal point is to the left of bit 63.
749: * The bcd digits are stored in the correct position in
750: * the final string area in memory.
751: *
752: *
753: * Register usage:
754: * Input/Output
755: * d0: x/LEN call to binstr - final is 0
756: * d1: x/0
757: * d2: x/ms 32-bits of mant of abs(YINT)
758: * d3: x/ls 32-bits of mant of abs(YINT)
759: * d4: LEN/Unchanged
760: * d5: ICTR:LAMBDA/LAMBDA:ICTR
761: * d6: ILOG
762: * d7: k-factor/Unchanged
763: * a0: pointer into memory for packed bcd string formation
764: * /ptr to first mantissa byte in result string
765: * a1: ptr to PTENxx array/Unchanged
766: * a2: ptr to FP_SCR2(a6)/Unchanged
767: * fp0: int portion of Y/abs(YINT) adjusted
768: * fp1: 10^ISCALE/Unchanged
769: * fp2: 10^LEN/Unchanged
770: * F_SCR1:x/Work area for final result
771: * F_SCR2:Y with original exponent/Unchanged
772: * L_SCR1:original USER_FPCR/Unchanged
773: * L_SCR2:first word of X packed/Unchanged
774:
775: A14_st:
776: fmove.l #rz_mode,FPCR ;force rz for conversion
777: fdiv.x fp2,fp0 ;divide abs(YINT) by 10^LEN
778: lea.l FP_SCR1(a6),a0
779: fmove.x fp0,(a0) ;move abs(YINT)/10^LEN to memory
780: move.l 4(a0),d2 ;move 2nd word of FP_RES to d2
781: move.l 8(a0),d3 ;move 3rd word of FP_RES to d3
782: clr.l 4(a0) ;zero word 2 of FP_RES
783: clr.l 8(a0) ;zero word 3 of FP_RES
784: move.l (a0),d0 ;move exponent to d0
785: swap d0 ;put exponent in lower word
786: beq.b no_sft ;if zero, don't shift
787: subi.l #$3ffd,d0 ;sub bias less 2 to make fract
788: tst.l d0 ;check if > 1
789: bgt.b no_sft ;if so, don't shift
790: neg.l d0 ;make exp positive
791: m_loop:
792: lsr.l #1,d2 ;shift d2:d3 right, add 0s
793: roxr.l #1,d3 ;the number of places
794: dbf.w d0,m_loop ;given in d0
795: no_sft:
796: tst.l d2 ;check for mantissa of zero
797: bne.b no_zr ;if not, go on
798: tst.l d3 ;continue zero check
799: beq.b zer_m ;if zero, go directly to binstr
800: no_zr:
801: clr.l d1 ;put zero in d1 for addx
802: addi.l #$00000080,d3 ;inc at bit 7
803: addx.l d1,d2 ;continue inc
804: andi.l #$ffffff80,d3 ;strip off lsb not used by 882
805: zer_m:
806: move.l d4,d0 ;put LEN in d0 for binstr call
807: addq.l #3,a0 ;a0 points to M16 byte in result
808: bsr binstr ;call binstr to convert mant
809:
810:
811: * A15. Convert the exponent to bcd.
812: * As in A14 above, the exp is converted to bcd and the
813: * digits are stored in the final string.
814: *
815: * Digits are stored in L_SCR1(a6) on return from BINDEC as:
816: *
817: * 32 16 15 0
818: * -----------------------------------------
819: * | 0 | e3 | e2 | e1 | e4 | X | X | X |
820: * -----------------------------------------
821: *
822: * And are moved into their proper places in FP_SCR1. If digit e4
823: * is non-zero, OPERR is signaled. In all cases, all 4 digits are
824: * written as specified in the 881/882 manual for packed decimal.
825: *
826: * Register usage:
827: * Input/Output
828: * d0: x/LEN call to binstr - final is 0
829: * d1: x/scratch (0);shift count for final exponent packing
830: * d2: x/ms 32-bits of exp fraction/scratch
831: * d3: x/ls 32-bits of exp fraction
832: * d4: LEN/Unchanged
833: * d5: ICTR:LAMBDA/LAMBDA:ICTR
834: * d6: ILOG
835: * d7: k-factor/Unchanged
836: * a0: ptr to result string/ptr to L_SCR1(a6)
837: * a1: ptr to PTENxx array/Unchanged
838: * a2: ptr to FP_SCR2(a6)/Unchanged
839: * fp0: abs(YINT) adjusted/float(ILOG)
840: * fp1: 10^ISCALE/Unchanged
841: * fp2: 10^LEN/Unchanged
842: * F_SCR1:Work area for final result/BCD result
843: * F_SCR2:Y with original exponent/ILOG/10^4
844: * L_SCR1:original USER_FPCR/Exponent digits on return from binstr
845: * L_SCR2:first word of X packed/Unchanged
846:
847: A15_st:
848: tst.b BINDEC_FLG(a6) ;check for denorm
849: beq.b not_denorm
850: ftst.x fp0 ;test for zero
851: fbeq.w den_zero ;if zero, use k-factor or 4933
852: fmove.l d6,fp0 ;float ILOG
853: fabs.x fp0 ;get abs of ILOG
854: bra.b convrt
855: den_zero:
856: tst.l d7 ;check sign of the k-factor
857: blt.b use_ilog ;if negative, use ILOG
858: fmove.s F4933,fp0 ;force exponent to 4933
859: bra.b convrt ;do it
860: use_ilog:
861: fmove.l d6,fp0 ;float ILOG
862: fabs.x fp0 ;get abs of ILOG
863: bra.b convrt
864: not_denorm:
865: ftst.x fp0 ;test for zero
866: fbne.w not_zero ;if zero, force exponent
867: fmove.s FONE,fp0 ;force exponent to 1
868: bra.b convrt ;do it
869: not_zero:
870: fmove.l d6,fp0 ;float ILOG
871: fabs.x fp0 ;get abs of ILOG
872: convrt:
873: fdiv.x 24(a1),fp0 ;compute ILOG/10^4
874: fmove.x fp0,FP_SCR2(a6) ;store fp0 in memory
875: move.l 4(a2),d2 ;move word 2 to d2
876: move.l 8(a2),d3 ;move word 3 to d3
877: move.w (a2),d0 ;move exp to d0
878: beq.b x_loop_fin ;if zero, skip the shift
879: subi.w #$3ffd,d0 ;subtract off bias
880: neg.w d0 ;make exp positive
881: x_loop:
882: lsr.l #1,d2 ;shift d2:d3 right
883: roxr.l #1,d3 ;the number of places
884: dbf.w d0,x_loop ;given in d0
885: x_loop_fin:
886: clr.l d1 ;put zero in d1 for addx
887: addi.l #$00000080,d3 ;inc at bit 6
888: addx.l d1,d2 ;continue inc
889: andi.l #$ffffff80,d3 ;strip off lsb not used by 882
890: move.l #4,d0 ;put 4 in d0 for binstr call
891: lea.l L_SCR1(a6),a0 ;a0 is ptr to L_SCR1 for exp digits
892: bsr binstr ;call binstr to convert exp
893: move.l L_SCR1(a6),d0 ;load L_SCR1 lword to d0
894: move.l #12,d1 ;use d1 for shift count
895: lsr.l d1,d0 ;shift d0 right by 12
896: bfins d0,FP_SCR1(a6){4:12} ;put e3:e2:e1 in FP_SCR1
897: lsr.l d1,d0 ;shift d0 right by 12
898: bfins d0,FP_SCR1(a6){16:4} ;put e4 in FP_SCR1
899: tst.b d0 ;check if e4 is zero
900: beq.b A16_st ;if zero, skip rest
901: or.l #opaop_mask,USER_FPSR(a6) ;set OPERR & AIOP in USER_FPSR
902:
903:
904: * A16. Write sign bits to final string.
905: * Sigma is bit 31 of initial value; RHO is bit 31 of d6 (ILOG).
906: *
907: * Register usage:
908: * Input/Output
909: * d0: x/scratch - final is x
910: * d2: x/x
911: * d3: x/x
912: * d4: LEN/Unchanged
913: * d5: ICTR:LAMBDA/LAMBDA:ICTR
914: * d6: ILOG/ILOG adjusted
915: * d7: k-factor/Unchanged
916: * a0: ptr to L_SCR1(a6)/Unchanged
917: * a1: ptr to PTENxx array/Unchanged
918: * a2: ptr to FP_SCR2(a6)/Unchanged
919: * fp0: float(ILOG)/Unchanged
920: * fp1: 10^ISCALE/Unchanged
921: * fp2: 10^LEN/Unchanged
922: * F_SCR1:BCD result with correct signs
923: * F_SCR2:ILOG/10^4
924: * L_SCR1:Exponent digits on return from binstr
925: * L_SCR2:first word of X packed/Unchanged
926:
927: A16_st:
928: clr.l d0 ;clr d0 for collection of signs
929: andi.b #$0f,FP_SCR1(a6) ;clear first nibble of FP_SCR1
930: tst.l L_SCR2(a6) ;check sign of original mantissa
931: bge.b mant_p ;if pos, don't set SM
932: moveq.l #2,d0 ;move 2 in to d0 for SM
933: mant_p:
934: tst.l d6 ;check sign of ILOG
935: bge.b wr_sgn ;if pos, don't set SE
936: addq.l #1,d0 ;set bit 0 in d0 for SE
937: wr_sgn:
938: bfins d0,FP_SCR1(a6){0:2} ;insert SM and SE into FP_SCR1
939:
940: * Clean up and restore all registers used.
941:
942: fmove.l #0,FPSR ;clear possible inex2/ainex bits
943: fmovem.x (a7)+,fp0-fp2
944: movem.l (a7)+,d2-d7/a2
945: rts
946:
947: end
CVSweb