Annotation of sys/arch/m68k/fpsp/srem_mod.sa, Revision 1.1.1.1
1.1 nbrk 1: * $OpenBSD: srem_mod.sa,v 1.2 1996/05/29 21:05:41 niklas Exp $
2: * $NetBSD: srem_mod.sa,v 1.3 1994/10/26 07:49:58 cgd Exp $
3:
4: * MOTOROLA MICROPROCESSOR & MEMORY TECHNOLOGY GROUP
5: * M68000 Hi-Performance Microprocessor Division
6: * M68040 Software Package
7: *
8: * M68040 Software Package Copyright (c) 1993, 1994 Motorola Inc.
9: * All rights reserved.
10: *
11: * THE SOFTWARE is provided on an "AS IS" basis and without warranty.
12: * To the maximum extent permitted by applicable law,
13: * MOTOROLA DISCLAIMS ALL WARRANTIES WHETHER EXPRESS OR IMPLIED,
14: * INCLUDING IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A
15: * PARTICULAR PURPOSE and any warranty against infringement with
16: * regard to the SOFTWARE (INCLUDING ANY MODIFIED VERSIONS THEREOF)
17: * and any accompanying written materials.
18: *
19: * To the maximum extent permitted by applicable law,
20: * IN NO EVENT SHALL MOTOROLA BE LIABLE FOR ANY DAMAGES WHATSOEVER
21: * (INCLUDING WITHOUT LIMITATION, DAMAGES FOR LOSS OF BUSINESS
22: * PROFITS, BUSINESS INTERRUPTION, LOSS OF BUSINESS INFORMATION, OR
23: * OTHER PECUNIARY LOSS) ARISING OF THE USE OR INABILITY TO USE THE
24: * SOFTWARE. Motorola assumes no responsibility for the maintenance
25: * and support of the SOFTWARE.
26: *
27: * You are hereby granted a copyright license to use, modify, and
28: * distribute the SOFTWARE so long as this entire notice is retained
29: * without alteration in any modified and/or redistributed versions,
30: * and that such modified versions are clearly identified as such.
31: * No licenses are granted by implication, estoppel or otherwise
32: * under any patents or trademarks of Motorola, Inc.
33:
34: *
35: * srem_mod.sa 3.1 12/10/90
36: *
37: * The entry point sMOD computes the floating point MOD of the
38: * input values X and Y. The entry point sREM computes the floating
39: * point (IEEE) REM of the input values X and Y.
40: *
41: * INPUT
42: * -----
43: * Double-extended value Y is pointed to by address in register
44: * A0. Double-extended value X is located in -12(A0). The values
45: * of X and Y are both nonzero and finite; although either or both
46: * of them can be denormalized. The special cases of zeros, NaNs,
47: * and infinities are handled elsewhere.
48: *
49: * OUTPUT
50: * ------
51: * FREM(X,Y) or FMOD(X,Y), depending on entry point.
52: *
53: * ALGORITHM
54: * ---------
55: *
56: * Step 1. Save and strip signs of X and Y: signX := sign(X),
57: * signY := sign(Y), X := |X|, Y := |Y|,
58: * signQ := signX EOR signY. Record whether MOD or REM
59: * is requested.
60: *
61: * Step 2. Set L := expo(X)-expo(Y), k := 0, Q := 0.
62: * If (L < 0) then
63: * R := X, go to Step 4.
64: * else
65: * R := 2^(-L)X, j := L.
66: * endif
67: *
68: * Step 3. Perform MOD(X,Y)
69: * 3.1 If R = Y, go to Step 9.
70: * 3.2 If R > Y, then { R := R - Y, Q := Q + 1}
71: * 3.3 If j = 0, go to Step 4.
72: * 3.4 k := k + 1, j := j - 1, Q := 2Q, R := 2R. Go to
73: * Step 3.1.
74: *
75: * Step 4. At this point, R = X - QY = MOD(X,Y). Set
76: * Last_Subtract := false (used in Step 7 below). If
77: * MOD is requested, go to Step 6.
78: *
79: * Step 5. R = MOD(X,Y), but REM(X,Y) is requested.
80: * 5.1 If R < Y/2, then R = MOD(X,Y) = REM(X,Y). Go to
81: * Step 6.
82: * 5.2 If R > Y/2, then { set Last_Subtract := true,
83: * Q := Q + 1, Y := signY*Y }. Go to Step 6.
84: * 5.3 This is the tricky case of R = Y/2. If Q is odd,
85: * then { Q := Q + 1, signX := -signX }.
86: *
87: * Step 6. R := signX*R.
88: *
89: * Step 7. If Last_Subtract = true, R := R - Y.
90: *
91: * Step 8. Return signQ, last 7 bits of Q, and R as required.
92: *
93: * Step 9. At this point, R = 2^(-j)*X - Q Y = Y. Thus,
94: * X = 2^(j)*(Q+1)Y. set Q := 2^(j)*(Q+1),
95: * R := 0. Return signQ, last 7 bits of Q, and R.
96: *
97:
98: SREM_MOD IDNT 2,1 Motorola 040 Floating Point Software Package
99:
100: section 8
101:
102: include fpsp.h
103:
104: Mod_Flag equ L_SCR3
105: SignY equ FP_SCR3+4
106: SignX equ FP_SCR3+8
107: SignQ equ FP_SCR3+12
108: Sc_Flag equ FP_SCR4
109:
110: Y equ FP_SCR1
111: Y_Hi equ Y+4
112: Y_Lo equ Y+8
113:
114: R equ FP_SCR2
115: R_Hi equ R+4
116: R_Lo equ R+8
117:
118:
119: Scale DC.L $00010000,$80000000,$00000000,$00000000
120:
121: xref t_avoid_unsupp
122:
123: xdef smod
124: smod:
125:
126: Clr.L Mod_Flag(a6)
127: BRA.B Mod_Rem
128:
129: xdef srem
130: srem:
131:
132: Move.L #1,Mod_Flag(a6)
133:
134: Mod_Rem:
135: *..Save sign of X and Y
136: MoveM.L D2-D7,-(A7) ...save data registers
137: Move.W (A0),D3
138: Move.W D3,SignY(a6)
139: AndI.L #$00007FFF,D3 ...Y := |Y|
140:
141: *
142: Move.L 4(A0),D4
143: Move.L 8(A0),D5 ...(D3,D4,D5) is |Y|
144:
145: Tst.L D3
146: BNE.B Y_Normal
147:
148: Move.L #$00003FFE,D3 ...$3FFD + 1
149: Tst.L D4
150: BNE.B HiY_not0
151:
152: HiY_0:
153: Move.L D5,D4
154: CLR.L D5
155: SubI.L #32,D3
156: CLR.L D6
157: BFFFO D4{0:32},D6
158: LSL.L D6,D4
159: Sub.L D6,D3 ...(D3,D4,D5) is normalized
160: * ...with bias $7FFD
161: BRA.B Chk_X
162:
163: HiY_not0:
164: CLR.L D6
165: BFFFO D4{0:32},D6
166: Sub.L D6,D3
167: LSL.L D6,D4
168: Move.L D5,D7 ...a copy of D5
169: LSL.L D6,D5
170: Neg.L D6
171: AddI.L #32,D6
172: LSR.L D6,D7
173: Or.L D7,D4 ...(D3,D4,D5) normalized
174: * ...with bias $7FFD
175: BRA.B Chk_X
176:
177: Y_Normal:
178: AddI.L #$00003FFE,D3 ...(D3,D4,D5) normalized
179: * ...with bias $7FFD
180:
181: Chk_X:
182: Move.W -12(A0),D0
183: Move.W D0,SignX(a6)
184: Move.W SignY(a6),D1
185: EOr.L D0,D1
186: AndI.L #$00008000,D1
187: Move.W D1,SignQ(a6) ...sign(Q) obtained
188: AndI.L #$00007FFF,D0
189: Move.L -8(A0),D1
190: Move.L -4(A0),D2 ...(D0,D1,D2) is |X|
191: Tst.L D0
192: BNE.B X_Normal
193: Move.L #$00003FFE,D0
194: Tst.L D1
195: BNE.B HiX_not0
196:
197: HiX_0:
198: Move.L D2,D1
199: CLR.L D2
200: SubI.L #32,D0
201: CLR.L D6
202: BFFFO D1{0:32},D6
203: LSL.L D6,D1
204: Sub.L D6,D0 ...(D0,D1,D2) is normalized
205: * ...with bias $7FFD
206: BRA.B Init
207:
208: HiX_not0:
209: CLR.L D6
210: BFFFO D1{0:32},D6
211: Sub.L D6,D0
212: LSL.L D6,D1
213: Move.L D2,D7 ...a copy of D2
214: LSL.L D6,D2
215: Neg.L D6
216: AddI.L #32,D6
217: LSR.L D6,D7
218: Or.L D7,D1 ...(D0,D1,D2) normalized
219: * ...with bias $7FFD
220: BRA.B Init
221:
222: X_Normal:
223: AddI.L #$00003FFE,D0 ...(D0,D1,D2) normalized
224: * ...with bias $7FFD
225:
226: Init:
227: *
228: Move.L D3,L_SCR1(a6) ...save biased expo(Y)
229: move.l d0,L_SCR2(a6) ;save d0
230: Sub.L D3,D0 ...L := expo(X)-expo(Y)
231: * Move.L D0,L ...D0 is j
232: CLR.L D6 ...D6 := carry <- 0
233: CLR.L D3 ...D3 is Q
234: MoveA.L #0,A1 ...A1 is k; j+k=L, Q=0
235:
236: *..(Carry,D1,D2) is R
237: Tst.L D0
238: BGE.B Mod_Loop
239:
240: *..expo(X) < expo(Y). Thus X = mod(X,Y)
241: *
242: move.l L_SCR2(a6),d0 ;restore d0
243: BRA.W Get_Mod
244:
245: *..At this point R = 2^(-L)X; Q = 0; k = 0; and k+j = L
246:
247:
248: Mod_Loop:
249: Tst.L D6 ...test carry bit
250: BGT.B R_GT_Y
251:
252: *..At this point carry = 0, R = (D1,D2), Y = (D4,D5)
253: Cmp.L D4,D1 ...compare hi(R) and hi(Y)
254: BNE.B R_NE_Y
255: Cmp.L D5,D2 ...compare lo(R) and lo(Y)
256: BNE.B R_NE_Y
257:
258: *..At this point, R = Y
259: BRA.W Rem_is_0
260:
261: R_NE_Y:
262: *..use the borrow of the previous compare
263: BCS.B R_LT_Y ...borrow is set iff R < Y
264:
265: R_GT_Y:
266: *..If Carry is set, then Y < (Carry,D1,D2) < 2Y. Otherwise, Carry = 0
267: *..and Y < (D1,D2) < 2Y. Either way, perform R - Y
268: Sub.L D5,D2 ...lo(R) - lo(Y)
269: SubX.L D4,D1 ...hi(R) - hi(Y)
270: CLR.L D6 ...clear carry
271: AddQ.L #1,D3 ...Q := Q + 1
272:
273: R_LT_Y:
274: *..At this point, Carry=0, R < Y. R = 2^(k-L)X - QY; k+j = L; j >= 0.
275: Tst.L D0 ...see if j = 0.
276: BEQ.B PostLoop
277:
278: Add.L D3,D3 ...Q := 2Q
279: Add.L D2,D2 ...lo(R) = 2lo(R)
280: AddX.L D1,D1 ...hi(R) = 2hi(R) + carry
281: SCS D6 ...set Carry if 2(R) overflows
282: AddQ.L #1,A1 ...k := k+1
283: SubQ.L #1,D0 ...j := j - 1
284: *..At this point, R=(Carry,D1,D2) = 2^(k-L)X - QY, j+k=L, j >= 0, R < 2Y.
285:
286: BRA.B Mod_Loop
287:
288: PostLoop:
289: *..k = L, j = 0, Carry = 0, R = (D1,D2) = X - QY, R < Y.
290:
291: *..normalize R.
292: Move.L L_SCR1(a6),D0 ...new biased expo of R
293: Tst.L D1
294: BNE.B HiR_not0
295:
296: HiR_0:
297: Move.L D2,D1
298: CLR.L D2
299: SubI.L #32,D0
300: CLR.L D6
301: BFFFO D1{0:32},D6
302: LSL.L D6,D1
303: Sub.L D6,D0 ...(D0,D1,D2) is normalized
304: * ...with bias $7FFD
305: BRA.B Get_Mod
306:
307: HiR_not0:
308: CLR.L D6
309: BFFFO D1{0:32},D6
310: BMI.B Get_Mod ...already normalized
311: Sub.L D6,D0
312: LSL.L D6,D1
313: Move.L D2,D7 ...a copy of D2
314: LSL.L D6,D2
315: Neg.L D6
316: AddI.L #32,D6
317: LSR.L D6,D7
318: Or.L D7,D1 ...(D0,D1,D2) normalized
319:
320: *
321: Get_Mod:
322: CmpI.L #$000041FE,D0
323: BGE.B No_Scale
324: Do_Scale:
325: Move.W D0,R(a6)
326: clr.w R+2(a6)
327: Move.L D1,R_Hi(a6)
328: Move.L D2,R_Lo(a6)
329: Move.L L_SCR1(a6),D6
330: Move.W D6,Y(a6)
331: clr.w Y+2(a6)
332: Move.L D4,Y_Hi(a6)
333: Move.L D5,Y_Lo(a6)
334: FMove.X R(a6),fp0 ...no exception
335: Move.L #1,Sc_Flag(a6)
336: BRA.B ModOrRem
337: No_Scale:
338: Move.L D1,R_Hi(a6)
339: Move.L D2,R_Lo(a6)
340: SubI.L #$3FFE,D0
341: Move.W D0,R(a6)
342: clr.w R+2(a6)
343: Move.L L_SCR1(a6),D6
344: SubI.L #$3FFE,D6
345: Move.L D6,L_SCR1(a6)
346: FMove.X R(a6),fp0
347: Move.W D6,Y(a6)
348: Move.L D4,Y_Hi(a6)
349: Move.L D5,Y_Lo(a6)
350: Clr.L Sc_Flag(a6)
351:
352: *
353:
354:
355: ModOrRem:
356: Move.L Mod_Flag(a6),D6
357: BEQ.B Fix_Sign
358:
359: Move.L L_SCR1(a6),D6 ...new biased expo(Y)
360: SubQ.L #1,D6 ...biased expo(Y/2)
361: Cmp.L D6,D0
362: BLT.B Fix_Sign
363: BGT.B Last_Sub
364:
365: Cmp.L D4,D1
366: BNE.B Not_EQ
367: Cmp.L D5,D2
368: BNE.B Not_EQ
369: BRA.W Tie_Case
370:
371: Not_EQ:
372: BCS.B Fix_Sign
373:
374: Last_Sub:
375: *
376: FSub.X Y(a6),fp0 ...no exceptions
377: AddQ.L #1,D3 ...Q := Q + 1
378:
379: *
380:
381: Fix_Sign:
382: *..Get sign of X
383: Move.W SignX(a6),D6
384: BGE.B Get_Q
385: FNeg.X fp0
386:
387: *..Get Q
388: *
389: Get_Q:
390: clr.l d6
391: Move.W SignQ(a6),D6 ...D6 is sign(Q)
392: Move.L #8,D7
393: LSR.L D7,D6
394: AndI.L #$0000007F,D3 ...7 bits of Q
395: Or.L D6,D3 ...sign and bits of Q
396: Swap D3
397: FMove.L fpsr,D6
398: AndI.L #$FF00FFFF,D6
399: Or.L D3,D6
400: FMove.L D6,fpsr ...put Q in fpsr
401:
402: *
403: Restore:
404: MoveM.L (A7)+,D2-D7
405: FMove.L USER_FPCR(a6),fpcr
406: Move.L Sc_Flag(a6),D0
407: BEQ.B Finish
408: FMul.X Scale(pc),fp0 ...may cause underflow
409: bra t_avoid_unsupp ;check for denorm as a
410: * ;result of the scaling
411:
412: Finish:
413: fmove.x fp0,fp0 ;capture exceptions & round
414: rts
415:
416: Rem_is_0:
417: *..R = 2^(-j)X - Q Y = Y, thus R = 0 and quotient = 2^j (Q+1)
418: AddQ.L #1,D3
419: CmpI.L #8,D0 ...D0 is j
420: BGE.B Q_Big
421:
422: LSL.L D0,D3
423: BRA.B Set_R_0
424:
425: Q_Big:
426: CLR.L D3
427:
428: Set_R_0:
429: FMove.S #:00000000,fp0
430: Clr.L Sc_Flag(a6)
431: BRA.W Fix_Sign
432:
433: Tie_Case:
434: *..Check parity of Q
435: Move.L D3,D6
436: AndI.L #$00000001,D6
437: Tst.L D6
438: BEq.W Fix_Sign ...Q is even
439:
440: *..Q is odd, Q := Q + 1, signX := -signX
441: AddQ.L #1,D3
442: Move.W SignX(a6),D6
443: EOrI.L #$00008000,D6
444: Move.W D6,SignX(a6)
445: BRA.W Fix_Sign
446:
447: End
CVSweb