Arithmetic¶
Addition¶
FixedPoint
addition using the +
or +=
operator is always
full-precision; that is, there is always bit growth.
Augend |
Addend |
Sum |
---|---|---|
\(UQm_1.n_1\)
(unsigned)
|
\(UQm_2.n_2\)
(unsigned)
|
\(UQx.y\), where
\(x = 1 + max\{m_1, m_2\}\)
\(y = max\{n_1, n_2\}\)
|
\(Qm_1.n_1\)
(signed)
|
\(Qm_2.n_2\)
(signed)
|
\(Qx.y\), where
\(x = 1 + max\{m_1, m_2\}\)
\(y = max\{n_1, n_2\}\)
|
\(Qm_1.n_1\)
(signed)
|
\(UQm_2.n_2\)
(unsigned)
|
|
\(UQm_1.n_1\)
(unsigned)
|
\(Qm_2.n_2\)
(signed)
|
As indicated in the table, any combination of signed and unsigned numbers can be added together. The sum is only unsigned when both augend and added are unsigned.
Overflow is not possible using the addition operators.
Unsigned¶
>>> x = FixedPoint(14, signed=0, m=8, n=4)
>>> y = FixedPoint(6, signed=0, m=3, n=5)
>>> z = x + y
>>> print(f' {x:q}\n+ {y:q}\n-------\n {z:q}')
UQ8.4
+ UQ3.5
-------
UQ9.5
Signed¶
>>> x = FixedPoint(-4, signed=1, m=4, n=4)
>>> y = FixedPoint(3, signed=1, m=3, n=5)
>>> z = x + y
>>> print(f' {x:q}\n+ {y:q}\n------\n {z:q}')
Q4.4
+ Q3.5
------
Q5.5
Mixed Signedness¶
>>> s = FixedPoint(-4.375, signed=1, m=4, n=4)
>>> u = FixedPoint(3 + 2**-5, signed=0, m=3, n=5)
>>> z = s + u
WARNING [SN1]: Non-matching rounding behaviors ['convergent', 'nearest'].
WARNING [SN1]: Using 'convergent'.
>>> print(f' {s:q}\n+ {u:q}\n-------\n {z:q}')
Q4.4
+ UQ3.5
-------
Q5.5
Additional Examples¶
This behavior guarantees that addition will never cause overflow. However, it does mean that an accumulator may grow larger than intended.
For example, summing 64 Q18.0 numbers will grow at maximum \(log_2(64) = 6\) bits, thus the accumulator should be sized to 24 integer bits.
from fixedpoint import FixedPoint
accum = FixedPoint(0, 1, 24, 0, overflow_alert='error', str_base=2)
max_neg = FixedPoint(-2**17, 1, 18, 0)
assert max_neg.clamped
for _ in range(64):
accum += max_neg
accum.clamp(24)
print(f"{int(accum)} in {accum:q} is\n0b{accum:_b}")
Summing the maximum negative Q18.0 number 64 times produces a Q24.0 that is
clamped to the maximum negative value. Note that accum.overflow_alert
was
set to 'error'
, thus we would have been informed had overflow occurred.
-8388608 in Q24.0 is
0b1000_0000_0000_0000_0000_0000
Subtraction¶
FixedPoint
subtraction using the -
or -=
operator is always
full-precision; that is, there is always bit growth.
Minuend |
Subtrahend |
Difference |
---|---|---|
\(UQm_1.n_1\)
(unsigned)
|
\(UQm_2.n_2\)
(unsigned)
|
\(UQx.y\), where
\(x = 1 + max\{m_1, m_2\}\)
\(y = max\{n_1, n_2\}\)
Overflow occurs if
subtrahend > minuend.
|
\(Qm_1.n_1\)
(signed)
|
\(Qm_2.n_2\)
(signed)
|
\(Qx.y\), where
\(x = 1 + max\{m_1, m_2\}\)
\(y = max\{n_1, n_2\}\)
|
\(Qm_1.n_1\)
(signed)
|
\(UQm_2.n_2\)
(unsigned)
|
\(Qx.y\), where
\(x = 2 + max\{m_1, m_2\}\)
\(y = max\{n_1, n_2\}\)
|
\(UQm_1.n_1\)
(unsigned)
|
\(Qm_2.n_2\)
(signed)
|
As indicated in the table, any combination of signed and unsigned numbers can be subtracted from each other. The difference is only unsigned when both augend and added are unsigned.
When signedness between minuend and subtrahend does not match, an extra integer bit is added to the unsigned term so it can be signed without overflowing.
Unsigned¶
>>> x = FixedPoint(14, signed=0, m=8, n=4)
>>> y = FixedPoint(6, signed=0, m=3, n=5)
>>> z = x - y
>>> print(f' {x:q}\n- {y:q}\n-------\n {z:q}')
UQ8.4
- UQ3.5
-------
UQ9.5
>>> float(z)
8.0
Overflow occurs when subtrahend > minuend.
>>> q_presub = y.qformat
>>> y.overflow_alert = 'warning'
>>> y -= x
WARNING [SN2]: Unsigned subtraction causes overflow.
WARNING [SN2]: Clamped to minimum.
>>> print(f' {q_presub}\n- {x:q}\n-------\n {y:q}')
UQ3.5
- UQ8.4
-------
UQ9.5
>>> float(y)
0.0
Signed¶
>>> x = FixedPoint(250 + 2**-6, signed=1)
>>> y = FixedPoint(-13 - 2**-8, signed=1)
>>> a = x - y
>>> print(f' {x:q}\n- {y:q}\n------\n {a:q}')
Q9.6
- Q5.8
------
Q10.8
>>> float(a)
263.01953125
>>> b = y - x
>>> print(f' {y:q}\n- {x:q}\n------\n {b:q}')
Q5.8
- Q9.6
------
Q10.8
>>> float(b)
-263.01953125
>>> a == -b
True
Overflow is not possible with signed subtraction.
Mixed Signedness¶
>>> s = FixedPoint(1, 1, 2)
>>> u = FixedPoint(1, 0, 2)
>>> x = u - s
WARNING [SN2]: Non-matching rounding behaviors ['convergent', 'nearest'].
WARNING [SN2]: Using 'convergent'.
>>> print(f' {u:q}\n- {s:q}\n------\n {x:q}')
UQ2.0
- Q2.0
------
Q4.0
>>> float(x)
0.0
Note that even though u and s can be represented without overflow in both
UQ2.0 and Q2.0 formats (their difference can too), 2 bits are still added
to the maximum integer bit width for the result. This makes for deterministic
bit growth. Use clamp()
or wrap()
to
revert back to the original Q format if needed.
>>> y = s - u
WARNING [SN1]: Non-matching rounding behaviors ['convergent', 'nearest'].
WARNING [SN1]: Using 'convergent'.
>>> print(f' {s:q}\n- {u:q}\n-------\n {y:q}')
Q2.0
- UQ2.0
-------
Q4.0
>>> float(y), clamp(y, s.m).qformat
(0.0, 'Q2.0')
Overflow is not possible with mixed signedness subtraction.
Multiplication¶
FixedPoint
multiplication using the *
or *=
operator is always
full-precision; that is, there is always bit growth.
Multiplicand |
Multiplier |
Product |
---|---|---|
\(UQm_1.n_1\)
(unsigned)
|
\(UQm_2.n_2\)
(unsigned)
|
\(UQx.y\), where
\(x = m_1 + m_2\)
\(y = n_1 + n_2\)
|
\(Qm_1.n_1\)
(signed)
|
\(Qm_2.n_2\)
(signed)
|
\(Qx.y\), where
\(x = m_1 + m_2\)
\(y = n_1 + n_2\)
|
\(Qm_1.n_1\)
(signed)
|
\(UQm_2.n_2\)
(unsigned)
|
|
\(UQm_1.n_1\)
(unsigned)
|
\(Qm_2.n_2\)
(signed)
|
Overflow is not possible using the multiplication operator.
Unsigned¶
>>> x = FixedPoint(10, n=2)
>>> y = FixedPoint(29, n=7)
>>> z = x * y
>>> print(f' {x:q}\n* {y:q}\n-------\n {z:q}')
UQ4.2
* UQ5.7
-------
UQ9.9
Signed¶
>>> x = FixedPoint(-4, signed=1, n=8)
>>> y = FixedPoint(2.5, signed=1)
>>> q = y.qformat
>>> y *= x
>>> print(f' {q}\n* {x:q}\n------\n {y:q}')
Q3.1
* Q3.8
------
Q6.9
>>> float(y)
-10.0
Mixed Signedness¶
>>> s = FixedPoint("0b1000", signed=1, m=3, n=1, rounding='nearest')
>>> u = FixedPoint("0b11", signed=0, m=2, n=0)
>>> z = u * s
>>> print(f"{u:.1f} * {s:.1f} = {z:.1f}")
3.0 * -4.0 = -12.0
>>> print(f' {u:q}\n* {s:q}\n------\n {z:q}')
UQ2.0
* Q3.1
------
Q5.1
Exponentiation¶
FixedPoint
exponentiation using the **
or **=
operator is always
full-precision; that is, there is always bit growth. Only positive integer
exponents are supported.
Base |
Exponent |
Result |
|
---|---|---|---|
\(UQm.n\)
(unsigned)
|
\(p \in \mathbb{Z}^+\)
(
int > 0) |
\(UQx.y\) |
where
\(x = p \times m\)
\(y = p \times n\)
|
\(Qm.n\)
(signed)
|
\(Qx.y\) |
>>> x = FixedPoint(1.5)
>>> y = FixedPoint(-1.5)
>>> x**y # not allowed
Traceback (most recent call last):
...
TypeError: Only positive integers are supported for exponentiation.
>>> x **= -2 # not allowed
Traceback (most recent call last):
...
TypeError: Only positive integers are supported for exponentiation.
>>> 2**x # not allowed
Traceback (most recent call last):
...
TypeError: unsupported operand type(s) for ** or pow(): 'int' and 'FixedPoint'
>>> a = x**4
>>> x.qformat, float(a), a.qformat
('UQ1.1', 5.0625, 'UQ4.4')
>>> b = y**3
>>> y.qformat, float(b), b.qformat
('Q2.1', -3.375, 'Q6.3')
Overflow is not possible using the power operators.
Negation & Absolute Value¶
Negation is achieved using the unary negation operator -
(see
FixedPoint.__neg__()
). Absolute value (see FixedPoint.__abs__()
)
is achieved using the negation operator on negative numbers, thus the same
behavior applies to unary negation and absolute value.
>>> x = FixedPoint(-4, m=10, overflow_alert='warning', str_base=2)
>>> float(y := -x)
4.0
>>> float(abs(x))
4.0
If the Q format can be maintained without overflow it will,
(as in the example above) otherwise an overflow alert is
issued, and the Q format of the result has an integer bit width one more than
the original FixedPoint
(as long as overflow_alert
is not 'error'
).
>>> y.qformat
'Q10.0'
>>> x.trim(ints=True) # remove unneeded leading bits
>>> x.qformat, float(x)
('Q3.0', -4.0)
>>> yy = -x
WARNING [SN1]: Negating 0b100 (Q3.0) causes overflow.
WARNING [SN1]: Adjusting Q format to Q4.0 to allow negation.
>>> x.qformat, y.qformat, yy.qformat, float(yy)
('Q3.0', 'Q10.0', 'Q4.0', 4.0)
>>> zz = abs(x)
WARNING [SN1]: Negating 0b100 (Q3.0) causes overflow.
WARNING [SN1]: Adjusting Q format to Q4.0 to allow negation.
>>> x.qformat, y.qformat, zz.qformat, float(zz)
('Q3.0', 'Q10.0', 'Q4.0', 4.0)
Unsigned numbers cannot be negated; this behavior is intended to minimize user error. Negating an unsigned number should be intentional. The preferred method is by use of the context manager:
>>> x = FixedPoint(3, signed=0)
>>> xx = abs(x)
>>> float(xx)
3.0
>>> -x
Traceback (most recent call last):
...
fixedpoint.FixedPointError: Unsigned numbers cannot be negated.
>>> with x(m=x.m + 1, signed=1): # Increase integer bit width for sign
... y = -x
>>> x.qformat, y.qformat, float(y)
('UQ2.0', 'Q3.0', -3.0)
Bitwise Operations¶
Bitwise operations do not cause overflow, nor do they modify the Q format.
Left Shift¶
Shifting bits left will cause MSbs to be lost. 0s are shifted into the LSb.
>>> x = FixedPoint('0b111000', 0, 3, 3, str_base=2)
>>> str(x << 2)
'100000'
To shift bits left and not lose bits, instead multiply the number by 2n, where n is the number of bits to shift.
>>> float(x) * 2**4
112.0
>>> y = x << 4
>>> float(y), y.qformat
(0.0, 'UQ3.3')
>>> z = x * 2**4
>>> float(z), z.qformat
(112.0, 'UQ8.3')
If the number of bits to shift is negative, a right shift is performed instead. For signed numbers, the value of the bits shifted in is the MSb. For unsigned numbers, 0s are shifted into the MSb.
>>> str(x << -2) # unsigned shift
'001110'
>>> with x(overflow_alert='ignore', overflow='wrap', signed=1): # signed shift
... str(x << -2)
'111110'
Right Shift¶
Shifting bits right will cause LSbs to be lost. 0s are shifted into the MSb for unsigned numbers. Sign bits are shifted into the MSb for signed numbers.
>>> notsigned = FixedPoint('0b111000', 0, 3, 3, str_base=2)
>>> signedneg = FixedPoint('0b111000', 1, 3, 3, str_base=2)
>>> signedpos = FixedPoint('0b011000', 1, 3, 3, str_base=2)
>>> print(f"{notsigned >> 2!s}\n{signedpos >> 2!s}\n{signedneg >> 2!s}")
001110
000110
111110
To shift bits left and not lose bits, instead multiply the number by 2-n, where n is the number of bits to shift.
If the number of bits to shift is negative, a left shift is performed instead. 0s are shifted into the LSb.
>>> print(f"{notsigned >> -2!s}\n{signedpos >> -2!s}\n{signedneg >> -2!s}")
100000
100000
100000
>>> x = FixedPoint(1, m=3)
>>> 2**-3 # Desired numerical value
0.125
>>> y = x >> 3
>>> float(y), y.qformat
(0.0, 'UQ3.0')
>>> z = x * 2**-3
>>> float(z), z.qformat
(0.125, 'UQ3.3')
AND, OR, XOR¶
The &
, &=
, |
, |=
, ^
, and ^=
operators perform bitwise
operations. A FixedPoint
is inter operable with an int
or
another FixedPoint
. In the latter case, the operand on the left will
be the Q format of the returned value.
>>> from operator import and_, or_, xor
>>> def operate(left, op, right):
... """Pretty display of using `op` with `left` and `right` operands"""
... r = {'&': and_, '|': or_, '^': xor}[op](left, right)
... l = max(len(left), len(right))
... return (f" {left:>{l}s} ({left:q})\n"
... f"{op} {right:>{l}s} ({right:q})\n"
... f"----------{'-' * l}\n"
... f" {r:>{l}s} ({r:q})")
>>> L = FixedPoint('0b100011', 0, 3, 3, str_base=2)
>>> R = FixedPoint(0b10, str_base=2)
>>> print(f" L & R\n{operate(L, '&', R)}")
L & R
100011 (UQ3.3)
& 10 (UQ2.0)
----------------
000010 (UQ3.3)
>>> print(f" R | L\n{operate(R, '|', L)}")
R | L
10 (UQ2.0)
| 100011 (UQ3.3)
----------------
11 (UQ2.0)
>>> print(f" L ^ R\n{operate(L, '^', R)}")
L ^ R
100011 (UQ3.3)
^ 10 (UQ2.0)
----------------
100001 (UQ3.3)
When using an int
as an operand, the operation is performed on the
FixedPoint.bits
attribute, and not the numerical value.
>>> x = FixedPoint('0b100011', 1, 3, 3, str_base=2)
>>> str(a := 7 & x)
'000011'
>>> float(a)
0.375
The order of the operands is irrelevant.
>>> str(b1 := x ^ 0b110000)
'010011'
>>> str(b2 := 0b110000 ^ x)
'010011'
>>> float(b1), float(b2)
(2.375, 2.375)
The integer is masked to the the number of bits in the FixedPoint
before
performing the operation.
>>> b1 |= 0b11111111111111111111101100 # (only the left len(b1) bits are used)
>>> str(b1), float(b1)
('111111', -0.125)
Inversion¶
Use the unary inversion operator ~
(see FixedPoint.__invert__()
) to
perform bitwise inversion.
>>> x = FixedPoint(0xAAAA)
>>> hex(~x)
'0x5555'
>>> ~x == (x ^ x.bitmask)
True