294 lines
12 KiB
ReStructuredText
294 lines
12 KiB
ReStructuredText
lib_xcore_math change log
|
|
=========================
|
|
|
|
2.4.0
|
|
-----
|
|
|
|
* CHANGED: Documentation updated
|
|
* CHANGED: Renamed examples to app_
|
|
* FIXED: Added missing xcore_math.h
|
|
* REMOVED: xmos_cmake_toolchain submodule
|
|
|
|
2.3.0
|
|
-----
|
|
|
|
* CHANGED: Examples and tests to build using XCommon CMake
|
|
|
|
2.2.0
|
|
-----
|
|
|
|
* ADDED: `filter_biquad_sat_s32()` - Apply a biquad filter to a 32-bit
|
|
signal with saturation.
|
|
* ADDED: XCommon CMake build system support.
|
|
* CHANGED: Removed deprecated `numpy` types in `xmath_script.py` (issue #139).
|
|
* CHANGED: Made FFT documentation more clear (issue #169).
|
|
* FIXED: Bug (issue #170) in `xs3_memcpy()`.
|
|
* FIXED: Saturation in `filter_fir_s32()` when `shift <= 0`.
|
|
|
|
2.1.3
|
|
-----
|
|
|
|
* Fixes bug (issue #147) in `s16_to_s32()`.
|
|
* Fixes bug (issue #146) in `bfp_s32_macc()` and `bfp_s32_nmacc()`.
|
|
* Fixes bug with the `vect_s32_prepare_api` not appearing in the
|
|
documentation.
|
|
* Fixes bug in `bfp_s32_mean()` and `bfp_s16_mean()` when hitting a corner
|
|
case scenario.
|
|
* Cleans up internal functions.
|
|
* Allows compiling and running demos and tests on Windows Native x86
|
|
platforms.
|
|
* Removes several warnings.
|
|
|
|
2.1.2
|
|
-----
|
|
|
|
* Optimisation fix (issue #128) for `filter_fir_s32()`
|
|
* Documentation improvements.
|
|
|
|
2.1.1
|
|
-----
|
|
|
|
* Fixes bug (issue #116) in `vect_packed_complex_s32_macc()`.
|
|
* Fixes bug (issue #119) in `filter_fir_s32()`.
|
|
* Adds `--scale` option to the filter conversion script
|
|
`gen_biquad_filter_s32.py`
|
|
* If internal filter states (outputs from internal biquad sections) grow too
|
|
large integer overflows may occur. Using the `--scale` option can help avoid
|
|
this by effectively applying a gain factor to all coefficients.
|
|
* Fixes bug (mentioned in issue #119) in the `gen_fir_filter_s32.py` and
|
|
`gen_fir_filter_s16.py` filter conversion scripts where in a certain corner
|
|
case filter coefficients can overflow.
|
|
|
|
2.1.0
|
|
-----
|
|
|
|
* Adds several new operations for IEEE float vectors.
|
|
* Corrects `module_build_info` for legacy build tools.
|
|
* Fixes potential issue with include paths.
|
|
|
|
2.0.2
|
|
-----
|
|
|
|
* Updated CMake configuration to support Darwin platform.
|
|
|
|
2.0.1
|
|
-----
|
|
|
|
* Bugfix: Fixed issue with including ``xmath/xmath.h`` from XC files.
|
|
* Doc Update: Corrected instructions for configuring CMake using XS3
|
|
toolchain.
|
|
|
|
|
|
Legacy release history
|
|
======================
|
|
|
|
2.0.0
|
|
-----
|
|
|
|
Major, backwards compatibility-breaking update to the library.
|
|
|
|
Background
|
|
**********
|
|
|
|
This update does not add new features (compared to 1.1.0) to the library. Instead, it is a
|
|
major refactoring of the library in a few different ways. The overarching purpose of this
|
|
refactoring is to generalize the library so that when any future xcore ISAs are released, support
|
|
for those architectures can be included within this library in a minimally disruptive way. The
|
|
xcore XS3 architecture currently remains the only xcore ISA officially supported by the library.
|
|
|
|
Major Changes
|
|
*************
|
|
|
|
As mentioned, the update from 1.1.0 to 2.0.0 is does not add, remove or functionally change
|
|
supported features. Rather, the changes are primarily related to organization and the conventions
|
|
used. However, these changes do break backwards compatibility; hence the major version increment.
|
|
|
|
The relevant major changes are:
|
|
|
|
* **Library name has changed from `lib_xs3_math` to `lib_xcore_math`**
|
|
|
|
* This is to reflect that this library is not intended to be deprecated when future xcore ISA
|
|
versions are released.
|
|
|
|
* **Library re-organized into multiple sub-APIs**
|
|
|
|
* The organization of the library into the 'high-level' (BFP) and 'low-level' (non-BFP) APIs has
|
|
grown cumbersome and no longer cleanly fits the content of the library. To this end, the
|
|
operations made available by this library have been re-organized into several sub-APIs. These
|
|
APIs are all versioned together and are in many cases interdependent. This new grouping is
|
|
primarily for conceptual simplicity.
|
|
|
|
* BFP API -- Mostly unchanged from the previous high-level API.
|
|
* Vector API -- Corresponds to most of the functionality in the previous low-level API, less the
|
|
operations that were moved to the new APIs.
|
|
* Filtering API -- Operations related to supported linear filters (FIR and Biquad).
|
|
* FFT API -- Collects the FFT-related functions (both low-level and high-level) from the
|
|
previous API and groups them together.
|
|
* DCT API -- Like the FFT API, collected DCT-related functions.
|
|
* Scalar API -- A small API for implementations of scalar operations, with particular emphasis
|
|
on the non-IEEE 754 floating-point scalars provided (but also includes some ``float``
|
|
operations).
|
|
|
|
* **Many functions renamed**
|
|
|
|
* Previous versions of this library used a function naming convention whereby the majority of
|
|
low-level functions (and some types) had names prefixed with ``xs3_``, even where those
|
|
functions were written in C (rather than XS3 assembly) and were not intimately related to the
|
|
specifics of the XS3 hardware (i.e. where the same C implementation could plausibly be used in
|
|
future ISA versions).
|
|
* To that end, most functions prefixed with ``xs3_`` have had that prefix removed. This way, when
|
|
a future ISA is released (and once support is added to this library), applications can be
|
|
retargeted at the newer architecture without the unnecessary effort of going through the code to
|
|
rename all the function calls.
|
|
|
|
* e.g. ``xs3_vect_s32_mul() --> vect_s32_mul()``
|
|
* Note that most BFP function names remain unchanged.
|
|
|
|
* The intention going forward is that the public API should avoid ISA version-specific naming when
|
|
the object being named is not conceptually specific to a particular ISA (except possibly where
|
|
optimizing different ISA versions necessitates mutually incompatible implementations).
|
|
|
|
|
|
1.1.0
|
|
-----
|
|
|
|
Major Changes
|
|
*************
|
|
|
|
* Support for channel-pair related types and operations has been dropped. These were considered to
|
|
be too narrowly focused on making use of a single optimization (stereo FFT).
|
|
|
|
* This is a backwards compatibility-breaking change, requiring a major version increment.
|
|
|
|
* Added various scalar arithmetic functions for `float_s32_t` type.
|
|
|
|
* Adds Discrete Cosine Transform API
|
|
|
|
* Adds various trig and exponential functions.
|
|
|
|
Bugfixes
|
|
********
|
|
|
|
* Fixed bug in `bfp_fft_inverse_stereo()` where length of output BFP vector was half of correct
|
|
length.
|
|
|
|
New Functions
|
|
*************
|
|
* BFP API
|
|
|
|
* FFT spectrum unpacking
|
|
|
|
* `bfp_fft_unpack_mono()` -- Used to expand the output spectrum from `bfp_fft_forward_mono()`
|
|
from `FFT_N/2` elements (with the Nyquist component packed into the DC component) to
|
|
`FFT_N/2 + 1` elements. This is useful as many complex operations behave undesirably on the
|
|
packed representation.
|
|
* `bfp_fft_pack_mono()` -- Opposite of `bfp_fft_unpack_mono()`. Used to repack the spectrum into
|
|
a form suitable for calling `bfp_fft_inverse_mono()`.
|
|
|
|
* Dynamic BFP vector allocation
|
|
|
|
* Functions for allocating and deallocating BFP vectors dynamically from the heap.
|
|
* `bfp_sXX_alloc()`, `bfp_complex_sXX_alloc()`
|
|
* `bfp_sXX_dealloc()`, `bfp_complex_sXX_dealloc()`
|
|
|
|
* Multiply-accumulate functions
|
|
|
|
* A handful of element-wise multiply-accumulate functions have been added for both 16-bit and
|
|
32-bit, and both real and complex vector types. e.g...
|
|
|
|
* `bfp_sXX_macc()` -- Element-wise multiply accumulate for real 16/32-bit vectors
|
|
* `bfp_sXX_nmacc()` -- Element-wise negated multiply accumulate (i.e. multiply-subtract) for
|
|
real vectors
|
|
* `bfp_complex_sXX_macc()` -- Element-wise multiply accumulate for complex vectors.
|
|
* `bfp_complex_sXX_conj_macc()` -- Element-wise conjugate multiply accumulate for complex
|
|
vectors.
|
|
* (and various others)
|
|
|
|
* `bfp_complex_sXX_conjugate()` -- Get the complex conjugate of a vector
|
|
* `bfp_complex_sXX_energy()` -- Compute the sum of a complex vector's elements' squared
|
|
magnitudes.
|
|
* `bfp_sXX_use_exponent()` / `bfp_complex_sXX_use_exponent()` -- Force BFP vector to encode
|
|
mantissas using specified exponent (i.e. convert to specified Q-format)
|
|
* `bfp_s32_convolve_valid()` / `bfp_complex_s32_convolve_same()` -- Filter a 32-bit signal using a
|
|
short convolution kernel. Both "valid" and "same" padding modes are supported.
|
|
* `xs3_vect_sXX_add_scalar()` / `xs3_vect_complex_sXX_add_scalar()` -- Functions to add scalar to
|
|
a vector (16/32-bit real/complex)
|
|
|
|
|
|
* Vector API
|
|
|
|
* Functions supporting mixed-depth operations
|
|
|
|
* `xs3_mat_mul_s8_x_s8_yield_s32()` -- Multiply-accumulate an 8-bit vector by an 8-bit matrix
|
|
into 32-bit accumulators.
|
|
* `xs3_mat_mul_s8_x_s16_yield_s32()` -- Multiply a 16-bit vector by an 8-bit matrix for a 32-bit
|
|
result.
|
|
* `xs3_vect_s8_is_negative()` -- Determine whether each element of an 8-bit vector is negative.
|
|
* `xs3_vect_s16_extract_high_byte()` -- Extract the most significant byte of each element of a
|
|
16-bit vector.
|
|
* `xs3_vect_s16_extract_low_byte()` -- Extract the least significant byte of each element of a
|
|
16-bit vector.
|
|
|
|
* Memory ops
|
|
|
|
* `xs3_vect_s32_zip()` -- Interleave elements from two `int32_t` vectors.
|
|
* `xs3_vect_s32_unzip()` -- De-interleave elements from a `int32_t` vector.
|
|
* `xs3_vect_s32_copy()` -- Copy an `int32_t` vector.
|
|
* `xs3_memcpy()` -- Quickly copy word-aligned vector to another word-aligned vector.
|
|
* Various low-level functions used in the implementation of the high-level multiply-accumulate
|
|
functions (e.g. `xs3_vect_s32_macc()`).
|
|
* `xs2_vect_s32_convolve_valid()` / `xs3_vect_complex_s32_convolve_same()` -- Filter a 32-bit
|
|
signal using a short convolution kernel. Both "valid" and "same" padding modes are supported.
|
|
* `xs3_vect_sXX_add_scalar()` / `xs3_vect_complex_sXX_add_scalar()` -- Add a scalar to a 16- or
|
|
32-bit real or complex vector.
|
|
|
|
* IEEE754 single-precision float vector functions
|
|
|
|
* `xs3_vect_f32_fft_forward()` / `xs3_vect_f32_fft_inverse()` -- Forward/Inverse FFT functions
|
|
for vectors of floats.
|
|
* `xs3_vect_f32_max_exponent()` -- Get maximum exponent from vector of floats.
|
|
* `xs3_vect_f32_to_s32()` / `xs3_vect_s32_to_f32()` -- Convert between float vector and BFP
|
|
vector.
|
|
* `xs3_vect_f32_dot()` -- Inner product between two float vectors.
|
|
|
|
* `xs3_vect_sXX_max_elementwise()` / `xs3_vect_sXX_min_elementwise()` -- Element-wise maximum and
|
|
minimum between two 16-/32-bit vectors.
|
|
|
|
* DCT API
|
|
|
|
* `dctXX_forward()` / `dctXX_inverse()` -- Forward (type-II) and inverse (type-III) `XX`-point DCT
|
|
implementations.
|
|
|
|
* Current sizes supported are `6`, `8`, `12`, `16`, `24`, `32`, `48` and `64`
|
|
|
|
* `dct8x8_forward()` / `dct8x8_inverse()` -- Fast 2D 8-by-8 forward and inverse DCTs.
|
|
|
|
|
|
Miscellaneous
|
|
*************
|
|
|
|
* Unit tests have been refactored to make use of Unity fixtures.
|
|
* Added example apps: `vect_demo`, `bfp_demo`, `fft_demo` and `filter_demo`
|
|
* Removed configuration support for `XS3_MATH_VECTOR_TAIL_SUPPORT`
|
|
* Added `QXX()` and `FXX()` macros (e.g. `Q24()`; taken from `lib_dsp`) for converting (constants)
|
|
between floating-point and fixed-point values.
|
|
* Added python scripts to generate code for filters
|
|
|
|
* `lib_xs3_math/script/gen_fir_filter_s16.py`
|
|
* `lib_xs3_math/script/gen_fir_filter_s32.py`
|
|
* `lib_xs3_math/script/gen_biquad_filter_s32.py`
|
|
|
|
* Changed low-level API so that each function `foo()` that has an associated 'prepare' function (to
|
|
calculate shifts or output exponents) can be prepared with `foo_prepare()`. This makes the
|
|
low-level API more consistent.
|
|
* Separated filtering-related unit tests into a separate unit test application.
|
|
* Various improvements to CMake project files.
|
|
|
|
* Includes automatic fetching of Unity repository during build
|
|
|
|
1.0.0
|
|
-----
|
|
|
|
* Initial version
|
|
|