Files
3d_audio/lib_xcore_math/CHANGELOG.rst
Steven Dan d8b2974133 init
2025-12-11 09:43:42 +08:00

294 lines
12 KiB
ReStructuredText

lib_xcore_math change log
=========================
2.4.0
-----
* CHANGED: Documentation updated
* CHANGED: Renamed examples to app_
* FIXED: Added missing xcore_math.h
* REMOVED: xmos_cmake_toolchain submodule
2.3.0
-----
* CHANGED: Examples and tests to build using XCommon CMake
2.2.0
-----
* ADDED: `filter_biquad_sat_s32()` - Apply a biquad filter to a 32-bit
signal with saturation.
* ADDED: XCommon CMake build system support.
* CHANGED: Removed deprecated `numpy` types in `xmath_script.py` (issue #139).
* CHANGED: Made FFT documentation more clear (issue #169).
* FIXED: Bug (issue #170) in `xs3_memcpy()`.
* FIXED: Saturation in `filter_fir_s32()` when `shift <= 0`.
2.1.3
-----
* Fixes bug (issue #147) in `s16_to_s32()`.
* Fixes bug (issue #146) in `bfp_s32_macc()` and `bfp_s32_nmacc()`.
* Fixes bug with the `vect_s32_prepare_api` not appearing in the
documentation.
* Fixes bug in `bfp_s32_mean()` and `bfp_s16_mean()` when hitting a corner
case scenario.
* Cleans up internal functions.
* Allows compiling and running demos and tests on Windows Native x86
platforms.
* Removes several warnings.
2.1.2
-----
* Optimisation fix (issue #128) for `filter_fir_s32()`
* Documentation improvements.
2.1.1
-----
* Fixes bug (issue #116) in `vect_packed_complex_s32_macc()`.
* Fixes bug (issue #119) in `filter_fir_s32()`.
* Adds `--scale` option to the filter conversion script
`gen_biquad_filter_s32.py`
* If internal filter states (outputs from internal biquad sections) grow too
large integer overflows may occur. Using the `--scale` option can help avoid
this by effectively applying a gain factor to all coefficients.
* Fixes bug (mentioned in issue #119) in the `gen_fir_filter_s32.py` and
`gen_fir_filter_s16.py` filter conversion scripts where in a certain corner
case filter coefficients can overflow.
2.1.0
-----
* Adds several new operations for IEEE float vectors.
* Corrects `module_build_info` for legacy build tools.
* Fixes potential issue with include paths.
2.0.2
-----
* Updated CMake configuration to support Darwin platform.
2.0.1
-----
* Bugfix: Fixed issue with including ``xmath/xmath.h`` from XC files.
* Doc Update: Corrected instructions for configuring CMake using XS3
toolchain.
Legacy release history
======================
2.0.0
-----
Major, backwards compatibility-breaking update to the library.
Background
**********
This update does not add new features (compared to 1.1.0) to the library. Instead, it is a
major refactoring of the library in a few different ways. The overarching purpose of this
refactoring is to generalize the library so that when any future xcore ISAs are released, support
for those architectures can be included within this library in a minimally disruptive way. The
xcore XS3 architecture currently remains the only xcore ISA officially supported by the library.
Major Changes
*************
As mentioned, the update from 1.1.0 to 2.0.0 is does not add, remove or functionally change
supported features. Rather, the changes are primarily related to organization and the conventions
used. However, these changes do break backwards compatibility; hence the major version increment.
The relevant major changes are:
* **Library name has changed from `lib_xs3_math` to `lib_xcore_math`**
* This is to reflect that this library is not intended to be deprecated when future xcore ISA
versions are released.
* **Library re-organized into multiple sub-APIs**
* The organization of the library into the 'high-level' (BFP) and 'low-level' (non-BFP) APIs has
grown cumbersome and no longer cleanly fits the content of the library. To this end, the
operations made available by this library have been re-organized into several sub-APIs. These
APIs are all versioned together and are in many cases interdependent. This new grouping is
primarily for conceptual simplicity.
* BFP API -- Mostly unchanged from the previous high-level API.
* Vector API -- Corresponds to most of the functionality in the previous low-level API, less the
operations that were moved to the new APIs.
* Filtering API -- Operations related to supported linear filters (FIR and Biquad).
* FFT API -- Collects the FFT-related functions (both low-level and high-level) from the
previous API and groups them together.
* DCT API -- Like the FFT API, collected DCT-related functions.
* Scalar API -- A small API for implementations of scalar operations, with particular emphasis
on the non-IEEE 754 floating-point scalars provided (but also includes some ``float``
operations).
* **Many functions renamed**
* Previous versions of this library used a function naming convention whereby the majority of
low-level functions (and some types) had names prefixed with ``xs3_``, even where those
functions were written in C (rather than XS3 assembly) and were not intimately related to the
specifics of the XS3 hardware (i.e. where the same C implementation could plausibly be used in
future ISA versions).
* To that end, most functions prefixed with ``xs3_`` have had that prefix removed. This way, when
a future ISA is released (and once support is added to this library), applications can be
retargeted at the newer architecture without the unnecessary effort of going through the code to
rename all the function calls.
* e.g. ``xs3_vect_s32_mul() --> vect_s32_mul()``
* Note that most BFP function names remain unchanged.
* The intention going forward is that the public API should avoid ISA version-specific naming when
the object being named is not conceptually specific to a particular ISA (except possibly where
optimizing different ISA versions necessitates mutually incompatible implementations).
1.1.0
-----
Major Changes
*************
* Support for channel-pair related types and operations has been dropped. These were considered to
be too narrowly focused on making use of a single optimization (stereo FFT).
* This is a backwards compatibility-breaking change, requiring a major version increment.
* Added various scalar arithmetic functions for `float_s32_t` type.
* Adds Discrete Cosine Transform API
* Adds various trig and exponential functions.
Bugfixes
********
* Fixed bug in `bfp_fft_inverse_stereo()` where length of output BFP vector was half of correct
length.
New Functions
*************
* BFP API
* FFT spectrum unpacking
* `bfp_fft_unpack_mono()` -- Used to expand the output spectrum from `bfp_fft_forward_mono()`
from `FFT_N/2` elements (with the Nyquist component packed into the DC component) to
`FFT_N/2 + 1` elements. This is useful as many complex operations behave undesirably on the
packed representation.
* `bfp_fft_pack_mono()` -- Opposite of `bfp_fft_unpack_mono()`. Used to repack the spectrum into
a form suitable for calling `bfp_fft_inverse_mono()`.
* Dynamic BFP vector allocation
* Functions for allocating and deallocating BFP vectors dynamically from the heap.
* `bfp_sXX_alloc()`, `bfp_complex_sXX_alloc()`
* `bfp_sXX_dealloc()`, `bfp_complex_sXX_dealloc()`
* Multiply-accumulate functions
* A handful of element-wise multiply-accumulate functions have been added for both 16-bit and
32-bit, and both real and complex vector types. e.g...
* `bfp_sXX_macc()` -- Element-wise multiply accumulate for real 16/32-bit vectors
* `bfp_sXX_nmacc()` -- Element-wise negated multiply accumulate (i.e. multiply-subtract) for
real vectors
* `bfp_complex_sXX_macc()` -- Element-wise multiply accumulate for complex vectors.
* `bfp_complex_sXX_conj_macc()` -- Element-wise conjugate multiply accumulate for complex
vectors.
* (and various others)
* `bfp_complex_sXX_conjugate()` -- Get the complex conjugate of a vector
* `bfp_complex_sXX_energy()` -- Compute the sum of a complex vector's elements' squared
magnitudes.
* `bfp_sXX_use_exponent()` / `bfp_complex_sXX_use_exponent()` -- Force BFP vector to encode
mantissas using specified exponent (i.e. convert to specified Q-format)
* `bfp_s32_convolve_valid()` / `bfp_complex_s32_convolve_same()` -- Filter a 32-bit signal using a
short convolution kernel. Both "valid" and "same" padding modes are supported.
* `xs3_vect_sXX_add_scalar()` / `xs3_vect_complex_sXX_add_scalar()` -- Functions to add scalar to
a vector (16/32-bit real/complex)
* Vector API
* Functions supporting mixed-depth operations
* `xs3_mat_mul_s8_x_s8_yield_s32()` -- Multiply-accumulate an 8-bit vector by an 8-bit matrix
into 32-bit accumulators.
* `xs3_mat_mul_s8_x_s16_yield_s32()` -- Multiply a 16-bit vector by an 8-bit matrix for a 32-bit
result.
* `xs3_vect_s8_is_negative()` -- Determine whether each element of an 8-bit vector is negative.
* `xs3_vect_s16_extract_high_byte()` -- Extract the most significant byte of each element of a
16-bit vector.
* `xs3_vect_s16_extract_low_byte()` -- Extract the least significant byte of each element of a
16-bit vector.
* Memory ops
* `xs3_vect_s32_zip()` -- Interleave elements from two `int32_t` vectors.
* `xs3_vect_s32_unzip()` -- De-interleave elements from a `int32_t` vector.
* `xs3_vect_s32_copy()` -- Copy an `int32_t` vector.
* `xs3_memcpy()` -- Quickly copy word-aligned vector to another word-aligned vector.
* Various low-level functions used in the implementation of the high-level multiply-accumulate
functions (e.g. `xs3_vect_s32_macc()`).
* `xs2_vect_s32_convolve_valid()` / `xs3_vect_complex_s32_convolve_same()` -- Filter a 32-bit
signal using a short convolution kernel. Both "valid" and "same" padding modes are supported.
* `xs3_vect_sXX_add_scalar()` / `xs3_vect_complex_sXX_add_scalar()` -- Add a scalar to a 16- or
32-bit real or complex vector.
* IEEE754 single-precision float vector functions
* `xs3_vect_f32_fft_forward()` / `xs3_vect_f32_fft_inverse()` -- Forward/Inverse FFT functions
for vectors of floats.
* `xs3_vect_f32_max_exponent()` -- Get maximum exponent from vector of floats.
* `xs3_vect_f32_to_s32()` / `xs3_vect_s32_to_f32()` -- Convert between float vector and BFP
vector.
* `xs3_vect_f32_dot()` -- Inner product between two float vectors.
* `xs3_vect_sXX_max_elementwise()` / `xs3_vect_sXX_min_elementwise()` -- Element-wise maximum and
minimum between two 16-/32-bit vectors.
* DCT API
* `dctXX_forward()` / `dctXX_inverse()` -- Forward (type-II) and inverse (type-III) `XX`-point DCT
implementations.
* Current sizes supported are `6`, `8`, `12`, `16`, `24`, `32`, `48` and `64`
* `dct8x8_forward()` / `dct8x8_inverse()` -- Fast 2D 8-by-8 forward and inverse DCTs.
Miscellaneous
*************
* Unit tests have been refactored to make use of Unity fixtures.
* Added example apps: `vect_demo`, `bfp_demo`, `fft_demo` and `filter_demo`
* Removed configuration support for `XS3_MATH_VECTOR_TAIL_SUPPORT`
* Added `QXX()` and `FXX()` macros (e.g. `Q24()`; taken from `lib_dsp`) for converting (constants)
between floating-point and fixed-point values.
* Added python scripts to generate code for filters
* `lib_xs3_math/script/gen_fir_filter_s16.py`
* `lib_xs3_math/script/gen_fir_filter_s32.py`
* `lib_xs3_math/script/gen_biquad_filter_s32.py`
* Changed low-level API so that each function `foo()` that has an associated 'prepare' function (to
calculate shifts or output exponents) can be prepared with `foo_prepare()`. This makes the
low-level API more consistent.
* Separated filtering-related unit tests into a separate unit test application.
* Various improvements to CMake project files.
* Includes automatic fetching of Unity repository during build
1.0.0
-----
* Initial version