init

2025-12-11 09:43:42 +08:00
commit d8b2974133
1822 changed files with 280037 additions and 0 deletions
--- a/lib_xcore_math/CHANGELOG.rst
+++ b/lib_xcore_math/CHANGELOG.rst
@@ -0,0 +1,293 @@
+lib_xcore_math change log
+=========================
+
+2.4.0
+-----
+
+  * CHANGED: Documentation updated
+  * CHANGED: Renamed examples to app_
+  * FIXED:   Added missing xcore_math.h
+  * REMOVED: xmos_cmake_toolchain submodule
+
+2.3.0
+-----
+
+  * CHANGED: Examples and tests to build using XCommon CMake
+
+2.2.0
+-----
+
+  * ADDED:   `filter_biquad_sat_s32()` - Apply a biquad filter to a 32-bit
+    signal with saturation.
+  * ADDED:   XCommon CMake build system support.
+  * CHANGED: Removed deprecated `numpy` types in `xmath_script.py` (issue #139).
+  * CHANGED: Made FFT documentation more clear (issue #169).
+  * FIXED:   Bug (issue #170) in `xs3_memcpy()`.
+  * FIXED:   Saturation in `filter_fir_s32()` when `shift <= 0`.
+
+2.1.3
+-----
+
+  * Fixes bug (issue #147) in `s16_to_s32()`.
+  * Fixes bug (issue #146) in `bfp_s32_macc()` and `bfp_s32_nmacc()`.
+  * Fixes bug with the `vect_s32_prepare_api` not appearing in the
+    documentation.
+  * Fixes bug in `bfp_s32_mean()` and `bfp_s16_mean()` when hitting a corner
+    case scenario.
+  * Cleans up internal functions.
+  * Allows compiling and running demos and tests on Windows Native x86
+    platforms.
+  * Removes several warnings.
+
+2.1.2
+-----
+
+  * Optimisation fix (issue #128) for `filter_fir_s32()`
+  * Documentation improvements.
+
+2.1.1
+-----
+
+  * Fixes bug (issue #116) in `vect_packed_complex_s32_macc()`.
+  * Fixes bug (issue #119) in `filter_fir_s32()`.
+  * Adds `--scale` option to the filter conversion script
+    `gen_biquad_filter_s32.py`
+  * If internal filter states (outputs from internal biquad sections) grow too
+    large integer overflows may occur. Using the `--scale` option can help avoid
+    this by effectively applying a gain factor to all coefficients.
+  * Fixes bug (mentioned in issue #119) in the `gen_fir_filter_s32.py` and
+    `gen_fir_filter_s16.py` filter conversion scripts where in a certain corner
+    case filter coefficients can overflow.
+
+2.1.0
+-----
+
+  * Adds several new operations for IEEE float vectors.
+  * Corrects `module_build_info` for legacy build tools.
+  * Fixes potential issue with include paths.
+
+2.0.2
+-----
+
+  * Updated CMake configuration to support Darwin platform.
+
+2.0.1
+-----
+
+  * Bugfix: Fixed issue with including ``xmath/xmath.h`` from XC files.
+  * Doc Update: Corrected instructions for configuring CMake using XS3
+    toolchain.
+
+
+Legacy release history
+======================
+
+2.0.0
+-----
+
+Major, backwards compatibility-breaking update to the library.
+
+Background
+**********
+
+This update does not add new features (compared to 1.1.0) to the library.  Instead, it is a
+major refactoring of the library in a few different ways.  The overarching purpose of this
+refactoring is to generalize the library so that when any future xcore ISAs are released, support
+for those architectures can be included within this library in a minimally disruptive way.  The
+xcore XS3 architecture currently remains the only xcore ISA officially supported by the library.
+
+Major Changes
+*************
+
+As mentioned, the update from 1.1.0 to 2.0.0 is does not add, remove or functionally change
+supported features.  Rather, the changes are primarily related to organization and the conventions
+used.  However, these changes do break backwards compatibility; hence the major version increment.
+
+The relevant major changes are:
+
+* **Library name has changed from `lib_xs3_math` to `lib_xcore_math`**
+
+  * This is to reflect that this library is not intended to be deprecated when future xcore ISA
+    versions are released.
+
+* **Library re-organized into multiple sub-APIs**
+
+  * The organization of the library into the 'high-level' (BFP) and 'low-level' (non-BFP) APIs has
+    grown cumbersome and no longer cleanly fits the content of the library. To this end, the
+    operations made available by this library have been re-organized into several sub-APIs. These
+    APIs are all versioned together and are in many cases interdependent. This new grouping is
+    primarily for conceptual simplicity.
+
+    * BFP API -- Mostly unchanged from the previous high-level API.
+    * Vector API -- Corresponds to most of the functionality in the previous low-level API, less the
+      operations that were moved to the new APIs.
+    * Filtering API -- Operations related to supported linear filters (FIR and Biquad).
+    * FFT API -- Collects the FFT-related functions (both low-level and high-level) from the
+      previous API and groups them together.
+    * DCT API -- Like the FFT API, collected DCT-related functions.
+    * Scalar API -- A small API for implementations of scalar operations, with particular emphasis
+      on the non-IEEE 754 floating-point scalars provided (but also includes some ``float``
+      operations).
+
+* **Many functions renamed**
+
+  * Previous versions of this library used a function naming convention whereby the majority of
+    low-level functions (and some types) had names prefixed with ``xs3_``, even where those
+    functions were written in C (rather than XS3 assembly) and were not intimately related to the
+    specifics of the XS3 hardware (i.e. where the same C implementation could plausibly be used in
+    future ISA versions).
+  * To that end, most functions prefixed with ``xs3_`` have had that prefix removed. This way, when
+    a future ISA is released (and once support is added to this library), applications can be
+    retargeted at the newer architecture without the unnecessary effort of going through the code to
+    rename all the function calls.
+
+    * e.g.  ``xs3_vect_s32_mul() --> vect_s32_mul()``
+    * Note that most BFP function names remain unchanged.
+
+  * The intention going forward is that the public API should avoid ISA version-specific naming when
+    the object being named is not conceptually specific to a particular ISA (except possibly where
+    optimizing different ISA versions necessitates mutually incompatible implementations).
+
+
+1.1.0
+-----
+
+Major Changes
+*************
+
+* Support for channel-pair related types and operations has been dropped. These were considered to
+  be too narrowly focused on making use of a single optimization (stereo FFT).
+
+  * This is a backwards compatibility-breaking change, requiring a major version increment.
+
+* Added various scalar arithmetic functions for `float_s32_t` type.
+
+* Adds Discrete Cosine Transform API
+
+* Adds various trig and exponential functions.
+
+Bugfixes
+********
+
+* Fixed bug in `bfp_fft_inverse_stereo()` where length of output BFP vector was half of correct
+  length.
+
+New Functions
+*************
+* BFP API
+
+  * FFT spectrum unpacking
+
+    * `bfp_fft_unpack_mono()` -- Used to expand the output spectrum from `bfp_fft_forward_mono()`
+      from `FFT_N/2` elements (with the Nyquist component packed into the DC component) to
+      `FFT_N/2 + 1` elements. This is useful as many complex operations behave undesirably on the
+      packed representation.
+    * `bfp_fft_pack_mono()` -- Opposite of `bfp_fft_unpack_mono()`. Used to repack the spectrum into
+      a form suitable for calling `bfp_fft_inverse_mono()`.
+
+  * Dynamic BFP vector allocation
+
+    * Functions for allocating and deallocating BFP vectors dynamically from the heap.
+    * `bfp_sXX_alloc()`, `bfp_complex_sXX_alloc()`
+    * `bfp_sXX_dealloc()`, `bfp_complex_sXX_dealloc()`
+
+  * Multiply-accumulate functions
+
+    * A handful of element-wise multiply-accumulate functions have been added for both 16-bit and
+      32-bit, and both real and complex vector types. e.g...
+
+    * `bfp_sXX_macc()` -- Element-wise multiply accumulate for real 16/32-bit vectors
+    * `bfp_sXX_nmacc()` -- Element-wise negated multiply accumulate (i.e. multiply-subtract) for
+      real vectors
+    * `bfp_complex_sXX_macc()` -- Element-wise multiply accumulate for complex vectors.
+    * `bfp_complex_sXX_conj_macc()` -- Element-wise conjugate multiply accumulate for complex
+      vectors.
+    * (and various others)
+
+  * `bfp_complex_sXX_conjugate()` -- Get the complex conjugate of a vector
+  * `bfp_complex_sXX_energy()` -- Compute the sum of a complex vector's elements' squared
+    magnitudes.
+  * `bfp_sXX_use_exponent()` / `bfp_complex_sXX_use_exponent()` -- Force BFP vector to encode
+    mantissas using specified exponent (i.e. convert to specified Q-format)
+  * `bfp_s32_convolve_valid()` / `bfp_complex_s32_convolve_same()` -- Filter a 32-bit signal using a
+    short convolution kernel. Both "valid" and "same" padding modes are supported.
+  * `xs3_vect_sXX_add_scalar()` / `xs3_vect_complex_sXX_add_scalar()` -- Functions to add scalar to
+    a vector (16/32-bit real/complex)
+
+
+* Vector API
+
+  * Functions supporting mixed-depth operations
+
+    * `xs3_mat_mul_s8_x_s8_yield_s32()` -- Multiply-accumulate an 8-bit vector by an 8-bit matrix
+      into 32-bit accumulators.
+    * `xs3_mat_mul_s8_x_s16_yield_s32()` -- Multiply a 16-bit vector by an 8-bit matrix for a 32-bit
+      result.
+    * `xs3_vect_s8_is_negative()` -- Determine whether each element of an 8-bit vector is negative.
+    * `xs3_vect_s16_extract_high_byte()` -- Extract the most significant byte of each element of a
+      16-bit vector.
+    * `xs3_vect_s16_extract_low_byte()` -- Extract the least significant byte of each element of a
+      16-bit vector.
+
+  * Memory ops
+
+    * `xs3_vect_s32_zip()` -- Interleave elements from two `int32_t` vectors.
+    * `xs3_vect_s32_unzip()` -- De-interleave elements from a `int32_t` vector.
+    * `xs3_vect_s32_copy()` -- Copy an `int32_t` vector.
+    * `xs3_memcpy()` -- Quickly copy word-aligned vector to another word-aligned vector.
+  * Various low-level functions used in the implementation of the high-level multiply-accumulate
+    functions (e.g. `xs3_vect_s32_macc()`).
+  * `xs2_vect_s32_convolve_valid()` / `xs3_vect_complex_s32_convolve_same()` -- Filter a 32-bit
+    signal using a short convolution kernel. Both "valid" and "same" padding modes are supported.
+  * `xs3_vect_sXX_add_scalar()` / `xs3_vect_complex_sXX_add_scalar()` -- Add a scalar to a 16- or
+    32-bit real or complex vector.
+
+  * IEEE754 single-precision float vector functions
+
+    * `xs3_vect_f32_fft_forward()` / `xs3_vect_f32_fft_inverse()` -- Forward/Inverse FFT functions
+      for vectors of floats.
+    * `xs3_vect_f32_max_exponent()` -- Get maximum exponent from vector of floats.
+    * `xs3_vect_f32_to_s32()` / `xs3_vect_s32_to_f32()` -- Convert between float vector and BFP
+      vector.
+    * `xs3_vect_f32_dot()` -- Inner product between two float vectors.
+
+  * `xs3_vect_sXX_max_elementwise()` / `xs3_vect_sXX_min_elementwise()` -- Element-wise maximum and
+    minimum between two 16-/32-bit vectors.
+
+* DCT API
+
+  * `dctXX_forward()` / `dctXX_inverse()` -- Forward (type-II) and inverse (type-III) `XX`-point DCT
+    implementations.
+
+    * Current sizes supported are `6`, `8`, `12`, `16`, `24`, `32`, `48` and `64`
+
+  * `dct8x8_forward()` / `dct8x8_inverse()` -- Fast 2D 8-by-8 forward and inverse DCTs.
+
+
+Miscellaneous
+*************
+
+* Unit tests have been refactored to make use of Unity fixtures.
+* Added example apps: `vect_demo`, `bfp_demo`, `fft_demo` and `filter_demo`
+* Removed configuration support for `XS3_MATH_VECTOR_TAIL_SUPPORT`
+* Added `QXX()` and `FXX()` macros (e.g. `Q24()`; taken from `lib_dsp`) for converting (constants)
+  between floating-point and fixed-point values.
+* Added python scripts to generate code for filters
+
+  * `lib_xs3_math/script/gen_fir_filter_s16.py`
+  * `lib_xs3_math/script/gen_fir_filter_s32.py`
+  * `lib_xs3_math/script/gen_biquad_filter_s32.py`
+
+* Changed low-level API so that each function `foo()` that has an associated 'prepare' function (to
+  calculate shifts or output exponents) can be prepared with `foo_prepare()`. This makes the
+  low-level API more consistent.
+* Separated filtering-related unit tests into a separate unit test application.
+* Various improvements to CMake project files.
+
+  * Includes automatic fetching of Unity repository during build
+
+1.0.0
+-----
+
+  * Initial version
+