This commit is contained in:
Steven Dan
2025-12-11 09:43:42 +08:00
commit d8b2974133
1822 changed files with 280037 additions and 0 deletions

View File

@@ -0,0 +1,18 @@
####################################
lib_xcore_math: xcore optimised math
####################################
.. toctree::
:maxdepth: 1
:caption: Contents:
src/introduction
src/getting_started
src/bfp_background
src/reference/reference_index
src/examples
src/tests

View File

@@ -0,0 +1,117 @@
.. _bfp_background:
*******************************
Block Floating-Point background
*******************************
Block Floating-Point vectors
============================
A standard (IEEE) floating-point object can exist either as a scalar, e.g.
.. code-block:: c
//Single IEEE floating-point variable
float foo;
or as a vector, e.g.
.. code-block:: c
//Array of IEEE floating-point variables
float foo[20];
Standard floating-point values carry both a mantissa :math:`m` and an exponent :math:`p`, such that
the logical value represented by such a variable is :math:`m\cdot2^p`. When you have a vector of
standard floating-point values, each element of the vector carries its own mantissa and its own
exponent: :math:`m[k]\cdot2^{p[k]}`.
.. image:: images/bfp_bg_fig1.png
By contrast, block floating-point objects have a vector of mantissas :math:`\bar{m}` which all share
the same exponent :math:`p`, such that the logical value of the element at index :math:`k` is
:math:`m[k]\cdot2^p`.
.. code-block:: c
struct {
// Array of mantissas
int32_t mant[20];
// Shared exponent
int32_t exp;
} bfp_vect;
.. image:: images/bfp_bg_fig2.png
.. _headroom_intro:
Headroom
========
With a given exponent, :math:`p`, the largest value that can be represented by a 32-bit BFP vector
is given by a maximal mantissa (:math:`2^{31}-1`), for a logical value of
:math:`(2^{31}-1)\cdot2^p`. The smallest non-zero value that an element can represent is
:math:`1\cdot2^p`.
Because all elements must share a single exponent, in order to avoid overflow or saturation of the
largest magnitude values, the exponent of a BFP vector is constrained by the element with the
largest (logical) value. The drawback to this is that when the elements of a BFP vector represent a
large dynamic range -- that is, where the largest magnitude element is many, many times larger than
the smallest (non-zero) magnitude element -- the smaller magnitude elements effectively have fewer
bits of precision.
Consider a 2-element BFP vector intended to carry the values :math:`2^{20}` and :math:`255 \cdot
2^{-10}`. One way this vector can be represented is to use an exponent of :math:`0`.
.. code-block:: c
struct {
int32_t mant[2];
int32_t exp;
} vect = { { (1<<20), (0xFF >> 10) }, 0 };
.. image:: images/bfp_bg_fig3.png
In the diagram above, the fractional bits (shown in red text) are discarded, as the mantissa is only
32 bits. Then, with :math:`0` as the exponent, ``mant[1]`` underflows to :math:`0`. Meanwhile, the
12 most significant bits of ``mant[0]`` are all zeros.
The headroom of a signed integer is the number of *redundant* leading sign bits. Equivalently, it is
the number of bits that a mantissa can be left-shifted without losing any information. In the the
diagram, the bits corresponding to headroom are shown in green text. Here ``mant[0]`` has 10 bits of
headroom and ``mant[1]`` has a full 32 bits of headroom. (``mant[0]`` does not have 11 bits of
headroom because in two's complement the MSb serves as a sign bit). The headroom for a BFP vector is
the `minimum` of headroom amongst each of its elements; in this case, 10 bits.
If we remove headroom from one mantissa of a BFP vector, all other mantissas must shift by the same
number of bits, and the vector's exponent must be adjusted accordingly. A left-shift of one bit
corresponds to reducing the exponent by 1, because a single bit left-shift corresponds to
multiplication by 2.
In this case, if we remove 10 bits of headroom and subtract 10 from the exponent we get the
following:
.. code-block:: c
struct {
int32_t mant[2];
int32_t exp;
} vect = { { (1<<30), (0xFF >> 0) }, -10 };
.. image:: images/bfp_bg_fig4.png
Now, no information is lost in either element. One of the main goals of BFP arithmetic is to keep
the headroom in BFP vectors to the minimum necessary (equivalently, keeping the exponent as small as
possible). That allows for maximum effective precision of the elements in the vector.
Note that the headroom of a vector also tells you something about the size of the largest magnitude
mantissa in the vector. That information (in conjunction with exponents) can be used to determine
the largest possible output of an operation without having to look at the mantissas.
For this reason, the BFP vectors in ``lib_xcore_math`` carry a field which tracks their current
headroom. The functions in the BFP API use this property to make determinations about how best to
preserve precision.

View File

@@ -0,0 +1,105 @@
.. _examples:
********************
Example Applications
********************
Several example applications are offered to demonstrate use of the ``lib_xcore_math`` APIs through
simple code examples.
* ``app_bfp_demo`` - Demonstration of the block floating-point arithmetic API
* ``app_vect_demo`` - Demonstration of the low-level vectorized arithmetic API
* ``app_fft_demo`` - Demonstration of the Fast Fourier Transform API
* ``app_filter_demo`` - Demonstration of the filtering API
This section assumes you have downloaded and installed the `XMOS XTC tools <https://www.xmos.com/software-tools/>`_
(see `README` for required version).
Installation instructions can be found `here <https://xmos.com/xtc-install-guide>`_.
Particular attention should be paid to the section `Installation of required third-party tools
<https://www.xmos.com/documentation/XM-014363-PC-10/html/installation/install-configure/install-tools/install_prerequisites.html>`_.
The application examples uses the `xcommon-cmake <https://www.xmos.com/file/xcommon-cmake-documentation/?version=latest>`_
build system as bundled with the XTC tools.
Building Examples
=================
To build the applications, from an XTC command prompt run the following commands in the
`lib_xcore_math/examples` directory::
cmake -B build -G "Unix Makefiles"
xmake -C build
Individual examples can be built using a command similar to the following::
xmake -C build EXAMPLE_NAME
where ``EXAMPLE_NAME`` is the example to build.
Running Examples
================
Once built, the example ``EXAMPLE_NAME`` can be run on the `XK-EVK-XU316` board using the following
command::
xrun --xscope examples/EXAMPLE_NAME/bin/EXAMPLE_NAME.xe
For instance, to run the ``bfp_demo`` example, use::
xrun --xscope examples/app_bfp_demo/bin/app_bfp_demo.xe
To run the example using the ``xcore`` simulator instead, use::
xsim examples/EXAMPLE_NAME/bin/EXAMPLE_NAME.xe
app_bfp_demo
=============
The purpose of this example application is to demonstrate how the arithmetic functions of
``lib_xcore_math``'s block floating-point API may be used.
In it, three 32-bit BFP vectors are allocated, initialized and filled with random data. Then several
BFP operations are applied using those vectors as inputs and/or outputs.
The example only demonstrates the real 32-bit arithmetic BFP functions (that is, functions with
names ``bfp_s32_*``). The real 16-bit (``bfp_s16_*``), complex 32-bit (``bfp_complex_s32_*``) and
complex 16-bit (``bfp_complex_s16_*``) functions all use similar naming conventions.
app_vect_demo
=============
The purpose of this example application is to demonstrate how the arithmetic functions of
``lib_xcore_math``'s lower-level vector API may be used.
In general the low-level arithmetic API are the functions in this library whose names begin with
``vect_*``, such as :c:func:`vect_s32_mul()` for element-wise multiplication of 32-bit vectors, and
:c:func:`vect_complex_s16_scale()` for multiplying a complex 16-bit vector by a complex scalar.
We assume that where the low-level API is being used it is because some behavior other than the
default behavior of the high-level block floating-point API is required. Given that, rather than
showcasing the breadth of operations available, this example examines first how to achieve
comparable behavior to the BFP API, and then ways in which that behavior can be modified.
app_fft_demo
============
The purpose of this example application is to demonstrate how the FFT functions of
``lib_xcore_math``'s block floating-point API may be used.
In this example we demonstrate each of the offered forward and inverse FFTs of the BFP API.
app_filter_demo
===============
The purpose of this example application is to demonstrate how the functions of
``lib_xcore_math``'s filtering vector API may be used.
The filtering API currently supports three different filter types:
* 32-bit FIR Filter
* 16-bit FIR Filter
* 32-bit Biquad Filter
This example application presents simple demonstrations of how to use each of these filter types.

View File

@@ -0,0 +1,227 @@
.. _getting_started:
***************
Getting Started
***************
Overview
========
``lib_xcore_math`` is a library containing efficient implementations of various mathematical
operations that may be required in an embedded application. In particular, this library is geared
towards operations which work on vectors or arrays of data, including vectorized arithmetic,
linear filtering, and fast Fourier transforms.
This library comprises several sub-APIs. Grouping of operations into sub-APIs is a matter of
conceptual convenience. In general, functions from a given API share a common prefix indicating
which API the function comes from, or the type of object on which it acts. Additionally, there is
some interdependence between these APIs.
These APIs are:
* :ref:`Block floating-point (BFP) API <bfp_api>` -- High-level API providing operations on BFP
vectors. See :ref:`bfp_background` for an introduction to block floating-point. These functions
manage the exponents and headroom of input and output BFP vectors to avoid overflow and underflow
conditions.
* :ref:`Vector/Array API <vect_api>` -- Lower-level API which is used heavily by the BFP API.
As such, the operations available in this API are similar to those in the BFP API, but the user
will have to manage exponents and headroom on their own. Many of these routines are implemented
directly in optimized assembly to use the hardware as efficiently as possible.
* :ref:`Scalar API <scalar_api>` -- Provides various operations on scalar objects. In particular,
these operations focus on simple arithmetic operations applied to non-IEEE 754 floating-point
objects, as well as optimized operations which are applied to IEEE 754 ``floats``.
* :ref:`Filtering API <filter_api>` -- Provides access to linear filtering operations, including
16- and 32-bit FIR filters and 32-bit biquad filters.
* :ref:`Fast Fourier Transform (FFT) API <fft_api>` -- Provides both low-level and block
floating-point FFT implementations. Optimized FFT implementations are provided for real signals,
pairs of real signals, and for complex signals.
* :ref:`Discrete Cosine Transform (DCT) API <dct_api>` -- Provides functions which implement the
`type-II <https://en.wikipedia.org/wiki/Discrete_cosine_transform#DCT-II>`_ ('forward') and
`type-III <https://en.wikipedia.org/wiki/Discrete_cosine_transform#DCT-III>`_ ('inverse') DCT for
a variety of block lengths. Also provides a fast 8x8 two dimensional forward and inverse DCT.
All APIs are accessed by including the single header file:
.. code-block:: c
#include "xcore_math.h"
Usage
=====
The following sections are intended to give the reader a general sense of how to use the API.
BFP API
-------
In the BFP API the BFP vectors are C structures such as ``bfp_s16_t``, ``bfp_s32_t``, or
``bfp_complex_s32_t``, backed by a memory buffer. These objects contain a pointer to the data
carrying the content (mantissas) of the vector, as well as information about the length, headroom
and exponent of the BFP vector.
Below is the definition of :c:struct:`bfp_s32_t` from xmath/types.h.
.. code-block:: c
C_TYPE
typedef struct {
/** Pointer to the underlying element buffer.*/
int32_t* data;
/** Exponent associated with the vector. */
exponent_t exp;
/** Current headroom in the ``data[]`` */
headroom_t hr;
/** Current size of ``data[]``, expressed in elements */
unsigned length;
/** BFP vector flags. Users should not normally modify these manually. */
bfp_flags_e flags;
} bfp_s32_t;
The :ref:`32-bit BFP functions <bfp_s32>` take :c:struct:`bfp_s32_t` pointers as input and output
parameters.
Functions in the BFP API generally are prefixed with ``bfp_``. More specifically, functions where
the 'main' operands are 32-bit BFP vectors are prefixed with ``bfp_s32_``, whereas functions where
the 'main' operands are complex 16-bit BFP vectors are prefixed with ``bfp_complex_s16_``, and so
on for the other BFP vector types.
Initializing BFP Vectors
^^^^^^^^^^^^^^^^^^^^^^^^
Before calling these functions, the BFP vectors represented by the arguments must be initialized.
For :c:struct:`bfp_s32_t` this is accomplished with :c:func:`bfp_s32_init()`. Initialization
requires that a buffer of sufficient size be provided to store the mantissa vector, as well as an
initial exponent. If the first usage of a BFP vector is as an output, then the exponent will not
matter, but the object must still be initialized before use. Additionally, the headroom of the
vector may be computed upon initialization; otherwise it is set to ``0``.
Here is an example of a 32-bit BFP vector being initialized.
.. code-block:: c
#define LEN (20)
//The object representing the BFP vector
bfp_s32_t bfp_vect;
// buffer backing bfp_vect
int32_t data_buffer[LEN];
for(int i = 0; i < LEN; i++) data_buffer[i] = i;
// The initial exponent associated with bfp_vect
exponent_t initial_exponent = 0;
// If non-zero, `bfp_s32_init()` will compute headroom currently present in data_buffer.
// Otherwise, headroom is initialized to 0 (which is always safe but may not be optimal)
unsigned calculate_headroom = 1;
// Initialize the vector object
bfp_s32_init(&bfp_vec, data_buffer, initial_exponent, LEN, calculate_headroom);
// Go do stuff with bfp_vect
...
Once initialized, the exponent and mantissas of the vector can be accessed by ``bfp_vect.exp`` and
``bfp_vect.data[]`` respectively, with the logical (floating-point) value of element ``k`` being
given by :math:`\mathtt{bfp\_vect.data[k]}\cdot2^{\mathtt{bfp\_vect.exp}}`.
BFP Arithmetic Functions
^^^^^^^^^^^^^^^^^^^^^^^^
The following snippet shows a function ``foo()`` which takes 3 BFP vectors, ``a``, ``b`` and ``c``,
as arguments. It multiplies together ``a`` and ``b`` element-wise, and then subtracts ``c`` from the
product. In this example both operations are performed in-place on ``a``. (See
:c:func:`bfp_s32_mul()` and :c:func:`bfp_s32_sub()` for more information about those functions)
.. code-block:: c
void foo(bfp_s32_t* a, const bfp_s32_t* b, const bfp_s32_t* c)
{
// Multiply together a and b, updating a with the result.
bfp_s32_mul(a, a, b);
// Subtract c from the product, again updating a with the result.
bfp_s32_sub(a, a, c);
}
The caller of ``foo()`` can then access the results through ``a``. Note that the pointer ``a->data``
was not modified during this call.
Vector API
----------
The functions in the lower-level vector API are optimized for performance. They do very little to
protect the user from mangling their data by arithmetic saturation/overflows or underflows (although
they do provide the means to prevent this).
Functions in the vector API are generally prefixed with ``vect_``. For example, functions which
operate primarily on 16-bit vectors are prefixed with ``vect_s16_``.
Some functions are prefixed with ``chunk_`` instead of ``vect_``. A "chunk" is just a vector with a
fixed memory footprint (currently 32 bytes, or 8 32-bit elements) meant to match the width of the
architecture's vector registers.
As an example of a function from the vector API, see :c:func:`vect_s32_mul()` (from
``vect_s32.h``), which multiplies together two ``int32_t`` vectors element by element.
.. code-block:: c
C_API
headroom_t vect_s32_mul(
int32_t a[],
const int32_t b[],
const int32_t c[],
const unsigned length,
const right_shift_t b_shr,
const right_shift_t c_shr);
This function takes two ``int32_t`` arrays, ``b`` and ``c``, as inputs and one ``int32_t`` array,
``a``, as output (in the case of :c:func:`vect_s32_mul()`, it is safe to have ``a`` point to the
same buffer as ``b`` or ``c``, computing the result in-place). ``length`` indicates the number of
elements in each array. The final two parameters, ``b_shr`` and ``c_shr``, are the arithmetic
right-shifts applied to each element of ``b`` and ``c`` before they are multiplied together.
Why the right-shifts? In the case of 32-bit multiplication, the largest possible product is
:math:`2^{62}`, which will not fit in the 32-bit output vector. Applying positive arithmetic
right-shifts to the input vectors reduces the largest possible product. So, the shifts are there to
manage the headroom/size of the resulting product in order to maximize precision while avoiding
overflow or saturation.
Contrast this with :c:func:`vect_s16_mul()`:
.. code-block:: c
C_API
headroom_t vect_s16_mul(
int16_t a[],
const int16_t b[],
const int16_t c[],
const unsigned length,
const right_shift_t a_shr);
The parameters are similar here, but instead of ``b_shr`` and ``c_shr``, there's only an ``a_shr``.
In this case, the arithmetic right-shift ``a_shr`` is applied to the *products* of ``b`` and ``c``.
In this case the right-shift is also *unsigned* -- it can only be used to reduce the size of the
product.
Shifts like those in these two examples are very common in the vector API, as they are the main
mechanism for managing exponents and headroom. Whether the shifts are applied to inputs, outputs,
both, or only one input will depend on a number of factors. In the case of :c:func:`vect_s32_mul()`
they are applied to inputs because the XS3 VPU includes a compulsory (hardware) right-shift of 30
bits on all products of 32-bit numbers, and so often inputs may need to be *left*-shifted (negative
shift) in order to avoid underflows. In the case of :c:func:`vect_s16_mul()`, this is unnecessary
because no compulsory shift is included in 16-bit multiply-accumulates.
Both :c:func:`vect_s32_mul()` and :c:func:`vect_s16_mul()` return the headroom of the output
vector ``a``.
Functions in the vector API are in many cases closely tied to the instruction set architecture
for XS3. As such, if more efficient algorithms are found to perform an operation these low-level API
functions are more likely to change in future versions.

Binary file not shown.

After

Width:  |  Height:  |  Size: 36 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 36 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 51 KiB

View File

@@ -0,0 +1,52 @@
************
Introduction
************
``lib_xcore_math`` is a library of optimised math functions for taking advantage of the vector
processing unit (VPU) of the `XMOS` XS3 architecture (i.e `xcore.ai`).
Included in the library are functions for block floating-point arithmetic, fast Fourier transforms,
linear algebra, discrete cosine transforms, linear filtering and more.
Repository structure
====================
* */lib_xcore_math/*
* *api/* - Headers containing the public API.
* *script/* - Scripts used for source generation.
* *src/*- Library source code.
* */doc/* - documentation source.
* */examples/* - Example applications.
* */tests/* - Unit test projects.
API structure
=============
This library is organised around several sub-APIs. These APIs collect the provided operations into
coherent groups based on the kind of operation or the types of object being acted upon.
The current APIs are:
* Block Floating-Point Vector API
* Vector/Array API
* Scalar API
* Linear Filtering API
* Fast Fourier Transform API
* Discrete Cosine Transform API
Using ``lib_xcore_math``
========================
``lib_xcore_math`` is intended to be used with the `XCommon CMake <https://www.xmos.com/file/xcommon-cmake-documentation/?version=latest>`_
, the `XMOS` application build and dependency management system.
``lib_xcore_math`` can be compiled for both x86 platforms and XS3 based processors.
On x86 platforms you can develop DSP algorithms and test them for functional correctness;
this is an optional step before porting the library to an `xcore` device.
To use this module, include ``lib_xcore_math`` in the application's ``APP_DEPENDENT_MODULES`` list and
include the ``xcore_math.h`` header file.

View File

@@ -0,0 +1,159 @@
Notes for lib_xcore_math {#notes}
========================
## &nbsp;
## Vector Alignment ## {#vector_alignment}
This library makes use of the XMOS architecture's vector processing unit (VPU). In the XS3 version
of the architecture, all loads and stores stores to and from the XS3 VPU have the requirement that
the loaded/stored addresses must be aligned to a 4-byte boundary (word-aligned).
In the current version of the API, this leads to the requirement that most API functions require
vectors (or the data backing a BFP vector) to begin at word-aligned addresses. Vectors are *not*
required, however, to have a size (in bytes) that is a multiple of 4.
Some functions also make use of instructures which require data to be 8-byte-aligned.
### Writing Alignment-safe Code ###
The alignment requirement is ultimately always on the data that backs a vector. This applies to all
but the scalar API. For the BFP API, this applies to the memory to which the `data` field (or the
`real` and `imag` fields in the case of `bfp_complex_s16_t`) points, specified when the BFP vector
is initialized. A similar constraint applies when initializing filters. For the other APIs, this
will apply to the pointers that get passed into the API functions.
Arrays of type `int32_t` and `complex_s32_t` will normally be guaranteed to be word-aligned by the
compiler. However, if the user manually specifies the beginning of an `int32_t` array, as in the
following..
\code{.c}
uint8_t byte_buffer[100];
int32_t* integer_array = (int32_t*) &byte_buffer[1];
\endcode
.. the vector may not be word-aligned. It is the responsibility of the user to ensure proper
alignment of data.
For `int16_t` arrays, the compiler does not by default guarantee that the array starts on a
word-aligned address. To force word-alignment on arrays of this type, use
`__attribute__((aligned (4)))` in the variable definition, as in the following.
\code{.c}
int16_t __attribute__((aligned (4))) data[100];
\endcode
Occasionally, 8-byte (double word) alignment is required. In this case, neither `int32_t` nor
`int16_t` is necessarily guaranteed to align as required. Similar to the above, this can be hinted
to the compiler as in the following.
\code{.c}
int32_t __attribute__((aligned (8))) data[100];
\endcode
This library also provides the macros `WORD_ALIGNED` and `DWORD_ALIGNED` which force 4- and 8-byte
alignment respectively as above.
---------
## Symmetrically Saturating Arithmetic ## {#saturation}
With ordinary integer arithmetic the block floating-point logic chooses exponents and operand shifts
to prevent integer overflow with worst-case input values. However, the XS3 VPU uses symmetrically
saturating integer arithmetic.
Saturating arithmetic is that where partial results of the applied operation use a bit depth greater
than the output bit depth, and values that can't be properly expressed with the output bit depth are
set to the nearest expressible value.
For example, in ordinary C integer arithmetic, a function which multiplies two 32-bit integers may
internally compute the full 64-bit product and then clamp values to the range `(INT32_MIN,
INT32_MAX)` before returning a 32-bit result.
Symmetrically saturating arithmetic also includes the property that the lower bound of the
expressible range is the negative of the upper bound of the expressible range.
One of the major troubles with non-saturating integer arithmetic is that in a twos complement
encoding, there exists a non-zero integer (e.g. INT16_MIN in 16-bit twos complement arithmetic)
value @f$x@f$ for which @f$-1 \cdot x = x@f$. Serious arithmetic errors can result when this case
is not accounted for.
One of the results of _symmetric_ saturation, on the other hand, is that there is a corner case
where (using the same exponent and shift logic as non-saturating arithmetic) saturation may occur
for a particular combination of input mantissas. The corner case is different for different
operations.
When the corner case occurs, the minimum (and largest magnitude) value of the resulting vector is 1
LSb greater than its ideal value (e.g. `-0x3FFF` instead of `-0x4000` for 16-bit arithmetic). The
error in this output element's mantissa is then 1 LSb, or @f$2^p@f$, where @f$p@f$ is the exponent
of the resulting BFP vector.
Of course, the very nature of BFP arithmetic routinely involves errors of this magnitude.
---------
## Spectrum Packing ## {#spectrum_packing}
In its general form, the @math{N}-point Discrete Fourier Transform is an operation applied to a
complex @math{N}-point signal @math{x[n]} to produce a complex spectrum @math{X[f]}. Any spectrum
@math{X[f]} which is the result of a @math{N}-point DFT has the property that @math{X[f+N] = X[f]}.
Thus, the complete representation of the @math{N}-point DFT of @math{X[n]} requires @math{N} complex
elements.
### Complex DFT and IDFT ###
In this library, when performing a complex DFT (e.g. using fft_bfp_forward_complex()), the spectral
representation that results in a straight-forward mapping:
`X[f]` @math{\longleftarrow X[f]} for @math{0 \le f < N}
where `X` is an @math{N}-element array of `complex_s32_t`, where the real part of @math{X[f]} is in
`X[f].re` and the imaginary part in `X[f].im`.
Likewise, when performing an @math{N}-point complex inverse DFT, that is also the representation
that is expected.
### Real DFT and IDFT ###
Oftentimes we instead wish to compute the DFT of real signals. In addition to the periodicity
property (@math{X[f+N] = X[f]}), the DFT of a real signal also has a complex conjugate symmetry such
that @math{X[-f] = X^*[f]}, where @math{X^*[f]} is the complex conjugate of @math{X[f]}. This
symmetry makes it redundant (and thus undesirable) to store such symmetric pairs of elements. This
would allow us to get away with only explicitly storing @math{X[f} for @math{0 \le f \le N/2} in
@math{(N/2)+1} complex elements.
Unfortunately, using such a representation has the undesirable property that the DFT of an
@math{N}-point real signal cannot be computed in-place, as the representation requires more memory
than we started with.
However, if we take the periodicity and complex conjugate symmetry properties together:
\f[
X[0] = X^*[0] \rightarrow Imag\{X[0]\} = 0 \\
X[-(N/2) + N] = X[N/2] \\
X[-N/2] = X^*[N/2] \rightarrow X[N/2] = X^*[N/2] \rightarrow Imag \{ X[N/2] \} = 0
\f]
Because both @math{X[0]} and @math{X[N/2]} are guaranteed to be real, we can recover the benefit of
in-place computation in our representation by packing the real part of @math{X[N/2]} into the
imaginary part of @math{X[0]}.
Therefore, the functions in this library that produce the spectra of real signals (such as
fft_bfp_forward_mono() and fft_bfp_forward_stereo()) will pack the spectra in a slightly less
straight-forward manner (as compared with the complex DFTs):
`X[f]` @math{\longleftarrow X[f]} for @math{1 \le f < N/2}
`X[0]` @math{\longleftarrow X[0] + j X[N/2]}
where `X` is an @math{N/2}-element array of `complex_s32_t`.
Likewise, this is the encoding expected when computing the @math{N}-point inverse DFT, such as by
fft_bfp_inverse_mono() or fft_bfp_inverse_stereo().
@note One additional note, when performing a stereo DFT or inverse DFT, so as to preserve the
in-place computation of the result, the spectra of the two signals will be encoded into adjacent
blocks of memory, with the second spectrum (i.e. associated with 'channel b') occupying the higher
memory address.

View File

@@ -0,0 +1,6 @@
.. _bfp_complex_s16:
Complex 16-bit Block Floating-Point API
---------------------------------------
.. doxygengroup:: bfp_complex_s16_api

View File

@@ -0,0 +1,6 @@
.. _bfp_complex_s32:
Complex 32-bit Block Floating-Point API
---------------------------------------
.. doxygengroup:: bfp_complex_s32_api

View File

@@ -0,0 +1,12 @@
.. _bfp_api:
Block Floating-Point API
========================
.. toctree::
bfp_quickref
bfp_s16
bfp_s32
bfp_complex_s16
bfp_complex_s32

View File

@@ -0,0 +1,111 @@
BFP API quick reference
-----------------------
The tables below list the functions of the block floating-point API. The "EW" column indicates
whether the operation acts element-wise.
The "Signature" column is intended as a hint which quickly conveys the kind of the conceptual inputs
to and outputs from the operation. The signatures are only intended to convey how many (conceptual)
inputs and outputs there are, and their dimensionality.
The functions themselves will typically take more arguments than these signatures indicate. Check
the function's full documentation to get more detailed information.
The following symbols are used in the signatures:
.. table::
:widths: 40 60
:class: longtable
+--------------------------------------+---------------------------------------------+
| Symbol | Description |
+======================================+=============================================+
| :math:`\mathbb{S}` | A scalar input or output value. |
+--------------------------------------+---------------------------------------------+
| :math:`\mathbb{V}` | A vector-valued input or output. |
+--------------------------------------+---------------------------------------------+
| :math:`\mathbb{M}` | A matrix-valued input or output. |
+--------------------------------------+---------------------------------------------+
| :math:`\varnothing` | Placeholder indicating no input or output. |
+--------------------------------------+---------------------------------------------+
For example, the operation signature :math:`(\mathbb{V \times V \times S}) \to \mathbb{V}` indicates
the operation takes two vector inputs and a scalar input, and the output is a vector.
* `32-Bit BFP Ops <bfp32_api_>`_
* `16-Bit BFP Ops <bfp16_api_>`_
* `Complex 32-Bit BFP Ops <bfp32_complex_api_>`_
* `Complex 16-Bit BFP Ops <bfp16_complex_api_>`_
|newpage|
32-Bit BFP API quick reference
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. _bfp32_api:
|beginfullwidth|
.. csv-table:: 32-Bit BFP API - quick reference
:file: csv/32bit_bfp_quickref.csv
:widths: 42, 5, 20, 33
:header-rows: 1
:class: longtable
|endfullwidth|
|newpage|
16-Bit BFP API quick reference
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. _bfp16_api:
|beginfullwidth|
.. csv-table:: 16-Bit BFP API - quick reference
:file: csv/16bit_bfp_quickref.csv
:widths: 42, 5, 20, 33
:header-rows: 1
:class: longtable
|endfullwidth|
|newpage|
Complex 32-bit BFP API quick reference
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. _bfp32_complex_api:
|beginfullwidth|
.. csv-table:: Complex 32-Bit BFP API - quick reference
:file: csv/complex_32bit_bfp_quickref.csv
:widths: 42, 5, 20, 33
:header-rows: 1
:class: longtable
|endfullwidth|
|newpage|
Complex 16-bit BFP API quick reference
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. _bfp16_complex_api:
|beginfullwidth|
.. csv-table:: Complex 16-Bit BFP API - quick reference
:file: csv/complex_16bit_bfp_quickref.csv
:widths: 42, 5, 20, 33
:header-rows: 1
:class: longtable
|endfullwidth|
|newpage|

View File

@@ -0,0 +1,6 @@
.. _bfp_s16:
16-bit Block Floating-Point API
-------------------------------
.. doxygengroup:: bfp_s16_api

View File

@@ -0,0 +1,6 @@
.. _bfp_s32:
32-bit Block Floating-Point API
-------------------------------
.. doxygengroup:: bfp_s32_api

View File

@@ -0,0 +1,34 @@
Function,EW,Signature,Brief
:c:func:`bfp_s16_init()` , , ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", Initialize (static)
:c:func:`bfp_s16_alloc()` , , ":math:`\varnothing \to \mathbb{V}` ", Initialize (dynamic)
:c:func:`bfp_s16_dealloc()` , , ":math:`\mathbb{V} \to \mathbb{\varnothing}` ", Deinitialize
:c:func:`bfp_s16_set()` , x, ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", Set All Elements
:c:func:`bfp_s16_use_exponent()` , , ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", Force Exponent
:c:func:`bfp_s16_headroom()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", Get Headroom
:c:func:`bfp_s16_shl()` , x, ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", Shift Mantissas
:c:func:`bfp_s16_add()` , x, ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", Add Vector
:c:func:`bfp_s16_add_scalar()` , , ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", Add Scalar
:c:func:`bfp_s16_sub()` , x, ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", Subtract Vector
:c:func:`bfp_s16_mul()` , x, ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", Multiply Vector
:c:func:`bfp_s16_macc()` , x, ":math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` ", Multiply-Accumulate
:c:func:`bfp_s16_nmacc()` , x, ":math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` ", Negated Multiply-Accumulate
:c:func:`bfp_s16_scale()` , , ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", Multiply Scalar
:c:func:`bfp_s16_abs()` , x, ":math:`\mathbb{V} \to \mathbb{V}` ", Absolute Values
:c:func:`bfp_s16_sum()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", Sum Elements
:c:func:`bfp_s16_dot()` , , ":math:`(\mathbb{V \times V}) \to \mathbb{S}` ", Inner Product
:c:func:`bfp_s16_clip()` , x, ":math:`(\mathbb{V \times S \times S}) \to \mathbb{V}` ", Clip Bounds
:c:func:`bfp_s16_rect()` , x, ":math:`\mathbb{V} \to \mathbb{V}` ", Rectify Elements
:c:func:`bfp_s16_to_bfp_s32()` , x, ":math:`\mathbb{V} \to \mathbb{V}` ", Convert to 32-bit
:c:func:`bfp_s16_sqrt()` , x, ":math:`\mathbb{V} \to \mathbb{V}` ", Square Root
:c:func:`bfp_s16_inverse()` , x, ":math:`\mathbb{V} \to \mathbb{V}` ", Multiplicative Inverse
:c:func:`bfp_s16_abs_sum()` , , ":math:`\mathbb{V} \to \mathbb{V}` ", Absolute Sum Elements
:c:func:`bfp_s16_mean()` , , ":math:`\mathbb{V} \to \mathbb{V}` ", Vector Mean Value
:c:func:`bfp_s16_energy()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", Vector Energy
:c:func:`bfp_s16_rms()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", Vector RMS Value
:c:func:`bfp_s16_max()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", Vector Max Element
:c:func:`bfp_s16_min()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", Vector Min Element
:c:func:`bfp_s16_max_elementwise()` , x, ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", Elementwise Max
:c:func:`bfp_s16_min_elementwise()` , x, ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", Elementwise Min
:c:func:`bfp_s16_argmax()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", Max Element Index
:c:func:`bfp_s16_argmin()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", Min Element Index
:c:func:`bfp_s16_accumulate()` , x, ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", Elementwise Accumulate
1 Function EW Signature Brief
2 :c:func:`bfp_s16_init()` :math:`(\mathbb{V \times S}) \to \mathbb{V}` Initialize (static)
3 :c:func:`bfp_s16_alloc()` :math:`\varnothing \to \mathbb{V}` Initialize (dynamic)
4 :c:func:`bfp_s16_dealloc()` :math:`\mathbb{V} \to \mathbb{\varnothing}` Deinitialize
5 :c:func:`bfp_s16_set()` x :math:`(\mathbb{V \times S}) \to \mathbb{V}` Set All Elements
6 :c:func:`bfp_s16_use_exponent()` :math:`(\mathbb{V \times S}) \to \mathbb{V}` Force Exponent
7 :c:func:`bfp_s16_headroom()` :math:`\mathbb{V} \to \mathbb{S}` Get Headroom
8 :c:func:`bfp_s16_shl()` x :math:`(\mathbb{V \times S}) \to \mathbb{V}` Shift Mantissas
9 :c:func:`bfp_s16_add()` x :math:`(\mathbb{V \times V}) \to \mathbb{V}` Add Vector
10 :c:func:`bfp_s16_add_scalar()` :math:`(\mathbb{V \times S}) \to \mathbb{V}` Add Scalar
11 :c:func:`bfp_s16_sub()` x :math:`(\mathbb{V \times V}) \to \mathbb{V}` Subtract Vector
12 :c:func:`bfp_s16_mul()` x :math:`(\mathbb{V \times V}) \to \mathbb{V}` Multiply Vector
13 :c:func:`bfp_s16_macc()` x :math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` Multiply-Accumulate
14 :c:func:`bfp_s16_nmacc()` x :math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` Negated Multiply-Accumulate
15 :c:func:`bfp_s16_scale()` :math:`(\mathbb{V \times S}) \to \mathbb{V}` Multiply Scalar
16 :c:func:`bfp_s16_abs()` x :math:`\mathbb{V} \to \mathbb{V}` Absolute Values
17 :c:func:`bfp_s16_sum()` :math:`\mathbb{V} \to \mathbb{S}` Sum Elements
18 :c:func:`bfp_s16_dot()` :math:`(\mathbb{V \times V}) \to \mathbb{S}` Inner Product
19 :c:func:`bfp_s16_clip()` x :math:`(\mathbb{V \times S \times S}) \to \mathbb{V}` Clip Bounds
20 :c:func:`bfp_s16_rect()` x :math:`\mathbb{V} \to \mathbb{V}` Rectify Elements
21 :c:func:`bfp_s16_to_bfp_s32()` x :math:`\mathbb{V} \to \mathbb{V}` Convert to 32-bit
22 :c:func:`bfp_s16_sqrt()` x :math:`\mathbb{V} \to \mathbb{V}` Square Root
23 :c:func:`bfp_s16_inverse()` x :math:`\mathbb{V} \to \mathbb{V}` Multiplicative Inverse
24 :c:func:`bfp_s16_abs_sum()` :math:`\mathbb{V} \to \mathbb{V}` Absolute Sum Elements
25 :c:func:`bfp_s16_mean()` :math:`\mathbb{V} \to \mathbb{V}` Vector Mean Value
26 :c:func:`bfp_s16_energy()` :math:`\mathbb{V} \to \mathbb{S}` Vector Energy
27 :c:func:`bfp_s16_rms()` :math:`\mathbb{V} \to \mathbb{S}` Vector RMS Value
28 :c:func:`bfp_s16_max()` :math:`\mathbb{V} \to \mathbb{S}` Vector Max Element
29 :c:func:`bfp_s16_min()` :math:`\mathbb{V} \to \mathbb{S}` Vector Min Element
30 :c:func:`bfp_s16_max_elementwise()` x :math:`(\mathbb{V \times V}) \to \mathbb{V}` Elementwise Max
31 :c:func:`bfp_s16_min_elementwise()` x :math:`(\mathbb{V \times V}) \to \mathbb{V}` Elementwise Min
32 :c:func:`bfp_s16_argmax()` :math:`\mathbb{V} \to \mathbb{S}` Max Element Index
33 :c:func:`bfp_s16_argmin()` :math:`\mathbb{V} \to \mathbb{S}` Min Element Index
34 :c:func:`bfp_s16_accumulate()` x :math:`(\mathbb{V \times V}) \to \mathbb{V}` Elementwise Accumulate

View File

@@ -0,0 +1,35 @@
Function,EW,Signature,Brief
:c:func:`bfp_s32_init()` , , ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", "Initialize (static)"
:c:func:`bfp_s32_alloc()` , , ":math:`\varnothing \to \mathbb{V}` ", "Initialize (dynamic)"
:c:func:`bfp_s32_dealloc()` , , ":math:`\mathbb{V} \to \mathbb{\varnothing}` ", "Deinitialize"
:c:func:`bfp_s32_set()` , x, ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", "Set All Elements"
:c:func:`bfp_s32_use_exponent()` , , ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", "Force Exponent"
:c:func:`bfp_s32_headroom()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", "Get Headroom"
:c:func:`bfp_s32_shl()` , x, ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", "Shift Mantissas"
:c:func:`bfp_s32_add()` , x, ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", "Add Vector"
:c:func:`bfp_s32_add_scalar()` , , ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", "Add Scalar"
:c:func:`bfp_s32_sub()` , x, ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", "Subtract Vector"
:c:func:`bfp_s32_mul()` , x, ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", "Multiply Vector"
:c:func:`bfp_s32_macc()` , x, ":math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` ", "Multiply-Accumulate"
:c:func:`bfp_s32_nmacc()` , x, ":math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` ", "Negated Multiply-Accumulate"
:c:func:`bfp_s32_scale()` , , ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", "Multiply Scalar"
:c:func:`bfp_s32_abs()` , x, ":math:`\mathbb{V} \to \mathbb{V}` ", "Absolute Values"
:c:func:`bfp_s32_sum()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", "Sum Elements"
:c:func:`bfp_s32_dot()` , , ":math:`(\mathbb{V \times V}) \to \mathbb{S}` ", "Inner Product"
:c:func:`bfp_s32_clip()` , x, ":math:`(\mathbb{V \times S \times S}) \to \mathbb{V}` ", "Clip Bounds"
:c:func:`bfp_s32_rect()` , x, ":math:`\mathbb{V} \to \mathbb{V}` ", "Rectify Elements"
:c:func:`bfp_s32_to_bfp_s16()` , , ":math:`\mathbb{V} \to \mathbb{V}` ", "Convert to 16-bit"
:c:func:`bfp_s32_sqrt()` , x, ":math:`\mathbb{V} \to \mathbb{V}` ", "Square Root"
:c:func:`bfp_s32_inverse()` , x, ":math:`\mathbb{V} \to \mathbb{V}` ", "Multiplicative Inverse"
:c:func:`bfp_s32_abs_sum()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", "Absolute Sum Elements"
:c:func:`bfp_s32_mean()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", "Vector Mean Value"
:c:func:`bfp_s32_energy()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", "Vector Energy"
:c:func:`bfp_s32_rms()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", "Vector RMS Value"
:c:func:`bfp_s32_max()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", "Vector Max Element"
:c:func:`bfp_s32_min()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", "Vector Min Element"
:c:func:`bfp_s32_max_elementwise()` , x, ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", "Elementwise Max"
:c:func:`bfp_s32_min_elementwise()` , x, ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", "Elementwise Min"
:c:func:`bfp_s32_argmax()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", "Max Element Index"
:c:func:`bfp_s32_argmin()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", "Min Element Index"
:c:func:`bfp_s32_convolve_valid()` , , ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", "Convolve With Kernel (Valid mode)"
:c:func:`bfp_s32_convolve_same()` , , ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", "Convolve With Kernel (Same mode)"
1 Function EW Signature Brief
2 :c:func:`bfp_s32_init()` :math:`(\mathbb{V \times S}) \to \mathbb{V}` Initialize (static)
3 :c:func:`bfp_s32_alloc()` :math:`\varnothing \to \mathbb{V}` Initialize (dynamic)
4 :c:func:`bfp_s32_dealloc()` :math:`\mathbb{V} \to \mathbb{\varnothing}` Deinitialize
5 :c:func:`bfp_s32_set()` x :math:`(\mathbb{V \times S}) \to \mathbb{V}` Set All Elements
6 :c:func:`bfp_s32_use_exponent()` :math:`(\mathbb{V \times S}) \to \mathbb{V}` Force Exponent
7 :c:func:`bfp_s32_headroom()` :math:`\mathbb{V} \to \mathbb{S}` Get Headroom
8 :c:func:`bfp_s32_shl()` x :math:`(\mathbb{V \times S}) \to \mathbb{V}` Shift Mantissas
9 :c:func:`bfp_s32_add()` x :math:`(\mathbb{V \times V}) \to \mathbb{V}` Add Vector
10 :c:func:`bfp_s32_add_scalar()` :math:`(\mathbb{V \times S}) \to \mathbb{V}` Add Scalar
11 :c:func:`bfp_s32_sub()` x :math:`(\mathbb{V \times V}) \to \mathbb{V}` Subtract Vector
12 :c:func:`bfp_s32_mul()` x :math:`(\mathbb{V \times V}) \to \mathbb{V}` Multiply Vector
13 :c:func:`bfp_s32_macc()` x :math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` Multiply-Accumulate
14 :c:func:`bfp_s32_nmacc()` x :math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` Negated Multiply-Accumulate
15 :c:func:`bfp_s32_scale()` :math:`(\mathbb{V \times S}) \to \mathbb{V}` Multiply Scalar
16 :c:func:`bfp_s32_abs()` x :math:`\mathbb{V} \to \mathbb{V}` Absolute Values
17 :c:func:`bfp_s32_sum()` :math:`\mathbb{V} \to \mathbb{S}` Sum Elements
18 :c:func:`bfp_s32_dot()` :math:`(\mathbb{V \times V}) \to \mathbb{S}` Inner Product
19 :c:func:`bfp_s32_clip()` x :math:`(\mathbb{V \times S \times S}) \to \mathbb{V}` Clip Bounds
20 :c:func:`bfp_s32_rect()` x :math:`\mathbb{V} \to \mathbb{V}` Rectify Elements
21 :c:func:`bfp_s32_to_bfp_s16()` :math:`\mathbb{V} \to \mathbb{V}` Convert to 16-bit
22 :c:func:`bfp_s32_sqrt()` x :math:`\mathbb{V} \to \mathbb{V}` Square Root
23 :c:func:`bfp_s32_inverse()` x :math:`\mathbb{V} \to \mathbb{V}` Multiplicative Inverse
24 :c:func:`bfp_s32_abs_sum()` :math:`\mathbb{V} \to \mathbb{S}` Absolute Sum Elements
25 :c:func:`bfp_s32_mean()` :math:`\mathbb{V} \to \mathbb{S}` Vector Mean Value
26 :c:func:`bfp_s32_energy()` :math:`\mathbb{V} \to \mathbb{S}` Vector Energy
27 :c:func:`bfp_s32_rms()` :math:`\mathbb{V} \to \mathbb{S}` Vector RMS Value
28 :c:func:`bfp_s32_max()` :math:`\mathbb{V} \to \mathbb{S}` Vector Max Element
29 :c:func:`bfp_s32_min()` :math:`\mathbb{V} \to \mathbb{S}` Vector Min Element
30 :c:func:`bfp_s32_max_elementwise()` x :math:`(\mathbb{V \times V}) \to \mathbb{V}` Elementwise Max
31 :c:func:`bfp_s32_min_elementwise()` x :math:`(\mathbb{V \times V}) \to \mathbb{V}` Elementwise Min
32 :c:func:`bfp_s32_argmax()` :math:`\mathbb{V} \to \mathbb{S}` Max Element Index
33 :c:func:`bfp_s32_argmin()` :math:`\mathbb{V} \to \mathbb{S}` Min Element Index
34 :c:func:`bfp_s32_convolve_valid()` :math:`(\mathbb{V \times V}) \to \mathbb{V}` Convolve With Kernel (Valid mode)
35 :c:func:`bfp_s32_convolve_same()` :math:`(\mathbb{V \times V}) \to \mathbb{V}` Convolve With Kernel (Same mode)

View File

@@ -0,0 +1,26 @@
Function , EW , Signature , Brief
:c:func:`bfp_complex_s16_init()` , , ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", Initialize (static)
:c:func:`bfp_complex_s16_alloc()` , , ":math:`\varnothing \to \mathbb{V}` ", Initialize (dynamic)
:c:func:`bfp_complex_s16_dealloc()` , , ":math:`\mathbb{V} \to \mathbb{\varnothing}` ", Deinitialize
:c:func:`bfp_complex_s16_set()` , x , ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", Set All Elements
:c:func:`bfp_complex_s16_use_exponent()` , , ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", Force Exponent
:c:func:`bfp_complex_s16_headroom()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", Get Headroom
:c:func:`bfp_complex_s16_shl()` , x , ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", Shift Mantissas
:c:func:`bfp_complex_s16_real_mul()` , x , ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", Real Vector Multiply
:c:func:`bfp_complex_s16_mul()` , x , ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", Complex Vector Multiply
:c:func:`bfp_complex_s16_conj_mul()` , x , ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", Complex Vector Conjugate Multiply
:c:func:`bfp_complex_s16_macc()` , x , ":math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` ", Complex Vector Multiply-Accumulate
:c:func:`bfp_complex_s16_nmacc()` , x , ":math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` ", Complex Vector Negated Multiply-Accumulate
:c:func:`bfp_complex_s16_conj_macc()` , x , ":math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` ", Complex Vector Conjugate Multiply-Accumulate
:c:func:`bfp_complex_s16_conj_nmacc()` , x , ":math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` ", Complex Vector Negated Conjugate Multiply-Accumulate
:c:func:`bfp_complex_s16_real_scale()` , , ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", Real Scalar Multiply
:c:func:`bfp_complex_s16_scale()` , , ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", Complex Scalar Multiply
:c:func:`bfp_complex_s16_add()` , x , ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", Complex Vector Add
:c:func:`bfp_complex_s16_add_scalar()` , , ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", Complex Scalar Add
:c:func:`bfp_complex_s16_sub()` , , ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", Complex Vector Subtract
:c:func:`bfp_complex_s16_to_bfp_complex_s32()` , x , ":math:`\mathbb{V} \to \mathbb{V}` ", Convert to 32-bit
:c:func:`bfp_complex_s16_squared_mag()` , x , ":math:`\mathbb{V} \to \mathbb{V}` ", Squared Magnitude
:c:func:`bfp_complex_s16_sum()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", Vector Sum
:c:func:`bfp_complex_s16_mag()` , x , ":math:`\mathbb{V} \to \mathbb{V}` ", Magnitude
:c:func:`bfp_complex_s16_conjugate()` , x , ":math:`\mathbb{V} \to \mathbb{V}` ", Complex Conjugate
:c:func:`bfp_complex_s16_energy()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", Vector Energy
1 Function EW Signature Brief
2 :c:func:`bfp_complex_s16_init()` :math:`(\mathbb{V \times S}) \to \mathbb{V}` Initialize (static)
3 :c:func:`bfp_complex_s16_alloc()` :math:`\varnothing \to \mathbb{V}` Initialize (dynamic)
4 :c:func:`bfp_complex_s16_dealloc()` :math:`\mathbb{V} \to \mathbb{\varnothing}` Deinitialize
5 :c:func:`bfp_complex_s16_set()` x :math:`(\mathbb{V \times S}) \to \mathbb{V}` Set All Elements
6 :c:func:`bfp_complex_s16_use_exponent()` :math:`(\mathbb{V \times S}) \to \mathbb{V}` Force Exponent
7 :c:func:`bfp_complex_s16_headroom()` :math:`\mathbb{V} \to \mathbb{S}` Get Headroom
8 :c:func:`bfp_complex_s16_shl()` x :math:`(\mathbb{V \times S}) \to \mathbb{V}` Shift Mantissas
9 :c:func:`bfp_complex_s16_real_mul()` x :math:`(\mathbb{V \times V}) \to \mathbb{V}` Real Vector Multiply
10 :c:func:`bfp_complex_s16_mul()` x :math:`(\mathbb{V \times V}) \to \mathbb{V}` Complex Vector Multiply
11 :c:func:`bfp_complex_s16_conj_mul()` x :math:`(\mathbb{V \times V}) \to \mathbb{V}` Complex Vector Conjugate Multiply
12 :c:func:`bfp_complex_s16_macc()` x :math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` Complex Vector Multiply-Accumulate
13 :c:func:`bfp_complex_s16_nmacc()` x :math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` Complex Vector Negated Multiply-Accumulate
14 :c:func:`bfp_complex_s16_conj_macc()` x :math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` Complex Vector Conjugate Multiply-Accumulate
15 :c:func:`bfp_complex_s16_conj_nmacc()` x :math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` Complex Vector Negated Conjugate Multiply-Accumulate
16 :c:func:`bfp_complex_s16_real_scale()` :math:`(\mathbb{V \times S}) \to \mathbb{V}` Real Scalar Multiply
17 :c:func:`bfp_complex_s16_scale()` :math:`(\mathbb{V \times S}) \to \mathbb{V}` Complex Scalar Multiply
18 :c:func:`bfp_complex_s16_add()` x :math:`(\mathbb{V \times V}) \to \mathbb{V}` Complex Vector Add
19 :c:func:`bfp_complex_s16_add_scalar()` :math:`(\mathbb{V \times S}) \to \mathbb{V}` Complex Scalar Add
20 :c:func:`bfp_complex_s16_sub()` :math:`(\mathbb{V \times V}) \to \mathbb{V}` Complex Vector Subtract
21 :c:func:`bfp_complex_s16_to_bfp_complex_s32()` x :math:`\mathbb{V} \to \mathbb{V}` Convert to 32-bit
22 :c:func:`bfp_complex_s16_squared_mag()` x :math:`\mathbb{V} \to \mathbb{V}` Squared Magnitude
23 :c:func:`bfp_complex_s16_sum()` :math:`\mathbb{V} \to \mathbb{S}` Vector Sum
24 :c:func:`bfp_complex_s16_mag()` x :math:`\mathbb{V} \to \mathbb{V}` Magnitude
25 :c:func:`bfp_complex_s16_conjugate()` x :math:`\mathbb{V} \to \mathbb{V}` Complex Conjugate
26 :c:func:`bfp_complex_s16_energy()` :math:`\mathbb{V} \to \mathbb{S}` Vector Energy

View File

@@ -0,0 +1,29 @@
Function,EW,Signature,Brief
:c:func:`bfp_complex_s32_init()` , , ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", Initialize (static)
:c:func:`bfp_complex_s32_alloc()` , , ":math:`\varnothing \to \mathbb{V}` ", Initialize (dynamic)
:c:func:`bfp_complex_s32_dealloc()` , , ":math:`\mathbb{V} \to \mathbb{\varnothing}` ", Deinitialize
:c:func:`bfp_complex_s32_set()` , x, ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", Set All Elements
:c:func:`bfp_complex_s32_use_exponent()` , , ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", Force Exponent
:c:func:`bfp_complex_s32_headroom()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", Get Headroom
:c:func:`bfp_complex_s32_shl()` , x, ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", Shift Mantissas
:c:func:`bfp_complex_s32_real_mul()` , x, ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", Real Vector Multiply
:c:func:`bfp_complex_s32_mul()` , x, ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", Complex Vector Multiply
:c:func:`bfp_complex_s32_conj_mul()` , x, ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", Complex Vector Conjugate Multiply
:c:func:`bfp_complex_s32_macc()` , x, ":math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` ", Complex Vector Multiply-Accumulate
:c:func:`bfp_complex_s32_nmacc()` , x, ":math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` ", Complex Vector Negated Multiply-Accumulate
:c:func:`bfp_complex_s32_conj_macc()` , x, ":math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` ", Complex Vector Conjugate Multiply-Accumulate
:c:func:`bfp_complex_s32_conj_nmacc()` , x, ":math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` ", Complex Vector Negated Conjugate Multiply-Accumulate
:c:func:`bfp_complex_s32_real_scale()` , , ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", Real Scalar Multiply
:c:func:`bfp_complex_s32_scale()` , , ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", Complex Scalar Multiply
:c:func:`bfp_complex_s32_add()` , x, ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", Complex Vector Add
:c:func:`bfp_complex_s32_add_scalar()` , , ":math:`(\mathbb{V \times S}) \to \mathbb{V}` ", Complex Scalar Add
:c:func:`bfp_complex_s32_sub()` , , ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", Complex Vector Subtract
:c:func:`bfp_complex_s32_to_bfp_complex_s16()` , x, ":math:`\mathbb{V} \to \mathbb{V}` ", Convert to 16-bit
:c:func:`bfp_complex_s32_squared_mag()` , x, ":math:`\mathbb{V} \to \mathbb{V}` ", Squared Magnitude
:c:func:`bfp_complex_s32_mag()` , x, ":math:`\mathbb{V} \to \mathbb{V}` ", Magnitude
:c:func:`bfp_complex_s32_sum()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", Vector Sum
:c:func:`bfp_complex_s32_conjugate()` , x, ":math:`\mathbb{V} \to \mathbb{V}` ", Complex Conjugate
:c:func:`bfp_complex_s32_energy()` , , ":math:`\mathbb{V} \to \mathbb{S}` ", Vector Energy
:c:func:`bfp_complex_s32_make()` , x, ":math:`(\mathbb{V \times V}) \to \mathbb{V}` ", Construct Complex From Real and Imaginary
:c:func:`bfp_complex_s32_real_part()` , x, ":math:`\mathbb{V} \to \mathbb{V}` ", Real Part
:c:func:`bfp_complex_s32_imag_part()` , x, ":math:`\mathbb{V} \to \mathbb{V}` ", Imaginary Part
1 Function EW Signature Brief
2 :c:func:`bfp_complex_s32_init()` :math:`(\mathbb{V \times S}) \to \mathbb{V}` Initialize (static)
3 :c:func:`bfp_complex_s32_alloc()` :math:`\varnothing \to \mathbb{V}` Initialize (dynamic)
4 :c:func:`bfp_complex_s32_dealloc()` :math:`\mathbb{V} \to \mathbb{\varnothing}` Deinitialize
5 :c:func:`bfp_complex_s32_set()` x :math:`(\mathbb{V \times S}) \to \mathbb{V}` Set All Elements
6 :c:func:`bfp_complex_s32_use_exponent()` :math:`(\mathbb{V \times S}) \to \mathbb{V}` Force Exponent
7 :c:func:`bfp_complex_s32_headroom()` :math:`\mathbb{V} \to \mathbb{S}` Get Headroom
8 :c:func:`bfp_complex_s32_shl()` x :math:`(\mathbb{V \times S}) \to \mathbb{V}` Shift Mantissas
9 :c:func:`bfp_complex_s32_real_mul()` x :math:`(\mathbb{V \times V}) \to \mathbb{V}` Real Vector Multiply
10 :c:func:`bfp_complex_s32_mul()` x :math:`(\mathbb{V \times V}) \to \mathbb{V}` Complex Vector Multiply
11 :c:func:`bfp_complex_s32_conj_mul()` x :math:`(\mathbb{V \times V}) \to \mathbb{V}` Complex Vector Conjugate Multiply
12 :c:func:`bfp_complex_s32_macc()` x :math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` Complex Vector Multiply-Accumulate
13 :c:func:`bfp_complex_s32_nmacc()` x :math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` Complex Vector Negated Multiply-Accumulate
14 :c:func:`bfp_complex_s32_conj_macc()` x :math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` Complex Vector Conjugate Multiply-Accumulate
15 :c:func:`bfp_complex_s32_conj_nmacc()` x :math:`(\mathbb{V \times V \times V}) \to \mathbb{V}` Complex Vector Negated Conjugate Multiply-Accumulate
16 :c:func:`bfp_complex_s32_real_scale()` :math:`(\mathbb{V \times S}) \to \mathbb{V}` Real Scalar Multiply
17 :c:func:`bfp_complex_s32_scale()` :math:`(\mathbb{V \times S}) \to \mathbb{V}` Complex Scalar Multiply
18 :c:func:`bfp_complex_s32_add()` x :math:`(\mathbb{V \times V}) \to \mathbb{V}` Complex Vector Add
19 :c:func:`bfp_complex_s32_add_scalar()` :math:`(\mathbb{V \times S}) \to \mathbb{V}` Complex Scalar Add
20 :c:func:`bfp_complex_s32_sub()` :math:`(\mathbb{V \times V}) \to \mathbb{V}` Complex Vector Subtract
21 :c:func:`bfp_complex_s32_to_bfp_complex_s16()` x :math:`\mathbb{V} \to \mathbb{V}` Convert to 16-bit
22 :c:func:`bfp_complex_s32_squared_mag()` x :math:`\mathbb{V} \to \mathbb{V}` Squared Magnitude
23 :c:func:`bfp_complex_s32_mag()` x :math:`\mathbb{V} \to \mathbb{V}` Magnitude
24 :c:func:`bfp_complex_s32_sum()` :math:`\mathbb{V} \to \mathbb{S}` Vector Sum
25 :c:func:`bfp_complex_s32_conjugate()` x :math:`\mathbb{V} \to \mathbb{V}` Complex Conjugate
26 :c:func:`bfp_complex_s32_energy()` :math:`\mathbb{V} \to \mathbb{S}` Vector Energy
27 :c:func:`bfp_complex_s32_make()` x :math:`(\mathbb{V \times V}) \to \mathbb{V}` Construct Complex From Real and Imaginary
28 :c:func:`bfp_complex_s32_real_part()` x :math:`\mathbb{V} \to \mathbb{V}` Real Part
29 :c:func:`bfp_complex_s32_imag_part()` x :math:`\mathbb{V} \to \mathbb{V}` Imaginary Part

View File

@@ -0,0 +1,9 @@
.. _compile_time_opts:
Library Configuration
=====================
Configuration Options
---------------------
.. doxygengroup:: config_options

View File

@@ -0,0 +1,18 @@
Prefix , Object Type , Notes
``s32`` , ``int32_t`` , "32-bit signed integer. May be a simple integer, a fixed-point value or the mantissa of a floating-point value."
``s16`` , ``int16_t`` , "16-bit signed integer. May be a simple integer, a fixed-point value or the mantissa of a floating-point value."
``s8`` , ``int8_t`` , "8-bit signed integer. May be a simple integer, a fixed-point value or the mantissa of a floating-point value."
``complex_s32`` , :c:type:`complex_s32_t` , "Signed complex integer with 32-bit real and 32-bit imaginary parts."
``complex_s16`` , :c:type:`complex_s16_t` , "Signed complex integer with 16-bit real and 16-bit imaginary parts."
``float_s64`` , :c:type:`float_s64_t` , "Non-standard floating-point scalar with exponent and 64-bit mantissa."
``float_s32`` , :c:type:`float_s32_t` , "Non-standard floating-point scalar with exponent and 32-bit mantissa."
``qXX`` , ``int32_t`` , "32-bit fixed-point value with ``XX`` fractional bits (i.e. exponent of ``-XX``)."
``f32`` , ``float`` , "Standard IEEE 754 single-precision ``float``."
``f64`` , ``double`` , "Standard IEEE 754 double-precision ``float``."
``float_complex_s64`` , :c:type:`float_complex_s64_t` , "Floating-point value with exponent and complex mantissa with 64-bit real and imaginary parts."
``float_complex_s32`` , :c:type:`float_complex_s32_t` , "Floating-point value with exponent and complex mantissa with 32-bit real and imaginary parts."
``float_complex_s16`` , :c:type:`float_complex_s16_t` , "Floating-point value with exponent and complex mantissa with 16-bit real and imaginary parts."
N/A , :c:type:`exponent_t` , "Represents an exponent :math:`p` as in :math:`2^p`. Unless otherwise specified exponent are always assumed to have a base of :math:`2`."
N/A , :c:type:`headroom_t` , "The headroom of a scalar or vector. See :ref:`headroom_intro` for more information."
N/A , :c:type:`right_shift_t` , "Represents a rightward bit-shift of a certain number of bits. Care should be taken, as sometimes this is treated as unsigned."
N/A , :c:type:`left_shift_t` , "Represents a leftward bit-shift of a certain number of bits. Care should be taken, as sometimes this is treated as unsigned."
1 Prefix Object Type Notes
2 ``s32`` ``int32_t`` 32-bit signed integer. May be a simple integer, a fixed-point value or the mantissa of a floating-point value.
3 ``s16`` ``int16_t`` 16-bit signed integer. May be a simple integer, a fixed-point value or the mantissa of a floating-point value.
4 ``s8`` ``int8_t`` 8-bit signed integer. May be a simple integer, a fixed-point value or the mantissa of a floating-point value.
5 ``complex_s32`` :c:type:`complex_s32_t` Signed complex integer with 32-bit real and 32-bit imaginary parts.
6 ``complex_s16`` :c:type:`complex_s16_t` Signed complex integer with 16-bit real and 16-bit imaginary parts.
7 ``float_s64`` :c:type:`float_s64_t` Non-standard floating-point scalar with exponent and 64-bit mantissa.
8 ``float_s32`` :c:type:`float_s32_t` Non-standard floating-point scalar with exponent and 32-bit mantissa.
9 ``qXX`` ``int32_t`` 32-bit fixed-point value with ``XX`` fractional bits (i.e. exponent of ``-XX``).
10 ``f32`` ``float`` Standard IEEE 754 single-precision ``float``.
11 ``f64`` ``double`` Standard IEEE 754 double-precision ``float``.
12 ``float_complex_s64`` :c:type:`float_complex_s64_t` Floating-point value with exponent and complex mantissa with 64-bit real and imaginary parts.
13 ``float_complex_s32`` :c:type:`float_complex_s32_t` Floating-point value with exponent and complex mantissa with 32-bit real and imaginary parts.
14 ``float_complex_s16`` :c:type:`float_complex_s16_t` Floating-point value with exponent and complex mantissa with 16-bit real and imaginary parts.
15 N/A :c:type:`exponent_t` Represents an exponent :math:`p` as in :math:`2^p`. Unless otherwise specified exponent are always assumed to have a base of :math:`2`.
16 N/A :c:type:`headroom_t` The headroom of a scalar or vector. See :ref:`headroom_intro` for more information.
17 N/A :c:type:`right_shift_t` Represents a rightward bit-shift of a certain number of bits. Care should be taken, as sometimes this is treated as unsigned.
18 N/A :c:type:`left_shift_t` Represents a leftward bit-shift of a certain number of bits. Care should be taken, as sometimes this is treated as unsigned.

View File

@@ -0,0 +1,14 @@
Prefix,Object Type,Notes
``vect_s32`` , "``int32_t[]`` ", "Raw vector of signed 32-bit integers."
``vect_s16`` , "``int16_t[]`` ", "Raw vector of signed 16-bit integers."
``vect_s8`` , "``int8_t[]`` ", "Raw vector of signed 8-bit integers."
``vect_complex_s32`` , ":c:type:`complex_s32_t`\ ``[]`` ", "Raw vector of complex 32-bit integers."
``vect_complex_s16`` , "(``int16_t[]``, ``int16_t[]``) ", "Complex 16-bit vectors are usually represented as a pair of 16-bit vectors. This is an optimization due to the word-alignment requirement when loading data into the VPU's vector registers."
``chunk_s32`` , "``int32_t[8]`` ", "A 'chunk' is a fixed size vector corresponding to the size of the VPU vector registers."
``vect_qXX`` , "``int32_t[]`` ", "When used in an API function name, the ``XX`` will be an actual number (e.g. :c:func:`vect_q30_exp_small()`) indicating the fixed-point interpretation used by that function."
``vect_f32`` , "``float[]`` ", "Raw vector of standard IEEE ``float``"
``vect_float_s32`` , ":c:type:`float_s32_t`\ ``[]`` ", "Vector of non-standard 32-bit floating-point scalars."
``bfp_s32`` , ":c:type:`bfp_s32_t` ", "Block floating-point vector contianing 32-bit mantissas."
``bfp_s16`` , ":c:type:`bfp_s16_t` ", "Block floating-point vector contianing 16-bit mantissas."
``bfp_complex_s32`` , ":c:type:`bfp_complex_s32_t` ", "Block floating-point vector contianing complex 32-bit mantissas."
``bfp_complex_s16`` , ":c:type:`bfp_complex_s16_t` ", "Block floating-point vector contianing complex 16-bit mantissas."
1 Prefix Object Type Notes
2 ``vect_s32`` ``int32_t[]`` Raw vector of signed 32-bit integers.
3 ``vect_s16`` ``int16_t[]`` Raw vector of signed 16-bit integers.
4 ``vect_s8`` ``int8_t[]`` Raw vector of signed 8-bit integers.
5 ``vect_complex_s32`` :c:type:`complex_s32_t`\ ``[]`` Raw vector of complex 32-bit integers.
6 ``vect_complex_s16`` (``int16_t[]``, ``int16_t[]``) Complex 16-bit vectors are usually represented as a pair of 16-bit vectors. This is an optimization due to the word-alignment requirement when loading data into the VPU's vector registers.
7 ``chunk_s32`` ``int32_t[8]`` A 'chunk' is a fixed size vector corresponding to the size of the VPU vector registers.
8 ``vect_qXX`` ``int32_t[]`` When used in an API function name, the ``XX`` will be an actual number (e.g. :c:func:`vect_q30_exp_small()`) indicating the fixed-point interpretation used by that function.
9 ``vect_f32`` ``float[]`` Raw vector of standard IEEE ``float``
10 ``vect_float_s32`` :c:type:`float_s32_t`\ ``[]`` Vector of non-standard 32-bit floating-point scalars.
11 ``bfp_s32`` :c:type:`bfp_s32_t` Block floating-point vector contianing 32-bit mantissas.
12 ``bfp_s16`` :c:type:`bfp_s16_t` Block floating-point vector contianing 16-bit mantissas.
13 ``bfp_complex_s32`` :c:type:`bfp_complex_s32_t` Block floating-point vector contianing complex 32-bit mantissas.
14 ``bfp_complex_s16`` :c:type:`bfp_complex_s16_t` Block floating-point vector contianing complex 16-bit mantissas.

View File

@@ -0,0 +1,10 @@
Brief , Forward Function , Inverse Function
6-point DCT , :c:func:`dct6_forward()` , :c:func:`dct6_inverse()`
8-point DCT , :c:func:`dct8_forward()` , :c:func:`dct8_inverse()`
12-point DCT , :c:func:`dct12_forward()` , :c:func:`dct12_inverse()`
16-point DCT , :c:func:`dct16_forward()` , :c:func:`dct16_inverse()`
24-point DCT , :c:func:`dct24_forward()` , :c:func:`dct24_inverse()`
32-point DCT , :c:func:`dct32_forward()` , :c:func:`dct32_inverse()`
48-point DCT , :c:func:`dct48_forward()` , :c:func:`dct48_inverse()`
64-point DCT , :c:func:`dct64_forward()` , :c:func:`dct64_inverse()`
8-by-8 2-dimensional DCT , :c:func:`dct8x8_forward()` , :c:func:`dct8x8_inverse()`
1 Brief Forward Function Inverse Function
2 6-point DCT :c:func:`dct6_forward()` :c:func:`dct6_inverse()`
3 8-point DCT :c:func:`dct8_forward()` :c:func:`dct8_inverse()`
4 12-point DCT :c:func:`dct12_forward()` :c:func:`dct12_inverse()`
5 16-point DCT :c:func:`dct16_forward()` :c:func:`dct16_inverse()`
6 24-point DCT :c:func:`dct24_forward()` :c:func:`dct24_inverse()`
7 32-point DCT :c:func:`dct32_forward()` :c:func:`dct32_inverse()`
8 48-point DCT :c:func:`dct48_forward()` :c:func:`dct48_inverse()`
9 64-point DCT :c:func:`dct64_forward()` :c:func:`dct64_inverse()`
10 8-by-8 2-dimensional DCT :c:func:`dct8x8_forward()` :c:func:`dct8x8_inverse()`

View File

@@ -0,0 +1,22 @@
.. _dct_api:
Discrete Cosine Transform API
-----------------------------
DCT API quick reference
^^^^^^^^^^^^^^^^^^^^^^^
Note: The forward DCTs are Type-II. The inverse of the Type-II DCT is the Type-III DCT, so Type-II
and Type-III are supported here.
.. csv-table:: DCT functions - quick reference
:file: dct_functions.csv
:widths: 40, 30, 30
:header-rows: 1
:class: longtable
|newpage|
.. doxygengroup:: dct_api

View File

@@ -0,0 +1,8 @@
Brief , Forward Function , Inverse Function
BFP FFT on single real signal , :c:func:`bfp_fft_forward_mono()` , :c:func:`bfp_fft_inverse_mono()`
BFP FFT on single complex signal , :c:func:`bfp_fft_forward_complex()` , :c:func:`bfp_fft_inverse_complex()`
BFP FFT on pair of real signals , :c:func:`bfp_fft_forward_stereo()` , :c:func:`bfp_fft_inverse_stereo()`
BFP spectrum packing , :c:func:`bfp_fft_unpack_mono()` , :c:func:`bfp_fft_pack_mono()`
Low-level decimation-in-time FFT , :c:func:`fft_dit_forward()` , :c:func:`fft_dit_inverse()`
Low-level decimation-in-frequency FFT , :c:func:`fft_dif_forward()` , :c:func:`fft_dif_inverse()`
FFT on real signal of ``float`` , :c:func:`fft_f32_forward()` , :c:func:`fft_f32_inverse()`
1 Brief Forward Function Inverse Function
2 BFP FFT on single real signal :c:func:`bfp_fft_forward_mono()` :c:func:`bfp_fft_inverse_mono()`
3 BFP FFT on single complex signal :c:func:`bfp_fft_forward_complex()` :c:func:`bfp_fft_inverse_complex()`
4 BFP FFT on pair of real signals :c:func:`bfp_fft_forward_stereo()` :c:func:`bfp_fft_inverse_stereo()`
5 BFP spectrum packing :c:func:`bfp_fft_unpack_mono()` :c:func:`bfp_fft_pack_mono()`
6 Low-level decimation-in-time FFT :c:func:`fft_dit_forward()` :c:func:`fft_dit_inverse()`
7 Low-level decimation-in-frequency FFT :c:func:`fft_dif_forward()` :c:func:`fft_dif_inverse()`
8 FFT on real signal of ``float`` :c:func:`fft_f32_forward()` :c:func:`fft_f32_inverse()`

View File

@@ -0,0 +1,22 @@
.. _fft_api:
Fast Fourier Transform API
--------------------------
FFT API quick reference
^^^^^^^^^^^^^^^^^^^^^^^
|beginfullwidth|
.. csv-table:: FFT functions - quick reference
:file: fft_functions.csv
:widths: 40, 30, 30
:header-rows: 1
:class: longtable
|endfullwidth|
|newpage|
.. doxygengroup:: fft_api

View File

@@ -0,0 +1,9 @@
Filter,Function,Brief
32-bit FIR , :c:func:`filter_fir_s32_init()` , Initialize filter
32-bit FIR , :c:func:`filter_fir_s32_add_sample()` , Add sample (without computing output)
32-bit FIR , :c:func:`filter_fir_s32()` , Process next sample
16-bit FIR , :c:func:`filter_fir_s16_init()` , Initialize filter
16-bit FIR , :c:func:`filter_fir_s16_add_sample()` , Add sample (without computing output)
16-bit FIR , :c:func:`filter_fir_s16()` , Process next sample
32-bit Biquad , :c:func:`filter_biquad_s32()` , Process next sample (single block)
32-bit Biquad , :c:func:`filter_biquads_s32()` , Process next sample (multi block)
1 Filter Function Brief
2 32-bit FIR :c:func:`filter_fir_s32_init()` Initialize filter
3 32-bit FIR :c:func:`filter_fir_s32_add_sample()` Add sample (without computing output)
4 32-bit FIR :c:func:`filter_fir_s32()` Process next sample
5 16-bit FIR :c:func:`filter_fir_s16_init()` Initialize filter
6 16-bit FIR :c:func:`filter_fir_s16_add_sample()` Add sample (without computing output)
7 16-bit FIR :c:func:`filter_fir_s16()` Process next sample
8 32-bit Biquad :c:func:`filter_biquad_s32()` Process next sample (single block)
9 32-bit Biquad :c:func:`filter_biquads_s32()` Process next sample (multi block)

View File

@@ -0,0 +1,18 @@
.. _filter_api:
Filtering API
-------------
|beginfullwidth|
.. csv-table:: Filtering API - quick reference
:file: filter_functions.csv
:widths: 15,40,45
:header-rows: 1
:class: longtable
|endfullwidth|
|newpage|
.. doxygengroup:: filter_api

View File

@@ -0,0 +1,261 @@
// Copyright 2021-2024 XMOS LIMITED.
// This Software is subject to the terms of the XMOS Public Licence: Version 1.
// This file exists as a compatibility work-around between vanilla Doxygen and
// sphinx+breathe+doxygen.
// If these notes are written in an .md file, sphinx can't interpret them. If they're
// written in an .rst file, Doxygen can't interpret them and can't build link references
// to them.
/**
* @page note_vector_alignment Note: Vector Alignment
*
*
* This library makes use of the XMOS architecture's vector processing unit (VPU). All loads and
* stores to and from the XS3 VPU have the requirement that the loaded/stored addresses must be
* aligned to a 4-byte boundary (word-aligned).
*
* In the current version of the API, this leads to the requirement that most API functions
* require vectors (or the data backing a BFP vector) to begin at word-aligned addresses.
* Vectors are *not* required, however, to have a size (in bytes) that is a multiple of 4.
*
* @par Writing Alignment-safe Code
* @parblock
*
* The alignment requirement is ultimately always on the data that backs a vector. For the
* low-level API, that is the pointers passed to the functions themselves. For the high-level
* API, that is the memory to which the `data` field (or the `real` and `imag` fields in the
* case of `bfp_complex_s16_t`) points, specified when the BFP vector is initialized.
*
* Arrays of type `int32_t` and `complex_s32_t` will normally be guaranteed to be word-aligned
* by the compiler. However, if the user manually specifies the beginning of an `int32_t` array,
* as in the following..
*
* @code{.c}
* uint8_t byte_buffer[100];
* int32_t* integer_array = (int32_t*) &byte_buffer[1];
* @endcode
*
* .. the vector may not be word-aligned. It is the responsibility of the user to ensure proper
* alignment of data.
*
* For `int16_t` arrays, the compiler does not by default guarantee that the array starts on a
* word-aligned address. To force word-alignment on arrays of this type, use
* `__attribute__((aligned (4)))` in the variable definition, as in the following.
*
* @code{.c}
* int16_t __attribute__((aligned (4))) data[100];
* @endcode
*
* Occasionally, 8-byte (double word) alignment is required. In this case, neither `int32_t` nor
* `int16_t` is necessarily guaranteed to align as required. Similar to the above, this can be
* hinted to the compiler as in the following.
*
* @code{.c}
* int32_t __attribute__((aligned (8))) data[100];
* @endcode
*
* This library also provides the macros `WORD_ALIGNED` and `DWORD_ALIGNED` which force 4- and
* 8-byte alignment respectively as above.
*
* @endparblock
*/
/**
* @page note_symmetric_saturation Note: Symmetrically Saturating Arithmetic
*
* With ordinary integer arithmetic the block floating-point logic chooses exponents and operand
* shifts to prevent integer overflow with worst-case input values. However, the XS3 VPU uses
* symmetrically saturating integer arithmetic.
*
* Saturating arithmetic is that where partial results of the applied operation use a bit depth
* greater than the output bit depth, and values that can't be properly expressed with the output
* bit depth are set to the nearest expressible value.
*
* For example, in ordinary C integer arithmetic, a function which multiplies two 32-bit integers
* may internally compute the full 64-bit product and then clamp values to the range `(INT32_MIN,
* INT32_MAX)` before returning a 32-bit result.
*
* Symmetrically saturating arithmetic also includes the property that the lower bound of the
* expressible range is the negative of the upper bound of the expressible range.
*
* One of the major troubles with non-saturating integer arithmetic is that in a twos complement
* encoding, there exists a non-zero integer (e.g. INT16_MIN in 16-bit twos complement arithmetic)
* value @math{x} for which @math{-1 \cdot x = x}. Serious arithmetic errors can result when this
* case is not accounted for.
*
* One of the results of _symmetric_ saturation, on the other hand, is that there is a corner case
* where (using the same exponent and shift logic as non-saturating arithmetic) saturation may occur
* for a particular combination of input mantissas. The corner case is different for different
* operations.
*
* When the corner case occurs, the minimum (and largest magnitude) value of the resulting vector is
* 1 LSb greater than its ideal value (e.g. `-0x3FFF` instead of `-0x4000` for 16-bit arithmetic).
* The error in this output element's mantissa is then 1 LSb, or
* @math{2^p}, where @math{p} is the exponent of the resulting BFP vector.
*
* Of course, the very nature of BFP arithmetic routinely involves errors of this magnitude.
*/
/**
* @page note_spectrum_packing Note: Spectrum Packing
*
*
* In its general form, the @math{N}-point Discrete Fourier Transform is an operation applied
* to a complex @math{N}-point signal @math{x[n]} to produce a complex spectrum @math{X[f]}.
* Any spectrum @math{X[f]} which is the result of a @math{N}-point DFT has the property that
* @math{X[f+N] = X[f]}. Thus, the complete representation of the @math{N}-point DFT of
* @math{X[n]} requires @math{N} complex elements.
*
* @par Complex DFT and IDFT
* @parblock
*
* In this library, when performing a complex DFT (e.g. using fft_bfp_forward_complex()), the
* spectral representation that results in a straight-forward mapping:
*
* `X[f]` @math{\longleftarrow X[f]} for @math{0 \le f < N}
*
* where `X` is an @math{N}-element array of `complex_s32_t`, where the real part of @math{X[f]}
* is in `X[f].re` and the imaginary part in `X[f].im`.
*
* Likewise, when performing an @math{N}-point complex inverse DFT, that is also the
* representation that is expected.
* @endparblock
*
* @par Real DFT and IDFT
* @parblock
*
* Oftentimes we instead wish to compute the DFT of real signals. In addition to the periodicity
* property (@math{X[f+N] = X[f]}), the DFT of a real signal also has a complex conjugate symmetry
* such that @math{X[-f] = X^*[f]}, where @math{X^*[f]} is the complex conjugate of @math{X[f]}.
* This symmetry makes it redundant (and thus undesirable) tostore such symmetric pairs of elements.
* This would allow us to get away with only explicitly storing @math{X[f} for
* @math{0 \le f \le N/2} in @math{(N/2)+1} complex elements.
*
* Unfortunately, using such a representation has the undesirable property that the DFT of an
* @math{N}-point real signal cannot be computed in-place, as the representation requires more
* memory than we started with.
* However, if we take the periodicity and complex conjugate symmetry properties together:
*
* @f[
* & X[0] = X^*[0] \rightarrow Imag\{X[0]\} = 0 \\
* & X[-(N/2) + N] = X[N/2] \\
* & X[-N/2] = X^*[N/2] \rightarrow X[N/2] = X^*[N/2] \rightarrow Imag \{ X[N/2] \} = 0
* @f]
*
* Because both @math{X[0]} and @math{X[N/2]} are guaranteed to be real, we can recover the benefit
* of in-place computation in our representation by packing the real part of @math{X[N/2]} into the
* imaginary part of @math{X[0]}.
*
* Therefore, the functions in this library that produce the spectra of real signals (such as
* fft_bfp_forward_mono() and fft_bfp_forward_stereo()) will pack the spectra in a slightly less
* straight-forward manner (as compared with the complex DFTs):
*
* `X[f]` @math{\longleftarrow X[f]} for @math{1 \le f < N/2}
*
* `X[0]` @math{\longleftarrow X[0] + j X[N/2]}
*
* where `X` is an @math{N/2}-element array of `complex_s32_t`.
*
* Likewise, this is the encoding expected when computing the @math{N}-point inverse DFT, such as by
* fft_bfp_inverse_mono() or fft_bfp_inverse_stereo().
*
* @note One additional note, when performing a stereo DFT or inverse DFT, so as to preserve the
* in-place computation of the result, the spectra of the two signals will be encoded into adjacent
* blocks of memory, with the second spectrum (i.e. associated with 'channel b') occupying the
* higher memory address.
*
* @endparblock
*/
/**
* @page fft_length_support Note: Library FFT Length Support
*
* When computing DFTs this library relies on one or both of a pair of look-up tables which contain
* portions of the Discrete Fourier Transform matrix. Longer FFT lengths require larger look-up
* tables. When building using CMake, the maximum FFT length can be specified as a CMake option,
* and these tables are auto-generated at build time.
*
* If not using CMake, you can manually generate these files using a python script included with the
* library. The script is located at `lib_xcore_math/python/gen_fft_table.py`. If generated
* manually, you must add the generated .c file as a source, and the directory containing
* `xmath_fft_lut.h` must be added as an include directory when compiling the library's files.
*
* Note that the header file must be named `xmath_fft_lut.h` as it is included via
* `#include "xmath_fft_lut.h"`.
*
* By default the tables contain the coefficients necessary to perform forward or inverse DFTs of up
* to 1024 points. If larger DFTs are required, or if the maximum required DFT size is known to be
* less than 1024 points, the `MAX_FFT_LEN_LOG2` CMake option can be modified from its default value
* of `10`.
*
* The two look-up tables correspond to the decimation-in-time and decimation-in-frequency FFT
* algorithms, and the run-time symbols for the tables are `xmath_dit_fft_lut` and
* `xmath_dif_fft_lut` respectively. Each table contains @math{N-4} complex 32-bit values, with a
* size of @math{8\cdot (N-4)} bytes each.
*
* To manually regenerate the tables for amaximum FFT length of @math{16384} (@math{=2^{14}}),
* supporting only the decimation-in-time algorithm, for example, use the following:
*
* @code{.c}
* python lib_xcore_math/script/gen_fft_table.py --dit --max_fft_log2 14
* @endcode
*
* Use the `--help` flag with `gen_fft_table.py` for a more detailed description of its syntax and
* parameters.
*/
/**
* @page filter_conversion Note: Digital Filter Conversion
*
* This library supports optimized implementations of 16- and 32-bit FIR filters, as well as
* cascaded 32-bit biquad filters. Each of these filter implementations requires that the
* filter coefficients be represented in a compatible form.
*
* To assist with that, several python scripts are distributed with this library which can be
* used to convert existing floating-point filter coefficients into a code which is easily
* callable from within an xCore application.
*
* Each script reads in floating-point filter coefficients from a file and computes a new
* representation for the filter with coefficients which attempt to maximize precision and are
* compatible with the `lib_xcore_math` filtering API.
*
* Each script outputs two files which can be included in your own xCore application. The first
* output is a C source (`.c`) file containing the computed filter parameters and
* several function definitions for initializing and executing the generated filter. The second
* output is a C header (`.h`) file which can be `#include`d into your own application to
* give access to those functions.
*
* Additionally, each script also takes a user-provided filter name as an input parameter. The
* output files (as well as the function names within) include the filter name so that more than
* one filter can be generated and executed using this mechanism.
*
* As an example, take the following command to generate a 32-bit FIR filter:
*
* python lib_xcore_math/script/gen_fir_filter_s32.py MyFilter filter_coefs.txt
*
* This command creates a filter named "MyFilter", with coefficients taken from a file
* `filter_coefs.txt`. Two output files will be generated, `MyFilter.c` and `MyFilter.h`.
* Including ``MyFilter.h`` provides access to 3 functions, ``MyFilter_init()``,
* `MyFilter_add_sample()`, and `MyFilter()` which correspond to the library functions
* `filter_fir_s32_init()`, `filter_fir_s32_add_sample()` and `filter_fir_s32()`
* respectively.
*
* Use the `--help` flag with the scripts for more detailed descriptions of inputs and other
* options.
*
* | Filter Type | Script |
* | ----------- | ------ |
* | 32-bit FIR | `lib_xcore_math/script/gen_fir_filter_s32.py` |
* | 16-bit FIR | `lib_xcore_math/script/gen_fir_filter_s16.py` |
* | 32-bit Biquad | `lib_xcore_math/script/gen_biquad_filter_s32.py` |
*
*/

View File

@@ -0,0 +1,37 @@
.. _notes_page:
#############
Library Notes
#############
Note: Vector Alignment
======================
.. doxygenpage:: note_vector_alignment
Note: Symmetrically Saturating Arithmetic
=========================================
.. doxygenpage:: note_symmetric_saturation
Note: Spectrum Packing
======================
.. doxygenpage:: note_spectrum_packing
Note: Library FFT Length Support
================================
.. doxygenpage:: fft_length_support
Note: Digital Filter Conversion
================================
.. doxygenpage:: filter_conversion

View File

@@ -0,0 +1,7 @@
Q-format macros
---------------
.. doxygengroup:: qfmt_macros
:members:

View File

@@ -0,0 +1,27 @@
*************
API Reference
*************
.. toctree::
:maxdepth: 2
types
.. toctree::
:maxdepth: 1
bfp/bfp_index
dct/dct_index
fft/fft_index
filter/filter_index
scalar/scalar_index
vect/vect_index
q_format
utils
config_options
.. toctree::
:maxdepth: 2
notes

View File

@@ -0,0 +1,7 @@
Function, Brief
:c:func:`float_complex_s16_mul()` , ":math:`x \times y` "
:c:func:`float_complex_s16_add()` , ":math:`x + y` "
:c:func:`float_complex_s16_sub()` , ":math:`x - y` "
:c:func:`float_complex_s32_mul()` , ":math:`x \times y` "
:c:func:`float_complex_s32_add()` , ":math:`x + y` "
:c:func:`float_complex_s32_sub()` , ":math:`x - y` "
1 Function Brief
2 :c:func:`float_complex_s16_mul()` :math:`x \times y`
3 :c:func:`float_complex_s16_add()` :math:`x + y`
4 :c:func:`float_complex_s16_sub()` :math:`x - y`
5 :c:func:`float_complex_s32_mul()` :math:`x \times y`
6 :c:func:`float_complex_s32_add()` :math:`x + y`
7 :c:func:`float_complex_s32_sub()` :math:`x - y`

View File

@@ -0,0 +1,6 @@
Function, Brief
:c:func:`f32_sin()` , :math:`sin(x)`
:c:func:`f32_cos()` , :math:`cos(x)`
:c:func:`f32_log2()` , :math:`log_2(x)`
:c:func:`f32_power_series()` , Evaluate Power Series
:c:func:`f32_normA()` , Normalized Form A
1 Function Brief
2 :c:func:`f32_sin()` :math:`sin(x)`
3 :c:func:`f32_cos()` :math:`cos(x)`
4 :c:func:`f32_log2()` :math:`log_2(x)`
5 :c:func:`f32_power_series()` Evaluate Power Series
6 :c:func:`f32_normA()` Normalized Form A

View File

@@ -0,0 +1,13 @@
Function,Input Depth, Fractional Bits, Brief
:c:func:`s16_inverse()` , 16 , 0 , ":math:`x^{-1}` "
:c:func:`s32_inverse()` , 32 , 0 , ":math:`x^{-1}` "
:c:func:`sbrad_sin()` , 32 , 31 , ":math:`\sin(x)` "
:c:func:`sbrad_tan()` , 32 , 31 , ":math:`\tan(x)` "
:c:func:`q24_sin()` , 32 , 24 , ":math:`\sin(x)` "
:c:func:`q24_cos()` , 32 , 24 , ":math:`\cos(x)` "
:c:func:`q24_tan()` , 32 , 24 , ":math:`\tan(x)` "
:c:func:`q30_exp_small()` , 32 , 30 , ":math:`\exp(x)` "
:c:func:`q24_logistic()` , 32 , 24 , ":math:`\frac{1}{1+e^{-x}}` "
:c:func:`q24_logistic_fast()` , 32 , 24 , ":math:`\frac{1}{1+e^{-x}}` "
:c:func:`q30_powers()` , 32 , 30 , ":math:`(0,x,x^2,x^3,\dots)` "
:c:func:`u32_ceil_log2()` , 32 , 0 , ":math:`\lceil\log_2(x)\rceil` "
1 Function Input Depth Fractional Bits Brief
2 :c:func:`s16_inverse()` 16 0 :math:`x^{-1}`
3 :c:func:`s32_inverse()` 32 0 :math:`x^{-1}`
4 :c:func:`sbrad_sin()` 32 31 :math:`\sin(x)`
5 :c:func:`sbrad_tan()` 32 31 :math:`\tan(x)`
6 :c:func:`q24_sin()` 32 24 :math:`\sin(x)`
7 :c:func:`q24_cos()` 32 24 :math:`\cos(x)`
8 :c:func:`q24_tan()` 32 24 :math:`\tan(x)`
9 :c:func:`q30_exp_small()` 32 30 :math:`\exp(x)`
10 :c:func:`q24_logistic()` 32 24 :math:`\frac{1}{1+e^{-x}}`
11 :c:func:`q24_logistic_fast()` 32 24 :math:`\frac{1}{1+e^{-x}}`
12 :c:func:`q30_powers()` 32 30 :math:`(0,x,x^2,x^3,\dots)`
13 :c:func:`u32_ceil_log2()` 32 0 :math:`\lceil\log_2(x)\rceil`

View File

@@ -0,0 +1,15 @@
Function, Brief
:c:func:`float_s32_mul()` , ":math:`x \times y` "
:c:func:`float_s32_add()` , ":math:`x + y` "
:c:func:`float_s32_sub()` , ":math:`x - y` "
:c:func:`float_s32_div()` , ":math:`\frac{x}{y}` "
:c:func:`float_s32_abs()` , ":math:`\left|x\right|` "
:c:func:`float_s32_gt()` , ":math:`x > y` "
:c:func:`float_s32_gte()` , ":math:`x \ge y` "
:c:func:`float_s32_ema()` , ":math:`\alpha x + (1 - \alpha) y` "
:c:func:`float_s32_sqrt()` , ":math:`\sqrt{x}` "
:c:func:`float_s32_exp()` , ":math:`exp(x)` "
:c:func:`s16_mul()` , ":math:`x \times y` "
:c:func:`s32_sqrt()` , ":math:`\sqrt{x}` "
:c:func:`s32_mul()` , ":math:`x \times y` "
:c:func:`s32_odd_powers()` , ":math:`x, x^3, x^5, x^7, \dots` "
1 Function Brief
2 :c:func:`float_s32_mul()` :math:`x \times y`
3 :c:func:`float_s32_add()` :math:`x + y`
4 :c:func:`float_s32_sub()` :math:`x - y`
5 :c:func:`float_s32_div()` :math:`\frac{x}{y}`
6 :c:func:`float_s32_abs()` :math:`\left|x\right|`
7 :c:func:`float_s32_gt()` :math:`x > y`
8 :c:func:`float_s32_gte()` :math:`x \ge y`
9 :c:func:`float_s32_ema()` :math:`\alpha x + (1 - \alpha) y`
10 :c:func:`float_s32_sqrt()` :math:`\sqrt{x}`
11 :c:func:`float_s32_exp()` :math:`exp(x)`
12 :c:func:`s16_mul()` :math:`x \times y`
13 :c:func:`s32_sqrt()` :math:`\sqrt{x}`
14 :c:func:`s32_mul()` :math:`x \times y`
15 :c:func:`s32_odd_powers()` :math:`x, x^3, x^5, x^7, \dots`

View File

@@ -0,0 +1,15 @@
Function,Type In, Type Out
:c:func:`f32_unpack()` , "``float`` ", "``int32_t``, :c:type:`exponent_t`"
:c:func:`f32_unpack_s16()` , "``float`` ", "``int16_t``, :c:type:`exponent_t` "
:c:func:`f32_to_float_s32()` , "``float`` ", ":c:type:`float_s32_t` "
:c:func:`f64_to_float_s32()` , "``double`` ", ":c:type:`float_s32_t` "
:c:func:`float_s32_to_float_s64()` , ":c:type:`float_s32_t` ", ":c:type:`float_s64_t` "
:c:func:`float_s32_to_float()` , ":c:type:`float_s32_t` ", "``float`` "
:c:func:`float_s32_to_double()` , ":c:type:`float_s32_t` ", "``double`` "
:c:func:`s16_to_s32()` , "``int16_t``, :c:type:`exponent_t` ", "``int32_t``, :c:type:`exponent_t`"
:c:func:`s32_to_s16()` , "``int32_t``, :c:type:`exponent_t` ", "``int16_t``, :c:type:`exponent_t`"
:c:func:`s64_to_s32()` , "``int64_t``, :c:type:`exponent_t` ", "``int32_t``, :c:type:`exponent_t`"
:c:func:`s32_to_f32()` , "``int32_t``, :c:type:`exponent_t` ", "``float``"
:c:func:`radians_to_sbrads()` , ":c:type:`radian_q24_t` ", ":c:type:`sbrad_t`"
:c:func:`s32_to_chunk_s32()` , "``int32_t`` ", "``int32_t[8]``"
:c:func:`float_s64_to_float_s32()` , ":c:type:`float_s64_t` ", ":c:type:`float_s32_t`"
1 Function Type In Type Out
2 :c:func:`f32_unpack()` ``float`` ``int32_t``, :c:type:`exponent_t`
3 :c:func:`f32_unpack_s16()` ``float`` ``int16_t``, :c:type:`exponent_t`
4 :c:func:`f32_to_float_s32()` ``float`` :c:type:`float_s32_t`
5 :c:func:`f64_to_float_s32()` ``double`` :c:type:`float_s32_t`
6 :c:func:`float_s32_to_float_s64()` :c:type:`float_s32_t` :c:type:`float_s64_t`
7 :c:func:`float_s32_to_float()` :c:type:`float_s32_t` ``float``
8 :c:func:`float_s32_to_double()` :c:type:`float_s32_t` ``double``
9 :c:func:`s16_to_s32()` ``int16_t``, :c:type:`exponent_t` ``int32_t``, :c:type:`exponent_t`
10 :c:func:`s32_to_s16()` ``int32_t``, :c:type:`exponent_t` ``int16_t``, :c:type:`exponent_t`
11 :c:func:`s64_to_s32()` ``int64_t``, :c:type:`exponent_t` ``int32_t``, :c:type:`exponent_t`
12 :c:func:`s32_to_f32()` ``int32_t``, :c:type:`exponent_t` ``float``
13 :c:func:`radians_to_sbrads()` :c:type:`radian_q24_t` :c:type:`sbrad_t`
14 :c:func:`s32_to_chunk_s32()` ``int32_t`` ``int32_t[8]``
15 :c:func:`float_s64_to_float_s32()` :c:type:`float_s64_t` :c:type:`float_s32_t`

View File

@@ -0,0 +1,6 @@
Scalar IEEE 754 float API
-------------------------
.. doxygengroup:: scalar_f32_api
:members:

View File

@@ -0,0 +1,6 @@
16-bit complex scalar floating-point API
----------------------------------------
.. doxygengroup:: float_complex_s16_api
:members:

View File

@@ -0,0 +1,6 @@
32-bit complex scalar floating-point API
----------------------------------------
.. doxygengroup:: float_complex_s32_api
:members:

View File

@@ -0,0 +1,7 @@
32-bit scalar float API
-----------------------
.. doxygengroup:: float_s32_api
:members:

View File

@@ -0,0 +1,17 @@
.. _scalar_api:
Scalar API
----------
.. toctree::
:maxdepth: 1
scalar_quickref
scalar_s16
scalar_s32
scalar_f32
scalar_float_s32
scalar_float_complex_s16
scalar_float_complex_s32
scalar_misc

View File

@@ -0,0 +1,6 @@
Miscellaneous scalar API
------------------------
.. doxygengroup:: scalar_misc_api
:members:

View File

@@ -0,0 +1,70 @@
Scalar API quick reference
--------------------------
* `Scalar Type Conversion <scalar_type_conversion_>`_
* `Fixed-Point Scalar Ops <scalar_fixed_point_ops_>`_
* `IEEE 754 Float Scalar Ops <scalar_f32_ops_>`_
* `Non-standard Float Scalar Ops <scalar_float_ops_>`_
* `Non-standard Complex Float Scalar Ops <scalar_complex_float_ops_>`_
Scalar type conversion
^^^^^^^^^^^^^^^^^^^^^^
.. _scalar_type_conversion:
|beginfullwidth|
.. csv-table:: Scalar type conversion
:file: csv/scalar_type_conversion.csv
:widths: 40,30,30
:header-rows: 1
:class: longtable
|endfullwidth|
Fixed-point scalar ops
^^^^^^^^^^^^^^^^^^^^^^
.. _scalar_fixed_point_ops:
.. csv-table:: Fixed-point scalar ops
:file: csv/scalar_fixed_point_ops.csv
:widths: 35,15,15,35
:header-rows: 1
:class: longtable
IEEE 754 float ops
^^^^^^^^^^^^^^^^^^
.. _scalar_f32_ops:
.. csv-table:: IEEE 754 float ops
:file: csv/scalar_f32_ops.csv
:widths: 50,50
:header-rows: 1
:class: longtable
Non-standard scalar float ops
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. _scalar_float_ops:
.. csv-table:: Non-standard scalar float ops
:file: csv/scalar_float_ops.csv
:widths: 50,50
:header-rows: 1
:class: longtable
Non-standard complex scalar float ops
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. _scalar_complex_float_ops:
.. csv-table:: Non-standard complex scalar float ops
:file: csv/scalar_complex_float_ops.csv
:widths: 50,50
:header-rows: 1
:class: longtable

View File

@@ -0,0 +1,6 @@
16-bit scalar API
-----------------
.. doxygengroup:: scalar_s16_api
:members:

View File

@@ -0,0 +1,6 @@
32-bit scalar API
-----------------
.. doxygengroup:: scalar_s32_api
:members:

View File

@@ -0,0 +1,76 @@
XMath Types
===========
Each of the main operand types used in this library has a short-hand which is used as a prefix in
the naming of API operations. The following tables can be used for reference.
Common Vector Types
-------------------
The following table indicates the types and abbreviations associated with various common vector
types.
|beginfullwidth|
.. csv-table:: Common Vector Types
:file: csv/common_vector_types.csv
:widths: 25, 25, 50
:header-rows: 1
:class: longtable
|endfullwidth|
Common Scalar Types
-------------------
The following table indicates the types and abbreviations associated with various common scalar
types.
|beginfullwidth|
.. csv-table:: Common Scalar Types
:file: csv/common_scalar_types.csv
:widths: 25, 25, 50
:header-rows: 1
:class: longtable
|endfullwidth|
|newpage|
Block Floating-Point Types
--------------------------
.. doxygengroup:: type_bfp
:members:
Scalar Types (Integer)
----------------------
.. doxygengroup:: type_scalar_int
:members:
Scalar Types (Floating-Point)
-----------------------------
.. doxygengroup:: type_scalar_float
:members:
Scalar Types (Fixed-Point)
--------------------------
.. doxygengroup:: type_scalar_fixed
:members:
Misc Types
----------
.. doxygengroup:: type_misc
:members:

View File

@@ -0,0 +1,9 @@
Util functions and macros
-------------------------
.. doxygengroup:: util_macros
:members:

View File

@@ -0,0 +1,5 @@
32-Bit Vector Chunk (8-Element) API
===================================
.. doxygengroup:: chunk32_api

View File

@@ -0,0 +1,5 @@
Complex 16-bit vector API
-------------------------
.. doxygengroup:: vect_complex_s16_api

View File

@@ -0,0 +1,5 @@
16-Bit vomplex vector prepare functions
---------------------------------------
.. doxygengroup:: vect_complex_s16_prepare_api

View File

@@ -0,0 +1,5 @@
Complex 32-bit vector API
-------------------------
.. doxygengroup:: vect_complex_s32_api

View File

@@ -0,0 +1,5 @@
32-Bit complex vector prepare functions
---------------------------------------
.. doxygengroup:: vect_complex_s32_prepare_api

View File

@@ -0,0 +1,5 @@
32-bit IEEE 754 float API
-------------------------
.. doxygengroup:: vect_f32_api

View File

@@ -0,0 +1,26 @@
.. _vect_api:
Vector API
==========
.. toctree::
vect_quickref
.. toctree::
vect_s8
vect_s16
vect_s32
vect_f32
vect_complex_s16
vect_complex_s32
vect_mixed
.. toctree::
vect_s16_prepare
vect_s32_prepare
vect_complex_s16_prepare
vect_complex_s32_prepare
.. toctree::
chunk_s32

View File

@@ -0,0 +1,5 @@
Mixed-precision vector API
--------------------------
.. doxygengroup:: vect_mixed_api

View File

@@ -0,0 +1,570 @@
Vector API quick reference
--------------------------
The tables below list the functions of the vector API. The "EW" column indicates whether the
operation acts element-wise.
The "Signature" column is intended as a hint which quickly conveys the kind of the conceptual inputs
to and outputs from the operation. The signatures are only intended to convey how many (conceptual)
inputs and outputs there are, and their dimensionality.
The functions themselves will typically take more arguments than these signatures indicate. For
example, most functions take vector lengths as input, and many take shift values which are used to
control growth of element bit-depth. Check the function's full documentation to get more detailed
information.
The following symbols are used in the signatures:
.. table::
:widths: 30 70
:class: longtable
+--------------------------------------+---------------------------------------------+
| Symbol | Description |
+======================================+=============================================+
| :math:`\mathbb{S}` | A scalar input or output value. |
+--------------------------------------+---------------------------------------------+
| :math:`\mathbb{V}` | A vector-valued input or output. |
+--------------------------------------+---------------------------------------------+
| :math:`\mathbb{M}` | A matrix-valued input or output. |
+--------------------------------------+---------------------------------------------+
| :math:`\varnothing` | Placeholder indicating no input or output. |
+--------------------------------------+---------------------------------------------+
For example, the operation signature :math:`(\mathbb{V \times V \times S}) \to \mathbb{V}` indicates
the operation takes two vector inputs and a scalar input, and the output is a vector.
* `32-Bit Vector Ops <vect32_api_>`_
* `16-Bit Vector Ops <vect16_api_>`_
* `8-Bit Vector Ops <vect8_api_>`_
* `Complex 32-Bit Vector Ops <vect32_complex_api_>`_
* `Complex 16-Bit Vector Ops <vect16_complex_api_>`_
* `Fixed-Point Vector Ops <vect_fixed_point_api_>`_
* `Floating-Point Vector Ops <vect_float_api_>`_
* `Other Vector Ops <vect_other_api_>`_
* `Vector Type Conversions <vect_conversion_api_>`_
.. _vect32_api:
.. table::
:widths: 50 10 35
:class: longtable
+--------------------------------------------------------------------------------------------------+
| **32-bit Vector Ops** |
+-------------------------------------------------+-----+------------------------------------------+
| Function | EW | Signature |
+=================================================+=====+==========================================+
| :c:func:`vect_s32_copy()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_abs()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_abs_sum()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_add()` | x | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_add_scalar()` | x | :math:`(\mathbb{V \times S})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_argmax()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_argmin()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_clip()` | x | :math:`(\mathbb{V \times S \times S})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_dot()` | | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_energy()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_headroom()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_inverse()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_max()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_max_elementwise()` | x | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_min()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_min_elementwise()` | x | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_mul()` | x | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_macc()` | x | :math:`(\mathbb{V \times V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_nmacc()` | x | :math:`(\mathbb{V \times V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_rect()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_scale()` | x | :math:`(\mathbb{V \times S})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_set()` | x | :math:`\mathbb{S}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_shl()` | x | :math:`(\mathbb{V \times S})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_shr()` | x | :math:`(\mathbb{V \times S})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_sqrt()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_sub()` | x | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_sum()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_zip()` | | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_unzip()` | | :math:`\mathbb{V}` |
| | | :math:`\to (\mathbb{V \times V})` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_convolve_valid()` | | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_convolve_same()` | | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_log_base()` | x | :math:`(\mathbb{V \times S})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_log()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_log2()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s32_log10()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`chunk_s32_dot()` | | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`chunk_s32_log()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
.. _vect16_api:
.. table::
:widths: 50 10 35
:class: longtable
+--------------------------------------------------------------------------------------------------+
| **16-bit Vector Ops** |
+-------------------------------------------------+-----+------------------------------------------+
| Function | EW | Signature |
+=================================================+=====+==========================================+
| :c:func:`vect_s16_abs()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_abs_sum()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_add()` | x | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_add_scalar()` | x | :math:`(\mathbb{V \times S})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_argmax()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_argmin()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_clip()` | x | :math:`(\mathbb{V \times S \times S})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_dot()` | | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_energy()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_headroom()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_inverse()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_max()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_max_elementwise()` | x | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_min()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_min_elementwise()` | x | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_mul()` | x | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_macc()` | x | :math:`(\mathbb{V \times V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_nmacc()` | x | :math:`(\mathbb{V \times V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_rect()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_scale()` | x | :math:`(\mathbb{V \times S})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_set()` | x | :math:`\mathbb{S}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_shl()` | x | :math:`(\mathbb{V \times S})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_shr()` | x | :math:`(\mathbb{V \times S})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_sqrt()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_sub()` | x | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_sum()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_extract_high_byte()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_s16_extract_low_byte()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
.. _vect8_api:
.. table::
:widths: 40 10 20 30
:class: longtable
+---------------------------------------------------------------------------------------------------------------+
| **8-bit Vector Ops** |
+---------------------------------+-----+-----------------------------------------------+-----------------------+
| Function | EW | Signature | Brief |
+=================================+=====+===============================================+=======================+
| :c:func:`vect_s8_is_negative()` | x | :math:`\mathbb{V}` | Identify negative |
| | | :math:`\to \mathbb{V}` | elements |
+---------------------------------+-----+-----------------------------------------------+-----------------------+
.. _vect32_complex_api:
.. table::
:widths: 50 10 35
:class: longtable
+--------------------------------------------------------------------------------------------------+
| **32-bit Complex Vector Ops** |
+-------------------------------------------------+-----+------------------------------------------+
| Function | EW | Signature |
+=================================================+=====+==========================================+
| :c:func:`vect_complex_s32_add()` | x | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s32_add_scalar()` | x | :math:`(\mathbb{V \times S})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s32_conj_macc()` | x | :math:`(\mathbb{V \times V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s32_conj_mul()` | x | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s32_conj_nmacc()` | x | :math:`(\mathbb{V \times V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s32_conjugate()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s32_headroom()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s32_macc()` | x | :math:`(\mathbb{V \times V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s32_mag()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s32_mul()` | x | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s32_nmacc()` | x | :math:`(\mathbb{V \times V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s32_real_mul()` | x | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s32_real_scale()` | x | :math:`(\mathbb{V \times S})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s32_scale()` | x | :math:`(\mathbb{V \times S})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s32_set()` | x | :math:`\mathbb{S}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s32_shl()` | x | :math:`(\mathbb{V \times S})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s32_shr()` | x | :math:`(\mathbb{V \times S})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s32_squared_mag()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s32_sub()` | x | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s32_sum()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s32_tail_reverse()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
.. _vect16_complex_api:
.. table::
:widths: 50 10 35
:class: longtable
+--------------------------------------------------------------------------------------------------+
| **16-bit Complex Vector Ops** |
+-------------------------------------------------+-----+------------------------------------------+
| Function | EW | Signature |
+=================================================+=====+==========================================+
| :c:func:`vect_complex_s16_add()` | x | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s16_add_scalar()` | x | :math:`(\mathbb{V \times S})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s16_conj_mul()` | x | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s16_conj_macc()` | x | :math:`(\mathbb{V \times V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s16_conj_nmacc()` | x | :math:`(\mathbb{V \times V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s16_headroom()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s16_macc()` | x | :math:`(\mathbb{V \times V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s16_mag()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s16_mul()` | x | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s16_nmacc()` | x | :math:`(\mathbb{V \times V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s16_real_mul()` | x | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s16_real_scale()` | x | :math:`(\mathbb{V \times S})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s16_scale()` | x | :math:`(\mathbb{V \times S})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s16_set()` | x | :math:`\mathbb{S}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s16_shl()` | x | :math:`(\mathbb{V \times S})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s16_shr()` | x | :math:`(\mathbb{V \times S})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s16_squared_mag()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s16_sub()` | x | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_s16_sum()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
.. _vect_fixed_point_api:
.. table::
:widths: 50 10 35
:class: longtable
+--------------------------------------------------------------------------------------------------+
| **Fixed-Point Vector Ops** |
+-------------------------------------------------+-----+------------------------------------------+
| Function | EW | Signature |
+=================================================+=====+==========================================+
| :c:func:`vect_q30_power_series()` | x | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_q30_exp_small()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`chunk_q30_power_series()` | x | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`chunk_q30_exp_small()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
.. _vect_float_api:
.. table::
:widths: 50 10 35
:class: longtable
+--------------------------------------------------------------------------------------------------+
| **Floating-Point Vector Ops** |
+-------------------------------------------------+-----+------------------------------------------+
| Function | EW | Signature |
+=================================================+=====+==========================================+
| :c:func:`vect_f32_max_exponent()` | | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_f32_dot()` | | :math:`(\mathbb{V \times V})` |
| | | :math:`\to \mathbb{S}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_f32_add()` | x | :math:`\mathbb{V \times V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_float_s32_log_base()` | x | :math:`(\mathbb{V \times S})` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_float_s32_log()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_float_s32_log2()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_float_s32_log10()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`chunk_float_s32_log()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_f32_add()` | x | :math:`\mathbb{V \times V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_f32_mul()` | x | :math:`\mathbb{V \times V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_f32_conj_mul()` | x | :math:`\mathbb{V \times V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_f32_macc()` | x | :math:`\mathbb{V \times V \times V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
| :c:func:`vect_complex_f32_conj_macc()` | x | :math:`\mathbb{V \times V \times V}` |
| | | :math:`\to \mathbb{V}` |
+-------------------------------------------------+-----+------------------------------------------+
.. _vect_other_api:
Note that several of the functions below take vectors of the :c:struct:`split_acc_s32_t` type. This
is a 32-bit vector type used for accumulating results of 8- or 16-bit operations in a manner
optimized for the XS3 VPU.
.. table::
:widths: 50 10 35
:class: longtable
+--------------------------------------------------------------------------------+
| **Other Vector Ops** |
+----------------------------------------+---+-----------------------------------+
| Function |EW | Signature |
+========================================+===+===================================+
| :c:func:`vect_split_acc_s32_shr()` | x | :math:`(\mathbb{V \times S})` |
| | | :math:`\to \mathbb{V}` |
+----------------------------------------+---+-----------------------------------+
| :c:func:`vect_s32_merge_accs()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+----------------------------------------+---+-----------------------------------+
| :c:func:`vect_s32_split_accs()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+----------------------------------------+---+-----------------------------------+
| :c:func:`chunk_s16_accumulate()` | x | :math:`\mathbb{V}` |
| | | :math:`\to \mathbb{V}` |
+----------------------------------------+---+-----------------------------------+
| :c:func:`mat_mul_s8_x_s8_yield_s32()` | | :math:`(\mathbb{M \times V})` |
| | | :math:`\to \mathbb{V}` |
+----------------------------------------+---+-----------------------------------+
| :c:func:`mat_mul_s8_x_s16_yield_s32()` | | :math:`(\mathbb{M \times V})` |
| | | :math:`\to \mathbb{V}` |
+----------------------------------------+---+-----------------------------------+
.. _vect_conversion_api:
|beginfullwidth|
.. table::
:widths: 50 25 25
:class: longtable
+----------------------------------------------------------------------------------------------------------+
| **Vector Type Conversion Ops** |
+--------------------------------------------------+-------------------------------------------------------+
| Function | Array Element Type |
+--------------------------------------------------+---------------------------+---------------------------+
| | Input | Output |
+==================================================+===========================+===========================+
| :c:func:`vect_s16_to_vect_s32()` | ``int16_t`` | ``int32_t`` |
+--------------------------------------------------+---------------------------+---------------------------+
| :c:func:`vect_s32_to_vect_s16()` | ``int32_t`` | ``int16_t`` |
+--------------------------------------------------+---------------------------+---------------------------+
| :c:func:`vect_s32_to_vect_f32()` | ``int32_t`` | ``float`` |
+--------------------------------------------------+---------------------------+---------------------------+
| :c:func:`vect_f32_to_vect_s32()` | ``float`` | ``int32_t`` |
+--------------------------------------------------+---------------------------+---------------------------+
| :c:func:`vect_complex_s16_to_vect_complex_s32()` | :c:struct:`complex_s16_t` | :c:struct:`complex_s32_t` |
+--------------------------------------------------+---------------------------+---------------------------+
| :c:func:`vect_complex_s32_to_vect_complex_s16()` | :c:struct:`complex_s32_t` | :c:struct:`complex_s16_t` |
+--------------------------------------------------+---------------------------+---------------------------+
|endfullwidth|

View File

@@ -0,0 +1,5 @@
16-bit vector API
-----------------
.. doxygengroup:: vect_s16_api

View File

@@ -0,0 +1,5 @@
16-bit vector prepare functions
===============================
.. doxygengroup:: vect_s16_prepare_api

View File

@@ -0,0 +1,5 @@
32-bit vector API
-----------------
.. doxygengroup:: vect_s32_api

View File

@@ -0,0 +1,5 @@
32-bit vector prepare functions
-------------------------------
.. doxygengroup:: vect_s32_prepare_api

View File

@@ -0,0 +1,5 @@
8-bit vector API
----------------
.. doxygengroup:: vect_s8_api

View File

@@ -0,0 +1,65 @@
**********
Unit tests
**********
This project uses `XCommon CMake` to build the unit tests in a similar fashion to the examples.
Unit tests target the `XK-EVK-XU316` board and x86 platforms.
All unit tests are located in the */tests/* directory:
* `/tests/ <https://github.com/xmos/lib_xcore_math/tree/develop/tests/>`_ - Unit test projects for ``lib_xcore_math``:
* `bfp_tests/ <https://github.com/xmos/lib_xcore_math/tree/develop/tests/bfp_tests/>`_ - BFP unit tests
* `dct_tests/ <https://github.com/xmos/lib_xcore_math/tree/develop/tests/dct_tests/>`_ - DCT unit tests
* `filter_tests/ <https://github.com/xmos/lib_xcore_math/tree/develop/tests/filter_tests/>`_ - Filtering unit tests
* `fft_tests/ <https://github.com/xmos/lib_xcore_math/tree/develop/tests/fft_tests/>`_ - FFT unit tests
* `scalar_tests/ <https://github.com/xmos/lib_xcore_math/tree/develop/tests/scalar_tests/>`_ - Scalar op unit tests
* `vect_tests/ <https://github.com/xmos/lib_xcore_math/tree/develop/tests/vect_tests/>`_ - Vector op unit tests
* `xs3_tests/ <https://github.com/xmos/lib_xcore_math/tree/develop/tests/xs3_tests/>`_ - XS3-specific unit tests
All unit tests and examples are built and executed in a similar manner. The following shows how to do this with
the BFP unit tests.
BFP unit tests
==============
This application runs unit tests for the various 16- and 32-bit BFP vectorized arithmetic functions.
This application is located at `/tests/bfp_tests/
<https://github.com/xmos/lib_xcore_math/tree/develop/tests/bfp_tests>`_.
To build the test, from an XTC command prompt run the following commands in the
`lib_xcore_math/tests/bfp_tests` directory::
cmake -B build -G "Unix Makefiles"
xmake -C build
To execute the BFP unit tests on the `XK-EVK-XU316`, use the
following (after ensuring that the hardware is connected and drivers properly installed): ::
xrun --xscope bin/bfp_tests.xe
Or, to run the unit tests in the software simulator: ::
xsim bin/bfp_tests.xe
.. warning::
Running the unit tests in the simulator may be *very* slow.
To execute the BFP unit tests built for an x86 host platform, configure the build using the
``NATIVE_BUILD`` option: ::
cmake -B build_x86 -G "Unix Makefiles" -D BUILD_NATIVE=TRUE
xmake -C build_x86
on Linux and macOS run the tests as follows: ::
bin/bfp_tests/bfp_tests -v
and on Windows: ::
bin\bfp_tests\bfp_tests.exe -v
where ``-v`` is an optional argument to increase verbosity.