29 Numerics library [numerics]

29.9 Basic linear algebra algorithms [linalg]

29.9.13 BLAS 1 algorithms [linalg.algs.blas1]

29.9.13.1 Complexity [linalg.algs.blas1.complexity]

Complexity: All algorithms in [linalg.algs.blas1] with mdspan parameters perform a count of mdspan array accesses and arithmetic operations that is linear in the maximum product of extents of any mdspan parameter.

29.9.13.2 Givens rotations [linalg.algs.blas1.givens]

29.9.13.2.1 Compute Givens rotation [linalg.algs.blas1.givens.lartg]

🔗

template<class Real>
  setup_givens_rotation_result<Real> setup_givens_rotation(Real a, Real b) noexcept;

template<class Real>
  setup_givens_rotation_result<complex<Real>>
    setup_givens_rotation(complex<Real> a, complex<Real> b) noexcept;

These functions compute the Givens plane rotation represented by the two values c and s such that the 2 x 2 system of equations

[\begin{matrix} c & s - ¯ ¯ ¯ s & c \end{matrix}] \cdot [\begin{matrix} a b \end{matrix}] = [\begin{matrix} r 0 \end{matrix}]

holds, where c is always a real scalar, and

c^{2} + | s |^{2} = 1

That is, c and s represent a 2 x 2 matrix, that when multiplied by the right by the input vector whose components are a and b, produces a result vector whose first component r is the Euclidean norm of the input vector, and whose second component is zero.

[Note 1:

These functions correspond to the LAPACK function xLARTG[bib].

— end note]

Returns: c, s, r, where c and s form the Givens plane rotation corresponding to the input a and b, and r is the Euclidean norm of the two-component vector formed by a and b.

29.9.13.2.2 Apply a computed Givens rotation to vectors [linalg.algs.blas1.givens.rot]

🔗

template<inout-vector InOutVec1, inout-vector InOutVec2, class Real>
  void apply_givens_rotation(InOutVec1 x, InOutVec2 y, Real c, Real s);
template<class ExecutionPolicy, inout-vector InOutVec1, inout-vector InOutVec2, class Real>
  void apply_givens_rotation(ExecutionPolicy&& exec,
                             InOutVec1 x, InOutVec2 y, Real c, Real s);
template<inout-vector InOutVec1, inout-vector InOutVec2, class Real>
  void apply_givens_rotation(InOutVec1 x, InOutVec2 y, Real c, complex<Real> s);
template<class ExecutionPolicy, inout-vector InOutVec1, inout-vector InOutVec2, class Real>
  void apply_givens_rotation(ExecutionPolicy&& exec,
                             InOutVec1 x, InOutVec2 y, Real c, complex<Real> s);

[Note 1:

These functions correspond to the BLAS function xROT[bib].

— end note]

Mandates: compatible-static-extents<InOutVec1, InOutVec2>(0, 0) is true.

Preconditions: x.extent(0) equals y.extent(0).

Effects: Applies the plane rotation specified by c and s to the input vectors x and y, as if the rotation were a 2 x 2 matrix and the input vectors were successive rows of a matrix with two rows.

29.9.13.3 Swap matrix or vector elements [linalg.algs.blas1.swap]

🔗

template<inout-object InOutObj1, inout-object InOutObj2>
  void swap_elements(InOutObj1 x, InOutObj2 y);
template<class ExecutionPolicy, inout-object InOutObj1, inout-object InOutObj2>
  void swap_elements(ExecutionPolicy&& exec, InOutObj1 x, InOutObj2 y);

[Note 1:

These functions correspond to the BLAS function xSWAP[bib].

— end note]

Constraints: x.rank() equals y.rank().

Mandates: For all r in the range [0, x.rank()), compatible-static-extents<InOutObj1, InOutObj2>(r, r) is true.

Preconditions: x.extents() equals y.extents().

Effects: Swaps all corresponding elements of x and y.

29.9.13.4 Multiply the elements of an object in place by a scalar [linalg.algs.blas1.scal]

🔗

template<class Scalar, inout-object InOutObj>
  void scale(Scalar alpha, InOutObj x);
template<class ExecutionPolicy, class Scalar, inout-object InOutObj>
  void scale(ExecutionPolicy&& exec, Scalar alpha, InOutObj x);

[Note 1:

These functions correspond to the BLAS function xSCAL[bib].

— end note]

Effects: Overwrites x with the result of computing the elementwise multiplication αx, where the scalar α is alpha.

29.9.13.5 Copy elements of one matrix or vector into another [linalg.algs.blas1.copy]

🔗

template<in-object InObj, out-object OutObj>
  void copy(InObj x, OutObj y);
template<class ExecutionPolicy, in-object InObj, out-object OutObj>
  void copy(ExecutionPolicy&& exec, InObj x, OutObj y);

[Note 1:

These functions correspond to the BLAS function xCOPY[bib].

— end note]

Constraints: x.rank() equals y.rank().

Mandates: For all r in the range

[0, x.rank())

, compatible-static-extents<InObj, OutObj>(r, r) is true.

Preconditions: x.extents() equals y.extents().

Effects: Assigns each element of x to the corresponding element of y.

29.9.13.6 Add vectors or matrices elementwise [linalg.algs.blas1.add]

🔗

template<in-object InObj1, in-object InObj2, out-object OutObj>
  void add(InObj1 x, InObj2 y, OutObj z);
template<class ExecutionPolicy, in-object InObj1, in-object InObj2, out-object OutObj>
  void add(ExecutionPolicy&& exec,
           InObj1 x, InObj2 y, OutObj z);

[Note 1:

These functions correspond to the BLAS function xAXPY[bib].

— end note]

Constraints: x.rank(), y.rank(), and z.rank() are all equal.

Mandates: possibly-addable<InObj1, InObj2, OutObj>() is true.

Preconditions: addable(x,y,z) is true.

Effects: Computes

z = x + y

Remarks: z may alias x or y.

29.9.13.7 Dot product of two vectors [linalg.algs.blas1.dot]

[Note 1:

The functions in this section correspond to the BLAS functions xDOT, xDOTU, and xDOTC[bib].

— end note]

The following elements apply to all functions in [linalg.algs.blas1.dot].

Mandates: compatible-static-extents<InVec1, InVec2>(0, 0) is true.

Preconditions: v1.extent(0) equals v2.extent(0).

🔗

template<in-vector InVec1, in-vector InVec2, class Scalar>
  Scalar dot(InVec1 v1, InVec2 v2, Scalar init);
template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2, class Scalar>
  Scalar dot(ExecutionPolicy&& exec,
             InVec1 v1, InVec2 v2, Scalar init);

These functions compute a non-conjugated dot product with an explicitly specified result type.

Returns: Let N be v1.extent(0).

(6.1)
init if N is zero;
(6.2)
otherwise, GENERALIZED_SUM(plus<>(), init, v1[0]*v2[0], …, v1[N-1]*v2[N-1]).

Remarks: If InVec1::value_type, InVec2::value_type, and Scalar are all floating-point types or specializations of complex, and if Scalar has higher precision than InVec1::value_type or InVec2::value_type, then intermediate terms in the sum use Scalar's precision or greater.

🔗

  template<in-vector InVec1, in-vector InVec2>
    auto dot(InVec1 v1, InVec2 v2);
  template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2>
    auto dot(ExecutionPolicy&& exec,
             InVec1 v1, InVec2 v2);

These functions compute a non-conjugated dot product with a default result type.

Effects: Let T be decltype(declval<typename InVec1::value_type>() * declval<typename InVec2::value_type>()).

Then,

(9.1)
the two-parameter overload is equivalent to: return dot(v1, v2, T{}); and
(9.2)
the three-parameter overload is equivalent to: return dot(std::forward<ExecutionPolicy>(exec), v1, v2, T{});

🔗

template<in-vector InVec1, in-vector InVec2, class Scalar>
  Scalar dotc(InVec1 v1, InVec2 v2, Scalar init);
template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2, class Scalar>
  Scalar dotc(ExecutionPolicy&& exec,
              InVec1 v1, InVec2 v2, Scalar init);

These functions compute a conjugated dot product with an explicitly specified result type.

Effects:

(11.1)
The three-parameter overload is equivalent to: return dot(conjugated(v1), v2, init); and
(11.2)
the four-parameter overload is equivalent to: return dot(std::forward<ExecutionPolicy>(exec), conjugated(v1), v2, init);

🔗

template<in-vector InVec1, in-vector InVec2>
  auto dotc(InVec1 v1, InVec2 v2);
template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2>
  auto dotc(ExecutionPolicy&& exec,
            InVec1 v1, InVec2 v2);

These functions compute a conjugated dot product with a default result type.

Effects: Let T be decltype(conj-if-needed(declval<typename InVec1::value_type>()) * declval<typename InVec2::value_type>()).

Then,

(13.1)
the two-parameter overload is equivalent to: return dotc(v1, v2, T{}); and
(13.2)
the three-parameter overload is equivalent to return dotc(std::forward<ExecutionPolicy>(exec), v1, v2, T{});

29.9.13.8 Scaled sum of squares of a vector's elements [linalg.algs.blas1.ssq]

🔗

template<in-vector InVec, class Scalar>
  sum_of_squares_result<Scalar> vector_sum_of_squares(InVec v, sum_of_squares_result<Scalar> init);
template<class ExecutionPolicy, in-vector InVec, class Scalar>
  sum_of_squares_result<Scalar> vector_sum_of_squares(ExecutionPolicy&& exec,
                                                      InVec v, sum_of_squares_result<Scalar> init);

[Note 1:

These functions correspond to the LAPACK function xLASSQ[bib].

— end note]

Mandates: decltype(abs-if-needed(declval<typename InVec::value_type>())) is convertible to Scalar.

Effects: Returns a value result such that

(3.1)
result.scaling_factor is the maximum of init.scaling_factor and abs-if-needed(x[i]) for all i in the domain of v; and
(3.2)
let s2init be init.scaling_factor * init.scaling_factor * init.scaled_sum_of_squares then result.scaling_factor * result.scaling_factor * result.scaled_sum_of_squares equals the sum of s2init and the squares of abs-if-needed(x[i]) for all i in the domain of v.

Remarks: If InVec::value_type, and Scalar are all floating-point types or specializations of complex, and if Scalar has higher precision than InVec::value_type, then intermediate terms in the sum use Scalar's precision or greater.

29.9.13.9 Euclidean norm of a vector [linalg.algs.blas1.nrm2]

🔗

template<in-vector InVec, class Scalar>
  Scalar vector_two_norm(InVec v, Scalar init);
template<class ExecutionPolicy, in-vector InVec, class Scalar>
  Scalar vector_two_norm(ExecutionPolicy&& exec, InVec v, Scalar init);

[Note 1:

These functions correspond to the BLAS function xNRM2[bib].

— end note]

Mandates: Let a be abs-if-needed(declval<typename InVec::value_type>()).

Then, decltype(init + a * a is convertible to Scalar.

Returns: The square root of the sum of the square of init and the squares of the absolute values of the elements of v.

[Note 2:

For init equal to zero, this is the Euclidean norm (also called 2-norm) of the vector v.

— end note]

[Note 3:

An implementation of this function for floating-point types T can use the scaled_sum_of_squares result from vector_sum_of_squares(x, {.scaling_factor=1.0, .scaled_sum_of_squares=init}).

— end note]

🔗

template<in-vector InVec>
  auto vector_two_norm(InVec v);
template<class ExecutionPolicy, in-vector InVec>
  auto vector_two_norm(ExecutionPolicy&& exec, InVec v);

Effects: Let a be abs-if-needed(declval<typename InVec::value_type>()).

Let T be decltype(a * a).

Then,

(5.1)
the one-parameter overload is equivalent to: return vector_two_norm(v, T{}); and
(5.2)
the two-parameter overload is equivalent to: return vector_two_norm(std::forward<ExecutionPolicy>(exec), v, T{});

29.9.13.10 Sum of absolute values of vector elements [linalg.algs.blas1.asum]

🔗

template<in-vector InVec, class Scalar>
  Scalar vector_abs_sum(InVec v, Scalar init);
template<class ExecutionPolicy, in-vector InVec, class Scalar>
  Scalar vector_abs_sum(ExecutionPolicy&& exec, InVec v, Scalar init);

[Note 1:

These functions correspond to the BLAS functions SASUM, DASUM, SCASUM, and DZASUM[bib].

— end note]

Mandates: decltype(init + abs-if-needed(real-if-needed(declval<typename InVec::value_type>())) + abs-if-needed(imag-if-needed(declval<typename InVec::value_type>()))) is convertible to Scalar.

Returns: Let N be v.extent(0).

(3.1)
init if N is zero;
(3.2)
otherwise, if InVec::value_type is an arithmetic type, GENERALIZED_SUM(plus<>(), init, abs-if-needed(v[0]), …, abs-if-needed(v[N-1]))
(3.3)
otherwise, GENERALIZED_SUM(plus<>(), init, abs-if-needed(real-if-needed(v[0])) + abs-if-needed(imag-if-needed(v[0])), …, abs-if-needed(real-if-needed(v[N-1])) + abs-if-needed(imag-if-needed(v[N-1])))

Remarks: If InVec::value_type and Scalar are all floating-point types or specializations of complex, and if Scalar has higher precision than InVec::value_type, then intermediate terms in the sum use Scalar's precision or greater.

🔗

template<in-vector InVec>
  auto vector_abs_sum(InVec v);
template<class ExecutionPolicy, in-vector InVec>
  auto vector_abs_sum(ExecutionPolicy&& exec, InVec v);

Effects: Let T be typename InVec::value_type.

Then,

(5.1)
the one-parameter overload is equivalent to: return vector_abs_sum(v, T{}); and
(5.2)
the two-parameter overload is equivalent to: return vector_abs_sum(std::forward<ExecutionPolicy>(exec), v, T{});

29.9.13.11 Index of maximum absolute value of vector elements [linalg.algs.blas1.iamax]

🔗

template<in-vector InVec>
  typename InVec::extents_type vector_idx_abs_max(InVec v);
template<class ExecutionPolicy, in-vector InVec>
  typename InVec::extents_type vector_idx_abs_max(ExecutionPolicy&& exec, InVec v);

[Note 1:

These functions correspond to the BLAS function IxAMAX[bib].

— end note]

Let T be decltype(abs-if-needed(real-if-needed(declval<typename InVec::value_type>())) + abs-if-needed(imag-if-needed(declval<typename InVec::value_type>())))

Mandates: declval<T>() < declval<T>() is a valid expression.

Returns:

(4.1)
numeric_limits<typename InVec::size_type>::max() if v has zero elements;
(4.2)
otherwise, the index of the first element of v having largest absolute value, if InVec::value_type is an arithmetic type;
(4.3)
otherwise, the index of the first element $v_{e}$ of v for which abs-if-needed(real-if-needed( $v_{e}$ )) + abs-if-needed(imag-if-needed( $v_{e}$ )) has the largest value.

29.9.13.12 Frobenius norm of a matrix [linalg.algs.blas1.matfrobnorm]

[Note 1:

These functions exist in the BLAS standard[bib] but are not part of the reference implementation.

— end note]

🔗

template<in-matrix InMat, class Scalar>
  Scalar matrix_frob_norm(InMat A, Scalar init);
template<class ExecutionPolicy, in-matrix InMat, class Scalar>
  Scalar matrix_frob_norm(ExecutionPolicy&& exec, InMat A, Scalar init);

Mandates: Let a be abs-if-needed(declval<typename InMat::value_type>()).

Then, decltype(init + a * a) is convertible to Scalar.

Returns: The square root of the sum of squares of init and the absolute values of the elements of A.

[Note 2:

For init equal to zero, this is the Frobenius norm of the matrix A.

— end note]

Remarks: If InMat::value_type and Scalar are all floating-point types or specializations of complex, and if Scalar has higher precision than InMat::value_type, then intermediate terms in the sum use Scalar's precision or greater.

🔗

template<in-matrix InMat>
  auto matrix_frob_norm(InMat A);
template<class ExecutionPolicy, in-matrix InMat>
  auto matrix_frob_norm(ExecutionPolicy&& exec, InMat A);

Effects: Let a be abs-if-needed(declval<typename InMat::value_type>()).

Let T be decltype(a * a).

Then,

(5.1)
the one-parameter overload is equivalent to: return matrix_frob_norm(A, T{}); and
(5.2)
the two-parameter overload is equivalent to: return matrix_frob_norm(std::forward<ExecutionPolicy>(exec), A, T{});

29.9.13.13 One norm of a matrix [linalg.algs.blas1.matonenorm]

[Note 1:

These functions exist in the BLAS standard[bib] but are not part of the reference implementation.

— end note]

🔗

template<in-matrix InMat, class Scalar>
  Scalar matrix_one_norm(InMat A, Scalar init);
template<class ExecutionPolicy, in-matrix InMat, class Scalar>
  Scalar matrix_one_norm(ExecutionPolicy&& exec, InMat A, Scalar init);

Mandates: decltype(abs-if-needed(declval<typename InMat::value_type>())) is convertible to Scalar.

Returns:

(3.1)
init if A.extent(1) is zero;
(3.2)
otherwise, the sum of init and the one norm of the matrix A.

[Note 2:

The one norm of the matrix A is the maximum over all columns of A, of the sum of the absolute values of the elements of the column.

— end note]

🔗

template<in-matrix InMat>
  auto matrix_one_norm(InMat A);
template<class ExecutionPolicy, in-matrix InMat>
  auto matrix_one_norm(ExecutionPolicy&& exec, InMat A);

Effects: Let T be decltype(abs-if-needed(declval<typename InMat::value_type>()).

Then,

(5.1)
the one-parameter overload is equivalent to: return matrix_one_norm(A, T{}); and
(5.2)
the two-parameter overload is equivalent to: return matrix_one_norm(std::forward<ExecutionPolicy>(exec), A, T{});

29.9.13.14 Infinity norm of a matrix [linalg.algs.blas1.matinfnorm]

[Note 1:

These functions exist in the BLAS standard[bib] but are not part of the reference implementation.

— end note]

🔗

template<in-matrix InMat, class Scalar>
  Scalar matrix_inf_norm(InMat A, Scalar init);
template<class ExecutionPolicy, in-matrix InMat, class Scalar>
  Scalar matrix_inf_norm(ExecutionPolicy&& exec, InMat A, Scalar init);

Mandates: decltype(abs-if-needed(declval<typename InMat::value_type>())) is convertible to Scalar.

Returns:

(3.1)
init if A.extent(0) is zero;
(3.2)
otherwise, the sum of init and the infinity norm of the matrix A.

[Note 2:

The infinity norm of the matrix A is the maximum over all rows of A, of the sum of the absolute values of the elements of the row.

— end note]

🔗

template<in-matrix InMat>
  auto matrix_inf_norm(InMat A);
template<class ExecutionPolicy, in-matrix InMat>
  auto matrix_inf_norm(ExecutionPolicy&& exec, InMat A);

Effects: Let T be decltype(abs-if-needed(declval<typename InMat::value_type>()).

Then,

(5.1)
the one-parameter overload is equivalent to: return matrix_inf_norm(A, T{}); and
(5.2)
the two-parameter overload is equivalent to: return matrix_inf_norm(std::forward<ExecutionPolicy>(exec), A, T{});