[c++] What is the fastest way to transpose a matrix in C++?

Modern linear algebra libraries include optimized versions of the most common operations. Many of them include dynamic CPU dispatch, which chooses the best implementation for the hardware at program execution time (without compromising on portability).

This is commonly a better alternative to performing manual optimization of your functinos via vector extensions intrinsic functions. The latter will tie your implementation to a particular hardware vendor and model: if you decide to swap to a different vendor (e.g. Power, ARM) or to a newer vector extensions (e.g. AVX512), you will need to re-implement it again to get the most of them.

MKL transposition, for example, includes the BLAS extensions function imatcopy. You can find it in other implementations such as OpenBLAS as well:

#include <mkl.h>

void transpose( float* a, int n, int m ) {
    const char row_major = 'R';
    const char transpose = 'T';
    const float alpha = 1.0f;
    mkl_simatcopy (row_major, transpose, n, m, alpha, a, n, n);
}

For a C++ project, you can make use of the Armadillo C++:

#include <armadillo>

void transpose( arma::mat &matrix ) {
    arma::inplace_trans(matrix);
}

Examples related to c++

Method Call Chaining; returning a pointer vs a reference? How can I tell if an algorithm is efficient? Difference between opening a file in binary vs text How can compare-and-swap be used for a wait-free mutual exclusion for any shared data structure? Install Qt on Ubuntu #include errors detected in vscode Cannot open include file: 'stdio.h' - Visual Studio Community 2017 - C++ Error How to fix the error "Windows SDK version 8.1" was not found? Visual Studio 2017 errors on standard headers How do I check if a Key is pressed on C++

Examples related to algorithm

How can I tell if an algorithm is efficient? Find the smallest positive integer that does not occur in a given sequence Efficiently getting all divisors of a given number Peak signal detection in realtime timeseries data What is the optimal algorithm for the game 2048? How can I sort a std::map first by value, then by key? Finding square root without using sqrt function? Fastest way to flatten / un-flatten nested JSON objects Mergesort with Python Find common substring between two strings

Examples related to matrix

How to get element-wise matrix multiplication (Hadamard product) in numpy? How can I plot a confusion matrix? Error: stray '\240' in program What does the error "arguments imply differing number of rows: x, y" mean? How to input matrix (2D list) in Python? Difference between numpy.array shape (R, 1) and (R,) Counting the number of non-NaN elements in a numpy ndarray in Python Inverse of a matrix using numpy How to create an empty matrix in R? numpy matrix vector multiplication

Examples related to transpose

Converting rows into columns and columns into rows using R Postgres - Transpose Rows to Columns Transposing a 2D-array in JavaScript What is the fastest way to transpose a matrix in C++? Excel VBA - Range.Copy transpose paste Transpose list of lists Transposing a 1D NumPy array An efficient way to transpose a file in Bash Transpose/Unzip Function (inverse of zip)?