Matrix utilities¶

Helpers for working with (sparse) 2d matrices

Sparse matrix helpers¶

`get_dense_row`(matrix, row[, dtype])	Extract row from the sparse matrix
`sparse_matrix_to_dense`(sparse_matrix)	Convert sparse_matrix to a dense numpy array
`sparse_matrix_to_list`(sparse_matrix)	Convert sparse_matrix to a list of “sparse row vectors”.
`write_sparse_matrix`(target, a, compress, …)	Write a to the file target in matrix market format

Matrix operation helpers¶

`col_op`(m, op)	Apply op to each column in the matrix.
`col_sum`(m)	Calculate the sum of each column in the matrix.
`col_sum_mean`(m, return_var)	Calculate the mean of the sum of each column in the matrix.
`normalize_columns`(matrix)	Normalize the columns of the given (dense) matrix
`row_op`(m, op)	Apply op to each row in the matrix.
`row_sum`(m)	Calculate the sum of each row in the matrix.
`row_sum_mean`(m, var)	Calculate the mean of the sum of each row in the matrix.
`normalize_rows`(matrix)	Normalize the rows of the given (dense) matrix

Other helpers¶

`matrix_multiply`(m1, m2, m3)	Multiply the three matrices
`permute_matrix`(m, is_flat, shape)	Randomly permute the entries of the matrix.

Definitions¶

Helpers for working with (sparse) 2d matrices

pyllars.matrix_utils.col_op(m, op)[source]¶: Apply op to each column in the matrix.

pyllars.matrix_utils.col_sum(m)[source]¶: Calculate the sum of each column in the matrix.

pyllars.matrix_utils.col_sum_mean(m: numpy.ndarray, return_var: bool = False) → float[source]¶

Calculate the mean of the sum of each column in the matrix.

Optionally, the variances of the column sums can also be calculated.

Parameters:

m (numpy.ndarray) – The (2d) matrix
var (bool) – Whether to calculate the variances

Returns:

mean (float) – The mean of the column sums in the matrix
variance (float) – If return_var is True, then the variance of the column sums

pyllars.matrix_utils.get_dense_row(matrix: scipy.sparse.base.spmatrix, row: int, dtype=<class 'float'>, max_length: Optional[int] = None) → numpy.ndarray[source]¶

Extract row from the sparse matrix

Parameters:	matrix (scipy.sparse.spmatrix) – The sparse matrix row (int) – The 0-based row index dtype (type) – The base type of elements of matrix. This is used for the corner case where matrix is essentially a sparse column vector. max_length (typing.Optional[int]) – The maximum number of columns to include in the returned row.
Returns:	row – The specified row (as a 1d numpy array)
Return type:	numpy.ndarray

pyllars.matrix_utils.matrix_multiply(m1: numpy.ndarray, m2: numpy.ndarray, m3: numpy.ndarray) → numpy.ndarray[source]¶

Multiply the three matrices

This function performs the multiplications in an order such that the size of the intermediate matrix created by the first matrix multiplication is as small as possible.

Parameters:	m{1,2,3} (numpy.ndarray) – The (2d) matrices
Returns:	product_matrix – The product of the three input matrices
Return type:	numpy.ndarray

pyllars.matrix_utils.normalize_columns(matrix: numpy.ndarray) → numpy.ndarray[source]¶

Normalize the columns of the given (dense) matrix

Parameters:	m (numpy.ndarray) – The (2d) matrix
Returns:	normalized_matrix – The matrix normalized such that all column sums are 1
Return type:	numpy.ndarray

pyllars.matrix_utils.normalize_rows(matrix: numpy.ndarray) → numpy.ndarray[source]¶

Normalize the rows of the given (dense) matrix

Parameters:	matrix (numpy.ndarray) – The (2d) matrix
Returns:	normalized_matrix – The matrix normalized such that all row sums are 1
Return type:	numpy.ndarray

pyllars.matrix_utils.permute_matrix(m: numpy.ndarray, is_flat: bool = False, shape: Optional[Tuple[int]] = None) → numpy.ndarray[source]¶

Randomly permute the entries of the matrix. The matrix is first flattened.

For reproducibility, the random seed of numpy should be set before calling this function.

Parameters:	m (numpy.ndarray) – The matrix (tensor, etc.) is_flat (bool) – Whether the matrix values have already been flattened. If they have been, then the desired shape must be passed. shape (typing.Optional[typing.Tuple]) – The shape of the output matrix, if m is already flattened
Returns:	permuted_m – A copy of m (with the same shape as m) with the values randomly permuted.
Return type:	numpy.ndarray

pyllars.matrix_utils.row_op(m, op)[source]¶: Apply op to each row in the matrix.

pyllars.matrix_utils.row_sum(m)[source]¶: Calculate the sum of each row in the matrix.

pyllars.matrix_utils.row_sum_mean(m: numpy.ndarray, var: bool = False) → float[source]¶

Calculate the mean of the sum of each row in the matrix.

Optionally, the variances of the row sums can also be calculated.

Parameters:

m (numpy.ndarray) – The (2d) matrix
return_var (bool) – Whether to calculate the variances

Returns:

mean (float) – The mean of the row sums in the matrix
variance (float) – If return_var is True, then the variance of the row sums

pyllars.matrix_utils.sparse_matrix_to_dense(sparse_matrix: scipy.sparse.base.spmatrix) → numpy.ndarray[source]¶

Convert sparse_matrix to a dense numpy array

Parameters:	sparse_matrix (scipy.sparse.spmatrix) – The sparse scipy matrix
Returns:	dense_matrix – The dense (2d) numpy array
Return type:	numpy.ndarray

pyllars.matrix_utils.sparse_matrix_to_list(sparse_matrix: scipy.sparse.base.spmatrix) → List[source]¶

Convert sparse_matrix to a list of “sparse row vectors”.

In this context, a “sparse row vector” is simply a sparse matrix with dimensionality (1, sparse_matrix.shape[1]).

Parameters:	sparse_matrix (scipy.sparse.spmatrix) – The sparse scipy matrix
Returns:	list_of_sparse_row_vectors – The list of sparse row vectors
Return type:	typing.List[scipy.sparse.spmatrix]

pyllars.matrix_utils.write_sparse_matrix(target: str, a: scipy.sparse.base.spmatrix, compress: bool = True, **kwargs) → None[source]¶

Write a to the file target in matrix market format

This function is a drop-in replacement for scipy.io.mmwrite. The only difference is that it gzip compresses the output by default. It does not alter the file extension, which should likely end in “mtx.gz” except in special circumstances.

If compress is True, then this function imports gzip.

Parameters:	target (str) – The complete path to the output file, including file extension a (scipy.sparse.spmatrix) – The sparse matrix compress (bool) – Whether to compress the output *kwargs (<key>=<value> pairs*) – These are passed through to `scipy.io.mmwrite()`. Please see the scipy documentation for more details.
Returns:
Return type:	None, but the matrix is written to disk