[python] Why does corrcoef return a matrix?

It seems strange to me that np.corrcoef returns a matrix.

 correlation1 = corrcoef(Strategy1Returns,Strategy2Returns)

[[ 1.         -0.99598935]
 [-0.99598935  1.        ]]

Does anyone know why this is the case and whether it is possible to return just one value in the classical sense?

This question is related to python math numpy

The answer is


Consider using matplotlib.cbook pieces

for example:

import matplotlib.cbook as cbook
segments = cbook.pieces(np.arange(20), 3)
for s in segments:
     print s

It allows you to compute correlation coefficients of >2 data sets, e.g.

>>> from numpy import *
>>> a = array([1,2,3,4,6,7,8,9])
>>> b = array([2,4,6,8,10,12,13,15])
>>> c = array([-1,-2,-2,-3,-4,-6,-7,-8])
>>> corrcoef([a,b,c])
array([[ 1.        ,  0.99535001, -0.9805214 ],
       [ 0.99535001,  1.        , -0.97172394],
       [-0.9805214 , -0.97172394,  1.        ]])

Here we can get the correlation coefficient of a,b (0.995), a,c (-0.981) and b,c (-0.972) at once. The two-data-set case is just a special case of N-data-set class. And probably it's better to keep the same return type. Since the "one value" can be obtained simply with

>>> corrcoef(a,b)[1,0]
0.99535001355530017

there's no big reason to create the special case.


The correlation matrix is the standard way to express correlations between an arbitrary finite number of variables. The correlation matrix of N data vectors is a symmetric N × N matrix with unity diagonal. Only in the case N = 2 does this matrix have one free parameter.


The function Correlate of numpy works with 2 1D arrays that you want to correlate and returns one correlation value.


corrcoef returns the normalised covariance matrix.

The covariance matrix is the matrix

Cov( X, X )    Cov( X, Y )

Cov( Y, X )    Cov( Y, Y )

Normalised, this will yield the matrix:

Corr( X, X )    Corr( X, Y )

Corr( Y, X )    Corr( Y, Y )

correlation1[0, 0 ] is the correlation between Strategy1Returns and itself, which must be 1. You just want correlation1[ 0, 1 ].


You can use the following function to return only the correlation coefficient:

def pearson_r(x, y):
"""Compute Pearson correlation coefficient between two arrays."""

   # Compute correlation matrix
   corr_mat = np.corrcoef(x, y)

   # Return entry [0,1]
   return corr_mat[0,1]

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to math

How to do perspective fixing? How to pad a string with leading zeros in Python 3 How can I use "e" (Euler's number) and power operation in python 2.7 numpy max vs amax vs maximum Efficiently getting all divisors of a given number Using atan2 to find angle between two vectors How to calculate percentage when old value is ZERO Finding square root without using sqrt function? Exponentiation in Python - should I prefer ** operator instead of math.pow and math.sqrt? How do I get the total number of unique pairs of a set in the database?

Examples related to numpy

Unable to allocate array with shape and data type How to fix 'Object arrays cannot be loaded when allow_pickle=False' for imdb.load_data() function? Numpy, multiply array with scalar TypeError: only integer scalar arrays can be converted to a scalar index with 1D numpy indices array Could not install packages due to a "Environment error :[error 13]: permission denied : 'usr/local/bin/f2py'" Pytorch tensor to numpy array Numpy Resize/Rescale Image what does numpy ndarray shape do? How to round a numpy array? numpy array TypeError: only integer scalar arrays can be converted to a scalar index