Correlation heatmap

Question

I want to represent correlation matrix using a heatmap  There is something called correlogram in R  but I don t think there s such a thing in Python   How can I do this  The values go from -1 to 1  for example      1           0 00279981  0 95173379  0 02486161 -0 00324926 -0 00432099     0 00279981  1           0 17728303  0 64425774  0 30735071  0 37379443     0 95173379  0 17728303  1           0 27072266  0 02549031  0 03324756     0 02486161  0 64425774  0 27072266  1           0 18336236  0 18913512    -0 00324926  0 30735071  0 02549031  0 18336236  1           0 77678274    -0 00432099  0 37379443  0 03324756  0 18913512  0 77678274  1              I was able to produce the following heatmap based on another question  but the problem is that my values get  cut  at 0  so I would like to have a map which goes from blue -1  to red 1   or something like that  but here values below 0 are not presented in an adequate way     Here s the code for that   plt imshow correlation matrix cmap  hot  interpolation  nearest

User · Answer

Use the  jet  colormap for a transition between blue and red  Use pcolor   with the vmin  vmax parameters    It is detailed in this answer  https   stackoverflow com a 3376734 21974

User · Answer

The code below will produce this plot     import pandas as pd import seaborn as sns import matplotlib pyplot as plt import numpy as np    A list with your data slightly edited l    1 0 0 00279981 0 95173379 0 02486161 -0 00324926 -0 00432099  0 00279981 1 0 0 17728303 0 64425774 0 30735071 0 37379443  0 95173379 0 17728303 1 0 0 27072266 0 02549031 0 03324756  0 02486161 0 64425774 0 27072266 1 0 0 18336236 0 18913512  -0 00324926 0 30735071 0 02549031 0 18336236 1 0 0 77678274  -0 00432099 0 37379443 0 03324756 0 18913512 0 77678274 1 00     Split list n   6 data    l i i   n  for i in range 0  len l   n      A dataframe df   pd DataFrame data   def CorrMtx df  dropDuplicates   True          Your dataset is already a correlation matrix        If you have a dateset where you need to include the calculation       of a correlation matrix  just uncomment the line below        df   df corr          Exclude duplicate correlations by masking uper right values     if dropDuplicates              mask   np zeros like df  dtype np bool          mask np triu indices from mask     True        Set background color   chart style     sns set style style    white          Set up  matplotlib figure     f  ax   plt subplots figsize  11  9          Add diverging colormap from red to blue     cmap   sns diverging palette 250  10  as cmap True         Draw correlation plot with or without duplicates     if dropDuplicates          sns heatmap df  mask mask  cmap cmap                   square True                  linewidth  5  cbar kws   shrink    5   ax ax      else          sns heatmap df  cmap cmap                   square True                  linewidth  5  cbar kws   shrink    5   ax ax    CorrMtx df  dropDuplicates   False    I put this together after it was announced that the outstanding seaborn corrplot was to be deprecated  The snippet above makes a resembling correlation plot based on seaborn heatmap  You can also specify the color range and select whether or not to drop duplicate correlations  Notice that I ve used the same numbers as you  but that I ve put them in a pandas dataframe  Regarding the choice of colors you can have a look at the documents for sns diverging palette  You asked for blue  but that falls out of this particular range of the color scale with your sample data  For both observations of 0 95173379  try changing to -0 95173379 and you ll get this

User · Answer

If your data is in a Pandas DataFrame  you can use Seaborn s heatmap function to create your desired plot   import seaborn as sns  Var Corr   df corr     plot the heatmap and annotation on it sns heatmap Var Corr  xticklabels Var Corr columns  yticklabels Var Corr columns  annot True    Correlation plot  From the question  it looks like the data is in a NumPy array   If that array has the name numpy data  before you can use the step above  you would want to put it into a Pandas DataFrame using the following   import pandas as pd df   pd DataFrame numpy data

User · Answer

Another alternative is to use the heatmap function in seaborn to plot the covariance  This example uses the Auto data set from the ISLR package in R  the same as in the example you showed    import pandas rpy common as com import seaborn as sns  matplotlib inline    load the R package ISLR infert   com importr  ISLR      load the Auto dataset auto df   com load data  Auto      calculate the correlation matrix corr   auto df corr      plot the heatmap sns heatmap corr           xticklabels corr columns          yticklabels corr columns      If you wanted to be even more fancy  you can use Pandas Style  for example   cmap   cmap sns diverging palette 5  250  as cmap True   def magnify        return  dict selector  th                    props    font-size    7pt                  dict selector  td                    props    padding    0em 0em                  dict selector  th hover                    props    font-size    12pt                  dict selector  tr hover td hover                    props    max-width    200px                              font-size    12pt        corr style background gradient cmap  axis 1        set properties     max-width    80px    font-size    10pt          set caption  Hover to magify         set precision 2        set table styles magnify

[python] Correlation heatmap

Examples related to python

Examples related to correlation