Scikit-learn How to obtain True Positive True Negative False Positive and False Negative

Question

My problem   I have a dataset which is a large JSON file  I read it and store it in the trainList variable   Next  I pre-process it - in order to be able to work with it   Once I have done that I start the classification    I use the kfold cross validation method in order to obtain the mean accuracy and train a classifier  I make the predictions and obtain the accuracy  amp  confusion matrix of that fold  After this  I would like to obtain the True Positive TP   True Negative TN   False Positive FP  and False Negative FN  values  I ll  use these parameters to obtain the Sensitivity and Specificity     Finally  I would use this to put in HTML in order to show a chart with the TPs of each label   Code   The variables I have for the moment   trainList  It is a list with all the data of my dataset in JSON form labelList  It is a list with all the labels of my data    Most part of the method    I transform the data from JSON form to a numerical one X vec fit transform trainList    I scale the matrix  don t know why but without it  it makes an error  X preprocessing scale X toarray      I generate a KFold in order to make cross validation kf   KFold len X   n folds 10  indices True  shuffle True  random state 1    I start the cross validation for train indices  test indices in kf      X train  X ii  for ii in train indices      X test  X ii  for ii in test indices      y train  listaLabels ii  for ii in train indices      y test  listaLabels ii  for ii in test indices        I train the classifier     trained qda fit X train y train        I make the predictions     predicted qda predict X test        I obtain the accuracy of this fold     ac accuracy score predicted y test        I obtain the confusion matrix     cm confusion matrix y test  predicted        I should calculate the TP TN  FP and FN       I don t know how to continue

User · Answer

FalseNegatives test   pd merge Variables test  Banknote test left index True  right index True  Banknote test pred   pd DataFrame banknote test pred  Banknote test pred rename columns  0   Predicted    inplace True   test   test reset index drop True  merge Banknote test pred reset index drop True   left index True  right index True  test  FN     np where  test  Banknote     quot Genuine quot    amp   test  Predicted     quot Forged quot   1 0  test test FN    0

User · Answer

False positive cases train   pd merge X train  y train left index True  right index True  y train pred   pd DataFrame y train pred  y train pred rename columns  0   Predicted    inplace True   train   train reset index drop True  merge y train pred reset index drop True   left index True right index True  train  FP     np where  train  Banknote     quot Forged quot    amp   train  Predicted     quot Genuine quot   1 0  train train FP    0

User · Answer

I have tried some of the answers and found them not working   This works for me    from sklearn metrics import classification report  print classification report y test  predicted

User · Answer

In scikit version 0 22  you can do it  like this  from sklearn metrics import multilabel confusion matrix  y true     cat    ant    cat    cat    ant    bird   y pred     ant    ant    cat    cat    ant    cat    mcm   multilabel confusion matrix y true  y pred labels   ant    bird    cat     tn   mcm    0  0  tp   mcm    1  1  fn   mcm    1  0  fp   mcm    0  1

User · Answer

I think both of the answers are not fully correct  For example  suppose that we have the following arrays  y actual    1  1  0  0  0  1  0  1  0  0  0   y predic    1  1  1  0  0  0  1  1  0  1  0   If we compute the FP  FN  TP and TN values manually  they should be as follows   FP  3 FN  1 TP  3 TN  4  However  if we use the first answer  results are given as follows   FP  1 FN  3 TP  3 TN  4  They are not correct  because in the first answer  False Positive should be where actual is 0  but the predicted is 1  not the opposite  It is also same for False Negative   And  if we use the second answer  the results are computed as follows   FP  3 FN  1 TP  4 TN  3  True Positive and True Negative numbers are not correct  they should be opposite   Am I correct with my computations  Please let me know if I am missing something

User · Answer

if you have more than one classes in your classifier  you might want to use pandas-ml at that part  Confusion Matrix of pandas-ml give more detailed information  check that

User · Answer

Here s a fix to invoketheshell s buggy code  which currently appears as the accepted answer    def performance measure y actual  y hat       TP   0     FP   0     TN   0     FN   0      for i in range len y hat             if y actual i     y hat i   1              TP    1         if y hat i     1 and y actual i     0              FP    1         if y hat i     y actual i     0              TN   1         if y hat i     0 and y actual i     1              FN   1      return TP  FP  TN  FN

User · Answer

The one liner to get true postives etc  out of the confusion matrix is to ravel it   from sklearn metrics import confusion matrix  y true    1  1  0  0  y pred    1  0  1  0      tn  fp  fn  tp   confusion matrix y true  y pred  ravel   print tn  fp  fn  tp     1 1 1 1

User · Answer

Just in case some is looking for the same in MULTI-CLASS Example  def perf measure y actual  y pred       class id   set y actual  union set y pred       TP          FP          TN          FN           for index   id in enumerate class id           TP append 0          FP append 0          TN append 0          FN append 0          for i in range len y pred                if y actual i     y pred i      id                  TP index     1             if y pred i      id and y actual i     y pred i                   FP index     1             if y actual i     y pred i      id                  TN index     1             if y pred i      id and y actual i     y pred i                   FN index     1       return class id TP  FP  TN  FN

User · Answer

you can try sklearn metrics classification report as below   import sklearn y true    1  1  0  0  0  1  0  1  0  0  0  y pred    1  1  1  0  0  0  1  1  0  1  0   print sklearn metrics classification report y true  y pred    output            precision    recall  f1-score   support        0       0 80      0 57      0 67         7       1       0 50      0 75      0 60         4        avg   total       0 69      0 64      0 64        11

User · Answer

You can obtain all of the parameters from the confusion matrix  The structure of the confusion matrix which is 2X2 matrix  is as follows  assuming the first index is related to the positive label  and the rows are related to the true labels    TP FN FP TN   So   TP   cm 0  0  FN   cm 0  1  FP   cm 1  0  TN   cm 1  1    More details at https   en wikipedia org wiki Confusion matrix

User · Answer

If you have two lists that have the predicted and actual values  as it appears you do  you can pass them to a function that will calculate TP  FP  TN  FN with something like this   def perf measure y actual  y hat       TP   0     FP   0     TN   0     FN   0      for i in range len y hat             if y actual i   y hat i   1             TP    1         if y hat i   1 and y actual i   y hat i              FP    1         if y actual i   y hat i   0             TN    1         if y hat i   0 and y actual i   y hat i              FN    1      return TP  FP  TN  FN    From here I think you will be able to calculate rates of interest to you  and other performance measure like specificity and sensitivity

User · Answer

According to scikit-learn documentation   http   scikit-learn org stable modules generated sklearn metrics confusion matrix html sklearn metrics confusion matrix  By definition a confusion matrix C is such that C i  j  is equal to the number of observations known to be in group i but predicted to be in group j   Thus in binary classification  the count of true negatives is C 0 0   false negatives is C 1 0   true positives is C 1 1  and false positives is C 0 1    CM   confusion matrix y true  y pred   TN   CM 0  0  FN   CM 1  0  TP   CM 1  1  FP   CM 0  1

User · Answer

In the scikit-learn  metrics  library there is a confusion matrix method which gives you the desired output   You can use any classifier that you want  Here I used the KNeighbors as example   from sklearn import metrics  neighbors  clf   neighbors KNeighborsClassifier    X test       y test        expected   y test predicted   clf predict X test   conf matrix   metrics confusion matrix expected  predicted    gt  gt  gt  print conf matrix  gt  gt  gt     1403   87          56 3159     The docs  http   scikit-learn org stable modules generated sklearn metrics confusion matrix html sklearn metrics confusion matrix

User · Answer

I wrote a version that works using only numpy  I hope it helps you   import numpy as np  def perf metrics 2X2 yobs  yhat               Returns the specificity  sensitivity  positive predictive value  and      negative predictive value      of a 2X2 table       where      0   negative case     1   positive case      Parameters     ----------     yobs    array of positive and negative   observed   cases     yhat   array of positive and negative   predicted   cases      Returns     -------     sensitivity    TP    TP FN      specificity    TN    TN FP      pos pred val   TP   TP FP      neg pred val   TN   TN FN       Author  Julio Cardenas-Rodriguez             TP   np sum   yobs yobs  1     yhat yobs  1        TN   np sum   yobs yobs  0     yhat yobs  0        FP   np sum   yobs yobs  1     yhat yobs  0        FN   np sum   yobs yobs  0     yhat yobs  1         sensitivity    TP    TP FN      specificity    TN    TN FP      pos pred val   TP   TP FP      neg pred val   TN   TN FN       return sensitivity  specificity  pos pred val  neg pred val

User · Answer

For the multi-class case  everything you need can be found from the confusion matrix  For example  if your confusion matrix looks like this     Then what you re looking for  per class  can be found like this     Using pandas numpy  you can do this for all classes at once like so   FP   confusion matrix sum axis 0  - np diag confusion matrix    FN   confusion matrix sum axis 1  - np diag confusion matrix  TP   np diag confusion matrix  TN   confusion matrix values sum   -  FP   FN   TP     Sensitivity  hit rate  recall  or true positive rate TPR   TP  TP FN    Specificity or true negative rate TNR   TN  TN FP     Precision or positive predictive value PPV   TP  TP FP    Negative predictive value NPV   TN  TN FN    Fall out or false positive rate FPR   FP  FP TN    False negative rate FNR   FN  TP FN    False discovery rate FDR   FP  TP FP     Overall accuracy ACC    TP TN   TP FP FN TN

[python] Scikit-learn: How to obtain True Positive, True Negative, False Positive and False Negative

Examples related to python

Examples related to machine-learning

Examples related to scikit-learn

Examples related to classification

Examples related to supervised-learning