get dataframe row count based on conditions

Question

I want to get the count of dataframe rows based on conditional selection  I tried the following code   print df  df IP    head idxmax     amp   df Method     HEAD    amp   df Referrer      -     count     output   IP          57 Time        57 Method      57 Resource    57 Status      57 Bytes       57 Referrer    57 Agent       57 dtype  int64   The output shows the count for each an every column in the dataframe  Instead I need to get a single count where all of the above conditions satisfied  How to do this  If you need more explanation about my dataframe please let me know

User · Accepted Answer

You are asking for the condition where all the conditions are true  so len of the frame is the answer  unless I misunderstand what you are asking  In  17   df   DataFrame randn 20 4  columns list  ABCD     In  18   df  df  A   gt 0   amp   df  B   gt 0   amp   df  C   gt 0   Out 18               A         B         C         D 12  0 491683  0 137766  0 859753 -1 041487 13  0 376200  0 575667  1 534179  1 247358 14  0 428739  1 539973  1 057848 -1 254489  In  19   df  df  A   gt 0   amp   df  B   gt 0   amp   df  C   gt 0   count   Out 19    A    3 B    3 C    3 D    3 dtype  int64  In  20   len df  df  A   gt 0   amp   df  B   gt 0   amp   df  C   gt 0    Out 20   3

User · Answer

For increased performance you should not evaluate the dataframe using your predicate  You can just use the outcome of your predicate directly as illustrated below    In  1   import pandas as pd         import numpy as np         df   pd DataFrame np random randn 20 4  columns list  ABCD      In  2   df head   Out 2             A         B         C         D 0 -2 019868  1 227246 -0 489257  0 149053 1  0 223285 -0 087784 -0 053048 -0 108584 2 -0 140556 -0 299735 -1 765956  0 517803 3 -0 589489  0 400487  0 107856  0 194890 4  1 309088 -0 596996 -0 623519  0 020400  In  3    time sum  df  A   gt 0   amp   df  B   gt 0   CPU times  user 1 11 ms  sys  53   s  total  1 16 ms Wall time  1 12 ms Out 3   4  In  4    time len df  df  A   gt 0   amp   df  B   gt 0    CPU times  user 1 38 ms  sys  78   s  total  1 46 ms Wall time  1 42 ms Out 4   4   Keep in mind that this technique only works for counting the number of rows that comply with your predicate

User · Answer

In Pandas  I like to use the shape attribute to get number of rows  df df A  gt  0  shape 0   gives the number of rows matching the condition A  gt  0  as desired

[python] get dataframe row count based on conditions

Examples related to python

Examples related to pandas