Let's start with this small df:
import pandas as pd
import time as tm
import numpy as np
n=10
a=np.arange(0,n**2)
df=pd.DataFrame(a.reshape(n,n))
We'll so have
df
Out[25]:
0 1 2 3 4 5 6 7 8 9
0 0 1 2 3 4 5 6 7 8 9
1 10 11 12 13 14 15 16 17 18 19
2 20 21 22 23 24 25 26 27 28 29
3 30 31 32 33 34 35 36 37 38 39
4 40 41 42 43 44 45 46 47 48 49
5 50 51 52 53 54 55 56 57 58 59
6 60 61 62 63 64 65 66 67 68 69
7 70 71 72 73 74 75 76 77 78 79
8 80 81 82 83 84 85 86 87 88 89
9 90 91 92 93 94 95 96 97 98 99
With this we have:
df.iloc[3,3]
Out[33]: 33
df.iat[3,3]
Out[34]: 33
df.iloc[:3,:3]
Out[35]:
0 1 2 3
0 0 1 2 3
1 10 11 12 13
2 20 21 22 23
3 30 31 32 33
df.iat[:3,:3]
Traceback (most recent call last):
... omissis ...
ValueError: At based indexing on an integer index can only have integer indexers
Thus we cannot use .iat for subset, where we must use .iloc only.
But let's try both to select from a larger df and let's check the speed ...
# -*- coding: utf-8 -*-
"""
Created on Wed Feb 7 09:58:39 2018
@author: Fabio Pomi
"""
import pandas as pd
import time as tm
import numpy as np
n=1000
a=np.arange(0,n**2)
df=pd.DataFrame(a.reshape(n,n))
t1=tm.time()
for j in df.index:
for i in df.columns:
a=df.iloc[j,i]
t2=tm.time()
for j in df.index:
for i in df.columns:
a=df.iat[j,i]
t3=tm.time()
loc=t2-t1
at=t3-t2
prc = loc/at *100
print('\nloc:%f at:%f prc:%f' %(loc,at,prc))
loc:10.485600 at:7.395423 prc:141.784987
So with .loc we can manage subsets and with .at only a single scalar, but .at is faster than .loc
:-)