Given a DataFrame:
np.random.seed(0)
df = pd.DataFrame(np.random.randn(3, 3), columns=list('ABC'), index=[1, 2, 3])
df
A B C
1 1.764052 0.400157 0.978738
2 2.240893 1.867558 -0.977278
3 0.950088 -0.151357 -0.103219
What is the simplest way to add a new column containing a constant value eg 0?
A B C new
1 1.764052 0.400157 0.978738 0
2 2.240893 1.867558 -0.977278 0
3 0.950088 -0.151357 -0.103219 0
This is my solution, but I don't know why this puts NaN into 'new' column?
df['new'] = pd.Series([0 for x in range(len(df.index))])
A B C new
1 1.764052 0.400157 0.978738 0.0
2 2.240893 1.867558 -0.977278 0.0
3 0.950088 -0.151357 -0.103219 NaN
Here is another one liner using lambdas (create column with constant value = 10)
df['newCol'] = df.apply(lambda x: 10, axis=1)
before
df
A B C
1 1.764052 0.400157 0.978738
2 2.240893 1.867558 -0.977278
3 0.950088 -0.151357 -0.103219
after
df
A B C newCol
1 1.764052 0.400157 0.978738 10
2 2.240893 1.867558 -0.977278 10
3 0.950088 -0.151357 -0.103219 10
df['new'] = 0
For in-place modification, perform direct assignment. This assignment is broadcasted by pandas for each row.
df = pd.DataFrame('x', index=range(4), columns=list('ABC'))
df
A B C
0 x x x
1 x x x
2 x x x
3 x x x
df['new'] = 'y'
# Same as,
# df.loc[:, 'new'] = 'y'
df
A B C new
0 x x x y
1 x x x y
2 x x x y
3 x x x y
If you want to add an column of empty lists, here is my advice:
object
columns are bad news in terms of performance. Rethink how your data is structured. If you must store a column of lists, ensure not to copy the same reference multiple times.
# Wrong
df['new'] = [[]] * len(df)
# Right
df['new'] = [[] for _ in range(len(df))]
df.assign(new=0)
If you need a copy instead, use DataFrame.assign
:
df.assign(new='y')
A B C new
0 x x x y
1 x x x y
2 x x x y
3 x x x y
And, if you need to assign multiple such columns with the same value, this is as simple as,
c = ['new1', 'new2', ...]
df.assign(**dict.fromkeys(c, 'y'))
A B C new1 new2
0 x x x y y
1 x x x y y
2 x x x y y
3 x x x y y
Finally, if you need to assign multiple columns with different values, you can use assign
with a dictionary.
c = {'new1': 'w', 'new2': 'y', 'new3': 'z'}
df.assign(**c)
A B C new1 new2 new3
0 x x x w y z
1 x x x w y z
2 x x x w y z
3 x x x w y z
With modern pandas you can just do:
df['new'] = 0
Source: Stackoverflow.com