[python] Add column with constant value to pandas dataframe

Given a DataFrame:

np.random.seed(0)
df = pd.DataFrame(np.random.randn(3, 3), columns=list('ABC'), index=[1, 2, 3])
df

          A         B         C
1  1.764052  0.400157  0.978738
2  2.240893  1.867558 -0.977278
3  0.950088 -0.151357 -0.103219

What is the simplest way to add a new column containing a constant value eg 0?

          A         B         C  new
1  1.764052  0.400157  0.978738    0
2  2.240893  1.867558 -0.977278    0
3  0.950088 -0.151357 -0.103219    0

This is my solution, but I don't know why this puts NaN into 'new' column?

df['new'] = pd.Series([0 for x in range(len(df.index))])

          A         B         C  new
1  1.764052  0.400157  0.978738  0.0
2  2.240893  1.867558 -0.977278  0.0
3  0.950088 -0.151357 -0.103219  NaN

This question is related to python pandas

The answer is


Here is another one liner using lambdas (create column with constant value = 10)

df['newCol'] = df.apply(lambda x: 10, axis=1)

before

df
    A           B           C
1   1.764052    0.400157    0.978738
2   2.240893    1.867558    -0.977278
3   0.950088    -0.151357   -0.103219

after

df
        A           B           C           newCol
    1   1.764052    0.400157    0.978738    10
    2   2.240893    1.867558    -0.977278   10
    3   0.950088    -0.151357   -0.103219   10

Super simple in-place assignment: df['new'] = 0

For in-place modification, perform direct assignment. This assignment is broadcasted by pandas for each row.

df = pd.DataFrame('x', index=range(4), columns=list('ABC'))
df

   A  B  C
0  x  x  x
1  x  x  x
2  x  x  x
3  x  x  x

df['new'] = 'y'
# Same as,
# df.loc[:, 'new'] = 'y'
df

   A  B  C new
0  x  x  x   y
1  x  x  x   y
2  x  x  x   y
3  x  x  x   y

Note for object columns

If you want to add an column of empty lists, here is my advice:

  • Consider not doing this. object columns are bad news in terms of performance. Rethink how your data is structured.
  • Consider storing your data in a sparse data structure. More information: sparse data structures
  • If you must store a column of lists, ensure not to copy the same reference multiple times.

    # Wrong
    df['new'] = [[]] * len(df)
    # Right
    df['new'] = [[] for _ in range(len(df))]
    

Generating a copy: df.assign(new=0)

If you need a copy instead, use DataFrame.assign:

df.assign(new='y')

   A  B  C new
0  x  x  x   y
1  x  x  x   y
2  x  x  x   y
3  x  x  x   y

And, if you need to assign multiple such columns with the same value, this is as simple as,

c = ['new1', 'new2', ...]
df.assign(**dict.fromkeys(c, 'y'))

   A  B  C new1 new2
0  x  x  x    y    y
1  x  x  x    y    y
2  x  x  x    y    y
3  x  x  x    y    y

Multiple column assignment

Finally, if you need to assign multiple columns with different values, you can use assign with a dictionary.

c = {'new1': 'w', 'new2': 'y', 'new3': 'z'}
df.assign(**c)

   A  B  C new1 new2 new3
0  x  x  x    w    y    z
1  x  x  x    w    y    z
2  x  x  x    w    y    z
3  x  x  x    w    y    z

With modern pandas you can just do:

df['new'] = 0