[python] Set value to an entire column of a pandas dataframe

I'm trying to set the entire column of a dataframe to a specific value.

In  [1]: df
Out [1]: 
     issueid   industry
0        001        xxx
1        002        xxx
2        003        xxx
3        004        xxx
4        005        xxx

From what I've seen, loc is the best practice when replacing values in a dataframe (or isn't it?):

In  [2]: df.loc[:,'industry'] = 'yyy'

However, I still received this much talked-about warning message:

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead

If I do

In  [3]: df['industry'] = 'yyy'

I got the same warning message.

Any ideas? Working with Python 3.5.2 and pandas 0.18.1.

This question is related to python pandas dataframe

The answer is


Seems to me that:

df1 = df[df['col1']==some_value] WILL NOT create a new DataFrame, basically, changes in df1 will be reflected in the parent df. This leads to the warning. Whereas, df1 = df[df['col1]]==some_value].copy() WILL create a new DataFrame, and changes in df1 will not be reflected in df. the copy() method is recommended if you don't want to make changes to your original df.


I had a similar issue before even with this approach df.loc[:,'industry'] = 'yyy', but once I refreshed the notebook, it ran well.

You may want to try refreshing the cells after you have df.loc[:,'industry'] = 'yyy'.


You can do :

df['industry'] = 'yyy'

df.loc[:,'industry'] = 'yyy'

This does the magic. You are to add '.loc' with ':' for all rows. Hope it helps


This provides you with the possibility of adding conditions on the rows and then change all the cells of a specific column corresponding to those rows:

df.loc[(df['issueid'] == '001'), 'industry'] = str('yyy')

Assuming your Data frame is like 'Data' you have to consider if your data is a string or an integer. Both are treated differently. So in this case you need be specific about that.

import pandas as pd

data = [('001','xxx'), ('002','xxx'), ('003','xxx'), ('004','xxx'), ('005','xxx')]

df = pd.DataFrame(data,columns=['issueid', 'industry'])

print("Old DataFrame")
print(df)

df.loc[:,'industry'] = str('yyy')

print("New DataFrame")
print(df)

Now if want to put numbers instead of letters you must create and array

list_of_ones = [1,1,1,1,1]
df.loc[:,'industry'] = list_of_ones
print(df)

Or if you are using Numpy

import numpy as np
n = len(df)
df.loc[:,'industry'] = np.ones(n)
print(df)

You can use the assign function:

df = df.assign(industry='yyy')

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to pandas

xlrd.biffh.XLRDError: Excel xlsx file; not supported Pandas Merging 101 How to increase image size of pandas.DataFrame.plot in jupyter notebook? Trying to merge 2 dataframes but get ValueError Python Pandas User Warning: Sorting because non-concatenation axis is not aligned How to show all of columns name on pandas dataframe? Pandas/Python: Set value of one column based on value in another column Python Pandas - Find difference between two data frames Pandas get the most frequent values of a column Python convert object to float

Examples related to dataframe

Trying to merge 2 dataframes but get ValueError How to show all of columns name on pandas dataframe? Python Pandas - Find difference between two data frames Pandas get the most frequent values of a column Display all dataframe columns in a Jupyter Python Notebook How to convert column with string type to int form in pyspark data frame? Display/Print one column from a DataFrame of Series in Pandas Binning column with python pandas Selection with .loc in python Set value to an entire column of a pandas dataframe