[python] Extracting specific selected columns to new DataFrame as a copy

I have a pandas DataFrame with 4 columns and I want to create a new DataFrame that only has three of the columns. This question is similar to: Extracting specific columns from a data frame but for pandas not R. The following code does not work, raises an error, and is certainly not the pandasnic way to do it.

import pandas as pd
old = pd.DataFrame({'A' : [4,5], 'B' : [10,20], 'C' : [100,50], 'D' : [-30,-50]})
new = pd.DataFrame(zip(old.A, old.C, old.D)) # raises TypeError: data argument can't be an iterator 

What is the pandasnic way to do it?

This question is related to python pandas chained-assignment

The answer is


The easiest way is

new = old[['A','C','D']]

.


Generic functional form

def select_columns(data_frame, column_names):
    new_frame = data_frame.loc[:, column_names]
    return new_frame

Specific for your problem above

selected_columns = ['A', 'C', 'D']
new = select_columns(old, selected_columns)

columns by index:

# selected column index: 1, 6, 7
new = old.iloc[: , [1, 6, 7]].copy() 

As far as I can tell, you don't necessarily need to specify the axis when using the filter function.

new = old.filter(['A','B','D'])

returns the same dataframe as

new = old.filter(['A','B','D'], axis=1)

Another simpler way seems to be:

new = pd.DataFrame([old.A, old.B, old.C]).transpose()

where old.column_name will give you a series. Make a list of all the column-series you want to retain and pass it to the DataFrame constructor. We need to do a transpose to adjust the shape.

In [14]:pd.DataFrame([old.A, old.B, old.C]).transpose()
Out[14]: 
   A   B    C
0  4  10  100
1  5  20   50

If you want to have a new data frame then:

import pandas as pd
old = pd.DataFrame({'A' : [4,5], 'B' : [10,20], 'C' : [100,50], 'D' : [-30,-50]})
new=  old[['A', 'C', 'D']]

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to pandas

xlrd.biffh.XLRDError: Excel xlsx file; not supported Pandas Merging 101 How to increase image size of pandas.DataFrame.plot in jupyter notebook? Trying to merge 2 dataframes but get ValueError Python Pandas User Warning: Sorting because non-concatenation axis is not aligned How to show all of columns name on pandas dataframe? Pandas/Python: Set value of one column based on value in another column Python Pandas - Find difference between two data frames Pandas get the most frequent values of a column Python convert object to float

Examples related to chained-assignment

Extracting specific selected columns to new DataFrame as a copy Python: Pandas Dataframe how to multiply entire column with a scalar why should I make a copy of a data frame in pandas How to deal with SettingWithCopyWarning in Pandas Adding new column to existing DataFrame in Python pandas