[python] Move column by name to front of table in pandas

Here is my df:

                             Net   Upper   Lower  Mid  Zsore
Answer option                                                
More than once a day          0%   0.22%  -0.12%   2    65 
Once a day                    0%   0.32%  -0.19%   3    45
Several times a week          2%   2.45%   1.10%   4    78
Once a week                   1%   1.63%  -0.40%   6    65

How can I move a column by name ("Mid") to the front of the table, index 0. This is what the result should look like:

                             Mid   Upper   Lower  Net  Zsore
Answer option                                                
More than once a day          2   0.22%  -0.12%   0%    65 
Once a day                    3   0.32%  -0.19%   0%    45
Several times a week          4   2.45%   1.10%   2%    78
Once a week                   6   1.63%  -0.40%   1%    65

My current code moves the column by index using df.columns.tolist() but I'd like to shift it by name.

This question is related to python pandas move dataframe shift

The answer is


Here is a very simple answer to this.

Don't forget the two (()) 'brackets' around columns names.Otherwise, it'll give you an error.


# here you can add below line and it should work 
df = df[list(('Mid','Upper', 'Lower', 'Net','Zsore'))]
df

                             Mid   Upper   Lower  Net  Zsore
Answer option                                                
More than once a day          2   0.22%  -0.12%   0%    65 
Once a day                    3   0.32%  -0.19%   0%    45
Several times a week          4   2.45%   1.10%   2%    78
Once a week                   6   1.63%  -0.40%   1%    65

To reorder the rows of a DataFrame just use a list as follows.

df = df[['Mid', 'Net', 'Upper', 'Lower', 'Zsore']]

This makes it very obvious what was done when reading the code later. Also use:

df.columns
Out[1]: Index(['Net', 'Upper', 'Lower', 'Mid', 'Zsore'], dtype='object')

Then cut and paste to reorder.


For a DataFrame with many columns, store the list of columns in a variable and pop the desired column to the front of the list. Here is an example:

cols = [str(col_name) for col_name in range(1001)]
data = np.random.rand(10,1001)
df = pd.DataFrame(data=data, columns=cols)

mv_col = cols.pop(cols.index('77'))
df = df[[mv_col] + cols]

Now df.columns has.

Index(['77', '0', '1', '2', '3', '4', '5', '6', '7', '8',
       ...
       '991', '992', '993', '994', '995', '996', '997', '998', '999', '1000'],
      dtype='object', length=1001)

I didn't like how I had to explicitly specify all the other column in the other solutions so this worked best for me. Though it might be slow for large dataframes...?

df = df.set_index('Mid').reset_index()


Here is a generic set of code that I frequently use to rearrange the position of columns. You may find it useful.

cols = df.columns.tolist()
n = int(cols.index('Mid'))
cols = [cols[n]] + cols[:n] + cols[n+1:]
df = df[cols]

You can use the df.reindex() function in pandas. df is

                      Net  Upper   Lower  Mid  Zsore
Answer option                                      
More than once a day  0%  0.22%  -0.12%    2     65
Once a day            0%  0.32%  -0.19%    3     45
Several times a week  2%  2.45%   1.10%    4     78
Once a week           1%  1.63%  -0.40%    6     65

define an list of column names

cols = df.columns.tolist()
cols
Out[13]: ['Net', 'Upper', 'Lower', 'Mid', 'Zsore']

move the column name to wherever you want

cols.insert(0, cols.pop(cols.index('Mid')))
cols
Out[16]: ['Mid', 'Net', 'Upper', 'Lower', 'Zsore']

then use df.reindex() function to reorder

df = df.reindex(columns= cols)

out put is: df

                      Mid  Upper   Lower Net  Zsore
Answer option                                      
More than once a day    2  0.22%  -0.12%  0%     65
Once a day              3  0.32%  -0.19%  0%     45
Several times a week    4  2.45%   1.10%  2%     78
Once a week             6  1.63%  -0.40%  1%     65

Maybe I'm missing something, but a lot of these answers seem overly complicated. You should be able to just set the columns within a single list:

Column to the front:

df = df[ ['Mid'] + [ col for col in df.columns if col != 'Mid' ] ]

Or if instead, you want to move it to the back:

df = df[ [ col for col in df.columns if col != 'Mid' ] + ['Mid'] ]

Or if you wanted to move more than one column:

cols_to_move = ['Mid', 'Zsore']
df           = df[ cols_to_move + [ col for col in df.columns if col not in cols_to_move ] ]

I prefer this solution:

col = df.pop("Mid")
df.insert(0, col.name, col)

It's simpler to read and faster than other suggested answers.

def move_column_inplace(df, col, pos):
    col = df.pop(col)
    df.insert(pos, col.name, col)

Performance assessment:

For this test, the currently last column is moved to the front in each repetition. In-place methods generally perform better. While citynorman's solution can be made in-place, Ed Chum's method based on .loc and sachinnm's method based on reindex cannot.

While other methods are generic, citynorman's solution is limited to pos=0. I didn't observe any performance difference between df.loc[cols] and df[cols], which is why I didn't include some other suggestions.

I tested with python 3.6.8 and pandas 0.24.2 on a MacBook Pro (Mid 2015).

import numpy as np
import pandas as pd

n_cols = 11
df = pd.DataFrame(np.random.randn(200000, n_cols),
                  columns=range(n_cols))

def move_column_inplace(df, col, pos):
    col = df.pop(col)
    df.insert(pos, col.name, col)

def move_to_front_normanius_inplace(df, col):
    move_column_inplace(df, col, 0)
    return df

def move_to_front_chum(df, col):
    cols = list(df)
    cols.insert(0, cols.pop(cols.index(col)))
    return df.loc[:, cols]

def move_to_front_chum_inplace(df, col):
    col = df[col]
    df.drop(col.name, axis=1, inplace=True)
    df.insert(0, col.name, col)
    return df

def move_to_front_elpastor(df, col):
    cols = [col] + [ c for c in df.columns if c!=col ]
    return df[cols] # or df.loc[cols]

def move_to_front_sachinmm(df, col):
    cols = df.columns.tolist()
    cols.insert(0, cols.pop(cols.index(col)))
    df = df.reindex(columns=cols, copy=False)
    return df

def move_to_front_citynorman_inplace(df, col):
    # This approach exploits that reset_index() moves the index
    # at the first position of the data frame.
    df.set_index(col, inplace=True)
    df.reset_index(inplace=True)
    return df

def test(method, df):
    col = np.random.randint(0, n_cols)
    method(df, col)

col = np.random.randint(0, n_cols)
ret_mine = move_to_front_normanius_inplace(df.copy(), col)
ret_chum1 = move_to_front_chum(df.copy(), col)
ret_chum2 = move_to_front_chum_inplace(df.copy(), col)
ret_elpas = move_to_front_elpastor(df.copy(), col)
ret_sach = move_to_front_sachinmm(df.copy(), col)
ret_city = move_to_front_citynorman_inplace(df.copy(), col)

# Assert equivalence of solutions.
assert(ret_mine.equals(ret_chum1))
assert(ret_mine.equals(ret_chum2))
assert(ret_mine.equals(ret_elpas))
assert(ret_mine.equals(ret_sach))
assert(ret_mine.equals(ret_city))

Results:

# For n_cols = 11:
%timeit test(move_to_front_normanius_inplace, df)
# 1.05 ms ± 42.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit test(move_to_front_citynorman_inplace, df)
# 1.68 ms ± 46.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit test(move_to_front_sachinmm, df)
# 3.24 ms ± 96.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_chum, df)
# 3.84 ms ± 114 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_elpastor, df)
# 3.85 ms ± 58.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_chum_inplace, df)
# 9.67 ms ± 101 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


# For n_cols = 31:
%timeit test(move_to_front_normanius_inplace, df)
# 1.26 ms ± 31.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_citynorman_inplace, df)
# 1.95 ms ± 260 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_sachinmm, df)
# 10.7 ms ± 348 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_chum, df)
# 11.5 ms ± 869 µs per loop (mean ± std. dev. of 7 runs, 100 loops each
%timeit test(move_to_front_elpastor, df)
# 11.4 ms ± 598 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_chum_inplace, df)
# 31.4 ms ± 1.89 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

The most simplist thing you can try is:

df=df[[ 'Mid',   'Upper',   'Lower', 'Net'  , 'Zsore']]

df.set_index('Mid').reset_index()

seems to be a pretty easy way about this.


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to pandas

xlrd.biffh.XLRDError: Excel xlsx file; not supported Pandas Merging 101 How to increase image size of pandas.DataFrame.plot in jupyter notebook? Trying to merge 2 dataframes but get ValueError Python Pandas User Warning: Sorting because non-concatenation axis is not aligned How to show all of columns name on pandas dataframe? Pandas/Python: Set value of one column based on value in another column Python Pandas - Find difference between two data frames Pandas get the most frequent values of a column Python convert object to float

Examples related to move

Gradle - Move a folder from ABC to XYZ Moving all files from one directory to another using Python Move column by name to front of table in pandas PHP - Move a file into a different folder on the server How to move table from one tablespace to another in oracle 11g Batch file to move files to another directory Windows batch script to move files How to move files using FTP commands Moveable/draggable <div> How to move a marker in Google Maps API

Examples related to dataframe

Trying to merge 2 dataframes but get ValueError How to show all of columns name on pandas dataframe? Python Pandas - Find difference between two data frames Pandas get the most frequent values of a column Display all dataframe columns in a Jupyter Python Notebook How to convert column with string type to int form in pyspark data frame? Display/Print one column from a DataFrame of Series in Pandas Binning column with python pandas Selection with .loc in python Set value to an entire column of a pandas dataframe

Examples related to shift

Move column by name to front of table in pandas Java, Shifting Elements in an Array