One use-case is that you have named (some of) your columns with some prefix, and you want the columns sorted with those prefixes all together and in some particular order (not alphabetical).
For example, you might start all of your features with Ft_
, labels with Lbl_
, etc, and you want all unprefixed columns first, then all features, then the label. You can do this with the following function (I will note a possible efficiency problem using sum
to reduce lists, but this isn't an issue unless you have a LOT of columns, which I do not):
def sortedcols(df, groups = ['Ft_', 'Lbl_'] ):
return df[ sum([list(filter(re.compile(r).search, list(df.columns).copy())) for r in (lambda l: ['^(?!(%s))' % '|'.join(l)] + ['^%s' % i for i in l ] )(groups) ], []) ]