Problem
If I have a multi-level column index, I should:
>>> cols = pd.MultiIndex.from_tuples([("a", "b"), ("a", "c")])
>>> pd.DataFrame([[1,2], [3,4]], columns=cols)
a
---+--
b | c
--+---+--
0 | 1 | 2
1 | 3 | 4
How do I make that index’s “a” level disappear such that I end up with:
b | c
--+---+--
0 | 1 | 2
1 | 3 | 4
Asked by David Wolever
Solution #1
You can use MultiIndex.droplevel:MultiIndex.droplevel:MultiIndex.droplevel:MultiIndex
>>> cols = pd.MultiIndex.from_tuples([("a", "b"), ("a", "c")])
>>> df = pd.DataFrame([[1,2], [3,4]], columns=cols)
>>> df
a
b c
0 1 2
1 3 4
[2 rows x 2 columns]
>>> df.columns = df.columns.droplevel()
>>> df
b c
0 1 2
1 3 4
[2 rows x 2 columns]
Answered by DSM
Solution #2
Using a list comprehension is another approach to get rid of the index:
df.columns = [col[1] for col in df.columns]
b c
0 1 2
1 3 4
If you want to merge the names from both levels, as in the example below, where the bottom level contains two ‘y’s, this method is also useful:
cols = pd.MultiIndex.from_tuples([("A", "x"), ("A", "y"), ("B", "y")])
df = pd.DataFrame([[1,2, 8 ], [3,4, 9]], columns=cols)
A B
x y y
0 1 2 8
1 3 4 9
When the top level is removed, two columns with the index ‘y’ are left. By connecting the names with the list comprehension, this can be prevented.
df.columns = ['_'.join(col) for col in df.columns]
A_x A_y B_y
0 1 2 8
1 3 4 9
That was a difficulty I encountered after doing a groupby, and it took me a long time to figure out what the solution was. I tweaked that solution to fit this particular scenario.
Answered by Mint
Solution #3
We may now utilize DataFrame with Pandas 0.24.0. droplevel():
cols = pd.MultiIndex.from_tuples([("a", "b"), ("a", "c")])
df = pd.DataFrame([[1,2], [3,4]], columns=cols)
df.droplevel(0, axis=1)
# b c
#0 1 2
#1 3 4
If you want to keep your DataFrame method chain going, this is really beneficial.
Answered by jxc
Solution #4
Another option is to use the.xs technique to reassign df based on a cross section of df.
>>> df
a
b c
0 1 2
1 3 4
>>> df = df.xs('a', axis=1, drop_level=True)
# 'a' : key on which to get cross section
# axis=1 : get cross section of column
# drop_level=True : returns cross section without the multilevel index
>>> df
b c
0 1 2
1 3 4
Answered by spacetyper
Solution #5
It’s also possible to achieve this by renaming the columns:
[‘a’, ‘b’] df.columns =
This requires a manual procedure, however it may be an option if you want to rename your data frame in the future.
Answered by sedeh
Post is based on https://stackoverflow.com/questions/22233488/pandas-drop-a-level-from-a-multi-level-column-index