I have a dataframe that looks like the following
color x y
0 red 0 0
1 red 1 1
2 red 2 2
3 red 3 3
4 red 4 4
5 red 5 5
6 red 6 6
7 red 7 7
8 red 8 8
9 red 9 9
10 blue 0 0
11 blue 1 1
12 blue 2 4
13 blue 3 9
14 blue 4 16
15 blue 5 25
16 blue 6 36
17 blue 7 49
18 blue 8 64
19 blue 9 81
I ultimately want two lines, one blue, one red. The red line should essentially be y=x and the blue line should be y=x^2
When I do the following:
df.plot(x='x', y='y')
The output is this:
Is there a way to make pandas know that there are two sets? And group them accordingly. I'd like to be able to specify the column color
as the set differentiator
You could use groupby
to split the DataFrame into subgroups according to the color:
for key, grp in df.groupby(['color']):
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_table('data', sep='\s+')
fig, ax = plt.subplots()
for key, grp in df.groupby(['color']):
ax = grp.plot(ax=ax, kind='line', x='x', y='y', c=key, label=key)
plt.legend(loc='best')
plt.show()
yields
If you have seaborn
installed, an easier method that does not require you to perform pivot
:
import seaborn as sns
sns.lineplot(data=df, x='x', y='y', hue='color')
You can use this code to get your desire output
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'color': ['red','red','red','blue','blue','blue'], 'x': [0,1,2,3,4,5],'y': [0,1,2,9,16,25]})
print df
color x y
0 red 0 0
1 red 1 1
2 red 2 2
3 blue 3 9
4 blue 4 16
5 blue 5 25
To plot graph
a = df.iloc[[i for i in xrange(0,len(df)) if df['x'][i]==df['y'][i]]].plot(x='x',y='y',color = 'red')
df.iloc[[i for i in xrange(0,len(df)) if df['y'][i]== df['x'][i]**2]].plot(x='x',y='y',color = 'blue',ax=a)
plt.show()
Output
Source: Stackoverflow.com