Seaborn Showing Values Not Found In Pandas Columns
Solution 1:
Why this is happening I am not certain, but there is an easy way to get it to use the desired [3, 6, 8, 10]
legend you want.
#Create mock data
dp = pd.concat([pd.DataFrame(np.random.randint(1, 4, [100, 1])),
pd.DataFrame(np.random.randint(1, 14, [100, 1])),
pd.DataFrame([3.0]*20+ [6.0]*20+ [8.0]*20+ [10.0]*20+ [11.0]*20)], axis=1)
dp.columns = ["numyear", "numgrade", "numdept"]
dtest = pd.DataFrame(dp[dp['numdept'].isin([3,6,8,10])]).dropna()
dtest.reset_index(drop=True, inplace=True)
sns.boxplot(x="numyear", y="numgrade", hue="numdept", data=dtest,
hue_order=[10, 3 , 8, 6])
Here I have added a hue_order and specified the order (I chose non-numeric order to emphasise this) and exact values I'd like to see. If specified [1, 2, 3, 6, 8, 10]
it would give these as the legend.
Finally, you could generalise this nicely using the following,
sns.boxplot(x="numyear", y="numgrade", hue="numdept", data=dtest,
hue_order=dtest.numdept.unique().sort(), width=0.2)
Solution 2:
You are using a categorical variable. It appears the legend is based on the categories in the categorical variable, not the values that are actually present. A categorical variable may represent categories that don't actually occur in the data, and these categories are still shown in the legend.
As suggested in the documentation, you can do dtest.numdept.cat.remove_unused_categories()
to remove the empty categories.
Post a Comment for "Seaborn Showing Values Not Found In Pandas Columns"