Skip to content Skip to sidebar Skip to footer

Color Scale By Rows In Seaborn Heatmap

I would like to make heatmap in Seaborn where color is scaled by rows. I mean that the highest value in a row has the highest color on a legend and the lowest value in a row - the

Solution 1:

Using numpy.argsort you can find the order of values in each row. Using the result as base for colorization would give you the mapping per row.

import numpy as np; np.random.seed(42)
import matplotlib.pyplot as plt
from matplotlib.ticker import FixedFormatter

data = np.random.randint(1,250, size=(10,6))
b = np.argsort(np.argsort(data, axis=1), axis=1)

im = plt.imshow(b, aspect="auto", cmap="coolwarm")
plt.colorbar(im, ticks=np.array([0.0, 0.5, 1.0])*b.max(), 
             format=FixedFormatter(["low", "middle", "high"]))

for i in range(data.shape[0]):
    for j in range(data.shape[1]):
        plt.text(j,i,data[i,j], ha="center", va="center")

plt.show()

enter image description here

Solution 2:

Using pandas to divide each row by its maximum, we get a coloring where the maximum is dark red and the other columns depending on their relation to the maximum. So a column almost equal to the maximum will be a lighter ted. A column with only half of the sales will be colored white. A column with almost no sales will be blue.

The colorbar indicates the percentage compared to the maximum for each row.

import pandas as pd
import seaborn as sns
from matplotlib import pyplot as plt
import random

sources = ['A', 'B', 'C', 'D', 'E', 'F', 'G']
categories = [f'Cat {i}'for i inrange(1, 5)]
data = [[s, c, random.randint(2, 50)] for s in sources for c in categories]
sales = pd.DataFrame(data, columns=['Sources', 'Category', 'Value'])

# create a dataframe grouped by Sources and Category
per_source_cat = sales.groupby(['Sources', 'Category']).agg({'Value': 'sum'})
# calculate the maximum for each source
max_per_source = sales.groupby(['Sources']).agg({'Value': 'max'})
# divide the sales of each source by the maximum for that source
per_source_cat = per_source_cat.div(max_per_source, level='Sources') * 100# convert to a pivot table
per_source_cat = per_source_cat.pivot_table(index='Sources', columns='Category', values='Value')
# convert the sales to a compatible pivot table
sales = sales.pivot_table(index='Sources', columns='Category', values='Value')
sns.heatmap(per_source_cat, cmap='coolwarm', annot=sales, fmt='g', linewidths=1, linecolor='black', ).set_title('Sales')
plt.show()

example heatmap

Alternatively, suppose you want to color the highest red and the lowest blue, independent whether they are close together or not. Then, subtracting the minimum and dividing by the difference between maximum and minimum could define the coloring. A complete equal row causes a division by zero, which can be handled using fillna.

# create a dataframe grouped by Sources and Category
per_source_cat = sales.groupby(['Sources', 'Category']).agg({'Value': 'sum'})
# calculate the maximum and minimum for each source
max_per_source = sales.groupby(['Sources']).agg({'Value': 'max'})
min_per_source = sales.groupby(['Sources']).agg({'Value': 'min'})
# subtract the minimum and divide by the difference between maximum and minimum
per_source_cat = (per_source_cat - min_per_source) / (max_per_source - min_per_source) * 100# in the case of all equal, there will be a division by 0, set every value to 100 %
per_source_cat = per_source_cat.fillna(100.0)

plot lowest colored blue

Now the colorbar indicates 100% for the highest, 0% for the lowest and the others colored in proportion.

Post a Comment for "Color Scale By Rows In Seaborn Heatmap"