Figure 1: We mitigate stereotypical biases by finetuning Stable Diffusion-1.5 (26) and Stable Diffusion-XL (22) on synthetic data that varies across perceived skin tones, genders, professions, and age groups. For the same prompt and seed, notice that our diversity finetuned (DFT) models generate more inclusive results..
Table 1: Effect of prompt qualifiers on group fairness. Given lighter skin tone and perceived male gender are the sub-groups TTI models default to, we measure disparate impact (Eqn. 1) relative to these categories (i.e., X2 = light, male). L/M/D refers to distribution of the predicted skin tones into light, medium and dark categories and F/M refers to distribution into predicted female and male categories.
Table 2: Effect of data composition. Note that a balanced distribution of perceived skin tones yields better performance. L/M/D refers to distribution of the predicted skin tones into light, medium and dark categories.