In this experiment, we compare three different ways of asking raters to evaluate skin tone, testing whether methods created to reduce interrater variation (such as using a color palette to create a “norm” for responses) are effective. We compare two popular scales: a text-based 5-point skin color scale (which asks raters to classify pictures on a scale from very light to very dark) and a 10-point palette-based skin color scale (which asks raters to choose a number from 1 to 10 with pictures associated with each number). We also ask raters to use a more complex two-axis color chart to rate pictures, in order to test whether addressing common criticisms of the palette-based scales improves ratings. White and Latinx experiment participants complete a demographic questionnaire and rate a randomly selected set of 16 pictures. We discuss results and future analyses.
Presented in Session 186. Measurement of Race and Gender