Forwarding you to

Face recognition, bad people and bad data — Benedict Evans

My favourite example of what can go wrong here comes from a project for recognising cancer in photos of skin. The obvious problem is that you might not have an appropriate distribution of samples of skin in different tones. But another problem that can arise is that dermatologists tend to put rulers in the photo of cancer, for scale - so if all the examples of cancer have a ruler and all the examples of not-cancer do not, that might be a lot more statistically prominent than those small blemishes. You inadvertently built a ruler-recogniser instead of a cancer-recogniser.