In 1994, Florida jewelry designer Diana Duyser discovered what she believed to be an image of the Virgin Mary on a grilled cheese sandwich, which she preserved and later auctioned off for $28,000. But how much do we understand about pareidolia, the phenomenon of seeing faces and patterns in objects that aren’t actually there?
A new study from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) investigated this phenomenon by introducing an extensive human-labeled dataset of 5,000 similar images, far exceeding previous collections. Using this dataset, the team discovered some surprising results about the differences between human and machine perception and how the ability to see a face in a piece of toast saved the life of a distant relative.
“Pareidolia in faces has long fascinated psychologists, but has been little explored in the computer vision community,” says Mark Hamilton, a doctoral student in electrical engineering and computer science at MIT, CSAIL affiliate, and principal investigator on the study. “We wanted to create a resource that could help us understand how humans and AI systems process these fantasy faces.”
So what did these fake faces reveal? First of all, AI models don’t seem to recognize kinky faces like we do. Surprisingly, the team found that only after training the algorithm to recognize animal faces did it become much better at detecting illusory faces. This unexpected link suggests that there may be an evolutionary link between the ability to identify animal faces, which is essential for survival, and the tendency to look at inanimate faces. “Results like this suggest that pareidolia arises not from human social behavior, but from something deeper, like quickly spotting a hiding tiger or seeing which way a deer is looking so our primitive ancestors could hunt it.” It seems to suggest that it can happen,” says Hamilton.
Another interesting finding is what researchers call the “Goldilocks zone of pareidolia,” which is the type of image in which pareidolia is most likely to occur. “There is a certain range of visual complexity at which both humans and machines are most likely to recognize faces in non-face objects,” says William T. Freeman, professor of electrical engineering and computer science at MIT and principal investigator on the project. “It’s too simple and doesn’t have enough detail to form a face. “It’s so complicated that it becomes visual noise.”
To find out, the team developed equations that model how humans and algorithms detect phantom faces. When analyzing this equation, they found a clear “pareidolic peak” that most likely represented a face. This corresponds to an image with “the right amount” of complexity. This predicted “Goldilocks zone” was validated in tests using both real human subjects and an AI face detection system.
This new dataset, “Faces in Things,” is smaller than the datasets from previous studies, which typically used only 20 to 30 stimuli. This scale allowed the researchers to explore how state-of-the-art face detection algorithms behave after fine-tuning them on pareidol faces, showing that these algorithms can be edited to not only detect these faces, but also act as silicon . By proxying for our own brains, the team can ask and answer questions about the origins of kinky face detection that cannot be asked in humans.
To build this dataset, the team selected approximately 20,000 candidate images from the LAION-5B dataset, which were then meticulously labeled and judged by human annotators. The process involved drawing bounding boxes around recognized faces and answering detailed questions about each face, including the perceived emotion, age, and whether the face was accidental or intentional. “Collecting and annotating thousands of images was a monumental task,” says Hamilton. “Much of the data set exists thanks to my mom, a retired banker. She spent countless hours labeling images for analysis.”
The research also has potential applications in improving face detection systems by reducing false positives, which could have implications for fields such as self-driving cars, human-computer interaction, and robotics. Datasets and models can also help in areas like product design, where pareidolia can be understood and controlled to create better products. “Imagine being able to automatically adjust the design of a car or a children’s toy to make it look friendlier, or to ensure that a medical device doesn’t appear unintentionally threatening,” said Hamilton.
“It’s fascinating how humans instinctively interpret inanimate objects with human-like characteristics. For example, if you look at an electrical socket, you can immediately imagine it singing, or even imagine how it would ‘move its lips’. But algorithms don’t naturally recognize these cartoony faces in the same way that we do,” says Hamilton. “This raises an interesting question. What explains the difference between human perception and algorithmic interpretation? Is Pareidolia Beneficial or Harmful? Why don’t algorithms experience these effects like we do? “This classic psychological phenomenon in humans has not been thoroughly explored in algorithms, so these questions prompted our investigation.”
Researchers are already looking to the future as they prepare to share their datasets with the scientific community. Future work could include training vision language models to understand and describe illusory faces, which could potentially lead to AI systems that can engage with visual stimuli in a more human-like way.
“This is a really enjoyable paper! It’s fun to read and makes you think. Hamilton et al. I propose a sweet question. Why do we see faces in things?” says Pietro Perona, the Allen E. Puckett Professor of Electrical Engineering at Caltech, who was not involved in the work. “As they point out, learning from examples involving animal faces is only halfway to explaining the phenomenon. “Thinking about this question may teach us important things about how our visual system generalizes beyond the training we receive throughout our lives.”
Hamilton and Freeman’s co-authors include Simon Stent, a researcher at the Toyota Research Institute; Ruth Rosenholtz, senior research scientist in the Department of Brain and Cognitive Sciences, NVIDIA research scientist, and former CSAIL member; and CSAIL affiliates Vasha DuTell, Anne Harrington MEng ’23, and research scientist Jennifer Corbett. Their work was supported in part by the National Science Foundation and the CSAIL MenTorEd Opportunities in Research (CSAIL METEOR) fellowship, and by the United States Air Force Research Laboratory and the United States Air Force Artificial Intelligence Accelerator. It was sponsored by the States Air Force Artificial Intelligence Accelerator. MIT SuperCloud and Lincoln Laboratory Supercomputing Center provided HPC resources for the researchers’ results.
The research was presented this week at the European Conference on Computer Vision.