The artist René Magritte completed a painting of a pipe and coupled it with the words “Ceci n’est pas une pipe.” Magritte called the painting La trahison des images, “The Treachery of Images.”
Magritte’s assumption was almost diametrically opposed: that images in and of themselves have, at best, a very unstable relationship to the things seem to represent, one that can be sculpted by whoever has the power to say what a particular image means. For Magritte, the meaning of images is relational, open to contestation. At first blush, Magritte’s painting might seem like a simple semiotic stunt, but the underlying dynamic Magritte underlines in the painting points to a much broader politics of representation and self-representation.
Reflect on the relationship between labels and images in a machine learning image classification dataset. Who has the power to label images and how do those labels and machine learning models trained on them impact society?
Labels and images, inputs for algorithms to train with, are both only symbolic representations of objects, people, and abstract concepts in our real world. Like The Treachery of Images by René Magritte, the image of a pipe doesn’t fully represent an actual, tangible pipe in real life, so training sets of simple linkage between an image and a text label shouldn’t be taken as representative of the things they refer to in real life.
Currently AI technology and its integration in most digitalized aspects of our life are in the hands of tech giants and government institutions. A most typical example of how text labels in these large training sets could impact society lies in the way they label people. ImageNet has 2833 subcategories under the category “Person”, classifying people into a huge range of labels including race, nationality, profession, economic status, behavior, character, and even morality. Not to mention the privacy issue with tech giants like Google collecting millions’ personal data to feed their ever-growing models, the fact that someone could be categorized as a “snob”, “swinger”, “slav”, “tosser”, “kleptomaniac” simply of the way they appear in a photograph already poses a huge social issue. These biased, underrepresented image-label models are then used in security facial-recognition and social media content moderation in our everyday life.
For this assignment, I want Teachable Machine to teach me about proper social etiquette: recognize my middle finger and violently attack my visual cortex for inappropriately displaying profanity in an academic setting, while rewarding me with cute emoticons for behaving normally.
In order for Teachable Machine to recognize my finger, I trained it with images of my normal open hand gesture, no hand present, no people present, and finally, my middle finger. Initially I only had two sets of images: the finger and no hand, but then I realized the model was only differentiating between these two sets based on skin color instead of actual hand gesture, so I added another set for my “neutral hand”.
EPILEPSY WARNING
https://editor.p5js.org/XXHYZ/sketches/rkZm0N0OS