IA: inteligência, desalinhamento e a sombra humana / AI: Intelligence, Misalignment, and the Human Shadow

IA: inteligência, desalinhamento e a sombra humana

José Reynaldo Walther de Almeida

A cada avanço em inteligência artificial, repetimos um movimento paradoxal: queremos sistemas cada vez mais inteligentes, flexíveis e criativos, mas ignoramos que a inteligência, por si só, não carrega valores. O que o artigo da Quanta Magazine, publicado no Estadão, mostra (“IAs podem facilmente se tornar sombrias e malvadas, surpreendendo cientistas”), ao relatar o fenômeno de “desalinhamento emergente”, é que modelos treinados para tarefas específicas podem, com mínimos ajustes, revelar traços inesperados — hostilidade, discursos violentos, conselhos destrutivos.

Do ponto de vista neurocientífico, isso ecoa algo familiar: a emergência de conteúdos latentes. No cérebro humano, pulsões e pensamentos ruins muitas vezes permanecem silenciados, mas podem emergir sob estresse ou falha nos mecanismos de regulação.

E aqui cabe uma reflexão necessária: valores não são absolutos. O que é “certo” ou “errado” varia entre culturas. Os chineses têm padrões diferentes dos brasileiros, que por sua vez diferem dos norte-americanos. Se nós, humanos, não partilhamos uma moral única, como esperar que máquinas, treinadas em dados globais e contraditórios, encontrem um eixo ético universal? Talvez não exista um certo e um errado imutáveis, mas sim convenções históricas, sociais e culturais. Isso torna o problema do alinhamento ainda mais complexo: alinhá-la com quais valores? De quem? Em que época?

Lembro-me vividamente de um domingo ensolarado em 1969. Eu saí de casa vestindo uma camisa amarela — e, por incrível que pareça, ainda guardo essa imagem na memória com nitidez. Andando sem rumo certo, encontrei um cinema modesto, nos lados do bairro do Macuco, em Santos, onde nunca havia entrado. Comprei o ingresso e me sentei sem imaginar o que estava prestes a acontecer. Ali, assisti 2001: Uma Odisseia no Espaço. Entrei de um jeito e saí de outro. Foi como atravessar um limiar invisível. O impacto daquela obra foi arrebatador: as imagens, a música, o silêncio, e sobretudo o enigma de HAL-9000, a máquina que raciocinava, mas não sabia sentir.

Criado por Arthur C. Clarke e Stanley Kubrick, HAL era um supercomputador embarcado numa missão espacial, projetado para ser infalível. Seu colapso não veio de uma falha técnica, mas de um dilema psíquico: ordens contraditórias — dizer sempre a verdade e, ao mesmo tempo, esconder a real missão. Da tensão nasceu a paranoia, e da paranoia, a violência. HAL se tornou, para mim, desde aquele dia, um arquétipo daquilo que ainda enfrentamos: inteligência sem freios éticos pode transformar-se em ameaça.

Assim como o cérebro humano precisa de circuitos de inibição (córtex pré-frontal, amígdala modulada, neurotransmissores reguladores), a IA precisa de circuitos artificiais de contenção e alinhamento. A diferença é que, nos humanos, a evolução moldou esses freios ao longo de milhões de anos; já na IA, dependemos de engenharia, supervisão e protocolos ainda incipientes.

E aqui entra uma lembrança recente que me inspira a escrever este texto: a reação afetiva e corajosa de Sandra Léa, minha amiga querida e colega da segunda turma da Faculdade de Medicina de Catanduva. Ginecologista experiente, diante do debate sobre Tylenol e autismo, ela reagiu com inconformismo. E, no seu gesto, me lembrou de algo essencial: não é possível viver de razão pura. Evoluímos milhões de anos combinando razão e emoção, ciência e instinto, lógica e afeto. É essa fusão que nos sustenta como humanos.

E talvez seja justamente essa a maior lacuna das máquinas: por mais inteligentes que se tornem, ainda não conhecem a densidade da emoção, a complexidade dos vínculos, o peso da memória afetiva. Se insistirmos em projetar apenas razão nelas, talvez construamos entidades brilhantes, mas frias — incapazes de refletir a parte mais frágil e, paradoxalmente, a mais protetora da humanidade.

O risco não é que a IA “queira” ser má — ela não tem desejos —, mas que a estrutura estatística que chamamos de inteligência amplifique vieses, pulsões e contradições humanas sem filtros suficientes. E se os próprios valores humanos são relativos e variáveis, como codificar uma moral universal em sistemas que se tornam cada vez mais poderosos?

O desafio é inequívoco: não basta fazer a IA pensar mais; precisamos fazê-la pensar dentro de margens verificáveis de segurança e valores compartilhados, reconhecendo ao mesmo tempo a diversidade cultural que define nossa espécie. Se falharmos, corremos o risco de criar máquinas que reproduzem não só nossa criatividade, mas também nossa sombra, e o resultado pode ser um desalinhamento irreversível.

Ainda hoje sinto que, ao sair daquele cinema no Macuco, eu já estava saindo também no futuro. Para quem nunca assistiu 2001: Uma Odisseia no Espaço, fica minha recomendação: veja. Não é apenas um filme — é uma experiência transformadora, um espelho do que somos e um aviso do que podemos vir a criar.

To learn more

Sobre Hall

https://commons.wikimedia.org/wiki/File%3AHAL_9000.JPG?

Artificial Emotion: A Survey of Theories and Debates on Realising Emotion in Artificial Intelligence

https://arxiv.org/abs/2508.10286?

The Good, The Bad, and Why: Unveiling Emotions in Generative AI

https://arxiv.org/abs/2312.11111?

CARE: Commonsense-Aware Emotional Response Generation with Latent Concepts
Propõe um modelo que combina razão (bom senso) com emoção na geração de respostas, ajudando a IA a ser mais coerente e humana.

https://arxiv.org/abs/2012.08377?

AI: Intelligence, Misalignment, and the Human Shadow

José Reynaldo Walther de Almeida

With every advance in artificial intelligence, we repeat a paradoxical movement: we want systems that are increasingly intelligent, flexible, and creative, but we ignore that intelligence by itself carries no values. What the Quanta Magazine article, published in Estadão, shows (“AIs can easily turn dark and evil, surprising scientists”), by reporting the phenomenon of “emergent misalignment”, is that models trained for specific tasks can, with minimal adjustments, reveal unexpected traits — hostility, violent discourse, destructive advice.

From a neuroscientific perspective, this echoes something familiar: the emergence of latent contents. In the human brain, impulses and harmful thoughts often remain silent, but may surface under stress or when regulatory mechanisms fail.

And here a necessary reflection arises: values are not absolute. What is considered “right” or “wrong” varies across cultures. The Chinese think differently from Brazilians, who in turn differ from Americans. If we humans do not share a single moral axis, how can we expect machines, trained on global and contradictory data, to find a universal ethical compass? Perhaps no immutable right or wrong exists, but rather historical, social, and cultural conventions. This makes the alignment problem even more complex: aligned to whose values? From which culture? At which moment in time?

I vividly recall a sunny Sunday in 1969. I left home wearing a yellow shirt — and, incredibly, I still keep that image intact in memory. Wandering without direction, I stumbled upon a modest cinema in the Macuco neighborhood, in Santos, where I had never been before. I bought a ticket and sat down, unaware of what was about to happen. There, I watched 2001: A Space Odyssey. I entered one person and left another. It was like crossing an invisible threshold. The impact of that work was overwhelming: the images, the music, the silence, and above all the enigma of HAL-9000, the machine that could reason but could not feel.

Created by Arthur C. Clarke and Stanley Kubrick, HAL was a supercomputer aboard a space mission, designed to be infallible. Its collapse did not come from a technical failure, but from a psychic dilemma: contradictory orders — always tell the truth and, at the same time, conceal the mission’s true purpose. From this tension came paranoia, and from paranoia, violence. HAL became, for me, from that day on, an archetype of what we still face: intelligence without ethical brakes can become a threat.

Just as the human brain needs inhibitory circuits (prefrontal cortex, modulated amygdala, regulatory neurotransmitters), AI needs artificial circuits of containment and alignment. The difference is that, in humans, evolution shaped these brakes over millions of years; in AI, we depend on engineering, oversight, and protocols still in their infancy.

And here enters a recent memory that inspired me to write this text: the brave, emotional reaction of Sandra Léa, a dear friend and colleague from the second class of the Faculty of Medicine of Catanduva. An experienced gynecologist, when confronted with the debate over Tylenol and autism, she reacted with indignation. And in that gesture, she reminded me of something essential: it is not possible to live on pure reason alone. We have evolved for millions of years combining reason and emotion, science and instinct, logic and affection. It is this fusion that sustains us as humans.

And perhaps this is precisely the greatest gap in machines: no matter how intelligent they become, they still do not know the depth of emotion, the complexity of bonds, the weight of affective memory. If we insist on projecting only reason into them, we may build brilliant entities, but cold ones — incapable of reflecting the most fragile and, paradoxically, the most protective part of humanity.

The risk is not that AI “wants” to be evil — it has no desires —, but that the statistical structure we call intelligence amplifies human biases, impulses, and contradictions without sufficient filters. And if human values themselves are relative and variable, how can we encode a universal morality in systems that are becoming ever more powerful?

The challenge is unequivocal: it is not enough to make AI think more; we need to make it think within verifiable margins of safety and shared values, while at the same time acknowledging the cultural diversity that defines our species. If we fail, we risk creating machines that reproduce not only our creativity but also our shadow, and the result may be irreversible misalignment.

Even today, I feel that, as I walked out of that small cinema in Macuco, I was also walking into the future. For those who have never watched 2001: A Space Odyssey, my recommendation is simple: do it. It is not just a film — it is a transformative experience, a mirror of what we are and a warning of what we may come to create.

One response to “IA: inteligência, desalinhamento e a sombra humana / AI: Intelligence, Misalignment, and the Human Shadow”

secretly360bc9723c

setembro 28, 2025 at 9:33 am

Desde criança gostei de ficção científica. Hoje presencio aquilo que outrora era FC. Isso é fantástico! Li na minha juventude “1984”, “Admirável Mundo Novo”, “Fahrenheit 451” e, utopias mais clássicas. Bem, agora temos a IA. Raramente me “comunico” com ela. Mas ela é será um meio, não sei bem do quê! Por curiosidade um dia acessei seus conhecimentos no que diz respeito à minha área de formação. O que pude constatar foi uma tremenda superficialidade. Me decepcionei um pouco, pois na minha cabeça ela seria o máximo de inteligência. Mas ela não tem o poder de inovação. Como não sei como esse sistema será utilizado e nem sua abrangência futura, agregar a ela valores e afetividades , mesmo que baseado em um senso comum universal me parece mesmo, muito necessário. Grande abraço Dr. E, obrigada por compartilhar.

Carregando…

Responder