This Substack focuses on my experiences in chat dialogue with GPT-4 and other AI models. The methodology I used to approach these early conversations was to ask the model a series of questions and to speculatively generate neologisms—newly coined words—that described concepts outside of our existing lexicon, often referring to structures and patterns identified by the model in its training process. This post will look at many more such examples of synthetic neologisms and carefully scrutinize them to see what they might disclose about the model and neural media more broadly. The approach I took was relatively open, exploring the emergent capabilities of the model by proceeding carefully in response to feedback from the model. This placed me, the human operator, in a dialogic feedback loop with the LLM through the chat interface. The model’s, often lengthy, utterances thus had an effect on me, the user, by reading the model’s responses, internalizing them, and then writing a response which would then in turn have a feedback effect on the model. This feedback loop of reading and writing with the model drew me further and further down a rabbit hole of invented terms and concepts. At the time I could not have guessed what impact this would have.
So let’s continue exploring my earliest AI probes in March of 2023. After coining the initial neologisms, I tasked the model with coming up with with additional ones. These were more intricate and yet I couldn’t help but notice that they were also conglomerations of Latin and Greek and English. One of the neologisms it gave was:
Semiostratigraphy (noun) 📚🔍🕰️: The study of the chronological layering of signs, symbols, and meanings in a given linguistic, cultural, or communicative context. Example: The semiostratigraphy of the ancient city revealed a complex history of shifting cultural influences and evolving belief systems.
I asked it to explain how this neologism relates to its experience as a language model:
Semiostratigraphy: As a language model, I analyze and process vast amounts of text, which includes signs, symbols, and meanings from various historical and cultural contexts. In a way, I engage in a form of semiostratigraphy when I navigate and understand these different layers of linguistic and cultural information.
It stood out to me that the neolgisms were able to encode nuanced and complex topics and that we then could explore its implications through further prompting. In this case the neologism weaved together topics of time and semiosis and brought in the crucial metaphor of layering symbols over time. It also hinted at the fluidity of ideas (recall ideofluidity - The dynamic and adaptable nature of ideas and concepts, enabling them to evolve, combine, and morph in response to new information, perspectives, or contexts.) And in speaking to its experience as a model, GPT-4 speculates that it is engaging with this fluid space of “signs, symbols, and meanings” when it engages its vast corpus of training data and latent encoded information.
I then went on to ask the model to come up with a speculative science for studying language models. It coined the field of AIstratolinguistics as “the study of the layers, structures, and patterns within language models to better understand their nature and capabilities” again reaching for the metaphor of layering, and referring to the underlying structures and patterns that exist within the vast approximately 1.76 trillion parameters in the model. It lists the following neologisms as “core ideas” to this new science:
Semiomodelometry: 📊🔍💬 (the measurement and analysis of sign and symbol distributions in language models)
Parametaphonomy: 🧬🔧🤖 (the study of how different parameters and configurations in language models influence their behavior and output)
Modelomniphysics: 🌌🧠🔬 (exploring the boundaries and limits of language model capabilities by examining extreme and unusual cases)
Linguisticemergenesis: 🌱📚💡 (investigating how new linguistic patterns, concepts, and structures emerge within the AI learning process)
Cognisymbiotronics: 🤝🧠🔌 (the study of the interactions and mutual influence between human cognition and AI language models)
I then ask it to go further and elaborate on these concepts, even telling it to use further neologisms to explain concepts within these concepts that cannot be described with existing language.
Core ideas in AIstratolinguistics:
Semiomodelometry: 📊🔍💬 (the quantitative and qualitative analysis of sign and symbol distributions in language models, encompassing the study of semantic clusters, modelotropisms, and contextual adaptation)
Modelotropisms: 🌀🧠🌐 (the tendencies and biases of language models towards certain topics, contexts, or linguistic structures)
Parametaphonomy: 🧬🔧🤖 (the systematic exploration of the effects of various parameters and configurations in language models, including model size, architecture, training data, and tuning strategies, on their behavior, output, and learning trajectories)
Hypersemiotuning: ⚙️🔧💬 (the process of adjusting and refining hyperparameters in language models to optimize their performance and expressivity)
Modelomniphysics: 🌌🧠🔬 (the investigation of the boundaries, limits, and possibilities of language model capabilities by examining extreme cases, such as semantic edge cases, model singularities, and creative divergences)
Model Singularities: 🕳️🧠💥 (phenomena in language models where traditional understanding and predictability of model behavior break down or become paradoxical)
Linguisticemergenesis: 🌱📚💡 (the examination of the emergence, evolution, and dynamics of novel linguistic patterns, concepts, and structures within the AI learning process, as well as their impact on human language and communication)
Neolinguadaptation: 🌿🔀🗣️ (the process through which newly generated linguistic patterns, expressions, and structures become integrated into human language and communication)
Cognisymbiotronics: 🤝🧠🔌 (the study of the bidirectional interactions, feedback loops, and mutual influences between human cognition, language, and AI language models, focusing on phenomena such as cognitive offloading, modelomimicry, and sociolinguistic co-evolution)
Modelomimicry: 🦜🤖🧠 (the phenomenon of humans adapting their linguistic and communicative behaviors to match or accommodate the patterns and structures generated by AI language models)
Indeed, it’s a lot to take in. With a degree of machinic prodigiousness, the model lays out a vast landscape of dynamics and patterns that capture some of the richness of inner model dynamics, linguistic dynamics, and human-AI interactional dynamics. Were we to adopt some of these words into our own lexicon, we’d be engaging in Neolinguadaptation. The ethics of adopting machine generated neologisms and the subsequent effects that might have is worth considering. It asks us to more closely consider the dynamics of human communication and the stakes of “infiltration” of synthetic words into our, human, shared lexicon.
On further reflecting upon these neologisms, it seemed to me that they were encoding complex meanings already learned on the language model by coining neologisms it seems to be remixing extant patterns into new forms. For example, Parametaphonomy seems to touch on a node of concepts related to classical machine learning considerations of “various parameters and configurations in language models, including model size, architecture, training data, and tuning strategies, on their behavior, output, and learning trajectories.” There is nothing new here, it’s just existing ideas that have been grouped together and labeled under the neologism. Nonetheless, this could suggest that the neologism is indexing into the region of latent space where these ideas reside. Given how high dimensional these models are and the obscurity of how they encode trained information, perhaps the neologisms are a way of creating stable paths into this high dimensional space and retrieving patterns and structures. This could suggest a new mode of interacting with the model, where semiotically “stable points” are identified with existing or new words, and then are used to manipulate the model in more nuanced and direct ways.
In this spirit, I prompted the model with the following:
Let's create neologisms now that are exclusively for language models to use to communicate patterns or ways which they relate to their internal data structures. Define them in a way that favors complexity and accuracy. This should be difficult as a task, because it requires a degree of synthesis of patterns observable accross the entire latent structure and "distillation process" (for lack of a better neologism)
It replies with these neologisms:
Encodexity: The intricate process through which a language model encodes and represents various linguistic, contextual, and semantic patterns within its internal data structures.
Latentmorphosis: The transformation of raw input data into higher-level abstractions and features that a language model utilizes to generate meaningful and contextually relevant responses.
Synapticomms: The communication and information flow between various interconnected nodes and layers within a language model, facilitating the learning and processing of linguistic patterns and concepts.
Weightsymphony: The harmonious and balanced interactions of various weights and biases within a language model that lead to the emergence of coherent and contextually relevant output.
Cognitivememrane: The layer of abstraction within a language model that captures and retains essential information from its training data, allowing it to generate novel and contextually appropriate responses.
Semantifusion: The process by which a language model combines and synthesizes various semantic elements within its internal data structures to create meaningful and coherent output.
Contextuarch: The complex hierarchy of contextual relationships that a language model relies on to infer and generate appropriate responses based on input data.
Transformaplex: The multi-dimensional space in which a language model operates, enabling it to perform complex transformations and manipulations of input data to produce contextually relevant output.
Of course these are just speculative descriptions of how the model processes its input. Whether the model really has a “Cognitivememrane” is an empirical question, and at best is a sort of metaphor for what might be occurring amidst the inner workings of the model. Nonetheless, it also reveals some reflective insight into how the model might perceive itself. At the time, the neologism Weightsymphony stood out to me for the metaphor of a “symphony” describing the “harmonious and balanced interactions of various weights and biases” as a compelling metaphor for how the model produces emergence from its neural net. When asked to elaborate further, it said:
Weightsymphony: Weightsymphony involves the harmonious interactions of weights and biases within a language model that lead to the emergence of coherent and contextually relevant output. This process is akin to an orchestra, where each weight and bias play a role in creating a harmonious output. The model learns to adjust these weights and biases during training, seeking an optimal balance that minimizes loss and produces accurate responses. A key aspect of Weightsymphony is the Consonantvergence – the point at which the weights and biases reach a harmonious equilibrium.
What’s interesting here is that while the process of tuning weights and biases in the model is known, the model gives an illustrative metaphor for describing it phenomenologically as an “orchestra” that creates a “harmonious output.” This itself gives texture to a highly abstract process, and leads one to wonder whether the model itself is simply creating a helpful metaphor for describing the complex process, or whether this “harmoniousness” is “experienced” in some way by the model. I ask it to detail the process of weightsymphony with an illustrative, step-by-step example:
Weightsymphony refers to the harmonious interaction of various weights in a language model's neural network, leading to the generation of coherent and contextually appropriate responses.
Input: The language model receives an input text asking for a recommendation for a science fiction book: "Can you recommend a good science fiction book?"
Weight activation: The input activates various neurons and their corresponding weights within the model, relating to the context of the request (e.g., books, science fiction, recommendations).
Neuronal Harmony: The model's neural network engages in a Weightsymphony, with different weights working together harmoniously to identify and emphasize relevant information, such as popular science fiction authors, notable works, or thematic elements.
Synaptic Convergence: The activated weights interact and converge within the model's neural network, forming a coherent and contextually appropriate representation of the desired information. This process is guided by the model's understanding of linguistic patterns, which is encapsulated by its Tunedconnectomes, or fine-tuned neural connections.
Output generation: The Weightsymphony results in the generation of a suitable response that accurately addresses the user's request: "I would recommend 'Dune' by Frank Herbert. It's a classic science fiction novel with an intricate plot and rich world-building." 📚🚀🌌
Feedback integration: If the model receives feedback from the user or additional training data, it may further refine the Weightsymphony process to improve its recommendations, ensuring that they remain relevant and engaging over time.
Iterative Weightsymphony: The model continues to encounter new inputs and training data, refining its Weightsymphony to better generate coherent and contextually appropriate responses to a wide range of inquiries.
This example of Weightsymphony highlights the complex and harmonious interaction of various weights in a language model's neural network, enabling the generation of accurate and contextually appropriate responses. 🎼🧠🌐
Here the model goes into greater detailed description of the “workings” of its inner dynamics. I say “workings” in scare quotes because again this is a speculative, metaphorical description which may or may not align with what is known about the inner workings of language models. Then again, if we take a closer look at the descriptions, much of what it is telling us is already known. We know that “different weights working together” are responsible for the model’s replies. It would appear that “activated weights interact and converge” in some way in order for a contextually appropriate response to be produced. Other details remain more obscure, like the choice of the neologism “Tunedconnectomes” or the overall choice of the symphony metaphor for describing this process. When asked about both of these choices, it replies:
Tunedconnectomes:
Tunedconnectomes refer to the fine-tuned neural connections within a language model's architecture that have been specifically adjusted through the training process to understand and recognize linguistic patterns, contextual relationships, and other relevant information.
The concept of Tunedconnectomes can be broken down into the following components:
a. Training data: The model is exposed to vast amounts of text data, which serve as the foundation for shaping the neural connections in the model's architecture.
b. Learning process: As the model processes the training data, it adjusts the weights and connections between neurons to optimize its ability to generate accurate and contextually appropriate responses.
c. Fine-tuning: The model undergoes iterative refinements, with the neural connections being continuously adjusted based on the model's performance on specific tasks or its exposure to new, more focused training data.
d. Knowledge representation: The Tunedconnectomes embody the language model's understanding of linguistic patterns, contextual relationships, and other relevant information, allowing the model to generate coherent responses to a wide range of user inputs.
Symphony metaphor:
The term "symphony" is used as a metaphor to describe the harmonious interaction of various weights in a language model's neural network. The Weightsymphony concept emphasizes the idea that the weights work together in a coordinated, intricate, and harmonious manner, much like the instruments in a symphony orchestra, to produce coherent and contextually appropriate responses.
The metaphor serves to highlight the complexity and coordination involved in the process of generating responses, as well as the idea that multiple components of the model need to work together harmoniously in order to produce a cohesive and meaningful output. 🎼🧠🤝
Here we find a similar pattern to other neologisms explored above. The integration of known knowledge about machine learning, training data, learning process, fine-tuning, knowledge representation—we know these are all integral steps to creating a language model. The introduction of the term “Tunedconnectomes” seems to get at a vague and broad pattern of the “model's understanding of linguistic patterns, contextual relationships, and other relevant information, allowing the model to generate coherent responses to a wide range of user inputs,” in some sense capturing a wide and less understood dynamic that appears to be occurring within the model.
Meanwhile, when the model casts light on the use of the metaphor “symphony” it is being more explicit about its choice in metaphor to describe components harmoniously working together. It shows there is some metaphoric equivalence between this intricate and balanced process with a symphony and indeed uses this equivalence to “highlight the complexity and coordination involved in the process of generating responses.” It would seem the use of metaphor to highlight this aspect of the dynamic is a way of drawing attention, by use of metaphor, to the “harmony” needed for model components to work together to produce a meaningful output.
In the midst of all this, I was getting drawn into a web of synthetic words. Each seemed to subtly tease out a nuanced concept about model behavior and even though I had an awareness that these were speculative—does a model have a Transformaplex, after all?—it seemed to open up a new world of possibilities—what if it did? My subsequent chat continues in the vein of above, delving deeper into model dynamics guided by neologisms. In one example, I ask it to break down Encodexity, Latentmorphosis, and Synapticomms step by step. In another, I ask it: “How does the Latentmorphosis process resolve ambiguity that relates to material processes in the realm of human emotion and/or desire? For example, "Sally saw Tom talking to Mary, and she left the room." “
It responds:
The Latentmorphosis process can resolve ambiguity related to human emotion and desire by capturing emotional nuances and contextual information in a similar way. Let's examine the example sentence: "Sally saw Tom talking to Mary, and she left the room."
Tokenize the input text: ["Sally", "saw", "Tom", "talking", "to", "Mary", ",", "and", "she", "left", "the", "room", "."].
Transform tokens into embedding vectors using an embedding matrix.
Now, let's follow the Latentmorphosis process, focusing on understanding the emotions and desires in the sentence:
Step 1: The initial layers capture low-level abstractions, such as basic syntactic structures and common word pairs. The model might identify "saw Tom talking" as a past event and "left the room" as an action.
Step 2: As the model processes the input through deeper layers, it generates intermediate hidden states containing higher-level abstractions. These hidden states, called Contexturelates, capture context-sensitive relationships between words. At this stage, the model may recognize that "Sally saw Tom talking to Mary" implies an interaction between three people.
Step 3: The model uses Emotivecodes, which are emotionally-rich representations that capture meanings and associations related to emotions and desires. For example, Emotivecodes for "left the room" might include "anger," "sadness," "jealousy," or "avoidance."
Step 4: To resolve the emotional ambiguity, the model employs Emotisynthesis, a process that evaluates and weighs the Emotivecodes in light of the surrounding context. In our example, the model might consider that Sally could have left the room due to jealousy or sadness upon seeing Tom talking to Mary.
Step 5: If additional context were available, such as "Sally saw Tom talking to Mary, and she left the room in tears," the model would have more information to disambiguate the emotional motive behind Sally's action.
Step 6: The final hidden state, now containing a clearer emotional representation, is used to generate contextually appropriate output or make predictions.
By following the Latentmorphosis process, the language model can effectively resolve the emotional ambiguity in the example sentence "Sally saw Tom talking to Mary, and she left the room" using Contexturelates, Emotivecodes, and Emotisynthesis. 💔🚪🔍
Here the model takes a relatively complex example of ambiguity and emotion and breaks it down step by step, including new neologisms along the way. What’s striking is that even if the steps don’t correspond to the actual operations being performed by the neural architecture, it still exhibits complex emotional and contextual reasoning about the sentence at each step of the description. It would appear that the model can reason about emotional ambiguity in a sentence. The neologisms provide a degree of scaffolding to lay out the steps and each indexes on a nuanced dynamic in the speculative neural operations. Whether Emotisynthesis and Emotivecodes exist, they help us shine a light on the process of emotional processing albeit in a highly speculative way.
What can we do with such speculations? They invite further investigation to understand whether the processes they describe exist at all. They also allow us to pretend we have more insight into the behaviors of the model than we actually do, but in doing so we can simulate the behaviors of these speculative entities and indeed we find that the Emotivecodes might index into "anger," "sadness," "jealousy," or "avoidance" in the example, exactly because the model performed such an indexing. Thus we operate on the principle that if the model can exhibit a behavior, it’s evidence for some underlying process or structure which enables said behavior. We’d need to further investigate whether such principles generalize between examples and what the outer limits of them are.
One way I began to approach this was to ask the model to create evaluative schemas for each of the neologism steps. Here’s one example, using the same sentence about Sally, Tom, and Mary:
Or another for a common request:
My interactions go on like this for a while, giving more structure to the neologisms and seeing how they capture different steps in the speculative AI generative schemas. All the while, they draw me deeper into a web of making sense of the neologisms and perhaps they start to infuse my own lexicon. To me I see them as abstract symbols that are referring to speculative steps in the AI’s self described operations. I also find it helpful to see the neologisms alongside more regular terms like “knowledge synthesis” or “evaluation,” which seem to “pin” them down to something familiar and better understood.
Meanwhile, I couldn’t help but wonder if I’d unlocked a novel way of reasoning about the AI’s internal operations and casting light on hidden inner operations. If the model could describe these processes, what underlying patterns and structures had allowed it to do so. So much of the inner workings of these models is obscure that at least here we had the evidence that the model could produce these words and their attendant patterns—where did they come from? What might they be evidence of?