Non-AIs talking about AIs

really stumped GPT on this one

IMAGE(https://i.imgur.com/PQfhXSX.png)

I quite enjoyed this. It's a nice non-technical look at whether GPTs understand stuff, and more generally at what the question means and how we can go about answering it.

I'm reading the referenced article for myself. It's very long. Some of it is very dense. Relevant disclosure: I work at Microsoft and I work with AI.

This won’t let me embed so linking instead
Bill Gates on AI and the rapidly evolving future of computing
https://youtu.be/bHb_eG46v2c

It's very on-point that a lesswrong article on the Waluigi Effect goes on for ~10 pages, introduces several definitions and notations, quotes Derrida and attempts to define intelligence, all before mentioning the Waluigi Effect!

Anything that quotes Derrida seriously worries me deeply.

They are 2+ hours each. I need an LLM to generate a TL;DW from the transcripts. Anyone made an AI that is good at generating extractive TL;DWs directly from video? Bing Chat didn’t do a very good job and was likely just sampling quotes from web search. Maybe just doing an extractive summary of transcripts then snipping the clips matching those extractions would work well?

Sam Altman. Founder of OpenAI and AI optimist.

Eliezer Yudkowsky. A voice in the discussion around AI alignment.

Robear wrote:

Anything that quotes Derrida seriously worries me deeply.

Is this a fair assessment from Bing Chat? I prompted it to give a wide assessment and I’d say the results lean towards a positive impact of his work even when I specifically asked for criticism of Derrida. I’m still learning the best way to prompt Bing Chat to get what I really want.

Jacques Derrida (1930–2004) was the founder of “deconstruction,” a way of criticizing not only both literary and philosophical texts but also political institutions¹. Deconstruction is a form of philosophical and literary analysis, derived mainly from work begun in the 1960s by Jacques Derrida, that questions the fundamental conceptual distinctions, or “oppositions,” in Western philosophy through a close examination of the language and logic of philosophical and literary texts⁵.

Derrida's work has been influential in literary theory, philosophy, sociology, anthropology, psychology, feminist theory, cultural studies, and political theory². His popularity indicates the wide-ranging influence of his thought².

Source: Conversation with Bing, 4/1/2023(1) Jacques Derrida - Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/entries/d... Accessed 4/1/2023.
(2) Deconstruction | Definition, Philosophy, Theory, Examples, & Facts .... https://www.britannica.com/topic/dec... Accessed 4/1/2023.
(3) Opinion | What Derrida Really Meant - The New York Times. https://www.nytimes.com/2004/10/14/o... Accessed 4/1/2023.
(4) Derrida, Jacques | Internet Encyclopedia of Philosophy. https://iep.utm.edu/jacques-derrida/ Accessed 4/1/2023.
(5) Jacques Derrida - Wikipedia. https://en.wikipedia.org/wiki/Jacque... Accessed 4/1/2023.
(6) Deconstruction - New World Encyclopedia. https://www.newworldencyclopedia.org... Accessed 4/1/2023.
(7) Deconstruction - Wikipedia. https://en.wikipedia.org/wiki/Decons... Accessed 4/1/2023.

Derrida's ideas have been both influential and controversial. Some critics consider Speech and Phenomena (1967) to be his most important work¹. Others cite Of Grammatology (1967), Writing and Difference (1967), and Margins of Philosophy (1972)¹. In the same vein, other critics have portrayed Derrida as an antirational and nihilistic opponent of “serious” philosophical thinking³. Despite such criticism, Derrida’s ideas remain a powerful force in philosophy and myriad other fields³.

Source: Conversation with Bing, 4/1/2023(1) Jacques Derrida - Wikipedia. https://en.wikipedia.org/wiki/Jacque... Accessed 4/1/2023.
(2) Jacques Derrida - Criticism | Britannica. https://www.britannica.com/biography... Accessed 4/1/2023.
(3) Jacques Derrida | Biography, Books, & Facts | Britannica. https://www.britannica.com/biography... Accessed 4/1/2023.
(4) Jacques Derrida: Cosmopolitan Critic | SpringerLink. https://link.springer.com/referencew... Accessed 4/1/2023.
(5) Is there any good criticism of Derrida? - Quora. https://www.quora.com/Is-there-any-g... Accessed 4/1/2023.

Here’s what ChatGPT 3.5 came up with:

Jacques Derrida was a French philosopher and literary critic who was influential in the development of a critical theory known as deconstruction. He was born on July 15, 1930, in El Biar, Algeria, and died on October 9, 2004, in Paris, France.

Derrida's work is known for its complexity and challenging nature, particularly his emphasis on the idea that meaning is not fixed and stable but is rather constantly in flux. He argued that language is inherently unstable and that there are always multiple interpretations of any given text.

Derrida's work has been both influential and controversial in the academic world. He has been praised for his innovative approach to literary theory and his contributions to the development of critical theory in general. However, his work has also been criticized for its obscurity and for what some see as its lack of practical relevance.

My own work heavily reinforces this statement: “He argued that language is inherently unstable and that there are always multiple interpretations of any given text”. The changing meaning of language and its ability to create new and complex harms is a domain I have a deep professional understanding of.

Note that my background is an undergraduate major in theoretical linguistics and cognitive science, with 3 years of original source study in the philosophies of science, language and mathematics; logic; and epistemology. My linguistic theory, being done in the early-mid 80's, was Chomskian, and my cognitive science was Functionalist. In my current outlook, I am a Peircean Pragmatist. I am pro-scientific and I believe in objective truths. So it's possible that we come from different backgrounds; I mean no offense.

Basically, Bing is noting that he was a popular philosopher. It's not saying that he was *right* or correct in his reasoning, just that he created a field. Which he did.

And all he had to do to get there was to ignore, even try to negate, several hundred years of epistemology, logic and philosophy of language. I put him up there with Ayn Rand. Derrida has led to the post-modern idea that language always means *anything the listener wants it to mean*, that everything is misunderstood, that every text "undermines itself", that there is literally no connection between words and meaning. Even authorial intent can be no guide to meaning, and since every individual is different, there are no fixed meanings to any written work. (By extension, even Derrida's work cannot be trusted.)

The problem is, he never showed his analytical work supporting these ideas, he constantly confused the philosophical and logical constructs he referred to (famously, types and tokens, sentences and utterances, but others too), and retreated into platitudes when challenged - "Of course, what I mean was just that texts *can be* misunderstood", which... Yeah, okay. Searle famously argued that because of his ignorance of the philosophy of language and related topics like predicate calculus, which he attempted to combine with his knowledge of literary criticism, he simply misunderstood topics as being unresolved, when in fact there were perfectly reasonable explanations to hand, as I noted above. (Searle's famous piece is called "Literary Theory and its Discontents", from 1974.)

(To go into it a bit, Peirce was the first to distinguish "types" from "tokens" in text and speech. A "type" is an abstract class - In the sentence "A bicycle has two tires", the words "bicycles" and "tires" are types, not referring to any particular wheels or tires. But in the sentence "Pat's red bicycle has a flat rear tire", both of these same words are now specific tokens, referring to things that exist, but also being unique in that they are different from all other uses of "bicycle" and "tire" in the same text or speech sample. So tokens signify types as words representing particular objects, and types are words that denote the *class* of those objects, but both are the same words (word forms, I guess) in text and speech. If this distinction is made, then much of Derrida's work is simply wrong, in that misunderstandings are *easily* explained in terms of formal mistakes of reference. In that sense, linguistic theory had already solved the problems Derrida wrestled with in this area back in the late 19th and early 20th centuries, and moved on. He simply never studied deeply in the field. So while Wittgenstein was nailing down the primacy of Analytical philosophy in linguistics, Sassaure and later Derrida were stuck in a rut that puzzled philosophers 80 years earlier... And in their interpretations, linguists pushed them aside and moved forward, while literary theorists and many other related fields were happy to have their fields "deconstructed" and set back decades in their relevance to the real world. Little changes from Marx to Saussure to Derrida, for example, while linguistics has progressed into an actual science in the same time period.

Like Rand, he had little interest in or knowledge of earlier work in the fields he professed expertise in. As he drew heavily from Linguistic Structualism in his work, that theory was literally crashing and burning academically in the field of linguistics. And that led to just plain bad philosophy, but it was complicated enough that people took him seriously (and on occasion, he did have some useful insights, I suppose). To my mind, the main thing he successfully deconstructed was the field of literary critical theory itself...

In the end, the biggest effect Derrida has had in philosophy was to discredit Continental and specifically French philosophy in the eyes of other Western lineages of philosophy, in a way that is perhaps undeserved. But it has been instrumental in the rise of anti-scientific sentiment in the West and for that, I consider it to be nearly as damaging as Rand's work.

Robear wrote:

Note that my background is an undergraduate major in theoretical linguistics and cognitive science, with 3 years of original source study in the philosophies of science, language and mathematics; logic; and epistemology. My linguistic theory, being done in the early-mid 80's, was Chomskian, and my cognitive science was Functionalist. In my current outlook, I am a Peircean Pragmatist. I am pro-scientific and I believe in objective truths. So it's possible that we come from different backgrounds; I mean no offense.

Honestly, thank you for this extensive reply. Now I have a lot more things to read about. I have not done extensive study of those subjects.

Spoiler:

Here’s a bit of the journey that led me to spending a lot of time focused on “the changing meaning of language and its ability to create new and complex harms”.

My academic background is primarily in computer science. My small amount of published work is mostly about improving the efficiency of processing and querying large data sources. Some of my work has been integrated into open source and commercial data storage/processing systems and is indirectly used by a very large amount of users simply because some very popular software products use those systems. It’s just a tiny part of the whole and mostly about performance optimization so I’m not trying to argue it’s massively significant. Just widely used.

After I left academics I worked for tech startups. The last startup I was at established itself as a leader in understanding harm in text content (other stuff too but primarily harm in text). We were good enough that Microsoft acquired our company and now I work at Microsoft.

I am a software engineer and my job is designing and building software systems. I am not a data scientist or an expert in the nuance of language but I am lucky enough to work with some amazing people who specialize in those things. I’ve had to learn some about those fields over the years to succeed at my job but it’s easy to be humble about my own knowledge in comparison to the specialists I work with.

How does this apply to the changing meaning of language? I am still in the business of understanding harm. People use language in different ways over time to create harm for other people. Often this results in new meaning for words or combinations of words. Sometimes in complex ways. Sometimes based on context. This happens across all languages and cultures. I’m on the very applied side of this space as I’m building production systems that produce customer value through tackling these problems.

Huh, as a light counterpoint I thought the bit about Derrida was the most interesting/important part of that LW article. The point it was making about "outside text" and AI prompts I mean - irrespective of whether that point actually has anything to do with Derrida's views. I think a lot of the fundamental magic in CS comes from the fact that there's no bright line between program instructions and program data (and hence we get the incompleteness theorem, etc), and the the fact about there's no outside text in AI prompts seems analogous to that.

That said, it's hard to look closely at anything related to lesswrong/Yudkowski without getting the creepy feeling that one is looking at a cult.

I’m curious what the impact of the new System Message feature of GPT-4 is. OpenAI describes it as a more reliable way for users to steer the model.

The more I find out about Yudkowski, the more I want to sort out where he's coming from. I'm not sure at all, at this stage, whether he's reliable or not. However, I remember reading LessWrong back in the day and being impressed by some of it. I'm reserving judgement.

If it's a cult, it's a dead cult lol.

Fenomas wrote:

I think a lot of the fundamental magic in CS comes from the fact that there's no bright line between program instructions and program data (and hence we get the incompleteness theorem, etc), and the the fact about there's no outside text in AI prompts seems analogous to that.

I am familiar with Goedel's Incompleteness Theorem (and Tarski's Undefinability, etc) as they pertain to formal systems (although it's been decades since I studied them, so I bow to your better understanding). But I don't see where program data and program instructions coincide. They seem by the very fact of their separate natures to be completely different. I can't use a pdf as an algorithm instantiated in code, and while I can use code as data (for example, in a store of libraries), those libraries are then not used for their purpose.

Where do they cross in the sense you seem to intend? Truly interested but also highly skeptical that I understand the point you are making here, because otherwise, we could simply "run data" to get results, dispensing with separate code. (We could hard-code data into a program, of course, so that side of things works, but that turns the program into a static one-use model that returns the same results every time... Which is not how we think of and use computer programs today, mostly...).

To me the bright line is "data is not code". Is that wrong?

For a long time we’ve been writing code that takes code as input and perhaps generates new code. Similarly we could write code that interprets some data as instructions so some data could also have been considered interpretable code.

Now LLMs blur things a bit more since any data may be interpreted as intent/instructions to follow and result in complex system interactions that are similar to running a program to various amounts of success. When the LLM is plugged into external systems like compilers/interpreters/APIs the possibilities expand. Some current applications of LLMs as products combine existing data (a document perhaps) and intent (as new input data) and execute instructions that never existed explicitly in the program.

Regarding your example of a PDF: A PDF is markup instructions and data for an interpreter. In a sense it’s already code. You can use a PDF’s data as instruction to an LLM since it represents an instance of a class of object and when combined with other knowledge in the LLM can be used to derive new value, such as generating a new similar PDF with some adaptation. The input PDF is an instance but a recipe can be derived from an instance and used to generate new/modified instances.

How does the Incompleteness Theorem apply? I’ve queued up some reading.

fenomas wrote:

That said, it's hard to look closely at anything related to lesswrong/Yudkowski without getting the creepy feeling that one is looking at a cult.

They characterize a lot of their stuff as using scientific methods to determine what is really true. I haven’t read much yet but I’ve bookmarked the sequences highlights page for later perusal:
https://www.lesswrong.com/highlights

If only LLMs were better at generating good TL;DRs, without losing a ton of critical info, I’d burn through my reading backlog faster.

Felnomas cited the Incompleteness theorem as seemingly falling out of a relationship between data and code? I'm curious too.

Pandasuit wrote:

For a long time we’ve been writing code that takes code as input and perhaps generates new code. Similarly we could write code that interprets some data as instructions so some data could also have been considered interpretable code.

Now LLMs blur things a bit more since any data may be interpreted as intent/instructions to follow and result in complex system interactions that are similar to running a program to various amounts of success. When the LLM is plugged into external systems like compilers/interpreters/APIs the possibilities expand. Some current applications of LLMs as products combine existing data (a document perhaps) and intent (as new input data) and execute instructions that never existed explicitly in the program.

That's well out of my area of expertise, but it could be what Fenomas meant, in which case, color me ignorant lol.

Yeah, not sure at all about "The Sequences", those are new to me.

To a general purpose computer, code and data are the exact same thing: a number stored in a memory, addressed by a number from 0 - (2^bitness-1). To a human, some of those numbers are instructions, some are input data, some are data built into the code being executed, but those are our conceptions laid onto the numbers stored in the memory.

Each clock cycle the cpu looks at its program pointer, loads the number stored in that location, converts that number to an instruction (imagine 1 means add and 2 means subtract, etc.) and executes it. Some of those instructions store data back to the memory. So a program could load, modify and store its own instructions, then execute them later. Erlang actually supports hot-swapping code on the fly without restarting the program, which is pretty cool.

Point being, code and data are the same thing in a cpu, it's just a matter of what the instructions tell the cpu to do with other numbers stored in places.

Modern cpus have some security protection mechanisms to prevent execution of memory areas that have been labeled as data only, but not everything supports those mechanisms, and they are probably hackable anyway.

I have always found the Incompleteness theories to be interesting, but not particularly useful. But one of the things that it says is that it is impossible for a program to be able to sufficiently analyze all possible programs and determine if they will terminate or run forever. The program reads in another program's code as data. It's all just numbers looking at other numbers.

It’s numbers all the way down

Thanks, Mix. I've been working at the high level of abstraction for so long, I needed that mental reset.

Sorry, my comment about outside text and data/code and incompleteness didn't have enough words.

On data/code, as Mix and Panda explained the difference is interpretation - an EXE file is "data" when you view it in a text editor and "code" when you run it. We tend to think of programs operating on data, but the fundamental magic of CS comes from the fact that any program can itself be considered data for some other program to operate on. E.g. if you look at a counter-proof of the Halting Problem, it boils down to saying "give me a program that solves the problem, and I'll give you a program that operates on yours and leads to a contradiction".

Then the connection to the incompleteness theorems is that they rely on the same kind of inversion, but for arithmetic. We normally think of numbers as data, and statements like "∀x (x=x)" as being instructions that operate on numbers. But Gödel came along and showed a way of modeling any formula as a number, for other formulas to operate on - which paved the way for the same kind of self-referential magic as running programs on programs.

Then the third side of the coin is that AI prompts have the same dynamic. We would like to imagine an AI where our prompt "You are a helpful assistant, answer the following question:" is code, and the text afterward is data for the code to operate on, and never the twain shall meet. But this is a fiction - if the data includes malicious instructions, any agent smart enough to understand those instructions is also smart enough to follow them (thereby treating its data as code).

Hope that makes more sense!

pandasuit wrote:

I’m curious what the impact of the new System Message feature of GPT-4 is. OpenAI describes it as a more reliable way for users to steer the model.

I don't think anyone knows for sure, since there's so little info available about GPT-4. But most efforts in this area apparently boil down to putting a special system token between the prompt and the user input, and doing a bunch of reinforcement training to persuade the AI it's really really supposed to follow instructions before the system token, even if later instructions say otherwise.

But AFAIK this can basically be considered an engineering technique - like the way modern OSes try to separate code from data, or the way a modern compiler might be able to solve the halting problem for certain inputs. That is, system messages probably make the AI less likely to misbehave, but it's not clear that they could ever hope to solve the problem entirely.

Thanks fenomas, as I noted, I've spent so long dealing with apps and databases and system-level stuff that I had forgotten the nuances of digging in deeper. I do appreciate the reminder.

Yeah, I'm no longer comfortable with Yudkowsky's approach to things. Sorry for the derail.

I see this hand-wringing over AI and I can't help but compare it to the doomsayers of Y2K who said that all of human civilization would collapse at the stroke of midnight. And no, I'm not exaggerating, there were plenty of people who believed that in the months leading up to Y2K.

Quintin_Stone wrote:

I see this hand-wringing over AI and I can't help but compare it to the doomsayers of Y2K who said that all of human civilization would collapse at the stroke of midnight. And no, I'm not exaggerating, there were plenty of people who believed that in the months leading up to Y2K.

Agree to disagree. After watching this video, you'll also be convinced:

Truly terrifying!

Thoughts?

Schillace Laws of Semantic AI
https://learn.microsoft.com/en-us/se...

A few points from that doc

- Don’t write code if the model can do it; the model will get better, but the code won’t.
- Trade leverage for precision; use interaction to mitigate.
- Code is for syntax and process; models are for semantics and intent.
- The system will be as brittle as its most brittle part.
- Uncertainty is an exception throw.
- Hard for you is hard for the model.
Alan Turing wrote:

I believe that in about fifty years’ time it will be possible to programme computers, with a storage capacity of about 109, to make them play the imitation game so well that an average interrogator will not have more than 70 per cent, chance of making the right identification after five minutes of questioning.

It's interesting to note that Turing said that 73 years ago, and GPT3 has roughly 1011 parameters. Not too shabby eh?