I talk a lot in this publication about AI and how I believe this technology will shape our future (for the best).
Today1 I was pondering about the trolley problem, an ethical dilemma that was formulated in 1967, and which has since then ended more than one friendship.
There is a runaway trolley barreling down the railway tracks. Ahead, on the tracks, there are five people tied up and unable to move. The trolley is headed straight for them. You are standing some distance off in the train yard, next to a lever. If you pull this lever, the trolley will switch to a different set of tracks.
However, you notice that there is one person on the side track. You have two (and only two) options:
Do nothing, in which case the trolley will kill the five people on the main track.
Pull the lever, diverting the trolley onto the side track where it will kill one person.
Which is the more ethical option? Or, more simply: What is the right thing to do?
What I find amusing is that we are very close to discovering the right solution to the dilemma, especially once we start having self-driving cars on the streets at large.
I am sure the folks at Waymo are well-acquainted with the 3 laws of robotics:
A robot may not injure a human being or, through inaction, allow a human being to come to harm.
A robot must obey the orders given to it by human beings, except where such orders would conflict with the First Law.
A robot must protect its existence as long as such protection does not conflict with the First or Second Law.
And I hope they also remember that Asimov himself had to add a “zeroth” law to make it work2:
A robot may not harm humanity, or, by inaction, allow humanity to come to harm.
I am curious about how a Waymo SUV will decide what’s good for humanity when choosing between breaking, with an high likelyhood of crashing onto another car and kill all the passengers aboard, or steering, and kill the people on the sidewalk.
As of now, it will choose to brake and call home for human intervention, but that won’t be forever. For instance, you can’t brake in the middle of a highway, and when that happens, I would like to know how they resolve the dilemma so we can stop arguing.
The AI news here is that self-driving cars are super safe and I cannot wait for a world with fewer cars and far fewer human drivers, but the bar to clear for a robot to be called “safe” will be higher than an 18-year-old getting their license.
I am not sure that, even now, I trust a human driver more than a machine.
Resources to get started
I get this question often enough that I would like to add a couple of pointers here for anybody who wants to understand the current AI technology a bit better but doesn’t want to spend 6 months studying:
Why those two? Because they go beyond the practicalities of teaching you how to subscribe to GPT-4 for 20 $/month and try to give you the bigger picture of what’s going on.
Meta tries to make cool-looking Google glasses
It’s one of the oldest trope in sci-fi: the main character gets a device that records their life in realtime, they will use it nicely at first and enjoy this new technology very much. Eventually something will go terribly wrong and the finale will be tragic.3
This hasn’t deterred Zuckerberg from launching some Ray-Ban’s that are ready to upload your life on Instagram.
I think it’s very likely that the project will crash and burn like Google Glass, but they tried very hard to fix one crucial aspect: you won’t be immediately recognized as an idiot when you enter a bar.
The new smart glasses look 90% the same as the most common Ray-Ban. However, I can’t guarantee that the folks at the bar will be happy once you start streaming on Instagram or the NSA starts mining your feed.
So, why include these glasses in this post?
Meta plans to integrate an AI assistant within the glasses. Although, I know that most people will use these glasses for the most intrusive and annoying stunts, I can’t help but think that we could be very close to a life-saving assistant that can advise you on how to cook, fix a squeaky wheel, or even help you change a tire on your car in the middle of the night.
The technology exists, it’s just a matter of putting it all together.
Some pretty solid use cases are emerging
Recently I had the pleasure of participating as a “judge” for hosted by Instruqt, a startup located here in Amsterdam.
Their objective was to evaluate the feasibility of integrating some form of AI into their product and transitioning it to production.
For me, their experience served as the conclusive evidence I needed. Amidst the hype and constant change, certain clear-use paradigms are emerging that everyone should consider:
AI Code Assistants: This should be self-explanatory. If your company produces some software and you are not using the help of AI (I strongly recommend Cursor), you're falling behind.4
RAG (Retrieval-Augmented Generation): a technique in which you can feed external data to an LLM to enhance the output. For instance, you could use your entire knowledge base to create a customer-service chatbot. This represents an interesting intersection between traditional Information Retrieval and modern AI. If you're unfamiliar with this, there are plenty of tools available, such as Llamaindex.
Agent model, OpenAI has invested significant effort into developing its "instruct" models. These are models designed to follow and generate instructions. GPT-4 excels at breaking down problems into manageable steps that can be communicated to humans and acted upon using external tools.
These paradigms are quite broad, but what excites me the most is the Agent Model. It has the potential to exponentially improve a model's output by acting as the 'brain' for a larger system equipped with various tools.
Rapid prototyping and the hyper IT self-service
As GPT-4 keeps getting better and shows strong abilities to perform coding and data-related tasks, it raises the question “Why do I have to copy-paste from the chat?”.
If I can ask GPT how to change a setting on MacOS, wouldn’t it be nice if it automatically made the change for me? Or if I am a Marketing Manager and I want to quickly create a landing page prototype, I know GPT can write the code but I don’t have enough knowledge to put it together. Wouldn’t it be nice if it simply executed the task?
Of course, ChatGPT can’t do that!
It’s a website, your laptop knows better than to allow a random website to change your display settings or access your files.
If only we could call GPT in a slightly different way and permit it to do the work…
This is what Open Interpreter does, I have tried it myself and it’s pretty astonishing.
I believe this will continue to improve, solve a bunch of problems for IT departments around the world, and allow business users to write working prototypes before anyone with any technical knowledge gets involved.
Hopefully, this will also assist numerous startups that lack a technical co-founder in getting off the ground.
Large Language fMRI
Anthropic AI is a spin out of OpenAI, its founders left because they believed that they could do a better job at saving humanity from AGI. That resulted in getting a lot of money from Google and Amazon, but I suspect that those 4B$ they got from Amazon are cloud credits to pay for network egress charges.
A few days ago Anthropic announced a breakthrough in explainability of Deep Neural Networks.
Let me try to break it down in layman terms as I understand it:
One of the big challenges with DNNs, especially from a safety and regulatory perspective, is that we can’t really say “how” they work for specific inputs
This means that if we would like to stop our LLM from being racist we can’t just point our fingers and say “here, this is the racist layer”
Many people in the scientific community had the idea that the reason was the phenomenon of “superposition”: like the human brain, neurons in a DNN fire for multiple, unrelated, inputs. So a neuron can fire when a user submits both a kitty cuddling with their owner and a racist meme.
Anthropic tried, and proved that it’s possible, to do something similar to a functional brain MRI, a technique used to highlight which parts of the human brain are used when the person is exposed to different stimuli, but for LLMs
This is potentially an important breakthrough, it might allow us to create safer AI and, crucially, improve the public perception of those “black boxes” called robots (and avoid the trolley problem etc etc).
V for Vendetta
Back in March OpenAI released a demo of GPT-4 that included the ability to transform some scribbled drawings into a website completely coded by AI.
Since then, this feature sort of vanished off the radar, until two weeks ago when they announced the release of GPT-4V(vision) — basically, the merge of all models of OpenAI (text, audio, image) into one single interface.
I gotta say, they're playing it a bit fast and loose here, touting a "new" feature when they're just making good on a promise from months back.
Regardless, the buzz about chatting with your phone in natural language is off the charts, but what's truly blowing minds is the sheer power of Dall-E 3. Feeling brave? Take a stroll over to Reddit and feast your eyes on the eldritch horrors folks are crafting with the free version on Bing.
Interestingly, Microsoft has published a technical paper on what GPT-4V can do and it’s fascinating. GPT-4V can do A LOT and the report is 166 pages, one sentence from the report that stuck with me was:
One observation about LLMs is that LLMs don’t want to succeed. Rather, they want to imitate training sets with a spectrum of performance qualities. If the user wants to succeed in a task given to the model, the user should explicitly ask for it, which has proven useful in improving the performance of LLMs
This is quite important to keep in mind when dealing with LLMs, visual or otherwise. This is a summary GPT-4 did, just in case.
I wish I could give you details from direct experience, but I am still waiting for OpenAI to give me access to GPT-4V.
So if you know Sam and could put a word for me…
This was an experiment to try out something different and a bit more editorial, let me know in the comments if you liked it!
when you are going to read this, it would be a week ago
I like to think that it was because someone pointed out to him that in the trolley problem, a robot would be able to make a decision
if you still haven’t, read "The Truth of Fact, the Truth of Feeling" by Ted Chiang, in the short stories collection “Exhalation”
no, the data privacy isn’t a real issue