Ten years ago, on November 2nd, I landed in Amsterdam with all my life packed in a couple of suitcases. An Atlantic storm had ravaged the Netherlands the day before, there were fallen trees everywhere on the streets, and the wind was so strong that, during the landing, all babies were crying (me included).
It was the beginning of my experience living abroad, I was going to stay for one year, maybe two years, three if I felt very good about it.
Now I have a house in Amsterdam and a mortgage to repay in the next 30 years.
Nobody forced me to stay ten years, it just somehow happened. I was here and it was warm and cozy (gezellig) and, even when it wasn’t, moving somewhere else was so much effort.
This is how life works most of the time, many of our long-term decisions are made with the idea of “just trying something, I can always go back”.
It’s like Amazon’s free returns policy, they understand that keeping an item I don’t like is less effort than sending it back.
The story of one of my greatest achievements
Nine years ago, I was assigned to a new team in Booking.com that had the mission of figuring out if this Machine Learning thing could be used to improve conversion (make money).
One key component of our strategy was to try to figure out what our users liked, what cities they were searching, which hotels they clicked, which ones they booked, etc etc
This sounds pretty easy in theory: you own the website and it should be pretty easy to extract this data. Except that, for a site with so many users, it’s a large amount of data and a lot of noise to cut through.
Our team was considered an “experiment”, we didn’t have enough resources to build anything sophisticated, but we got lucky: the need to run A/B tests all over the place meant that there was a way to track that information and store it for later analysis. We could hack our way by co-opting that system, it was not designed for it but, as long as we didn’t break anything, nobody would try to stop us.
This was not a complete solution, when we started trying to crunch that large amount of data we discovered that it didn’t work as well as we expected. I broke the system so often that the newly formed “Big Data” team invited me to a team retrospective to understand why I was reporting a new bug every day.
After a few months the infrastructure got much better, it was still a Rube Goldberg machine but, in my experience, all companies build a business-critical Robe Goldberg machine given enough time.
And we made money with this machine, oh boy we did, all our models and predictions depended on the data generated by the machine. In particular, two tables that tracked what was happening in the most important sections of the website.
Of course, that data wasn’t perfect, we had bugs and holes in the data, and I spent a lot of time fixing and turning knobs to make it better.
On the other hand, we were running machine learning models and didn’t need “perfect” data, we needed a representative sample.
We were not stupid, we knew that anything instrumental to the money-making engine was going to be bolted on the wall and labeled as a “magical data table that makes money for free”.
So, to avoid any confusion, we called all the tables with a clear descriptive name: team_horrible_data_log1.
And wrote clearly that this data cannot be trusted, “there will be dragons” and all the long-term nuclear waste warning messages that you can imagine.
Eventually, I left the team, handed over this pile of snakes to someone else, and didn’t think too much about it aside from the occasional question I would get from someone who had the misfortune of touching my code.
Later I was told that there was a plan to decommission those tables in favor of something more robust and generic.
“Good job guys”, I foolishly thought.
Harvey Dent’s prophecy
Three years later, I became the manager of a larger department, there were a few data science and analytics teams in there and I was very surprised to learn that they were very excited to meet me.
I was “the guy that created the tables” and, at that moment, the words of Harvey Dent echoed in my mind:
“You either die a hero or you live long enough to see yourself become the villain.”
I had lived long enough to attend a presentation that showed how my “absolutely temporary and access-restricted tables that were about to be decommissioned a couple of years ago” were not only in use but were the foundation of a myriad of business processes, some of them mission critical.
The arrows in the slide showed the various dependencies, like an infectious disease for which I was patient zero.
I was horrified to learn that now people were working full time on making sure that the Rube Goldberg machine kept working and, at the same time, working on decommissioning team_horrible_data_log.
I tried to explain that this was never supposed to happen, it was all temporary! It was a game we played when we were young and foolish!
It didn’t matter, team_horrible_data_log was now part of the foundations of the largest OTA in the world.
I left Booking.com in 2021, the decommissioning work was still ongoing.
Epilogue
July 27, 2022
My phone chimes, it’s a Whatsapp message from a friend who still works at Booking.com:
What can you tell me about team_horrible_data_log ?
This is not the real name of the table(s), I want to stay away from lawsuits so I will not share any name that might be leaking internal information