Since the problem statement mentions building a system for both new and existing users, it's important to note that the "cold start" problem applies to new users. The solution for new users could differ significantly from that for existing users. If this is not clearly mentioned, it's a good idea to clarify this point at the beginning of the interview.
How many active users are on the platform? How many videos will be available? Let's assume there are 1 million active users per day and 100 million videos. If each user visits the platform at least 5 times a day, we need to design a solution that operates at scale.
Will the system be deployed in an online (real-time inference) or offline setting? This is something you should clarify with the interviewer. In general, recommender systems are deployed in online settings. For this case, we will consider a real-time inference scenario.
Design a YouTube video recommendation system that serves millions of users and handles millions of videos. The system should cater to both existing and new users and be deployable in a real-time environment.
Before diving into the solution details, it's helpful to draw an end-to-end system diagram to clarify the full picture. The diagram below illustrates the high-level overview of the system. The life cycle of the complete system can be described as follows:
Once we have covered the first three steps, it's time to discuss the low-level design of the ML system. In the next section and onwards, we will dive into the detailed solution.