Company Name : Amazon
Profile : ML Applied Scientist II, Posted On : 06-02-2024

Round I: ML Round (Hiring Manager)

  • Started with a deep dive into my Master's project (~30 minutes). The interviewer asked detailed questions and expected clear explanations.
  • He explained ongoing projects and challenges in the team, listen carefully and ask questions if possible.
  • Discussion on:
    • Automatic program correction
    • RNN, LSTM, GRU, and the vanishing gradient problem
    • Sentiment classification
    • Logistic regression
  • Approximate k-NN Problem:
    • Scenario: Billions of products, retrieve top 1 million based on user query.
    • Solution discussed: LSH (Locality Sensitive Hashing)
    • Tip: Read about the PageRank algorithm if possible.
  • Scenario Problem:
    • Given a search query, user history, and current results on Amazon, how would you build a model to suggest filters for the left panel?
    • Discussion on dataset labeling and feature engineering.

Round II: Coding Round

  • Problem 1: Given a BST, generate all possible permutations of input sequences that would result in the same BST.
  • The interviewer focused on breaking down the problem and evaluating the approach.
  • Had to write complete code (pen-and-paper), be ready for that format.
  • Problem 2: Given two strings, find all merged strings such that the relative character order from the original strings is maintained.

Round III: Director's Round (Rajeev Rastogi)

  • Briefly discussed project experience.
  • Explained and drew the Word2Vec neural network model and its workflow.
  • Clustering problem:
    • Given 1D points, cluster into 2 groups in polynomial time. 1d Clustering
    • Then extended to k-clusters using dynamic programming.
  • Bootstrap sampling: How many unique samples expected? (See: Bootstrap Sampling)
  • Large-scale retrieval: From billions of samples, retrieve top 100.
    • Solution: LSH and random projection methods.

Round IV: ML Deep Dive

  • Topics covered:
    • Linear Regression (formulation, solution methods)
    • Logistic Regression (derivation, loss functions)
    • Taylor series for vector-valued functions
    • Jensen's Inequality
    • Neural Networks & CNNs
    • Properties of positive semi-definite (PSD) matrices

Company Name : Amazon
Profile : Machine Learning Applied Scientist II, Posted On : 19-06-2024

Phone Screening Round

This rounds are designed to evaluate both the breadth and depth of your understanding of Machine Learning fundamentals. Each interview typically begins with brief introductions from both the interviewer and the candidate. This is followed by a 20 - 25 minute discussion focusing on the projects listed in your resume. Expect questions related to the project's significance and its real-world applications.

You will also be asked about various machine learning paradigms you have worked with. It's best to mention only those models you are confident discussing in detail. In my case, I discussed Decision Trees, CNNs, LSTMs, VAEs, and GANs.

Here are some of the questions I was asked during the phone screening round:

  • How is a decision tree constructed? What is mutual information?
  • Why are VAEs preferred over Autoencoders? What is the loss function minimized?
  • What is batch normalization and how does it help?
  • Compare LSTMs and GRUs.
  • What does the "peephole" keyword in TensorFlow signify for LSTMs?
  • What are the various text representation techniques? How do Word2Vec, GloVe, and FastText differ?
  • What is the difference between Eigenvalue Decomposition (EVD) and Singular Value Decomposition (SVD)? Under what conditions do they behave similarly?
  • What is the difference between bagging and boosting?
  • Explain Gradient Boosted Decision Trees (GBDT).
  • Why is padding necessary in CNNs?
  • Describe the ResNet architecture.
  • What's the difference between Logistic Regression and Linear Regression?
  • What does maximizing the likelihood function in Logistic Regression imply? What happens when you introduce a prior on the weights? Why is this prior assumed?
  • When does the Maximum Likelihood Estimate (MLE) equal the Maximum A Posteriori (MAP) estimate?
  • Why is standardizing features important?

Onsite Interview Rounds

Once you pass the initial phone interviews, the recruiter will schedule onsite interviews. These sessions delve deeper into your ML expertise while also assessing how well you can apply machine learning to scalable, real-world systems. A solid understanding of data structures and algorithms is also expected.

Round 2: Machine Learning Breadth and Depth
  • Compare modeling P(y|x) vs P(y,x). What are the pros and cons? Which algorithms typically use each approach and why?
  • Compare MLE and MAP in the context of Linear Regression. What assumptions are made about P(y|x) in Linear and Logistic Regression?
  • Explain Bayes' Rule and the concepts of posterior, likelihood, and prior.
  • What is marginalization and why is it necessary?
  • Why is computing the posterior in VAEs challenging? What methods exist to approximate it?
  • Explain batch, mini-batch, and stochastic gradient descent. When would you use each?
  • Why does stochastic gradient descent tend to oscillate near the minima?
  • How can these optimization methods be enhanced? How does the momentum term help?
  • What is the Hessian? Can it improve training speed? What are its drawbacks?
  • What are adaptive learning rates? Describe any adaptive optimization methods you know.
  • Share an instance where you went out of your way to assist someone.
Round 3: Data Structures and Algorithms
  • Find all index pairs that sum up to a target value (for both sorted and unsorted arrays).
  • Find the longest repeated substring in a string (including overlapping).
    Example:
    Input: banana
    Output: ana
Round 1: Scalable ML and Applications, System Design
  • Given user interactions on Amazon (viewing, clicking, purchasing), how would you model the probability of each event?
  • How would you build the dataset? What features would you include for users and products?
  • Would you use three separate models or a joint model for view, click, and purchase predictions?
  • How would you handle training with millions of data points?
  • What are the different downsampling techniques?
  • What characteristics should a downsampled dataset maintain?
  • If selecting 10 out of 1000 points using a coin toss with p < 0.01, what is the resulting distribution, mean, and variance?
  • How does this sampling differ from random sampling?
  • What is reservoir sampling?
  • How would you approach generating a monthly cart for a user on Amazon?
Round 4: Behavioral Interview

This round evaluates your cultural fit within Amazon. Familiarity with Amazon's 14 Leadership Principles is expected, and your responses should reflect those values.

  • Describe a time when you exceeded your assigned responsibilities.
  • What do you believe are key short-term and long-term goals? Why?
  • Give an example of a long-term goal that required sacrificing a short-term goal.
  • What does your work mean to you?
  • Why did you choose to pursue a career in Machine Learning?

Final Tip: Walk into the interview with confidence and give it your best shot.

Company Name : Amazon
Profile : ML L6, Posted On : 22-02-2025

Note: Every round involved questions tied to Amazon Leadership Principles.

Round I: ML Depth
  • Discussion focused on deep technical understanding of ML concepts.
  • Topics included:
    • Autoencoders
    • Variational Autoencoders (VAE)
    • Bayesian Regression
    • Regularization (L1, L2)
    • BERT architecture and its inner workings
Round II: ML Design
  • Problem 1: Design a product classification model for Amazon using hierarchical classification.
  • Objective: Identify product's category -> subcategory -> final label.
  • Problem 2: Given a model trained on animal sketches and inferred on real animal images, and vice versa:
    • Explain the observed high bias/variance.
    • Discuss techniques to mitigate this domain shift (e.g., data augmentation, domain adaptation, transfer learning).
Round III: Coding
  • Implement random.shuffle() function from scratch.
  • Problem: Valid Parentheses
    • Given a string of parentheses, determine if it is valid.
Round IV: ML Project + Breadth
  • ML Depth:
    • Project-specific discussion and deep dive
    • BERT architecture
    • How gradients behave through ReLU during backpropagation
    • CNN related questions
  • ML Breadth: Quick-fire Q&A style round covering:
    • Word2Vec
    • CNNs
    • BERT (again - breadth and comparison)
Final Round: Bar Raiser
  • This round focused almost entirely on Amazon's Leadership Principles.
  • Nearly 10 situational and behavioral questions assessing principles like:
    • Customer Obsession
    • Ownership
    • Invent & Simplify
    • Learn & Be Curious
    • Are Right, A Lot
    • Bias for Action

Help Community !!
Click Here to Share Your Interview Experience.