Sampling from a Data Stream


You are given a continuous stream of data where the total number of elements is unknown or too large to fit into memory. Your task is to design an efficient algorithm to randomly select \(k \) elements from the stream while ensuring that each element has an equal probability of being selected. This must be achieved in a single pass over the data (i.e., without storing the entire dataset).


Inputs
  • A stream of elements, arriving one by one.
  • An integer \( k\), representing the number of elements to be randomly sampled.
Output
  • A list of \( k \) elements randomly selected from the stream, where each element has an equal chance of being included in the final selection.

Code Output