Two tower candidate retriever II

Mar 23, 2023

Mixed negative sampling with Tensorflow Recommender

4 Comments

Fan, thanks for sharing another great post. Really appreciate it if you can hep confirm my understanding of the effective sampling distribution Q for an item: (frequency in training) × (B/(B+B')) + (uniform frequency in index) × (B/(B+B')). For those items not in the training dataset, it becomes just (uniform frequency in index) × (B/(B+B')).

Expand full comment

Reply (1)

Fan

Nov 11

I doesn't fully understand about your equation. For the items not in training dataset, why the second param is multiplied by (B/(B+B')). Specifically, why the numerator is also B? B is the training dataset, right?

Expand full comment

Reply (1)

Zhong Zhang

Nov 11

Ah, I see. It’s a bit confusing without the context. B refers to the size of each training batch while B’ refers to the size of each index batch. B is NOT the totally training dataset and same for B’. And the equation also has a minor issue as you noticed.

The corrected one should be:

Q(t) = (frequency of t in the training dataset) × (B/(B+B')) + (frequency of t in index dataset) × (B'/(B+B'))

Expand full comment

Reply (1)

Fan

Nov 13

Yes. I agree with your thought. For those items not in the training dataset, it becomes just the uniform distribution estimation.

Expand full comment

Be a happy and strong coder

Two tower candidate retriever II