Q 1. In word vector quantization, one of the problems is that the relative position of words concerning target word matters a lot. Which of the following seems good weight term to multiply with each word?
Which of the following formula is used as "forget gate" in LSTM (Long Short Term Memory Network)?
Here xt denotes input in LSTM at time t and ht denotes computed hidden state in LSTM at time t.
Which of the following is the ambiguous meaning of the given sentence?
"He never drinks coffee."
Note: There can be multiple correct answers to this question.