Neural Networks
Logistic Regression
Let be a vector representing an input instance, where denotes the 'th feature of the input and be its corresponding output label. Logistic regression uses the logistic function, aka. the sigmoid function, to estimate the probability that belongs to :
The weight vector assigns weights to each dimension of the input vector for the label such that a higher magnitude of weight indicates greater importance of the feature . Finally, represents the bias of the label within the training distribution.
Q7: What role does the sigmoid function play in the logistic regression model?
Consider a corpus consisting of two sentences:
D1: I love this movie
D2: I hate this movie
The input vectors and can be created for these two sentences using the bag-of-words model:
V = {0: "I", 1: "love", 2: "hate", 3: "this", 4: "movie"}
x1 = [1, 1, 0, 1, 1]
x2 = [1, 0, 1, 1, 1]
Let and be the output labels of and , representing postive and negative sentiments of the input sentences, respectively. Then, a weight vector can be trained using logistic regression:
w = [0.0, 1.5, -1.5, 0.0, 0.0]
b = 0
Since the terms "I", "this", and "movie" appear with equal frequency across both labels, their weights , , and are neutralized. On the other hand, the terms "love" and "hate" appear only with the positive and negative labels, respectively. Therefore, while the weight for "love" () contributes positively to the label , the weight for "hate" () has a negative impact on the label . Furthermore, as positive and negative sentiment labels are equally presented in this corpus, the bias is also set to 0.
Given the weight vector and the bias, we have