NN hidden layer to logistic regression model same as original NN?

In the ranking section of the feed-based-system segment, under the “Stacking Models and online learning”, the author suggests a Neural Network can be used for this stacking technique by feeding the hidden layer into a logistic regression model.

Similarly, for neutral networks, rather than predicting the probability of events, you can just plug-in the output of the last hidden layer as features into the logistic regression models.

Neural Networks are essentially stacked logistic regression models (or equivalents depending on whether the activation layer is sigmoid or something else like tanh or relu). Removing the last layer which predicts a single value, and replacing it with a logistic regression layer that predicts a single value seems to leave you with essentially the same thing?

Is this “online learning” concept mean that if the NN has L layers, you would train all L layers offline, then freeze L-1 and retrain the last layer online?

If there was no online learning then both concept of just adding the last layer in NN or a separate logistic regression model will pretty much mean the same thing and you are right that it won’t make much sense to do it that way.

But, in this particular discussion, the final model being proposed is a logistic regression model with online learning that essentially is a function of f (raw features, tree features, NN features). So, in this particular setting of online learning, using the last hidden layer and applying online learning on top of it should definitely be advantageous as it allows us to learn the updated weights for all the last NN layer features rather than learn the weight of one feature (if we were just using NN model final output).