Nina Zumel recently gave a very clear explanation of logistic regression ( The Simpler Derivation of Logistic Regression ). In particular she called out the central role of log-odds ratios and demonstrated how the “deviance” (that mysterious quantity reported by fitting packages) is both a term in “the pseudo-R^2” (so directly measures goodness of fit) and is the quantity that is actually optimized during the fitting procedure. One great point of the writeup was how simple everything is once you start thinking in terms of derivatives (and that it isn’t so much the functional form of the sigmoid that is special but its relation to its own derivative that is special).
We adapt these presentation ideas to make explicit the well known equivalence of logistic regression and maximum entropy models.In our new writeup: https://github.com/WinVector/Examples/blob/main/dfiles/LogisticRegressionMaxEnt.pdf The equivalence of logistic regression and maximum entropy models we move to multi-category modeling and demonstrate how one invents something as remarkable as logistic regression.
Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.