You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/equations/pymle-equations.tex
+43-3Lines changed: 43 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -2239,14 +2239,54 @@ \section{Summary}
2239
2239
2240
2240
2241
2241
2242
+
%%%%%%%%%%%%%%%
2243
+
% CHAPTER 13
2244
+
%%%%%%%%%%%%%%%
2242
2245
2243
-
\newpage
2246
+
\chapter{Parallelizing Neural Network Training with Theano}
2244
2247
2245
-
... to be continued ...
2248
+
\section{Building, compiling, and running expressions with Theano}
2249
+
\subsection{What is Theano?}
2250
+
\subsection{First steps with Theano}
2251
+
\subsection{Configuring Theano}
2252
+
\subsection{Working with array structures}
2253
+
\subsection{Wrapping things up -- a linear regression example}
2254
+
\section{Choosing activation functions for feedforward neural networks}
2255
+
\subsection{Logistic function recap}
2246
2256
2257
+
We recall from the section on logistic regression in \textit{Chapter 3, A Tour of Machine Learning Classifiers Using Scikit-learn} that we can use the logistic function to model the probability that sample $x$ belongs to the positive class (class 1) in a binary classification task:
2247
2258
2259
+
\[
2260
+
\phi_{logistic} (z) = \frac{1}{1 + e^{-z}}
2261
+
\]
2248
2262
2263
+
Here, the scalar variable $z$ is de ned as the net input:
Note that $w_0$ is the bias unit (y-axis intercept, $x_0 =1$).
2270
+
2271
+
\subsection{Estimating probabilities in multi-class classification via the softmax function}
2272
+
2273
+
The softmax function is a generalization of the logistic function that allows usto compute meaningful class-probabilities in multi-class settings (multinomial logistic regression). In softmax, the probability of a particular sample with net input z belongs to the i th class can be computed with a normalization term in the denominator that is the sum of all M linear functions:
\subsection{Broadening the output spectrum by using a hyperbolic tangent}
2278
+
2279
+
Another sigmoid function that is often used in the hidden layers of artificial neural networks is the \textit{hyperbolic tangent (tanh)}, which can be interpreted as a rescaled version of the logistic function.
0 commit comments