Skip to content

Commit ee91b42

Browse files
committed
adding equations from chapter 13
1 parent 573b3bc commit ee91b42

2 files changed

Lines changed: 43 additions & 3 deletions

File tree

docs/equations/pymle-equations.pdf

4.83 KB
Binary file not shown.

docs/equations/pymle-equations.tex

Lines changed: 43 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2239,14 +2239,54 @@ \section{Summary}
22392239

22402240

22412241

2242+
%%%%%%%%%%%%%%%
2243+
% CHAPTER 13
2244+
%%%%%%%%%%%%%%%
22422245

2243-
\newpage
2246+
\chapter{Parallelizing Neural Network Training with Theano}
22442247

2245-
... to be continued ...
2248+
\section{Building, compiling, and running expressions with Theano}
2249+
\subsection{What is Theano?}
2250+
\subsection{First steps with Theano}
2251+
\subsection{Configuring Theano}
2252+
\subsection{Working with array structures}
2253+
\subsection{Wrapping things up -- a linear regression example}
2254+
\section{Choosing activation functions for feedforward neural networks}
2255+
\subsection{Logistic function recap}
22462256

2257+
We recall from the section on logistic regression in \textit{Chapter 3, A Tour of Machine Learning Classifiers Using Scikit-learn} that we can use the logistic function to model the probability that sample $x$ belongs to the positive class (class 1) in a binary classification task:
22472258

2259+
\[
2260+
\phi_{logistic} (z) = \frac{1}{1 + e^{-z}}
2261+
\]
22482262

2263+
Here, the scalar variable $z$ is de ned as the net input:
22492264

2250-
\newpage
2265+
\[
2266+
z = w_0 x_0 + \dots + w_m x_m = \sum_{j=0}^{m} x_j w_j = \mathbf{w}^T \mathbf{x}
2267+
\]
2268+
2269+
Note that $w_0$ is the bias unit (y-axis intercept, $x_0 =1$).
2270+
2271+
\subsection{Estimating probabilities in multi-class classification via the softmax function}
2272+
2273+
The softmax function is a generalization of the logistic function that allows usto compute meaningful class-probabilities in multi-class settings (multinomial logistic regression). In softmax, the probability of a particular sample with net input z belongs to the i th class can be computed with a normalization term in the denominator that is the sum of all M linear functions:
2274+
2275+
$P(y=i | z) = \phi_{softmax}(z) = \frac{e_{i}^{z}}{\sum_{m=1}^{M} e_{m}^{z}}$.
2276+
2277+
\subsection{Broadening the output spectrum by using a hyperbolic tangent}
2278+
2279+
Another sigmoid function that is often used in the hidden layers of artificial neural networks is the \textit{hyperbolic tangent (tanh)}, which can be interpreted as a rescaled version of the logistic function.
2280+
2281+
\[
2282+
\phi_{tanh} (z) = 2 \times \phi_{logistic} (2 \times z) - 1 = \frac{e^{z} - e^{-z} }{e^{z} + e^{-z}}
2283+
\]
2284+
2285+
\[
2286+
\phi_{logistic}(z) = \frac{1}{1 + e^{-z}}
2287+
\]
2288+
2289+
\section{Training neural networks efficiently using Keras}
2290+
\section{Summary}
22512291

22522292
\end{document} % end main document

0 commit comments

Comments
 (0)