Primer On Tensorflow And Keras: The Past (tf1) The Present (tf2)
Solution 1:
How does TF1/TF2 work? And their differences
TF1
TF1 follows an execution style known as define-then-run. This is opposed to define-by-run which is for example Python execution style. But what does that mean? Define then run means that, just because you called/defined something it's not executed. You have to explicitly execute what you defined.
TF has this concept of a Graph. First you define all the computations you need (e.g. all the layer computations of a neural network, loss computation and an optimizer that minimizes the loss - these are represented as ops or operations). After you define the computation/data-flow graph you execute bits and pieces of this using a Session. Let's see a simple example in action.
# Graph generation
tf_a = tf.placeholder(dtype=tf.float32)
tf_b = tf.placeholder(dtype=tf.float32)
tf_c = tf.add(tf_a, tf.math.multiply(tf_b, 2.0))
# Executionwith tf.Session() as sess:
c = sess.run(tf_c, feed_dict={tf_a: 5.0, tf_b: 2.0})
print(c)
The computational graph (also known as data flow graph) will look like below.
tf_a tf_b tf.constant(2.0)
\ \ /
\ tf.math.multiply
\ /
tf.add
|
tf_c
Analogy: Think about you making a cake. You download the recipe from the internet. Then you start following the steps to actually make the cake. The recipe is the Graph and the process of making the cake is what the Session does (i.e. execution of the graph).
TF2
TF2 follows immediate execution style or define-by-run. You call/define something, it is executed. Let's see an example.
a = tf.constant(5.0)
b = tf.constant(3.0)
c = tf_a + (tf_b * 2.0)
print(c.numpy())
Woah! It looks so clean compared to the TF1 example. Everything looks so Pythonic.
Analogy: Now think that you are in a hands-on cake workshop. You are making cake as the instructor explains. And the instructor explains what the result of each step is immediately. So, unlike in the previous example you don't have to wait till you bake the cake to see if you got it right (which is a reference to the fact that you cannot debug code). But you get instant feedback on how you are doing (you know what this means).
Does that mean TF2 doesn't build a graph? Panic attack
Well yes and no. There's two features in TF2 you should know about eager execution and AutoGraph functions.
Tip: To be exact TF1 also had eager execution (off by default) and can be enabled using
tf.enable_eager_execution()
. TF2 has eager_execution on by default.
Eager execution
Eager execution can immediately execute Tensor
s and Operation
s. This is what you observed in the TF2 example. But the flipside is that it does not build a graph. So for example you use eager execution to implement and run a neural network, it will be very slow (as neural networks do very repetitive tasks (forward computation - loss computation - backward pass) over and over again).
AutoGraph
This is where the AutoGraph feature comes to the rescue. AutoGraph is one of my favorite features in TF2. What this does is that if you are doing "TensorFlow" stuff in a function, it analyses the function and builds the graph for you (mind blown). So for example you do the following. TensorFlow builds the graph.
@tf.functiondefdo_silly_computation(x, y):
a = tf.constant(x)
b = tf.constant(y)
c = tf_a + (tf_b * 2.0)
return c
print(do_silly_computation(5.0, 3.0).numpy())
So all you need to do is define a function which takes the necessary inputs and return the correct output. Most importantly add @tf.function
decorator as that's the trigger for TensorFlow AutoGraph to analyse a given function.
Warning: AutoGraph is not a silver bullet and not to be used naively. There are various limitations of AutoGraph too.
Differences between TF1 and TF2
- TF1 requires a
tf.Session()
object to execute the graph and TF2 doesn't - In TF1 the unreferenced variables were not collected by the Python GC, but in TF2 they are
- TF1 does not promote code modularity as you need the full graph defined before starting the computations. However, with the AutoGraph function code modularity is encouraged
What are different datatypes in TF1 and TF2?
You've already seen lot of the main data types. But you might have questions about what they do and how they behave. Well this section is all about that.
TF1 Data types / Data structures
tf.placeholder
: This is how you provide inputs to the computational graph. As the name suggest, it does not have a value attached to it. Rather, you feed a value at runtime.tf_a
andtf_b
are examples of these. Think of this as an empty box. You fill that with water/sand/fluffy teddy bears depending on the need.tf.Variable
: This is what you use to define the parameters of your neural network. Unlike placeholders, variables are initialized with some value. But their value can also be changed over time. This is what happens to the parameters of a neural network during back propagation.tf.Operation
: Operations are various transformations you can execute on Placeholders, Tensors and Variables. For exampletf.add()
andtf.mul()
are operations. These operations return a Tensor (most of the time). If you want proof of an op that doesn't return a Tensor, check this out.tf.Tensor
: This is similar to a variable in the sense that it has an initial value. However, once they are defined, their value cannot be changed (i.e. they are immutable). For example,tf_c
in the previous example is atf.Tensor
.
TF2 Data types / Data structures
tf.Variable
tf.Tensor
tf.Operation
In terms of the behavior nothing much has changed in data types going from TF1 to TF2. The only main difference is that, the tf.placeholders
are gone. You can also have a look at the full list of data types.
What is Keras and how does that fit in all these?
Keras used to be a separate library providing high-level implementations of components (e.g. layers and models) that are mainly used for deep learning models. But since later versions of TensorFlow, Keras got integrated into TensorFlow.
So as I explained, Keras hides lot of unnecessary intricacies you have to deal with if you were to work with bare-bone TensorFlow. There are two main things Keras offers Layer
objects and Model
objects for implementing NNs. Keras also has two most common model APIs that lets you develop models: the Sequential API and the Functional API. Let's see how different Keras and TensorFlow are in a quick example. Let's build a simple CNN.
Tip: Keras allows you to achieve what you can with do achieve with TF much easier. But Keras also provide capabilities that are not yet strong in TF (e.g. text processing capabilities).
height=64width = 64n_channels = 3n_outputs = 10
Keras (Sequential API) example
model = Sequential()
model.add(Conv2D(filters=32, kernel_size=(2,2),
activation='relu',input_shape=(height, width, n_channels)))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(filters=64, kernel_size=(2,2), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(n_outputs, activation='softmax'))
model.compile(loss='binary_crossentropy', optimizer='adam')
model.summary()
Pros
Straight-forward to implement simple models
Cons
Cannot be used to implement complex models (e.g. models with multiple inputs)
Keras (Functional API) example
inp = Input(shape=(height, width, n_channels))
out = Conv2D(filters=32, kernel_size=(2,2), activation='relu',input_shape=(height, width, n_channels))(inp)
out = MaxPooling2D(pool_size=(2,2))(out)
out = Conv2D(filters=64, kernel_size=(2,2), activation='relu')(out)
out = MaxPooling2D(pool_size=(2,2))(out)
out = Flatten()(out)
out = Dense(n_outputs, activation='softmax')(out)
model = Model(inputs=inp, outputs=out)
model.compile(loss='binary_crossentropy', optimizer='adam')
model.summary()
Pros
Can be used to implement complex models involving multiple inputs and outputs
Cons
Needs to have a very good understanding of the shapes of the inputs outputs and what's expected as an input by each layer
TF1 example
# Inputtf_in = tf.placeholder(shape=[None, height, width, n_channels], dtype=tf.float32)
# 1st conv and max poolconv1 = tf.Variable(tf.initializers.glorot_uniform()([2,2,3,32]))
tf_out = tf.nn.conv2d(tf_in, filters=conv1, strides=[1,1,1,1], padding='SAME') # 64,64tf_out = tf.nn.max_pool2d(tf_out, ksize=[2,2], strides=[1,2,2,1], padding='SAME') # 32,32# 2nd conv and max poolconv2 = tf.Variable(tf.initializers.glorot_uniform()([2,2,32,64]))
tf_out = tf.nn.conv2d(tf_out, filters=conv2, strides=[1,1,1,1], padding='SAME') # 32, 32tf_out = tf.nn.max_pool2d(tf_out, ksize=[2,2], strides=[1,2,2,1], padding='SAME') # 16, 16tf_out = tf.reshape(tf_out, [-1, 16*16*64])
# Dense layerdense = conv1 = tf.Variable(tf.initializers.glorot_uniform()([16*16*64, n_outputs]))
tf_out = tf.matmul(tf_out, dense)
Pros
Is very good for cutting edge research involving atypical operations (e.g. changing the sizes of layers dynamically)
Cons
Poor readability
Caveats and Gotchas
Here I will be listing down few things you have to watch out for when using TF (coming from my experience).
TF1 - Forgetting to feed all the dependent placeholders to compute the result
tf_a = tf.placeholder(dtype=tf.float32)
tf_b = tf.placeholder(dtype=tf.float32)
tf_c = tf.add(tf_a, tf.math.multiply(tf_b, 2.0))
with tf.Session() as sess:
c = sess.run(tf_c, feed_dict={tf_a: 5.0})
print(c)
InvalidArgumentError: You must feed a value for placeholder tensor 'Placeholder_8' with dtype float [[node Placeholder_8 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]
The reason you get an error here is because, you haven't fed a value to tf_b
. So make sure you feed values to all the dependent placeholder to compute a result.
TF1 - Be very very careful of data types
tf_a = tf.placeholder(dtype=tf.int32)
tf_b = tf.placeholder(dtype=tf.float32)
tf_c = tf.add(tf_a, tf.math.multiply(tf_b, 2.0))
with tf.Session() as sess:
c = sess.run(tf_c, feed_dict={tf_a: 5, tf_b: 2.0})
print(c)
TypeError: Input 'y' of 'Add' Op has type float32 that does not match type int32 of argument 'x'.
Can you spot the error? It is because you have to match data types when passing them to operations. Otherwise, use tf.cast()
operation to cast your data type to a compatible data type.
Keras - Understand what input shape each layer expects
model = Sequential()
model.add(Conv2D(filters=32, kernel_size=(2,2),
activation='relu',input_shape=(height, width)))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(filters=64, kernel_size=(2,2), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(n_outputs, activation='softmax'))
model.compile(loss='binary_crossentropy', optimizer='adam')
ValueError: Input 0 of layer conv2d_8 is incompatible with the layer: expected ndim=4, found ndim=3. Full shape received: [None, 64, 64]
Here, you have defined an input shape [None, height, width]
(when you add the batch dimension). But Conv2D
expects a 4D input [None, height, width, n_channels]
. Therefore you get the error above. Some commonly misunderstood/error-prone layers are,
Conv2D
layer - Expects a 4D input[None, height, width, n_channels]
. To know about the convolution layer/operation in more detail have a look at this answerLSTM
layer - Expects a 3D input[None, timesteps, n_dim]
ConvLSTM2D
layer - Expects a 5D input[None, timesteps, height, width, n_channels]
Concatenate
layer - Except the axis the data concatenated on all other dimension needs to be the same
Keras - Feeding in the wrong input/output shape during fit()
height=64
width = 64
n_channels = 3
n_outputs = 10
Xtrain = np.random.normal(size=(500, height, width, 1))
Ytrain = np.random.choice([0,1], size=(500, n_outputs))
# Build the model# fit network
model.fit(Xtrain, Ytrain, epochs=10, batch_size=32, verbose=0)
ValueError: Error when checking input: expected conv2d_9_input to have shape (64, 64, 3) but got array with shape (64, 64, 1)
You should know this one. We are feeding an input of shape [batch size, height, width, 1]
when we should be feeding a [batch size, height, width, 3]
input.
Performance differences between TF1 and TF2
This has already been in discussion here. So I will not reiterate what's in there.
Things I wish I could have talked about but couldn't
I'm leaving this with some links to further reading.
Post a Comment for "Primer On Tensorflow And Keras: The Past (tf1) The Present (tf2)"