Keras LSTM Layer for RNN – Struggling with Layer Inputs

  Kiến thức lập trình

I am taking the code from Keras for the multi-layer perceptron function, and trying to adapt it to other neural networks such as a RNN with LSTM layers. This was found on the Keras PPO example:

def mlp(x, sizes, activation=keras.activations.tanh, output_activation=None):
    # Build a feedforward neural network
    for size in sizes[:-1]:
        x = layers.Dense(units=size, activation=activation)(x)
    return layers.Dense(units=sizes[-1], activation=output_activation)(x)

which is then used to define the action outputs, and actor and critic neural networks:

observation_input = keras.Input(shape=(observation_dimensions,), dtype="float32")

logits = mlp(observation_input, list(hidden_sizes) + [num_actions])

actor = keras.Model(inputs=observation_input, outputs=logits)

value = keras.ops.squeeze(mlp(observation_input, list(hidden_sizes) + [1]), axis=1)

critic = keras.Model(inputs=observation_input, outputs=value)

My goal is to mimic that structure and adapt it to other neural networks, for example:

def rnn_lstm(x, sizes, activation=keras.activations.tanh, output_activation=None):
    #Build a recurrent neural network
    for size in sizes[:-1]:
        x = layers.LSTM(units=size, activation=activation)(x)
    #return layers.LSTM(units=sizes[-1], activation=output_activation)(x)
    return layers.Dense(units=sizes[-1], activation=output_activation)(x)

A side question is to first please ask for clarification of the use the (x) Is it like a nested variable to append a layer at the end? The way I am reading this function is that it would define LSTM layers but end with a Dense layer. Where this naïve adjustment fails, is that I am struggling to adapt the input dimensions to match. Specifically, the error is:

ValueError: Input 0 of layer "lstm_27" is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 4)

I have read a few related StackOverflow responses to related question on input shape. The difficutly is that I seem to run into issues no matter how I change the observation_input variable. I have read on the docs from Keras that the input tensor for calling LSTM needs to be 3D – but exactly what is not 3D here?

Not to get distracted, but I do see that many people seem to prefer (?) defining neural networks by calling Sequential and then calling add(), and aside from this particular example, I have not seen this structure anywhere else. Is there a reason for this, do we think?