Python) tf.keras.utils.audio_dataset_from_directory does not return correct labels

  Kiến thức lập trình

I am trying to study the code from [a tensorflow tutorial on audio recognition].(https://www.tensorflow.org/tutorials/audio/simple_audio) But it seems the tf.keras.utils.audio_dataset_from_directory() function does not output labels corresponding to the correct data. Let me extract some of the relevant codes from the tutorial:

First, setting seed and downloading the data.

# Set the seed value for experiment reproducibility.
seed = 42
tf.random.set_seed(seed)
np.random.seed(seed)

DATASET_PATH = 'data/mini_speech_commands'

data_dir = pathlib.Path(DATASET_PATH)
if not data_dir.exists():
  tf.keras.utils.get_file(
      'mini_speech_commands.zip',
      origin="http://storage.googleapis.com/download.tensorflow.org/data/mini_speech_commands.zip",
      extract=True,
      cache_dir='.', cache_subdir='data')

Then split that data into train and validation data using tf.keras.utils.audio_dataset_from_directory(), and squeeze it.

train_ds, val_ds = tf.keras.utils.audio_dataset_from_directory(
    directory=data_dir,
    labels = "inferred",
    batch_size=64,
    validation_split=0.2,
    seed=0,
    output_sequence_length=16000,
    subset='both')

def squeeze(audio, labels):
  audio = tf.squeeze(audio, axis=-1)
  return audio, labels

train_ds = train_ds.map(squeeze, tf.data.AUTOTUNE)
val_ds = val_ds.map(squeeze, tf.data.AUTOTUNE)

However, when I iterate through train_ds with something like the following code, it always prints out the EXACT same list of labels despite the example audio being different on every iteration. Of course the label does not fit the audio.

for example_audio, example_labels in train_ds.take(1):
  print(example_audio)  #the example_audio tensor is different every time i run this block
  print(example_labels)  #the example_labels is always the same (and incorrect)

I cannot turn shuffle off when using audio_dataset_from_directory() bc then training dataset would not contain audio samples of all the possible words.

I have also tried something like train_ds = train_ds.shuffle(buffer_size=len(train_ds), seed=None, reshuffle_each_iteration=False) to no avail.

It would be incredibly helpful if anyone could figure out the reason behind label and audio not matching. Or at least the reason behind why only the audio is being shuffled, and not the labels. Please tell me if I’m missing anything or didn’t explain something enough, I’m just getting started on this field of study. Thank you in advance.

New contributor

aureum is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

LEAVE A COMMENT