Issue type Bug Have you reproduced the bug with

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Are you satisfied with the resolution of your issue? <a href="https://docs.google.

Unexpected steps_per_epoch behavior in model.fit about tensorflow HOT 8 CLOSED

varshad18 commented on April 27, 2024

Unexpected steps_per_epoch behavior in model.fit

from tensorflow.

Comments (8)

sushreebarsa commented on April 27, 2024

@varshad18 Could you please double-check your calculation for steps_per_epoch. Kindly ensure it considers the total number of samples in your dataset and the batch size.
In order to expedite the trouble-shooting process, please provide a code snippet to reproduce the issue reported here. Thank you!

from tensorflow.

NBCBM commented on April 27, 2024

epochs = 300
BATCH_SIZE = 4
train_size = trainImageTotalData
valid_size = validationImageTotalData

Calculate steps per epoch and validation steps

steps_per_epoch = train_size // BATCH_SIZE
validation_steps = valid_size // BATCH_SIZE

Optionally, you can adjust steps_per_epoch and validation_steps based on whether your dataset is shuffled

If your dataset is shuffled during training, you might want to set steps_per_epoch to None

and let it automatically determine the number of steps based on the dataset size and batch size

hist = model.fit(

x=[trainNumericData, trainImagesSBData, trainImagesCBData, trainImagesWBData, trainImagesHBData,trainImagesLLData,trainImagesLBData, trainImagesUpLeftABData,trainImagesUpRightABData, trainImagesALeftLData, trainImagesARightLData],

y=trainAllRegressionData,

epochs=epochs,

validation_data=([validationNumericData, validationImagesSBData, validationImagesCBData, validationImagesWBData, validationImagesHBData, validationImagesLLData,validationImagesLBData, validationImagesUpLeftABData, validationImagesUpRightABData, validationImagesALeftLData, validationImagesARightLData], validationAllRegressionData),

steps_per_epoch=steps_per_epoch, # Set steps_per_epoch

validation_steps=validation_steps, # Set validation_steps

callbacks=[mc, tensorboard_callback]).history

Instead of manually setting steps_per_epoch and validation_steps, you can let it automatically determine based on the dataset size and batch size

hist = model.fit(
x=[trainNumericData, trainImagesSBData, trainImagesCBData, trainImagesWBData, trainImagesHBData,trainImagesLLData,trainImagesLBData, trainImagesUpLeftABData,trainImagesUpRightABData, trainImagesALeftLData, trainImagesARightLData],
y=trainAllRegressionData,
epochs=epochs,
validation_data=([validationNumericData, validationImagesSBData, validationImagesCBData, validationImagesWBData, validationImagesHBData, validationImagesLLData,validationImagesLBData, validationImagesUpLeftABData, validationImagesUpRightABData, validationImagesALeftLData, validationImagesARightLData], validationAllRegressionData),
callbacks=[mc, tensorboard_callback]).history

from tensorflow.

sushreebarsa commented on April 27, 2024

@varshad18 Could you please share the complete code in a notebook or gist to replicate the issue reported here?
Thank you!

from tensorflow.

varshad18 commented on April 27, 2024

@sushreebarsa I double-checked my calculation for steps_per_epoch and tried using the following formula:

BATCH_SIZE = 4
train_size = trainImageTotalData
valid_size = validationImageTotalData

print("train size " + str(train_size))
print("valid size " + str(valid_size))
print("batch size " + str(BATCH_SIZE))
steps_per_epoch = (train_size / BATCH_SIZE)
validation_steps = (valid_size / BATCH_SIZE)
print("steps_per_epoch ="+str(steps_per_epoch))
print("validation_steps ="+str(validation_steps))

steps_per_epoch = math.ceil(steps_per_epoch)
#steps_per_epoch=steps_per_epoch-1
validation_steps = math.ceil(validation_steps)
#validation_steps=validation_steps-1
print("steps_per_epoch ="+str(steps_per_epoch))
print("validation_steps ="+str(validation_steps))

train size 714
valid size 89
batch size 4
steps_per_epoch =178.5
validation_steps =22.25
steps_per_epoch =179
validation_steps =23

This worked for me and is training for all 179 steps with no errors. But the most common approach is to simply exclude the last incomplete batch from training during an epoch and here if I try to exclude the last batch by (steps_per_epoch-1) I get an error as follows

KeyError: 'Failed to format this callback filepath: "/content/drive/MyDrive/FashionBody/Regression/TrainingRun/300Run2.0/checkpoint-{epoch:02d}-{val_loss:.2f}.tf". Reason: 'val_loss''

Is it okay to train 179 steps according to my train size? Or am I doing something wrong?

from tensorflow.

sushreebarsa commented on April 27, 2024

@varshad18 Could you please confirm if you are still using Keras 2 ? If so then please migrate to Keras 3 and follow this documentation here. Thank you!

from tensorflow.

github-actions commented on April 27, 2024

This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you.

from tensorflow.

github-actions commented on April 27, 2024

This issue was closed because it has been inactive for 7 days since being marked as stale. Please reopen if you'd like to work on this further.

from tensorflow.

google-ml-butler commented on April 27, 2024

Are you satisfied with the resolution of your issue?
Yes
No

from tensorflow.

Unexpected steps_per_epoch behavior in model.fit about tensorflow HOT 8 CLOSED

Comments (8)

Calculate steps per epoch and validation steps

Optionally, you can adjust steps_per_epoch and validation_steps based on whether your dataset is shuffled

If your dataset is shuffled during training, you might want to set steps_per_epoch to None

and let it automatically determine the number of steps based on the dataset size and batch size

hist = model.fit(

x=[trainNumericData, trainImagesSBData, trainImagesCBData, trainImagesWBData, trainImagesHBData,trainImagesLLData,trainImagesLBData, trainImagesUpLeftABData,trainImagesUpRightABData, trainImagesALeftLData, trainImagesARightLData],

y=trainAllRegressionData,

epochs=epochs,

steps_per_epoch=steps_per_epoch, # Set steps_per_epoch

validation_steps=validation_steps, # Set validation_steps

callbacks=[mc, tensorboard_callback]).history

Instead of manually setting steps_per_epoch and validation_steps, you can let it automatically determine based on the dataset size and batch size

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent