Comments (8)
@varshad18 Could you please double-check your calculation for steps_per_epoch. Kindly ensure it considers the total number of samples in your dataset and the batch size.
In order to expedite the trouble-shooting process, please provide a code snippet to reproduce the issue reported here. Thank you!
from tensorflow.
epochs = 300
BATCH_SIZE = 4
train_size = trainImageTotalData
valid_size = validationImageTotalData
Calculate steps per epoch and validation steps
steps_per_epoch = train_size // BATCH_SIZE
validation_steps = valid_size // BATCH_SIZE
Optionally, you can adjust steps_per_epoch and validation_steps based on whether your dataset is shuffled
If your dataset is shuffled during training, you might want to set steps_per_epoch to None
and let it automatically determine the number of steps based on the dataset size and batch size
hist = model.fit(
x=[trainNumericData, trainImagesSBData, trainImagesCBData, trainImagesWBData, trainImagesHBData,trainImagesLLData,trainImagesLBData, trainImagesUpLeftABData,trainImagesUpRightABData, trainImagesALeftLData, trainImagesARightLData],
y=trainAllRegressionData,
epochs=epochs,
validation_data=([validationNumericData, validationImagesSBData, validationImagesCBData, validationImagesWBData, validationImagesHBData, validationImagesLLData,validationImagesLBData, validationImagesUpLeftABData, validationImagesUpRightABData, validationImagesALeftLData, validationImagesARightLData], validationAllRegressionData),
steps_per_epoch=steps_per_epoch, # Set steps_per_epoch
validation_steps=validation_steps, # Set validation_steps
callbacks=[mc, tensorboard_callback]).history
Instead of manually setting steps_per_epoch and validation_steps, you can let it automatically determine based on the dataset size and batch size
hist = model.fit(
x=[trainNumericData, trainImagesSBData, trainImagesCBData, trainImagesWBData, trainImagesHBData,trainImagesLLData,trainImagesLBData, trainImagesUpLeftABData,trainImagesUpRightABData, trainImagesALeftLData, trainImagesARightLData],
y=trainAllRegressionData,
epochs=epochs,
validation_data=([validationNumericData, validationImagesSBData, validationImagesCBData, validationImagesWBData, validationImagesHBData, validationImagesLLData,validationImagesLBData, validationImagesUpLeftABData, validationImagesUpRightABData, validationImagesALeftLData, validationImagesARightLData], validationAllRegressionData),
callbacks=[mc, tensorboard_callback]).history
from tensorflow.
@varshad18 Could you please share the complete code in a notebook or gist to replicate the issue reported here?
Thank you!
from tensorflow.
@sushreebarsa I double-checked my calculation for steps_per_epoch and tried using the following formula:
BATCH_SIZE = 4
train_size = trainImageTotalData
valid_size = validationImageTotalData
print("train size " + str(train_size))
print("valid size " + str(valid_size))
print("batch size " + str(BATCH_SIZE))
steps_per_epoch = (train_size / BATCH_SIZE)
validation_steps = (valid_size / BATCH_SIZE)
print("steps_per_epoch ="+str(steps_per_epoch))
print("validation_steps ="+str(validation_steps))
steps_per_epoch = math.ceil(steps_per_epoch)
#steps_per_epoch=steps_per_epoch-1
validation_steps = math.ceil(validation_steps)
#validation_steps=validation_steps-1
print("steps_per_epoch ="+str(steps_per_epoch))
print("validation_steps ="+str(validation_steps))
train size 714
valid size 89
batch size 4
steps_per_epoch =178.5
validation_steps =22.25
steps_per_epoch =179
validation_steps =23
This worked for me and is training for all 179 steps with no errors. But the most common approach is to simply exclude the last incomplete batch from training during an epoch and here if I try to exclude the last batch by (steps_per_epoch-1)
I get an error as follows
KeyError: 'Failed to format this callback filepath: "/content/drive/MyDrive/FashionBody/Regression/TrainingRun/300Run2.0/checkpoint-{epoch:02d}-{val_loss:.2f}.tf". Reason: 'val_loss''
Is it okay to train 179 steps according to my train size? Or am I doing something wrong?
from tensorflow.
@varshad18 Could you please confirm if you are still using Keras 2 ? If so then please migrate to Keras 3 and follow this documentation here. Thank you!
from tensorflow.
This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you.
from tensorflow.
This issue was closed because it has been inactive for 7 days since being marked as stale. Please reopen if you'd like to work on this further.
from tensorflow.
Are you satisfied with the resolution of your issue?
Yes
No
from tensorflow.
Related Issues (20)
- TensorFlow lite Whisper model get worse inference result. HOT 1
- Not getting the same result when using .tflite in C and Python.
- Math problem HOT 1
- Cannot import tflite_support.task when tensorflow is installed HOT 2
- Incapable of loading a tf v1 model
- Action of the matrix exponential on vectors HOT 2
- Fetch HOT 1
- XNNPACK delegate doesn't support to broadcast last dimension for Prelu operator HOT 1
- Failing Tensorflow unit tests for BF16 hardware
- Interpreter in Swift fails with `EXC_BAD_ACCESS` at `Interpreter.copy(data, toInputAt: 0)` and `Interpreter.input(at: 0)` in some circumstances HOT 1
- Could not find any nvml.h during building from source in docker HOT 1
- Configurable attribute "deps" in @XNNPACK//:prod_microkernels doesn't match this configuration
- benchmark_model no longer cross-compiles for Android from macOS HOT 1
- Inference time using Interpreter API on Android inconsistent and 10–50 times slower than same tflite model on iOS HOT 9
- Basic regression: Predict fuel efficiency Probelm HOT 3
- [
- [Question] Is it possibile to use `jit_compile=True` (XLA) when input is a string? HOT 4
- Homepage sections displaying incorrectly on TensorFlow website HOT 3
- Wrong quantized_dimension (axis) when "per-channel" quantization
- tf.function deadlock with multiple multiprocess/threading HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tensorflow.