By - Dry_Back_1116
Mhm no not really. If you have 200samples then I would go for a batch size of 200 and try to see how the metrics change over 10 epoch. Often it makes no sense to bump up epochs without knowing your domain exactly, it could be that your model is learning just for 10 epochs and then diverges
All the numbers in your comment added up to 420. Congrats!
^([Click here](https://www.reddit.com/message/compose?to=LuckyNumber-Bot&subject=Stalk%20Me%20Pls&message=%2Fstalkme) to have me scan all your future comments.) \
^(Summon me on specific comments with u/LuckyNumber-Bot.)
Why don't you perform an experiment? This is computer *science* after all :D
The interaction between batch size, epochs, and model convergence is tricky, so I won't go into too much detail. However, I will highlight a few things you should keep track of in your experiments.
First, get a test set! With machine learning and deep learning, more epochs will almost always reduce loss/error and increase accuracy with more epochs. However, this may be because of overfitting.
Secondly, plot the training loss and test loss at the end of every epoch (or after every gradient update, whatever you want). The training loss can show how quickly you are converging which is useful for measuring how many epochs you actually need. Plotting the testing loss will identify if there is any overfitting; if test loss goes down, and then back up, you are overfitting.
Finally, batch sizes decrease computational times and increase convergence rates. Generally, larger batch sizes are better. Also, keep your batch size as a factor of 2 (i.e. 8, 16, 64, 256, 1024) for weird black-magic related reasons.
I'd be curious to see the results of your experiment! If you want, feel free to let me know how it went.
You mean validation set right? *Technically*, you are not suppose to touch the test set.
Hahaha, yes! You are completely correct. Since this is hyperparameter tuning, it should all be on a *validation* set. Though, I think the OP will have bigger problems, and for the sake of just getting a model working and being comfortable with the tradeoff between batchsizes and epochs, I don't think it matters a lot.
Once OP has figured out how the parameters impact training, then they should get a proper ML pipeline going.
if your machine memory can fit a batch size equal to the dataset's size then go for it IMO. the mini-batch approach is just an approximation in case memory can't fit all training samples in one go. the number of epochs is dependent on multiple other factors. I think you could increase it as you want if you are applying early stopping so the model stops as early as it starts to converge.
I read before that increasing your batch size up to a certain point has minimal effect (sorry, don't remember source). In model training, small batch size can actually help because it introduces noise, which helps converge to a better solution.
This paper is also very interesting on the subject on batch sizes:
This is helpful. thank you.