Hi @sgrvinod
Thank you for your Tutorial posted for Attention is all you need. I have a small question, and would appreciate an answer.
In data loader.py you've grouped the batches according to their lengths, so that a batch has similar lengths. Is that necessary to be done? I do understand that it speeds up the training and reduces memory. But my question is does it have any effect on the performance if I don't group the data according to the lengths?
Thanks
Hi @sgrvinod
Thank you for your Tutorial posted for Attention is all you need. I have a small question, and would appreciate an answer.
In data loader.py you've grouped the batches according to their lengths, so that a batch has similar lengths. Is that necessary to be done? I do understand that it speeds up the training and reduces memory. But my question is does it have any effect on the performance if I don't group the data according to the lengths?
Thanks