Get the latest tech news

PyTorch Reshaping with None

Currently I am learning attention mechanism from Dive into Deep Learning book. In the book I see following implementation in masked softmax: def sequence_mask(X, valid_len, value= -1e6): """ X is 2D array (number_of_points, maxlen), valid_len is 1D array (number_of_points)""" max_len = X.size(1) mask = torch.arange(max_len, dtype=torch.float32, device=X.device)[None, :] < valid_len[:, None] X[~mask] = value return X In sequential data processing, I mean processing natural language.

To solve that problem , we fill remaining values with a special token. If you do not know broadcast mechanism, you can read about it in PyTorch documentation. To be honest, I would prefer reshape more readable so my version of this function is :

Get the Android app

Or read this on Hacker News