What batch size to use when have millions-billions of samples?
I have a time series dataset. Each sample is 605 (LC). And I have about a billion samples.(Currently just testing with a million). Just wondering what sample size should I use? Still range from 64 to 512? Is it too small?
Neurual network how to handle the cases that the input is variable size data
I use pytorch and Neural network to do some training for the bond classification, the problem is that the input data is variable size. For example, the cash flows of bond are different, some short term bonds may have several cash flows, other long term bonds may have tens or hundreds cash flows. To handle the case, the input nodes are extends to the maximize of the cash flows and padding 0 for missing values. Anther problem is that there are many time serials data, the datetime is important for the training. For example the cash happens on certain date but different bond vary on the datetime. For Neural network input nodes, I extend the datetime to maximize of the cash flows and padding 0 fro missing values. Suppose the maximize cash flow is 100, then input nodes size is 200 like this:
[date1,date2,date3…date99,date100][cash1,cash2,cash3…cash99,cash100]
Why sometime the out_features in Linear layer is higher than in_features?
I understand that in the out_features in Linear often lower than the in_features to get more meaningful feature but some time I see the out_features is higher than the in_features, sometimes it’s equal.
For example, like the architecture below in swin transformer v2 in pytorch:
ANN training in pytorch giving me unchanged lossfunction
Hello i am learning pytorch.My code snippet is given below