Replies: 1 comment 7 replies
-
I don't recommend to use ReLU, since ReLU function is not derivable at x=0, and its derivative is not continuous at x<0 and x>0. However, the force is the negative energy gradient. |
Beta Was this translation helpful? Give feedback.
7 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
I am training a model using the "se_a" type. I am using "ReLu" as an activation function, and the "batch size" is 1. The problem is that the loss converges slowly and there is a significant difference between test and train loss (after 170,000 batches, the training loss is: 12, and the test loss is:13). Also, the force losses are increasing. What is your advice for better convergence?
I attached the "lcurve.out" plot for total and force losses. I would be thankful if you help me with this issue.
Beta Was this translation helpful? Give feedback.
All reactions