You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Deletion of the duplicative import tensorflow as tf after we train the baseline GPT model, before we train the Cerebros model, appears to lower val_binary_accuracy on the cerebros NLP model from 0.97 to 0.94.
We need to do 2 AB testing by re-applying this duplicative import, with and without the zero mask.
This is the test without the zero mask and with the duplicative import re-applied...
Details:
For some reason, all trials after we deleted the re-import of tensorflow between the GPT baseline test and the cerebros NLP trial to compare it to, the Cerebros runs without re-importing tensorflow are getting a lower val_binary_accuracy in this configuration (0.94 without re-importing tensorflow and 0.97 0n 2/3 trials with the re-import and 0.95 on the 3rd such trial). With 3 trials after deleting the re-import 2 without the zero-mask on the embedding and one with the zero mask, all getting like 0.94 val_binary_accuracy, this is probably not a spurious finding, though it could be.
There are several plausible reasons:
1. TensorFlow Session or Graph Reset
Re-importing TensorFlow between the two training tasks might reset the TensorFlow session or graph, potentially clearing any accumulated state or variables from the first model. This could lead to a "clean slate" for the second model, allowing it to train more effectively.
When TensorFlow is imported, it creates a default graph. If the first model is built on this graph and not properly cleared, it might interfere with the second model's construction.
Re-importing TensorFlow could reset the graph, eliminating any potential conflicts or memory leaks.
2. Memory Management and Garbage Collection
Re-importing TensorFlow might trigger a more thorough garbage collection or memory cleanup, which could help alleviate memory constraints.
The first model's memory allocation might not be fully released until the TensorFlow module is re-imported, potentially reducing memory fragmentation or other issues that could affect the second model's performance.
3. Random Seed and Initialization
The initialization of TensorFlow or its components might be affected by the re-import.
If the random seed is not explicitly set, re-importing TensorFlow could result in a different initialization, potentially influencing the model's performance.
4. Optimizer Initialization
The optimizer or variable initialization might be affected by the re-import.
The text was updated successfully, but these errors were encountered:
TLDR:
import tensorflow as tf
after we train the baseline GPT model, before we train the Cerebros model, appears to lower val_binary_accuracy on the cerebros NLP model from 0.97 to 0.94.Details:
For some reason, all trials after we deleted the re-import of tensorflow between the GPT baseline test and the cerebros NLP trial to compare it to, the Cerebros runs without re-importing tensorflow are getting a lower val_binary_accuracy in this configuration (0.94 without re-importing tensorflow and 0.97 0n 2/3 trials with the re-import and 0.95 on the 3rd such trial). With 3 trials after deleting the re-import 2 without the zero-mask on the embedding and one with the zero mask, all getting like 0.94 val_binary_accuracy, this is probably not a spurious finding, though it could be.
There are several plausible reasons:
1. TensorFlow Session or Graph Reset
Re-importing TensorFlow between the two training tasks might reset the TensorFlow session or graph, potentially clearing any accumulated state or variables from the first model. This could lead to a "clean slate" for the second model, allowing it to train more effectively.
2. Memory Management and Garbage Collection
Re-importing TensorFlow might trigger a more thorough garbage collection or memory cleanup, which could help alleviate memory constraints.
3. Random Seed and Initialization
The initialization of TensorFlow or its components might be affected by the re-import.
4. Optimizer Initialization
The optimizer or variable initialization might be affected by the re-import.
The text was updated successfully, but these errors were encountered: