a-b-test-on-re-importing-tf-after-gpt-baseline #176

david-thrower · 2025-04-13T04:27:10Z

TLDR:

Deletion of the duplicative import tensorflow as tf after we train the baseline GPT model, before we train the Cerebros model, appears to lower val_binary_accuracy on the cerebros NLP model from 0.97 to 0.94.
We need to do 2 AB testing by re-applying this duplicative import, with and without the zero mask.
This is the test without the zero mask and with the duplicative import re-applied...

Details:

For some reason, all trials after we deleted the re-import of tensorflow between the GPT baseline test and the cerebros NLP trial to compare it to, the Cerebros runs without re-importing tensorflow are getting a lower val_binary_accuracy in this configuration (0.94 without re-importing tensorflow and 0.97 0n 2/3 trials with the re-import and 0.95 on the 3rd such trial). With 3 trials after deleting the re-import 2 without the zero-mask on the embedding and one with the zero mask, all getting like 0.94 val_binary_accuracy, this is probably not a spurious finding, though it could be.

There are several plausible reasons:

1. TensorFlow Session or Graph Reset

Re-importing TensorFlow between the two training tasks might reset the TensorFlow session or graph, potentially clearing any accumulated state or variables from the first model. This could lead to a "clean slate" for the second model, allowing it to train more effectively.

When TensorFlow is imported, it creates a default graph. If the first model is built on this graph and not properly cleared, it might interfere with the second model's construction.
Re-importing TensorFlow could reset the graph, eliminating any potential conflicts or memory leaks.

2. Memory Management and Garbage Collection

Re-importing TensorFlow might trigger a more thorough garbage collection or memory cleanup, which could help alleviate memory constraints.

The first model's memory allocation might not be fully released until the TensorFlow module is re-imported, potentially reducing memory fragmentation or other issues that could affect the second model's performance.

3. Random Seed and Initialization

The initialization of TensorFlow or its components might be affected by the re-import.

If the random seed is not explicitly set, re-importing TensorFlow could result in a different initialization, potentially influencing the model's performance.

4. Optimizer Initialization

The optimizer or variable initialization might be affected by the re-import.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

a-b-test-on-re-importing-tf-after-gpt-baseline #176

a-b-test-on-re-importing-tf-after-gpt-baseline #176

david-thrower commented Apr 13, 2025

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

a-b-test-on-re-importing-tf-after-gpt-baseline #176

a-b-test-on-re-importing-tf-after-gpt-baseline #176

Comments

david-thrower commented Apr 13, 2025

TLDR:

Details:

There are several plausible reasons:

1. TensorFlow Session or Graph Reset

2. Memory Management and Garbage Collection

3. Random Seed and Initialization

4. Optimizer Initialization

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.