Skip to content

Commit

Permalink
Remove consolidate_results.py
Browse files Browse the repository at this point in the history
  • Loading branch information
pgmpablo157321 committed Nov 27, 2024
1 parent 9414c6e commit f360438
Show file tree
Hide file tree
Showing 3 changed files with 0 additions and 158 deletions.
11 changes: 0 additions & 11 deletions language/llama3-405b/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,17 +152,6 @@ if [ -e ${ACCURACY_LOG_FILE} ]; then
python evaluate-accuracy.py --checkpoint-path ${CHECKPOINT_PATH} \
--mlperf-accuracy-file ${ACCURACY_LOG_FILE} --dataset-file ${DATASET_PATH} --dtype int32
fi
# Optional: Create a pickled pandas DataFrame that is the original dataset with extra columns with output data from the
# accuracy run. The following columns will be added:
# - "gen_output_tok_id": A list of ints representing the tokenized output sequence.
# - "gen_output_text": A str representing the untokenized output sequence.
# - "gen_output_tok_len": An int representing the number of output tokens.
# - "rouge1": The rouge1 score for this sample
# - "rouge2": The rouge2 score for this sample
# - "rougeL": The rougeL score for this sample
# This file will by default be saved to 'full_output.pkl'. You can modify this with --output-pkl-path.
python consolidate_results.py --dataset-path ${DATASET_PATH} --model-dir ${CHECKPOINT_PATH}
```

For the GPU run - The above steps have been automated in `run_accuracy.sh`. You can also modify this script to use
Expand Down
137 changes: 0 additions & 137 deletions language/llama3-405b/consolidate_results.py

This file was deleted.

10 changes: 0 additions & 10 deletions language/llama3-405b/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,16 +80,6 @@ def postProcess(
output_seq = out_tokens
assert len(query_id_list) == len(output_seq)

# Save outputs
if not os.path.exists("run_outputs"):
os.makedirs("run_outputs")
fname = "q" + "_".join([str(i) for i in query_id_list])
fname = f"run_outputs/{fname}.pkl"
with open(fname, mode="wb") as f:
d = {"query_ids": query_id_list, "outputs": output_seq}
log.info(f"Saving outputs to {fname}")
pickle.dump(d, f)

return np.asarray(output_seq, dtype=np.int32)

def LoadSamplesToRam(self, sample_list):
Expand Down

0 comments on commit f360438

Please sign in to comment.