fixing classification_with_grn_and_vsn various errors #2010

Humbulani1234 · 2024-12-20T10:14:47Z

Dataset preparation errors

The example file from structured_data classification_with_grn_and vsn.py I think it is using the wrong dataset, i.e., the data_url: https://archive.ics.uci.edu/static/public/20/census+income.zip leads to a download of an incorrect dataset. The correct data_url, I believe should be: https://archive.ics.uci.edu/static/public/117/census+income+kdd.zip, a fix has been added.

To extract the downloaded .tar.gz file, created during a call to keras.utils.get_file, a fix has been added.

A fix was also added to clean up the directory that the files where extracted to during download in order to run the script again without errors:

Additionally, the original script has the code snippet:

train_data_path = os.path.join(
    os.path.expanduser("~"), ".keras", "datasets", "adult.data"
)
test_data_path = os.path.join(
    os.path.expanduser("~"), ".keras", "datasets", "adult.test"
)

The above snippet doesn't account for the directory created during keras.utils.get_file extraction process census+income+kdd.zip which leads to an incorrect path for both train_data_path and test_data_path, and a fix has been added.

Additional training errors

After covering the above dataset's preparation process, the script also has an additional error encountered during model training, detailed below and an attempted solution provided:

2024-12-19 21:02:15.350619: W tensorflow/core/framework/op_kernel.cc:1816] OP_REQUIRES failed at cast_op.cc:122 : UNIMPLEMENTED: Cast string to float is not supported
2024-12-19 21:02:15.350683: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: UNIMPLEMENTED: Cast string to float is not supported
Traceback (most recent call last):
  File "/home/humbulani/tensorflow-env/keras-io-master/examples/structured_data/classification_with_grn_and_vsn.py", line 513, in <module>
    model.fit(
  File "/home/humbulani/tensorflow-env/env/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 122, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/humbulani/tensorflow-env/env/lib/python3.11/site-packages/tensorflow/python/framework/ops.py", line 5983, in raise_from_not_ok_status
    raise core._status_to_exception(e) from None  # pylint: disable=protected-access
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tensorflow.python.framework.errors_impl.UnimplementedError: Exception encountered when calling Functional.call().

{{function_node __wrapped__Cast_device_/job:localhost/replica:0/task:0/device:CPU:0}} Cast string to float is not supported [Op:Cast] name:

Attempted solution:

I believe I have precisely traced the error to the following, here is a pdb script:

> /home/humbulani/tensorflow-env/env/lib/python3.11/site-packages/keras/src/models/functional.py(245)_convert_inputs_to_tensors()
-> converted = []
(Pdb) p self._inputs
[<KerasTensor shape=(None,), dtype=float32, sparse=False, name=age>, <KerasTensor shape=(None,), dtype=float32, sparse=False, name=capital_gains>, <KerasTensor shape=(None,), dtype=float32, sparse=False, name=capital_losses>, ...]

(Pdb) p flat_inputs
[<tf.Tensor: shape=(265,), dtype=float32, numpy=
array([63., 52.,  2., 45.,  0., 43., 67., 26., 29., 53., 31., 59., 57.,...>, <tf.Tensor: shape=(265,), dtype=string, numpy=
array([b' Not in universe', b' Private', b' Not in universe', b' Private',
       b' Not in universe', b' Private', b' Not in universe',...>...]

The function _convert_inputs_to_tensors creates a zip iterator pairing together flat_inputs and self._inputs, and as per the pdb output above the first element (age) from flat_inputs and self._inputs has float32 dtype, however the second element (capital_gains) has a float32 dtype and a string dtype causing the discrepancy, and hence the error.

I think the main issue is that the csv file used to create the dataset has columns arranged in a different order compared to model inputs. Model inputs have structured order, in that they are arranged from numerical to categorical features.

I've tried to rearrange the dataset dataframe to match the columns before creating the train and test csv files, but somehow pandas is merely renaming the columns without actually shifting the columns.

For more information related to original script

The original script had the following error which I didn't attend to since it has other issues before training.

Epoch 1/20
2024-12-19 20:30:24.244976: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: INVALID_ARGUMENT: Field 5 in record is not a valid float:  Never-married
Traceback (most recent call last):
  File "/home/humbulani/tensorflow-env/keras-io-master/examples/structured_data/classification_with_grn_and_vsn.py", line 514, in <module>
    model.fit(
  File "/home/humbulani/tensorflow-env/env/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 122, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/humbulani/tensorflow-env/env/lib/python3.11/site-packages/tensorflow/python/framework/ops.py", line 5983, in raise_from_not_ok_status
    raise core._status_to_exception(e) from None  # pylint: disable=protected-access
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tensorflow.python.framework.errors_impl.InvalidArgumentError: {{function_node __wrapped__IteratorGetNext_output_types_42_device_/job:localhost/replica:0/task:0/device:CPU:0}} Field 5 in record is not a valid float:  Never-married [Op:IteratorGetNext] name:

Environment

Tensorflow == 2.16.1
Python == 3.11
Keras == 3.7.0

google-cla · 2024-12-20T10:14:52Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Humbulani1234 · 2024-12-20T18:08:26Z

I have managed to resolve all the problems in these script and get the model to start training with no errors.

The issue of pandas not rearranging the order of the dataframe has been fixed. However, after rearranging the dataframe there were more fundamental issues.

The main issue was that the inputs datatype to the method Functional.call is an OrderedDict and in the function _standardize_inputs the line flat_inputs = tree.flatten(inputs) was not actually ordering/sorting the OrderedDict when flattening it as per doc for the function tree.flatten. This contributes to the mismatch between self._inputs, the models inputs, and flat_inputs. Hence a fix has been provided in the script function process to convert features to dict.

I think tree.flatten functionality must be assessed and rectified.

Humbulani1234 · 2024-12-21T23:58:48Z

I will resend another.

fixing classification_with_grn_and_vsn various errors

3a6e298

github-actions bot assigned sachinprasadhs Dec 20, 2024

rectified all the erros the file classification_with_grn_and_vsn

c3802a0

This was referenced Dec 21, 2024

Errors while executing the script movielens_recommendation_transformers.py from keras-io Humbulani1234/django-default#52

Closed

Errors running the script wide_deep_cross_networks.py in keras-io Humbulani1234/django-default#53

Closed

Humbulani1234 closed this Dec 21, 2024

Humbulani1234 deleted the classification_grn_vsn branch December 22, 2024 00:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fixing classification_with_grn_and_vsn various errors #2010

fixing classification_with_grn_and_vsn various errors #2010

Humbulani1234 commented Dec 20, 2024

google-cla bot commented Dec 20, 2024

Humbulani1234 commented Dec 20, 2024

Humbulani1234 commented Dec 21, 2024

fixing classification_with_grn_and_vsn various errors #2010

fixing classification_with_grn_and_vsn various errors #2010

Conversation

Humbulani1234 commented Dec 20, 2024

Dataset preparation errors

Additional training errors

For more information related to original script

Environment

google-cla bot commented Dec 20, 2024

Humbulani1234 commented Dec 20, 2024

Humbulani1234 commented Dec 21, 2024