I am coaching a Keras Mannequin to construct a recommender system and operating it on Spark with Horovod and hvd.KerasEstimator()
.
Right here is my Estimator :
keras_estimator = hvd.KerasEstimator(
num_proc=2,
retailer=retailer,
mannequin=mannequin,
optimizer=optimizer,
loss="mse",
metrics=[tf.keras.metrics.RootMeanSquaredError()],
feature_cols=['userID','itemID'],
label_cols=['rating'],
batch_size=512,
epochs=5,
verbose=1)
keras_model = keras_estimator.match(train_df).setOutputCols(['rating_prob'])
The predict operate is simply :
pred_df = keras_model.rework(test_df)
The mannequin is skilled with none downside and I’m able to get loss for each epochs however I wrestle with predictions!
The predict operate does not print any error and appears to work however the pred_df is not possible to control.
I attempted to do :
pred_df.present()
or pred_df.toPandas()
however every little thing increase the identical error under :
“org.apache.spark.api.python.PythonException: ‘ValueError: can not convert Spark knowledge Sort <class ‘pyspark.sql.varieties.DecimalType’> to native python sort'”
I dont perceive as a result of my train_df and test_df have the identical varieties !
I’ve tryed altering varieties with :
# reset knowledge varieties to integer and float for tensorflow
train_df = train_df.withColumn("itemID",col("itemID").solid(IntegerType()))
.withColumn("userID",col("userID").solid(IntegerType()))
.withColumn("ranking",col("ranking").solid(FloatType()))
test_df = test_df.withColumn("itemID",col("itemID").solid(IntegerType()))
.withColumn("userID",col("userID").solid(IntegerType()))
.withColumn("ranking",col("ranking").solid(FloatType()))
see knowledge varieties right here :
However the error continues to be right here..
Beneath is an instance of my knowledge :
Are you able to assist me clear up this problem please ?
Thanks upfront
I simply tried to vary column varieties nevertheless it does not change something