tensorflow – Spark knowledge conversion problem when predicting with Horovod KerasEstimator()

Spread the love


I am coaching a Keras Mannequin to construct a recommender system and operating it on Spark with Horovod and hvd.KerasEstimator().

Right here is my Estimator :

keras_estimator = hvd.KerasEstimator(
  num_proc=2,
  retailer=retailer,
  mannequin=mannequin,
  optimizer=optimizer,
  loss="mse",
  metrics=[tf.keras.metrics.RootMeanSquaredError()],
  feature_cols=['userID','itemID'],
  label_cols=['rating'],
  batch_size=512,
  epochs=5,
  verbose=1)

keras_model = keras_estimator.match(train_df).setOutputCols(['rating_prob'])

The predict operate is simply :

pred_df = keras_model.rework(test_df)

The mannequin is skilled with none downside and I’m able to get loss for each epochs however I wrestle with predictions!

The predict operate does not print any error and appears to work however the pred_df is not possible to control.

I attempted to do :

pred_df.present() or pred_df.toPandas() however every little thing increase the identical error under :

“org.apache.spark.api.python.PythonException: ‘ValueError: can not convert Spark knowledge Sort <class ‘pyspark.sql.varieties.DecimalType’> to native python sort'”

I dont perceive as a result of my train_df and test_df have the identical varieties !

I’ve tryed altering varieties with :

# reset knowledge varieties to integer and float for tensorflow
train_df = train_df.withColumn("itemID",col("itemID").solid(IntegerType())) 
    .withColumn("userID",col("userID").solid(IntegerType())) 
    .withColumn("ranking",col("ranking").solid(FloatType()))

test_df = test_df.withColumn("itemID",col("itemID").solid(IntegerType())) 
    .withColumn("userID",col("userID").solid(IntegerType())) 
    .withColumn("ranking",col("ranking").solid(FloatType()))

see knowledge varieties right here :

However the error continues to be right here..

Beneath is an instance of my knowledge :

train_df

Are you able to assist me clear up this problem please ?

Thanks upfront

I simply tried to vary column varieties nevertheless it does not change something

Leave a Reply

Your email address will not be published. Required fields are marked *