PySpark predict_batch_udf causes Cuda oom in kubeflow pipeline, but not kubeflow jupyter
I’ve spent over 20 hours on this problem. I use a SentenceTransformer model to embed ~3 million text documents and write to opensearch. I’m using PySpark’s predict_batch_udf, and running with Kubeflow Pipeline. pytorch_model.bin is ~500MB, so probably <1G mem needed to load.