Does it leverage deepspeed/zero 3?
That’s PyTorch only; the current models are TensorFlow.
Oh that's unfortunate, can't the models be exported to pytorch through e.g onnx?
There's a PyTorch + DeepSpeed repository here: https://github.com/EleutherAI/gpt-neox