Does it leverage deepspeed/zero 3?

That’s PyTorch only; the current models are TensorFlow.

Oh that's unfortunate, can't the models be exported to pytorch through e.g onnx?

There's a PyTorch + DeepSpeed repository here: https://github.com/EleutherAI/gpt-neox