Can this be run locally without beefy GPUs by any chance?

ggml (https://github.com/ggerganov/ggml) has a GPT-J example, the 6B parameter model runs happily on the CPU 16gb of ram and 8 cores at a couple of words per second, no GPUs necessary.

    gptj_model_load: ggml ctx size = 13334.86 MB
    gptj_model_load: memory_size =  1792.00 MB, n_mem = 57344
    gptj_model_load: model size = 11542.79 MB / num tensors = 285
    main: number of tokens in prompt = 12

    An example of GPT-J running on the CPU is shown in Fig. [4](#Fig4

    main: mem per token = 16179460 bytes
    main:     load time =  7463.20 ms
    main:   sample time =     3.24 ms
    main:  predict time =  4887.26 ms / 232.73 ms per token
    main:    total time = 13203.91 ms