We had to do this for a link shortening system (to make sure random base64 didn't contain profanity). It was a pretty fun problem. Not just the implementation, but doing the math to make sure it didn't make our shortened links easily enumerable. The implementation wasn't too bad, but we set up logging initially to spit out any random strings it decided to block. I demo'd this in front of the whole company and live tailed the logs and the first one that popped up during the demo was a big ole F bomb. It made for an excellent demo.
Fascinating! I guess an easy solution is to inject non alpha characters into any generated string. I imagine a constraint was that you wanted them to be easy to type?
SMS is the biggest constraint. Unicode characters trigger lower segment char limits (effectively doubling the cost of a 71 char text message). And also it's important that the links can be clicked on a smartphone. So url-safe base64 (some shorteners use base62). And numbers can be N4u6hty too, so you gotta catch those cases.
Goodness the scope of the problem just exploded in my mind after this explanation.
The hardest problem with the implementation was that with a long list you can't just search for a few dozen inappropriate words (like the Twitch implementation). It would be very expensive to do hundreds or even thousands of checks against every inappropriate word.
The solution we came to was to truncate all the inappropriate words to either 3 or 4 letters and store them in a big set. We then take our generated strings, which are usually 11 characters, and break them up into all possible substrings of lengths 3 and 4. For example, 1a2b3c4d5e6 would be broken down into 1a2 a2b 2b3 b3c 3c4 c4d 4d5 5e6 1a2b a2b3 2b3c b3c4 3c4d c4d5 4d5e d5e6. An 11 character string would always have 16 such substrings. We then check all 16 against the banned set. 16 lookups into a set is pretty cheap and as we have expanded the word set over time (e.g. add a new language) our performance hasn't changed.
One drawback to our approach is that we do have false positives but we did the math and our space was still large enough, the cost of generating a new one was pretty low, and customers never see it so it's just not a big deal to throw out false positives.