These secret scanning integrations have been very helpful. We had a client ask to take a project open source recently that had started a few years ago as closed source. We of course checked over the current version of the code and have had linters in place to look for secrets for a while but not in the very early days of the project. In that one codebase we had:

- AWS IAM token for S3 upload access to a throwaway dev bucket. The bucket had already been deleted but still... Got an email about it informing me the IAM token had been revoked by AWS within 5 minutes

- A Slack webhook notification URL/secret. Committed as a example on a working branch and then git rm'ed but still active. Got an email about it and token revoked by Slack automatically within 5 minutes.

- A Mapbox API token. This one was funny. The token was indeed in there and functional but was in the docs/sample code for a dependency. Still, we got an email within the hour about it and were able to investigate.

Edit: In this case we intentionally kept the commit history. A safer alternative (and one we normally practice) is to start a fresh repo for the open source variant.

An overlooked vector is old commits. It’s often times better to squash all commits before taking a project open source, which is a real shame for obvious reasons.

Commit histories can spill a lot of secrets that are easy to overlook.

There are tools available to help look for this sort of thing (for both you and any potential attackers). TruffleHog[1] is the first one that comes to mind for me.

I also like shhgit[2] for looking for secrets in repositories. (I don't think shhgit will look back in the git history for you though).

[1]: https://github.com/dxa4481/truffleHog

[2]: https://github.com/eth0izzle/shhgit