The times we've seen this is when transactions are waiting on something so they pile up consuming shared memory. Do you have any database lookups or calls out to external services or scripts?
Long timeouts can also contribute if something stops responding because transactions are waiting for a long timeout to expire.
It's typically a balance between setting reasonable timeouts and allocating enough shm. In addition we implemented some watcher scripts that monitor shm and will set gflags to disable non-critical external calls beyond a certain threshold as well as send us an alert.