The server load wasn't anything extraordinary; around 100 new instances per day. We were not dealing with a high-load problem.
The symptoms we would see were as follows:
- When we sent a message to the K2 web service to start a workflow instance, we would get a response saying that the web service timed out and we would be seeing database deadlock errors.
- We would stop any running workflows and either reboot the K2 server to get the system working again. But that would only last for a few hours (or minutes) before things stopped again.
- Any existing workflows would continue running. Only new workflows could not be started.
The K2 level 1 support rep did not suggest this to us when we first showed him the problem, and it was only after days of investigation by a level 2 support rep that K2 support thought of the idea. So if you are seeing similar symptoms, check your K2 blackpearl version and try updating K2 if it's not up-to-date.