At my startup, we had been experiencing an intermittent user problem for the last few days, with no clue as to the cause.
Users have been reporting that their "alerts" aren't showing up when they login...sometimes....and it goes away a few minutes later....sometimes.
Anytime I hear a report like that, I start dreading the work ahead, because if it's only "sometimes", there's probably not an easy way to reproduce the bug reliably. It could be data-dependant, or a browser quirk, or dependant on the phase of the moon for all I know.
So I started digging in to our code base. Alerts were displayed based on when certain events had gone into the past. Was there a time-zone problem? Was the current time being extracted incorrectly? Was there a problem rendering the partial that handled that block of the homepage? Maybe the CSS was displaying it improperly if there were other dynamic elements on the page?
I started throwing logging in at various points through this controller action. Hours later, reports were still coming in (slowly, so not everyone was experiencing this problem), and the logs looked fine. Then I started to notice something. The people who were complaining the most, had the fewest log reports. In fact, the production log showed that one of the users who was complaining had not accessed the page in question for 2 days.
Enter the browser cache.
For some unknown reason (damned if I could figure this out), this user's browser decided to cache her homepage. She wasn't seeing the alerts because her browser had chosen to cache this page at a moment when she didn't have any. Until it decided to reload the html from that page, she would continue to see no alerts.
With some quick googling, I found a cache buster snippet, and I'll share it here in case anyone else has had the same problem. To summarize:
CLUES THAT A BROWSER CACHE IS CAUSING YOUR BUG:
1) only some users report an issue
2) issue is intermittent
3) no errors being reported from your notification software
4) logging shows no data problems or logic errors
5) symptoms indicate "time displacement" (that's what the page would have looked like "at one point")
6) serious frustration and anger on the part of the software developer
If you are a rails developer, here's the medicine (this goes in application controller):
"no-cache, no-store, max-age=0, must-revalidate"
response.headers["Pragma"] = "no-cache"
response.headers["Expires"] = "Fri, 01 Jan 1990 00:00:00 GMT"
set this as a :before_filter on any action you want to protect.