Alert Subsystem

Alert Subsystem

  • Conditions are cached - cache is a simple POJO-based, non-transactional cache
  • Data comes either explicitly from agents only, or from either agents or server-initiated. The former uses conditions cached in the agent condition cache; the latter uses conditions in the global condition cache
  • Synchronously, alert condition cache is examined when agent messages come in and trigger a JMS queue message if condition is met
  • Out of band/JMS messages do the more expensive calculations to determine if an alert is to be triggered and triggers it.
  • Alert subsystem engine code is in the packages under org.rhq.enterprise.server.alert.engine
  • org.rhq.enterprise.server.cloud.StatusManagerBean sets status bits that indicate if a condition cache needs to be reloaded. Status bits are stored in the agent (RHQ_AGENT)
  • Data that only ever originates from agents and thus this data is checked against conditions stored in the agent condition cache
    • measurement data
    • calltime data
    • trait data
    • event data
  • Data that may originate from agents but may also originate from within the server and thus this data is checked against conditions stored in the global condition cache
    • availability data (agents availability report OR server performing backfilling due to agent-suspect-down
    • resource operations (agents reporting operation results OR server reporting operation timeouts)
    • config changes (agents reporting the changes OR server reporting config change timeouts)
  • Global condition cache is loaded at server startup and periodically refreshed
  • Agent condition cache is loaded at agent connect time
  • When conditions themselves change OR numbers related to existing conditions (like baselines) change, related caches are made stale via status bit changes
  • StatusManagerBean.updateBy will set status bits based on changing data
  • markGlobalCache - if changing an alert definition, we assume it affects the global cache, even though we don't actually know for sure if it does. We could make this more intelligent but it seems efficient enough to not worry about optimization here
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.