History | Log In     View a printable version of the current page.  
Issue Details (XML | Word | Printable)

Key: RHQ-1122
Type: Improvement Improvement
Status: Open Open
Priority: Critical Critical
Assignee: Unassigned
Reporter: John Mazzitelli
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
RHQ Project

throttle events

Created: 14/Nov/08 03:11 PM   Updated: 19/Nov/08 11:54 AM
Component/s: FX - Events, Core Server
Fix Version/s: None

Time Tracking:
Not Specified

Issue Links:
Dependency
 
Relation
 

Date of First Response: 19/Nov/08 11:24 AM


 Description  « Hide
We need a way to throttle the amount of events we store in the database.

If, for example, a JbossAS server resource is capturing events from its log4j at the WARN level, and something goes horribly wrong in that managed resource that causes WARN messages to emit infinitly, we could blow up our server by asking it to insert an abnormal amount of events. (see linked issue as an example of this happening).

We should have a threshold (perhaps configurable on a per resource basis, or on a whole event-subsystem global basis) that says, "if we get more than X events in an event report, only insert X-N events" or maybe time based in the plugin container like "if we get X events in Y seconds, only report back X-N events".

Perhaps we do some kind of filtering - if we get similar events in X seconds, only send up 1 of them.

In short, we need a throttling mechanism to avoid inserting too many events in the database.


 All   Comments   Work Log   Change History      Sort Order: Ascending order - Click to sort in descending order
John Mazzitelli - 19/Nov/08 11:20 AM
this is critical - something has to be done. After perf testing, with almost 8M rows of event data needing to be purged, it took a long time. We need to make sure we do not put too many events in the database.

I suggest for one we limit the number of events in the database to 7 days worth. See RHQ-1064

Joseph Marques - 19/Nov/08 11:24 AM
almost sounds like we need rhq_event_r01, rhq_event_r02, .... ; )

Charles Crouch - 19/Nov/08 11:54 AM
Simply limiting the amount of data in the events table to 7days worth isnt going to help by itself, e.g. we generated over 7m rows in roughly 4days. In fact this is going to reduce the usability of this feature for people with a low volume of event, but want to keep more history.

As mentioned in the main description we need to limit the rate at which events are coming in. That's the main difference I see between this and metrics is that we have *no* control over the rate at which events are added. At least with metrics we have an idea based on the default metric collection intervals and the environment size. So one option would be to put an upper limit on "event density", e.g. make sure the event table won't contain more than 1000/10000/...events in any given hour time slice. Then we can tune our purging policy with this "event density" so that we know any one run of the purge job wont ever be asked to delete more than 100k/1m rows at once.

Ensuring a maximum "event density" is going to be tough across multiple event sources, maybe we assume most people won't have any more than 20/100/.. event sources and just put a limit of 1/20th, 1/100th of the max insertion rate on any event source.