ESM Content Authoring Best Practices

Pavan Raja
Apr 8, 2025
26 min read

Summary:

The provided text discusses several aspects related to rule performance characteristics in a system or software that handles security events or similar data. Here's a summary based on the information provided: 1. **Rule Performance Characteristics**: - There are no logical issues with the current setup, but when attempting to put attacker-related information into an active list, it may not be effective due to the rules engine being unable to identify which piece of information to display. - Two types of partial matches can occur: join rules where one rule alias matches multiple events and aggregation thresholds where a rule requires specific event counts within given time frames (e.g., 5 events in 2 minutes). 2. **Partial Matches**: - **Join Rules**: Events are kept in memory if they partially match, but the number of aliases used should be minimized to avoid confusion and reduce computational load. Using the "Consume After Match" option per alias can help by ensuring each event is only used once for triggering rules. - **Aggregation Threshold**: Events that meet this condition are held until the specified time frame has passed, regardless of additional matches during that period. The behavior with more than required events isn't clearly defined but could affect performance or memory usage. 3. **Selectivity of Aliases (Join Rules)**: - Ensure individual aliases do not match too many events to optimize performance and reduce computational overhead. Using fewer aliases in a join rule is preferable over using three aliases compared to two, as it generally improves efficiency. - Utilize the "Consume After Match" option for each alias to ensure each event's data isn't retained longer than necessary for matching rules. 4. **Time Windows**: - Shorter time windows are preferable as they can improve performance and reduce unnecessary processing. If more than two minutes is required, consider using an active list instead of extending the time window unnecessarily. 5. **High-level join rule example**: - For instance, to detect a login event followed by a connection from that system to another sensitive system, a join rule can be used but may require a long time window. Alternatively, employing a session list and lightweight rules could efficiently monitor logins without extending the time window significantly. In summary, this text provides practical guidance on how to optimize performance when dealing with partial matches and time windows in event processing systems, emphasizing efficiency through minimizing unnecessary data handling and computational tasks. Additionally, the text discusses creating an active list (AL) for monitoring access events on a sensitive system using security log (SL) analysis. The process involves setting up a lightweight rule that updates the AL with each audit event and checking if the count of these events in a specified time window meets or exceeds a predefined threshold. Aggregation, or grouping, is another key aspect, where only necessary fields should be selected for better performance. Grouping by unique identifiers like event ID may not work efficiently due to high variability in values. Instead, consider using less varied fields such as request URL host. For single events, the On Every Event action can use these aggregated fields. Variables are a useful feature that supports different types of resources and functionalities within them, including active channels, filters, queries, and rules. There are two main classes of variables: global and local. Global variables can be used by other resources, while local variables have limited scope to the resource where they are defined. In some cases, local variables can be promoted to become global. The text also provides tips for optimizing variable naming practices and managing timestamps within specific data management scenarios: - **Variable Naming**: It is suggested that spaces can be included in variable names, particularly with global variables, as it makes the use of these variables less error-prone in velocity macros. Using spaces in variable names automatically enhances their presentation when used in other resources, eliminating the need for aliases. - **Timestamps**: When creating date/timestamp fields in queries, it is advised against chaining multiple variables to form such fields. Instead, utilize functions provided in the select Fields tab or directly use the End Time field with appropriate time-related functions to get a formatted date if needed. This approach avoids complications and ensures clarity in timestamp handling. In summary, this text provides practical tips for optimizing both variable naming practices and managing timestamps within specific data management scenarios, aiming to improve efficiency and accuracy.

Details:

This document aims to help content authors build efficient content for the ArcSight Event Security Manager (ESM) platform by providing knowledge on how to manage various scenarios effectively despite potential performance impacts. The information in this document has been compiled based on resources from an internal ArcSight wiki, which is not accessible to all users due to its restricted availability and specific audience. The content of the ESM system can be structured using several attributes and practices that contribute to coherence and ease of understanding, such as naming conventions, location within the resource tree, and management features introduced in newer versions like ESM v6.5c which facilitate sharing among independent installations. For development and testing purposes, authors commonly use personal folders for initial content creation before moving it to another area for production or sharing. It is crucial not to overlook essential elements during this process, such as filters and variables, to ensure efficiency in the ESM platform. The article highlights that when exporting content from one system and importing it into another, especially with different account IDs, it can lead to the creation of duplicate user groups in the target system due to filters being included within personal filters groups. This is not harmful but undesirable as it clutters the production system. To avoid this issue, the article suggests developing content outside of personal groups when transferring it to another independent system. It provides examples of how different teams could structure their development spaces, such as using paths like "/All s/ArcSight Solutions/development/use case/" or "/All s//development/use case/". The article also mentions that the term "Use Case" can refer to different things—from a problem statement to a collection of content addressing the issue, and even to a resource collecting related content. Regardless of how it is defined, the author recommends organizing related resources in groups as much as possible. For example, for detecting brute force login attacks, related resources could be grouped together under a use case structure. The provided text outlines a method for organizing and naming resources within an ArcSight system related to detecting attacks such as brute force logins. It suggests grouping similar types of resources together and providing clear, descriptive names for each resource. **Grouping Resources:**

**All Rules/ArcSight Solutions/Attacks on Internal Systems/Brute Force Logins/**: Includes rules related to detecting failed and successful brute force login attempts.
**All Reports/ArcSight Solutions/Attacks on Internal Systems/Brute Force Logins/**: Aggregates reports focused on identifying successful brute force attacks.
**All Queries/ArcSight Solutions/Attacks on Internal Systems/Brute Force Logins/**: Contains queries designed to analyze data related to brute force login attempts.
**All Active Lists/ArcSight Solutions/Attacks on Internal Systems/**: Lists potential attackers attempting brute force logins, including those that could be compromised accounts.
**All Filters/ArcSight Solutions/**: Includes filters used in rules and queries for detecting both failed and successful login events.

**Naming Resources:** While it is acceptable to use concise naming conventions for certain resources like variables (programmers often name them succinctly), this text recommends a balance between brevity and clarity when naming dashboard, data monitors, query viewers, active channels, and reports. Examples of **bad** names are "VPN Logins Collected Daily for the Past Week, Successful and Unsuccessful Attempts" and examples of **decent** and **better** names are "Successful and Unsuccessful Daily VPN Logins – Past Week" and "Daily VPN Login Attempts for the Past Week." The rationale behind these recommendations is to ensure that resource titles clearly communicate what each item covers, thereby facilitating easier navigation and understanding by users. This clarity helps in quick identification of relevant resources based on specific needs or alerts raised by the system. The text discusses guidelines and suggestions for creating effective titles and descriptions in reports, emphasizing brevity and relevance without omitting necessary information. It advises against using numbers like "Top 10 Attackers" as it can be better described simply by "Top Attackers." The title of a report's resource is typically its name, which should not include numbers unless they are essential for clarity. Additionally, the ability to set row limits and override the default title allows for flexibility in presenting information at runtime. Regarding descriptions, the text recommends avoiding specifics such as field names or URIs from related resources. Descriptions should be concise and avoid technical jargon that might confuse users not familiar with the terminology. It's suggested that if a resource changes, its description should also be reviewed and potentially updated. To streamline this process, it is advised to:

Avoid listing specifics like field names or specific URIs from related resources.
Do not copy descriptions directly from Word documents as these often struggle with non-ASCII characters, which can cause issues in the system.
Keep descriptions concise and avoid technical jargon that might confuse users unfamiliar with the subject matter.

Finally, the text introduces various resources available for use in a report setup. These include customers, which are particularly relevant to Managed Security Services Providers (MSSPs) but can also be used within single customer installations. The creation of a customer resource is recommended even when there are no physical or virtual divisions within an organization. All rules should aggregate on this customer resource. The text outlines guidelines for handling customer and network resources in a system, emphasizing the importance of including customer resource columns in active lists and queries to facilitate targeted information retrieval. It suggests that each customer should have their own user groups and notification destinations, with rules specifically designed to operate within these parameters. Additionally, it highlights best practices for using active lists (ALs), which are increasingly equipped with features such as multi-mapped entries and case insensitive columns. Key field selection in ALs is crucial, requiring a balance between the minimum number of fields used while ensuring unique identification and non-null event data. IP address fields should always be paired with their respective zone fields when acting as key fields. To effectively track and manage various resources like connectors and systems within your organization's infrastructure, using appropriate fields in your system (such as ArcLog for Log Management) is crucial. Here are key points from the provided text regarding how to set up such a system: 1. **Customer Field**: Always use the customer field as a primary key since it helps quickly identify and select resources associated with a specific customer or business unit. This makes managing multiple customers' data easier, especially in scenarios where different connectors or systems might be used by each client. 2. **Key Fields**: In many cases, specifying both 'key' fields (like Customer as a key) and non-key data fields is beneficial. For instance:

**Connectors**: Use columns like ConnectorID (as it's unique), ConnectorZone, ConnectorAddress, etc., where ConnectorID is the primary key, and other details help in tracking specific connector attributes.
**Systems**: The setup can vary based on what data fields are available or necessary:
**Column Set 1** includes SystemZone (key), SystemAddress (key), and SystemHostName as keys, useful when systems use static IP addresses.
**Column Set 2** uses SystemZone (key), SystemHostName (key) if DHCP is used; other details are crucial for tracking.
**Column Set 3** also considers all fields keyable: SystemZone, SystemAddress, and SystemHostName, ensuring comprehensive data capture even in less certain network conditions.
**Column Set 4** leverages Asset information where possible, with AssetID as a primary key for unique identification across the system.

3. **Non-Key Data Fields**: If additional features like case insensitive columns are needed when checking lists or entries, consider using a dummy string field to accommodate such requirements without compromising on data integrity and retrieval speed (since you cannot have duplicate keys). 4. **Consistency in Keying**: Ensure that the key fields are consistently used across different sets of configurations to maintain uniformity and accuracy in tracking changes and versions related to connectors and systems, thus facilitating better decision-making and management strategies based on real-time data. The provided text discusses several key aspects related to data monitors in a software context. Here's a summary of the main points: 1. **Maximum Alarm Frequency**: It is recommended that the Maximum Alarm Frequency should not exceed twice the Sampling Interval. This ensures that changes are noted across multiple intervals, preventing rapid fluctuations from being overlooked by the system. 2. **Resource Utilization**: Data monitors primarily consume resources in terms of process time and memory. The lightest types include Last State (number of groups), Event Graph (number of groups and nodes), and Reconciliation (number of groups). Heavier types are Moving Average, Statistics, and Top Value DMs which require more computational resources based on the number of groups and size of buckets. 3. **Performance Impact**: The heaviest data monitors are crucial for correlation but also have a significant performance impact. They consume memory when caching events. It is important to be aware of this impact and monitor their efficiency. 4. **Managing Data Monitors**: You can use the manage.jsp page to view the number of data monitors (DMs) and their names. This helps in understanding the system's load due to DMs. The tool allows you to click on specific components like DataMonitorProbeRegistry and DataMonitorProbe to get detailed information about each DM, including performance metrics. By following these guidelines, one can optimize the use of data monitors for better performance and accuracy without overloading the system with unnecessary resource usage. The article discusses the comparison between Last N Events Data Monitor (DM) and Query Viewer (QV) for viewing recent events. It highlights potential issues with the Last N Events DM when dealing with high event rates or when aiming to view all events from the last minute, suggesting that a QV might be more appropriate in such cases. The article also emphasizes that QVs have an advantage over DMs as they can collect data across a longer time range and are less reliant on checkpoints like DMs, but if the goal is simply to show recent events without concern for event rates or N value, the Last N Events DM could be more efficient in terms of processing. The passage discusses the use of filters across various resources such as rules, queries, data monitors, and more. It highlights that filters can utilize both local and global variables but have restrictions when certain conditions or variable functions are used. For instance, using "InActiveList" condition in a filter disqualifies it from being used in active channels, while arithmetic or Java Mathematical Expression (JME) cannot be employed in queries or trends, reports, or query viewers due to their specific limitations in these contexts. The passage suggests that for optimal functionality across multiple resources like rules and data monitors, filters should be designed with rule conditions in mind, as they are optimized for such use cases. It advises simplifying query conditions by avoiding complex variable functions and lookups to improve performance and usability. It warns against the misuse of active lists within filters, as this practice renders them ineffective in active channels. If an active list must be referenced, using "GetActiveListValue" function can provide a workaround for compliance with the rule that prohibits referencing active lists in active channels. Lastly, the passage does not cover everything mentioned in the original text but provides a summary of key points about filter usage and restrictions across different resources within a system. To address issues related to pre-populated active lists and cross-package contamination, follow these steps: 1. **Using Export Format for Initial Package**: Create a first package using the "export" format, which excludes all list data. This is done by default when setting up the export without including any active or session list data. Ensure to check the box "exclude reference IDs". 2. **Setting Up Second Package**: Use the second package in the same way but with the "default" format that includes list data. You will need this second package to include specific pre-populated active lists where needed. The export for this second package should be set up as a dependency when exporting the first package. 3. **Excluding Pre-populated Lists in First Package**: Exclude all active lists from the first package that are meant to have static data entries, as these will not be included in the exported format by default. Ensure that TTLs for these pre-populated lists are set to 0. 4. **Including Specific Lists in Second Package**: Explicitly include the individual active lists with static data in the second package where they are needed for lookup or checking conditions. 5. **Exporting Packages with Dependencies**: When exporting the first package, make sure it includes a reference to the second package within its .arb file to ensure both packages are included together when managing resource dependencies. 6. **Removing Event Fields from Package**: To avoid issues during uninstallation due to locked group fields, follow these steps in the package editor:

Navigate to the "Resources" tab.
Sort and add "/All Fields/ArcSight System/Event Fields" under the "Removed Resource" column while checking the "If Not Included" checkbox. This step ensures that Event Fields are not included by default, preventing potential errors during package installation or uninstallation.

7. **Preventing Cross Package Contamination**: For maintaining packages over time, ensure that resources only appear in one specific package to avoid any contamination where modifications across different packages could lead to inconsistencies or conflicts between the versions of a resource present in multiple packages. The provided text highlights the importance of carefully selecting and managing resources within packages, particularly when dealing with filters that might be part of larger system components like the ArcSight Core package. The summary emphasizes the following points: 1. **Package Resources Management**: It is crucial to review the resources included in a package through the package navigator to avoid unnecessary exports or imports. Exclude any resources that are not required for the specific functionality being packaged, such as filters or field sets that should not be part of a custom intrusion monitoring (IM) package if they belong to another package like ArcSight Core. 2. **Example of Package Modification**: A screenshot demonstrates how to properly modify an Intrusion Monitoring package by removing unnecessary resources from the 'Removed Resources' tab in the package navigator, specifically mentioning the removal of the '/All Filters/ArcSight System' entry. This example shows how to prevent including extraneous elements that could cause conflicts or data loss upon reinstallation. 3. **Potential Issues with Unnecessary Filters**: Including filters (like the Non-ArcSight Internal Events filter) from other packages can pose risks, such as overwriting modifications when reinstalling the package or uninstalling and reimporting it. This is because these filters are not part of the custom intrusion monitoring package and might be updated or changed in future versions without preserving local changes. 4. **Query Performance Optimization**: The text discusses how content authors can influence query performance, especially with variables that use functions like 'get list data'. It suggests optimizing queries by simplifying conditions to reduce processor usage, although it acknowledges the limitation of not always avoiding expensive operations entirely. 5. **Conclusion and Recommendations**: The summary concludes with a note on the challenges in detecting such issues without constant manual checks and recommends careful resource management within packages to avoid potential problems. This includes being vigilant about which filters or resources are included, especially when they belong to other system components. Overall, the text underscores the importance of package design and maintenance in ensuring optimal performance and avoiding conflicts in large security information management systems like ArcSight. In software engineering, there's a light-hearted saying about optimization: "Do not optimize anything." This is because when working with queries in systems like ArcSight ESM, the conditions in the query editor are fundamentally different from those in rule editors. While the interfaces may look similar, they operate at two distinct levels: rules work within live event flows (in memory), whereas queries are translated into SQL for database interactions. This distinction is crucial because reordering conditions in a query editor has minimal impact on the generated SQL, which is optimized by ArcSight ESM's query generator specifically for Oracle or CORRe databases. There isn't short-circuit logic evaluation in SQL like there is in rules, so it's advisable to approach queries as if writing them for rules. When you want to see the results of a query, running QV (Query Visualization) or reports using that query will show the raw SQL generated by ArcSight ESM in the server.log file. This can lead to two possible reactions: either disinterest due to the complexity displayed, or an interest in mastering SQL. Regarding data manipulation and queries, it's often recommended to use Sum(Aggregated_Event_Count) instead of Count(EventID), especially when dealing with aggregated events. The query generator automatically adjusts for this, making both metrics equivalent; however, Sum(Aggregated_Event_Count) is considered more intuitive for users since it directly reflects what’s happening at the event level. If fixing a resource using Count(EventID) is convenient, then go ahead, but there's no need to overhaul all your content. The provided text discusses several aspects related to handling SQL queries and reports in a specific context, likely within an information management system or database environment. Here's a summary of the key points: 1. **Complexity of SQL Translation**: Dealing with the code that translates SQL from query objects is highly complex, making it challenging for non-experts to significantly enhance query performance through manual adjustments. It notes some general considerations such as the high cost of operations like AL lookups and certain variable functions, which should be taken into account when managing queries. 2. **Short-Circuit Evaluation**: This concept does not automatically apply to all conditions in SQL queries. Developers need to consider this when designing query structures. 3. **Reports vs. Queries**: While it's possible to create reports using lists or trends without directly querying the data, such methods have limitations. Writing a query for a report provides more flexibility and is generally recommended, especially for parameterization and reuse in other reporting contexts like query viewers or related trend creation. 4. **Custom Parameters**: Custom parameters can be used to link multiple queries by synchronizing their start and end times. It's important not to include spaces in custom parameter names (e.g., use "StartTime" instead of "Start Time") and to ensure that the custom parameter applies to all relevant queries within a report or system. 5. **Charts and Data Selection**: When using charts, it’s advisable to avoid selecting too many fields as this might lead to data redundancy or confusion in presentation. Developers should consider the implications of field selection on both query performance and chart display clarity. In summary, the text highlights several practical considerations for improving query management and reporting efficiency within a system, emphasizing expert knowledge of SQL operations and flexibility in report generation through structured querying. The text discusses best practices for creating charts with limited data series, emphasizing the use of queries that select only two fields when dealing with a single series chart. It suggests avoiding confusion by ensuring proper grouping through SELECT fields in SQL queries and minimizing differentiation caused by other fields. For instance, it advises using wildcard characters like '%' in parameterized queries to allow for flexibility when focusing on specific data related to individual customers or zones. This method allows easier testing of results and adjustment without changing the entire query setup. The text recommends enhancing usability with a filter system that defaults to "All Customers" but can be adjusted for any single customer, simplifying report creation and management. Additionally, it highlights how limiting data to a single entity (like a customer) while including their name in table displays might appear unconventional or confusing due to the fixed nature of such fields within tables. However, these practices help maintain clarity and ease of use when filtering specific segments of data for analysis or reporting purposes. The summary discusses various aspects related to creating and managing reports in a system, focusing on the creation of two specific reports ("My Important Events" and "My Important Events for All Customers") and techniques for customizing chart titles. It also covers best practices for setting the start time for trends and backfilling data. For report generation, the author suggests creating two separate reports: one tailored to individual customers and another that includes all customers without specific filtering capabilities. The trick here is using wildcard characters in queries to filter data based on certain criteria. Regarding chart titles, it is recommended to make them dynamic by incorporating custom parameters (like "ChartRowLimit") which allows users to adjust the settings without altering the underlying report configuration. This flexibility enhances usability and user engagement. In terms of trend management, start time can be set in two ways: via the schedule tab or through imported trend attributes. The hour:minute:second should ideally be set to 12:00:00 AM for optimal data collection across days or hours. For backfilling data into trends, the ability to set a past date as the start time is available but limited by online retention policies. In summary, this guide provides practical advice on enhancing report functionality and managing data overviews in the system through strategic use of parameters, dynamic attributes, and adherence to best practices for trend initiation. The article explains how the "Prioritized Attack Counts by Service" trend works in terms of data collection and display. By default, this trend starts collecting data from one day before the current date ($Today - 1d). If installed on March 29th, it will collect data starting from March 28th. The trend is disabled by default, so if enabled later, say on April 3rd at 2:47 PM, its first run will be scheduled for midnight of April 4th. The trend will then backfill data starting from this date. When the trend data is checked on subsequent days (April 4th and April 5th), it should display the appropriate historical data as per the schedule. To get data from before the start time into the trend, one can change the trend's start time. For trends that run daily with a 24-hour interval, ensure to select appropriate time parameters in your query to avoid performance issues. The timestamp field in the query can be used to determine the day for the report, and adjustments might need to be made depending on whether the trend is for long term data collection or shorter term results like hourly over a day. When working with trend queries, there are specific timestamp options to consider based on whether you're querying events or resources like assets and cases. For events, you can use either the End Time or the Manager Receipt Time as your timestamp field. Resources such as assets, cases, notifications, etc., typically have a Creation Time or Modification Time that can be used for snapshot trends in queries. It's important to avoid using time functions like Hour, Day, etc., within trend queries unless you specifically need to aggregate and group by this time. Using these functions may store the value as a string, making it difficult to filter rows based on specific time or date ranges. Instead, use the full timestamp for flexibility in reports and other uses. In rule-based systems, there are four critical areas: Type, Conditions, Aggregation, and Actions. Rules can incorporate variables from these areas to carry out their functions effectively. Confusion often arises between correlation events and correlated events. Correlation events are specifically produced by a rule or data monitor as part of the correlation process. They should not be confused with audit events which are related to the same resources but for different purposes. The event name in a rule-based correlation typically matches the rule's name, while moving average data monitors might generate correlated events based on specific criteria. In the context of Event-driven System Management (ESM) version 6.5, rules play a significant role in processing events, primarily through three types: Standard, Lightweight, and Pre-Persistence. These rules have specific functionalities and constraints to manage event consumption and correlation effectively. ### Rule Types 1. **Standard Rules**: These are the most flexible type of rules that can perform any function typically associated with rule execution. They offer extensive capabilities for data manipulation and processing without restrictions on actions or triggers. 2. **Lightweight Rules**: Designed for maintaining active and session lists, these rules do not automatically send correlation events upon triggering. They are ideal for simple tasks like adding or removing data from a list but should be converted to lightweight unless necessary for other rule consumption. 3. **Pre-Persistence Rules**: Similar in function to Lightweight rules but with the limitation that they can only set event fields. Setting these fields requires caution as there's a risk of overwriting existing data. Any changes affecting events from specific devices need reevaluation whenever updates occur in the device, connector, or parser configurations. ### Testing Guidelines

**Rule Testing**: While standard rules have well-established testing methods, lightweight and pre-persistence rules require different approaches for effective testing:
Begin by treating them as standard rules but with specific trigger settings (e.g., On Every Event for Lightweight Rules, Set Event Field actions for Pre-Persistence Rules).
Monitor correlation events to ensure modifications are correctly applied in pre-persistence rules.
Transition from standard to lightweight or pre-persistence after confirming correct operation by checking the impact on event fields and correlations.

This structure ensures that while pre-persistence and lightweight rules have limitations, they can still be effectively utilized with specific testing procedures designed for their unique operational characteristics. This text is about creating rules for handling events, especially in security systems like SIEM (Security Information and Event Management) tools. It's important to categorize these rules properly because it helps the system understand how to deal with the event data. There are two main types of rule-firings based on their purpose: 1. **Categorization for Reporting**: This is used when you want to simply report on a specific condition without expecting any further actions or conclusions from other rules. The categories in this case include Object, Behavior, Device Group, Significance, and Outcome. These help specify the type of event being reported (e.g., Host/Application), how it behaves (/Execute/Response), which device group is involved, its significance level (/Information/Warning), and whether it was successful or not (/Success). 2. **Aggregation for Consumption by Other Rules**: This is used when a rule wants to pass on the information to other rules to draw further conclusions or perform additional actions. In this case, you would categorize based on Device Group (e.g., /Security Information Manager) and fill in the rest of the categories from the base events. For example, if your rule aggregates failed login attempts over a certain time, it might set Category Behavior as /Authentication/Verify, Category Technique as /Brute Force/Login, and Category Outcome as /Attempt. This allows other rules to consume this event and potentially find more connections or confirm actions (like concluding successful brute force attacks with subsequent successful logins). The text emphasizes that categorizing events properly is crucial because it helps the system understand the context of the data and how different parts of the system should interact based on these categories. This ensures that rules can either report specific conditions, initiate further analysis by other rules, or confirm complex scenarios like brute force attacks. The performance testing conducted on the rule engine revealed that its processing is directly linked to the sequence and specificity of the rules examined. Generally, more specific conditions at the beginning tend to reduce unnecessary checks, while more general conditions towards the end or bottom may lead to increased system load. To optimize this process, implement best practices such as placing specific conditions upfront and general ones towards the end; also, consider moving less frequent but expensive conditions to the latter parts of the rule. As an example using ArcSight Administration's "/All Rules/Real-time Rules/ArcSight Administration/ESM/System Health/Resources/Rules/Excessive Rule Recursion," adjustments should be made: Replace "Type = Base" with "Type != Correlation" because only a few audit events meet this condition, placing it at the top. The specific condition for the base event generated by the system should also be prioritized and simplified by removing "

" to reduce processing time. For rearranging conditions in the rule editor, consider using temporary filters or saving after every change to prevent data loss. Adjusted rule conditions might look like: "Type != Correlation" (top), "Device Event Category = /Rule/Warning/Loop" (immediately below), and "Type = Base" (bottom). This configuration maximizes efficiency by focusing on the most significant conditions first, followed by specifics for a base event that can be more broadly defined. The text discusses the performance costs associated with various conditions in a rules engine and provides insights into why certain operators are more expensive than others. It highlights that InActiveList and HasVulnerability are among the most costly conditions to evaluate due to their complexity. MatchesFilter can also be expensive, especially when it involves heavy variable functions or complex filters. The least expensive condition types mentioned are String Operators, which suggests they are relatively straightforward in terms of evaluation time. The text also mentions that while there is a cost associated with the MatchesFilter, it's not fully detailed, and more data on timing values by type would be beneficial for better understanding. Additionally, the document briefly touches on how rules operate under AND and OR conditions, explaining short-circuit evaluation: in an AND condition, if one part evaluates to false, the entire condition is false regardless of the other part; similarly, in an OR condition, if one part is true, the whole condition is true without considering the second part. In summary, this text provides a basic overview of how different conditions and operators perform in terms of cost and efficiency within a rules engine environment, with recommendations for optimizing performance based on the relative costs of various operations. The article emphasizes the importance of correctly ordering conditions in logical expressions (OR conditions) for optimal performance. It suggests evaluating less expensive and more effective conditions earlier to minimize processing time. This is based on balancing two concepts: the cost of evaluating each condition and their effectiveness in eliminating potential events. For example, if Condition A takes 2 time units to evaluate but matches only 90% of events while Condition B, which takes 3 time units to evaluate, can eliminate up to 90% of all events, it's more beneficial to prioritize evaluating Condition B first because its effectiveness is higher. This reduces the overall processing time for a large set of data (e.g., thousands of events). The article also provides a practical demonstration using two conditions where:

Condition A takes 2 time units and matches 90% of potential events.
Condition B takes 3 time units and matches only 50% of potential events.

Considering the evaluation order, if you first evaluate Condition A (with an initial minimum cost of 2,000 time units per thousand events), then Condition B will be evaluated 900 times, adding additional processing time to reach a total of 4,700 time units for this rule. Conversely, if you start with evaluating Condition B first (with an initial cost of 3,000 time units), and then evaluate Condition A on the remaining events (500 times), the total processing time would be 3,500 time units. This highlights that placing more expensive conditions early can significantly increase computational costs without providing proportional value, while a reverse order can optimize performance based on the specific needs of the conditions and their effectiveness in reducing potential matches. This text is about optimizing rule processing in a system that involves analyzing thousands of events to identify specific patterns or threats. The author suggests several strategies to improve efficiency and effectiveness: 1. **Rule Ordering**: Prioritize rules by placing "Device Product = ArcSight" before "Type != Correlation". Additionally, move the "Target Address in the Hostile List" lower in a conditional list as it is more computationally expensive than simple field checks. 2. **Device Event Category (DEC)**: Set the Device Event Category (DEC) for rule correlation events to "/Rule/Fire/something", ensuring consistency across related rules. Adjust the level of resolution based on the specific needs and structure of your rules. 3. **Aggregation in Rules**: Explicitly add all necessary data to the aggregation section of a rule, especially if you need to use fields like "name". Use tools within the Rules Editor to manage aggregations properly to avoid confusion with base event names. 4. **Aggregation Types**: Understand and utilize two types of aggregation conditions:

**Common Aggregation (identical fields)** which is widely used, and
**Unique Field Aggregation**, whose implications need to be managed carefully based on the specific analysis goals.

5. **Example Application**: As an example for a real-world application, consider rules designed to detect attacks from multiple attackers targeting a single system. This involves looking at fields that might indicate unique patterns of behavior or identity among event sources. Overall, these tips aim to improve rule processing speed and accuracy by optimizing both the setup and execution phases of rule-based analysis in this type of system. The text discusses several aspects related to rule performance characteristics in a system or software that handles security events or similar data. Here's a summary based on the provided information: 1. **Rule Performance Characteristics**:

There are no logical issues with the current setup, but when attempting to put attacker-related information into an active list, it may not be effective due to the rules engine being unable to identify which piece of information to display.
Two types of partial matches can occur: join rules where one rule alias matches multiple events and aggregation thresholds where a rule requires specific event counts within given time frames (e.g., 5 events in 2 minutes).

2. **Partial Matches**:

**Join Rules**: Events are kept in memory if they partially match, but the number of aliases used should be minimized to avoid confusion and reduce computational load. Using the "Consume After Match" option per alias can help by ensuring each event is only used once for triggering rules.
**Aggregation Threshold**: Events that meet this condition are held until the specified time frame has passed, regardless of additional matches during that period. The behavior with more than required events isn't clearly defined but could affect performance or memory usage.

3. **Selectivity of Aliases (Join Rules)**:

Ensure individual aliases do not match too many events to optimize performance and reduce computational overhead. Using fewer aliases in a join rule is preferable over using three aliases compared to two, as it generally improves efficiency.
Utilize the "Consume After Match" option for each alias to ensure each event's data isn't retained longer than necessary for matching rules.

4. **Time Windows**:

Shorter time windows are preferable as they can improve performance and reduce unnecessary processing. If more than two minutes is required, consider using an active list instead of extending the time window unnecessarily.

5. **High-level join rule example**:

For instance, to detect a login event followed by a connection from that system to another sensitive system, a join rule can be used but may require a long time window. Alternatively, employing a session list and lightweight rules could efficiently monitor logins without extending the time window significantly.

In summary, the text provides practical guidance on how to optimize performance when dealing with partial matches and time windows in event processing systems, emphasizing efficiency through minimizing unnecessary data handling and computational tasks. To summarize, this text discusses creating an active list (AL) for monitoring access events on a sensitive system using security log (SL) analysis. The process involves setting up a lightweight rule that updates the AL with each audit event and checking if the count of these events in a specified time window meets or exceeds a predefined threshold. Aggregation, or grouping, is another key aspect, where only necessary fields should be selected for better performance. Grouping by unique identifiers like event ID may not work efficiently due to high variability in values. Instead, consider using less varied fields such as request URL host. For single events, the On Every Event action can use these aggregated fields. Variables are a useful feature that supports different types of resources and functionalities within them, including active channels, filters, queries, and rules. There are two main classes of variables: global and local. Global variables can be used by other resources, while local variables have limited scope to the resource where they are defined. In some cases, local variables can be promoted to become global. The text discusses two main points related to data handling and variable naming in specific contexts: 1. **Variable Naming**:

It is suggested that spaces can be included in variable names, particularly with global variables, as it makes the use of these variables less error-prone in velocity macros.
Using spaces in variable names automatically enhances their presentation when used in other resources, eliminating the need for aliases.

2. **Timestamps**:

When creating date/timestamp fields in queries, it is advised against chaining multiple variables to form such fields. Instead, utilize functions provided in the select Fields tab or directly use the End Time field with appropriate time-related functions to get a formatted date if needed. This approach avoids complications and ensures clarity in timestamp handling.

In summary, the text provides practical tips for optimizing variable naming practices and managing timestamps within specific data management scenarios, aiming to improve efficiency and accuracy.

Disclaimer:

The content in this post is for informational and educational purposes only. It may reference technologies, configurations, or products that are outdated or no longer supported. If there are any comments or feedback, kindly leave a message and will be responded.

ESM Content Authoring Best Practices

Summary:

Details:

Recent Posts

Comments