Create and Manage IECs and IECMs
The video below demonstrates, in detail, how to Create and Manage Incident Error Codes to Eliminate Noise and Increase Granularity (19 mins :: 17 secs)
Please see the Video before reading the Article below.
Time Stamps:
0:06 - Initiating the Process of Managing the Incident Error Codes:
To initiate the process of managing the Incident Error Codes, navigate to the Discovery Status Module.
Click on the { Discovery Admin } UI Action Button to navigate to the Troubleshooting List View, which has records of all the previous Troubleshooting(s).
Click on the Troubleshooting Record to navigate to the Troubleshooting Form.
Click on the { Log Analysis } UI Action to navigate to the Log Analysis List View.
Group the Records in the Log Analysis List View by the Incident Error Code.
This can be achieved by simply clicking on the Incident Error Code header and then selecting Group By Incident Error Code to Group the results of the Troubleshooting.
1:00 - Creation of new Incident Error Codes for Segregation
Example 1: The Incident Error Code is P3.SSH.01 and the Device DNS Name contains '44'.
Example 2: The Incident Error Code is P3.SSH.01 and the Device DNS Name contains '52'.
3:16 - Applying filters for the Creation of a New Incident Error Code with SUFFIX 'Critical'
To create a new Incident Error Code the primary step is the creation of a filter on the Log Analysis List View.
For the creation of any new Incident Error Code, the FIRST Filter should always be "Error Code" IS. Using this as the FIRST Filter implies the creation of a new Incident Error Code.
And now considering our first example, the second filter is Device DNS Name contains "44". To add a second filter click on the AND button and configure the second filter.
Once the filter is configured, click on the RUN button to run the filters. Now all the results adhering to the filter will be shown.
Once the desired results are obtained, to create a new Incident Error Code, click on the { Incident Error Code } UI Action.
A dialogue appears on the screen, where we have to input the SUFFIX for our new Incident Error Code.
Based on the example above, input the SUFFIX as 'Critical'.
This redirects us to the new Incident Error Code Form with the IEC and corresponding IECM.
5:55 - Applying filters for the Creation of a New Incident Error Code with SUFFIX 'Ignore'
To create the second filter, we simply navigate to the Log Analysis List View and re-apply the filters in the same way as we did in the above example.
Click on the { Incident Error Code } UI Action button and enter the SUFFIX of the new Incident Error Code.
Based on the example above, input the SUFFIX as 'Ignore.Not_Critical'.
This redirects us to the new Incident Error Code Form with the IEC and corresponding IECM.
Note that the Active flag for the new Incident Error Code will be False, as it is detected from the SUFFIX that the code has to be ignored. As a result, it will not be considered during the process of Incident Generation.
7:58 - Adding new rules to the existing Incident Error Codes
Navigate to the Incident Error Code created in the example above (with SUFFIX = 'Critical').
To add a new IECM to an existing IEC, it is not necessary to recreate the filters manually. Navigate to the IECM at the bottom of the IEC. On the IECM Form, click the hyperlink to navigate the Log Analysis List View, with the previously applied filters.
This can be very helpful when the filters are very complex and it eliminates the need to recreate the filters manually. Update the new rules by configuring the new filter and then clicking on the { Incident Error Code } UI Action.
Enter the same SUFFIX as the previously created IEC.
Matching the SUFFIX to a previously created IEC's SUFFIX will add the new IECM to the existing IEC.
11:55 - Order of Incident Error Codes
The Incident Error Code associated with the Log error is determined by the 'order' of the corresponding Incident Error Code Mapping(s) associated with this Incident Error Code.
IECMs with a lower 'order' are given lower precedence.
IECMs with a higher 'order' are given higher precedence.
The default value of the 'order' for any IECM created by the Customer is 20,000 (5 digits). This value can be changed based on the requirements to resolve conflicting IECMs.
The Specific / Exception IECMs have an 'order' higher than 20,000, thus giving them higher precedence. This also applies to unique scenarios where you may need to Ignore certain IP Addresses from previously configured IECs.
15:43 - Creation of a new Troubleshooting for verifying the changes
Select the Discovery Records from the Discovery Status module and then click on the { Discovery Admin } UI Action button which then starts the troubleshooting of the records.
Once the Troubleshooting is complete, view the results of the Troubleshooting in the Log Analysis Table, where we can group the results by the Incident Error Code and verify the results of the above configurations.
17:53 - Deleting a record
Navigate to the IEC Form and click on Delete. This will delete all the IECMs associated with this IEC.
To delete an IECM (and retain the IEC), navigate to the IECM at the bottom of the IEC Form and then click on Delete.
Alternatively, switching the Active Flag on the IECM Form to False will prevent the IECM from being processed.
The Incident Error Code and the Incident Error Code Mapping is the key to contextualizing the results of Discovery Admin to better align with the results of ServiceNow Discovery in your environment.
The FIRST step for creating a new Incident Error Code is to create a filter on the Log Analysis List View, with the FIRST filter condition being 'Error Code IS'.
NOTE (again): The FIRST filter condition should be 'Error Code IS' (NOT 'Incident Error Code IS')
This filter lets Discovery Admin know that a new Incident Error Code needs to be created.
After the FIRST filter, any other filter on the Log Analysis Table can be appended to meet the requirements.Â
The distinction between Error Code and Incident Error Code is nuanced and significant.
Error Codes are the default codes that are generated by Discovery Admin and CANNOT be modified.
In contrast, the Incident Error Code complements the Error Code.
The Incident Error Code is designed to be extended and enhanced by following the steps and guidelines outlined in the Video above and this Article.
This permits the Customer to further contextualize the results of Discovery Admin to meet their specific needs, providing flexibility in organizing and tracking different types of errors.
The first three segments of the Incident Error Code are the same as the corresponding Error Code. An actionable Error Code does NOT have a suffix, but an Incident Error Code can be extended to have an Optional Suffix.
Discovery Admin ships with several Incident Error Codes which have been extended (using the approach above) to provide additional context for DNS and CMDB Lookups.
Note that if an IEC is not actionable or it is not scoped for Incident Generation, we MUST include the suffix 'Ignore' in the name of the Incident Error Code.
NOTE: 'Ignore' is case-sensitive. Do not use 'IGNORE' or 'ignore' or any other variation.
This aligns with the out-of-the-box design for non-actionable Incident Error Codes and also changes the 'active' flag for that Incident Error Code to 'false' indicating that there will be no corresponding Incident generated for that particular Incident Error Code.Â
If the particular Incident Error Code is no longer needed, we can do one of the following:Â
Filter out the IEC from any Reports that it may be a part of
Switch the 'active' flag on the IEC from 'true' to 'false'.Â
Delete the IEC. Deleting the IEC automatically deletes any related IECMs.
NOTE: Do NOT delete (or rename) any out-of-the-box Incident Error Code(s). Instead, create new IECs following the steps and guidelines outlined in the Video above and this Article
If a particular Incident Error Code Mapping is no longer needed, we can do one of the following:
Switch the 'active' flag on the IECM from 'true' to 'false'.Â
Delete the IECM. Deleting the IECM does NOT automatically delete the corresponding IEC.
The Incident Error Code associated with the Log error is determined by the 'order' of the corresponding Incident Error Code Mapping(s) associated with this Incident Error Code.
IECMs with a lower 'order' are given lower precedence.
IECMs with a higher 'order' are given higher precedence.
NOTE: This is different than how ServiceNow interprets 'order' where the lower 'order' implies higher precedence.
We have reserved IECMs with 3 digits (DNS) and 4 digits (CMDB Lookup) for our internal use.
The default value of the 'order' for any IECM created by the Customer is 20,000 (5 digits). This value can be changed based on the requirements to resolve conflicting IECMs.
We recommend assigning the values for the 'order' in the range of 15,000 to 25,000.
As a result, any IEC created by Customers with the IECM 'order' between 15,000 to 25,000, by design, takes precedence and overrides a corresponding out-of-the-box IECM.
The Generic / Catch-All IECMs have an 'order' lower than 20,000, thus giving them lower precedence.Â
The Default IECMs have an 'order' of 20,000. This is fine for most use cases as rules are usually unique and do not overlap.
The Specific / Exception IECMs have an 'order' higher than 20,000, thus giving them higher precedence. This also applies to unique scenarios where you may need to Ignore certain IP Addresses from previously configured IECs.
This concept is important if you observe that Discovery Admin is not generating the expected IEC.
In this scenario, review the IECMs (corresponding to the IECs) for conflicting / overlapping rules and fix the 'order' for those IECMs based on the guidelines above.
See the above video from Timestamp 11:55 to 14:20 for an additional explanation about the importance of the 'order' of the IECM.
UI Tip: Mouse over the attributes on the Forms to read the Hints, which further explain what each attribute is.
UI Tip: Make sure you Run the filter on the Log Analysis List View, before clicking the { Incident Error Code } UI Action, as it needs the updated breadcrumb to trigger the creation of the new IEC.
Design Tip: The effort put in to configure IECs, pays dividends, as the rules are automatically applied to every subsequent Troubleshooting run by Discovery Admin.
Design Tip: When designing the rules and filters to create new IECs, we get the best query performance with the 'STARTS WITH' filter, followed by the 'IS' filter followed by the 'CONTAINS' filter.
Design Tip: DO NOT use spaces in the IEC suffix (or in any part of the IEC Name). Use an underscore or dot instead.
Design Tip: DO NOT use any IEC Name in the IEC suffix.
Design Tip: One IEC can have more than one IECM.
Configuration Tip: The filters for creating a new IEC are CASE-SENSITIVE. Though the filters in the UI are NOT case sensitive (i.e. you will still see the results on the List View), the filter is case sensitive when applied in the backend (i.e. when ServiceNow is processing these rules during the Troubleshooting).
Keep this in mind when creating filters on String fields.
Configuration Tip: We should use the keyword 'Ignore' instead of 'ignore' or 'IGNORE' to align with the IEC naming convention to Eliminate Noise with IECs.
Configuration Tip: ANY attribute on the Log Analysis Table (except the Incident Error Code Reference attribute and the attributes in the IP Address Insights section) can be used to create IECs via the Log Analysis List View Filter. Discovery Admin also supports dot-walking to Reference Fields on the Log Analysis Table via the Log Analysis List View Filter.
Configuration Tip: For Filter Conditions that include 'compounded negative conditions' for example: 'Attribute1 DOES NOT CONTAIN abc' OR 'Attribute 1 DOES NOT CONTAIN xyz', the Filter Condition should have an AND instead of an OR between these conditions. i.e. Two (or more) negative conditions where one of them needs to be TRUE should have an AND instead of an OR when you build these conditions in the List View Filter.
Configuration Tip: Visually validate the results of the created filter in the Log Analysis List View before creating the new IEC.
(To validate an IECM that has already been created, click on the URL corresponding to the attribute 'Log Analysis List Filter' on the IECM Form.)
What you see in these results is what will be filtered when Discovery Admin is applying the rules during the analysis.
HOWEVER, if you don't see the expected IEC with Discovery Admin after validating the filter, MAKE SURE there are no TYPOS in your existing Filter.
If there are no typos, see the next Configuration Tip.
Configuration Tip: Referencing the tip above, if you still don't see the expected IEC, the reason is due to the conflicting ORDER configured in the corresponding Incident Error Code Mapping.
To view a list of ALL Incident Error Code Mappings, navigate to the Incident Error Code Mapping List View via the { IECM List View } UI Action on the Incident Error Code Mapping Form.
On the IECM List View, make sure the ORDER on the IECM corresponding to the IEC you want to see is higher than the ORDER on the IECM for the IEC you do not want to see (while ensuring it does not override other IECs that have a higher priority).
Filtering on the conflicting IECs in the IECM List will help focus on the IECMs that need to be assessed for updating the ORDER.
Once the ORDER has been updated, it is recommended to do a targeted ServiceNow Discovery on that IP Address and leverage the { Troubleshoot with Discovery Admin } UI Action on the Discovery Status Form to quickly confirm that you are getting the expected IEC with Discovery Admin.
Configuration Tip: Add an explanation in the Description field of the IECM to explain what the filter is supposed to do.
Configuration Tip: To create additional IECMs for the same IEC, reference the tip in the Video above (see Timestamp from 8:22 to 11:18) to navigate to the IECM and leverage its existing filter to create a new IECM for that IEC.
Configuration Tip:Â Once you create a new IEC / IECM, you need to run Discovery Admin again to see the results with the updated IECs and IECMs. Discovery Admin will NOT retroactively apply the new rules to the analysis that is already complete.