Friday, February 18, 2011

EMC Ionix: Integration Basics (part 2)


EMC Ionix: Integration Basics (part 2)

Abstract:
Higher level integrations to network management frameworks are normally facilitated through command line processes. SMARTS, the producer a product called InCharge, which was a market leader in event correlation, was later purchased by EMC, and consolidated the product into Ionix framework. In the EMC Ionix framework, a higher level enterprise management system integration utility ("sm_ems") simplifies integration.

Integration Point:
The Managers, Open Integration, and Service Assurance Manager can be integrated to via the following commands:
  • sm_ems
    Performs individual queries and updates to a manager or manager-of-managers
The sm_ems can be leveraged to perform basic interfacing through external languages.

SM_EMS:
The "sm_ems" command offers the following options:

SparcSolaris/User777$ sm_ems --help
[No write since last change]
Usage: sm_ems [options...] [command]
Options:

--server=[name] The name of the server. Also -s.
--broker=[location] Alternate Broker location as host:port.
--system=[nameOrAddr]
Specify the name or IP address of the system this alarm is associated
with. The event will automatically be associated with this system in
the ICOI topology. The system name is canonicalized using host
name lookups. If the system does not exist in the topology it may
be created automatically if the -create-system option is specified.
Also -t.
--create-system
Indicates that the system should automatically be created if it does
not exist in the topology. The class defaults to Node, or use the
--element-class option to specify the class name.
Also -c.
--element-class=[className]
Class name to be used if the system specified by --system
option is not found in the InCharge topology and --create-system
is specified.
Also -e .
--element-name=[InstanceName]
Instance name to be used with --element-class option
Options provided with --element-class and --element-name will be used
to create the object. --system should not be used if --element-class and
--element-name are mentioned.
is specified.
Also -v .
--create-element
Indicates that the element-class and element-name should automatically
be created if it does not exist in the topology.
Also -C.
--aggregate-element-class=[className]
Aggregate Event Class name to be used if you want to generate an Aggregate
Also -E .
--aggregate-element-name=[InstanceName]
Aggregate Instance name to be used with --aggregate-element-class option
Using --aggregate-element-class and --aggregate-element-name Aggregate Event
will be created.
Also -V .
--aggregate-event-name=[aggregate Event Name]
Aggregate Event name to be used if you want to generate an Aggregate
Also -g .
--audit=[msg]
Optional text to include in the description field of the
audit log entry created for the action. Note that this
option is ignored for the add-audit-log command.
Also -a.
--traceServer Enable tracing of server communications.
--source-event-type=[eventType]
Optional source event type for notification. If not specified, no source event type will be passed in the notify() call, which will result in the server inserting
a default value (typically "UNKNOWN") into the SourceEventType attribute.
This option only works with a server newer than 6.2-SP2.

Commands:
notify [class] [name] [event] [src] [type] [clear-mode] [[attr]=[val] ...]
Notify an occurrence of the notification identified by
[class] [name] and [event].

[src] indicates the name of the application
generating the notification. Note that a subsequent
invocation to clear this notification must specify
the same value for [src]

[type] indicates the nature of the event and it
must have the value 'momentary' or 'durable'. A
momentary event has meaning only at a specific point
in time; it has no duration. An authentication failure
event is a good example. A durable event has a duration
over which the event is active and after which the
event is no longer active. An example of a durable
event is a link failure.

[clear-mode] indicates the mechanism by which the event
will be cleared. This parameter is ignored when the
type is discrete. The value 'source' indicates that
the notification will be cleared automatically by the
source when the event goes away. A value of [n]
indicates that the notification should expire in [n]
seconds. A value of 'none' indicates that the notification
should not expire and that the source will not generate
a clear event; this implies that the actual duration of
the occurrence will not be known. In this case the
system clears the event when it is acknowledged.

[attr]=[val] ... are optional attribute/value
pairs where [attr] is the attribute name and
[val] is the value. These parameters may be used
to set additional attribute values for the notification
object.

update [class] [name] [event] [attr]=[value]
Update one or more the attributes of an event.

clear [class] [name] [event] [src]
Clear an occurrence of the notification identified by
[class], [name], and [event]. [source]
indicates the name of the application generating
the clear.
assign [class] [name] [event] [owner]
Assign ownership of the notification identified by
[class], [name], and [event] to [owner].

release [class] [name] [event]
Release ownership of the notification identified by
[class], [name], and [event]. The caller
must be the owner of the notification.

acknowledge [class] [name] [event]
Acknowledge the notification identified by
[class], [name], and [event]. The
caller must be the owner of the notification in
order to acknowledge it.

unacknowledge [class] [name] [event] [owner]
Unacknowledge the notification identified by
[class], [name], and [event]. The
caller must be the owner of the notification in
order to unacknowledge it.

add-audit-log [class] [instance] [event] [message]
Add a user note containing [message] to the audit
log for the notification identified by [class]
[instance], and [event]. Note that the --audit will
be ignored for this option.

print [class] [name] [event]
Print the properties including the audit log for the
notification identified by [class] [name] and [event].

summarize [NL name]
Print a summary of all notifications of
all NL events


Standard Options:
--help Print help and exit.
--version Print program version and exit.
--daemon Run process as a daemon.
--logname=[name] Use [name] to identify sender in the system log.
Default: The program's name.
--loglevel=[level] Minimum system logging level. Default: Error.
--errlevel=[level] Minimum error printing level. Default: Warning.
--tracelevel=[level] Minimum stack trace level. Default: Fatal.
[level]: One of None, Emergency, Alert,
Critical, Error, Warning, Notice, Informational,
or Debug. Fatal is a synonym for Critical.
--facility=[facility] Non-Windows only. A case-insensitive string which
identifies the facility to use for syslog messages.
[facility]: One of Cron, Daemon, Kern, Local0-Local7,
Lpr, Mail, News, Uucp, User. Default: Daemon.
--output[=[file]] Redirect server output (stdout and stderr). The
file name is [file], or the --logname value if
[file] is omitted. Log files are always placed
in $SM_LOGFILES or $SM_WRITEABLE/logs.
--accept=[host-list] Accept connections only from hosts on
[host-list], a comma-separated list of host
names and IP addresses. --accept=any allows
any host to connect. Default: --accept=any.
--useif=[ip-address] Use this IP address as the source/destination
interface address for SNMP and ICMP packets.
-- Stop scanning for options.
For more information:
file:/opt/InCharge7/SAM/smarts/doc/html/usage/index.html
http://www.EMC.com/

One of the most powerful options from the "sm_ems" command is "summarize", to quickly review notifications from a manager.

SparcSolaris/user777$ sm_ems --server=SAM-27 summarize ALL_NOTIFICATIONS

ClassDisplayName = Router
InstanceDisplayName = ABC_CUAUHTEMOC99
EventDisplayName = Down
Active = TRUE
Acknowledged = FALSE
Category = Availability
TroubleTicketID =
Owner =

ClassDisplayName = Interface
InstanceDisplayName = IF-ABC_CUAUHTEMOC99/106 [VoiceEncapPeer20018]
EventDisplayName = Down
Active = TRUE
Acknowledged = FALSE
Category = Availability
TroubleTicketID =
Owner =
...

Note, in the output above, there are two identified types of records:
  • Interface Record
    The Interface Record can be identified through the "IF-" prefix on the display name, assigned to the Interface class, and suffixed with a "/#".
    (The "#" represents an ifIndex for the interface through SNMP and can change on the device during a reboot or other type of reconfiguration - Ionix will only recognize this after a re-discovery.)
  • Device Record
    The Device Record can be identified through not having a prefix with a "-" on it and can be noted that this is also a Router class.
For simplicity, the devices are always prefixed when loaded into smarts with "ABC_ in the above example.

The output of the "sm_ems" command can easily be parsed in POSIX awk for extracts, integrity checks with external systems, and feed external management systems.

An example follows to parse Device up/down types of events using the "sm_ems" command where the host name prefix is "ABC_":
SparcSolaris/User777$
sm_ems --server=SAM-27 summarize ALL_NOTIFICATIONS | nawk '
BEGIN { Pattern="%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\n" }
# clear vars on new record
/^Class/ { Class="" ; Inst=""; Event=""; Active="";
Ack=""; Cat=""; TT=""; Owner="" ; Tag="" }
# read record
/^Class/ { Class=$3 }
/^Insta/ { Inst=$0 ; gsub("InstanceDisplayName = ","",Inst) }
/^Event/ { Event=$3 }
/^Activ/ { Active=$3 }
/^Ackno/ { Ack=$3 }
/^Categ/ { Cat=$3 }
/^Troub/ { TT=$3 ; gsub("TroubleTicketID = ","",TT) }
/^Owner/ { Owner=$3 }
# tag interesting records
/^Insta/ && $3~/^HDB_/ { Tag="Yes" }
# print interesting record in columns
/^Owner/ && Tag=="Yes" { printf Pattern,Class,Inst,Event,Active,Ack,Cat,TT,Owner }'

Node ABC_ANEA03_ID Down FALSE TRUE Availability AR000000003967636 SYSTEM
Node ABC_ANVW04_ID Down FALSE TRUE Availability AR000000003968578 SYSTEM
Node ABC_ANSM12_BR Down FALSE TRUE Availability AR000000003968469 SYSTEM
...


The beauty of "nawk", in conjunction with "sm_ems" is the simple capacity to move from reporting to interfacing to foreign Ionix systems.

To replicate the notifications from a source SAM to a destination SAM, a couple more nawk statements are all that is required, print out the command, and pipe it to a shell.

Conclusion:
The use of the "sm_ems" allows for a simple integration point into Ionix for reporting and can also facilitate the movement of notifications to foreign systems with standard POSIX commands like "awk".

No comments:

Post a Comment