VNXCollector – DIY EMC VNX Monitoring and Reporting

UPDATE

This tool was improved and integrated into the new tool Universal Storage Collector

 

One more tool I have created to work with storage systems EMC VNX, CLARiiON and Celerra.

VNXCollector is a tool for gathering information from EMC CLARiiON/Celerra/VNX and providing it to the Graphite.

What’s wrong with EMC VNX Monitoring and Reporting

VNX Monitoring and Reporting automatically collects block and file storage statistics along with configuration data, and stores them into a database that can be viewed from dashboards and reports.

What’s wrong:

  1. It needs a lot of resources.
    For up to 10 storage systems we need 8 CPU cores and 32 GB RAM.
  2. It’s not scalable.
    What if you need to monitor 20-30 systems? Could one VNX M&R gather data from this systems? Is it usable to have some independent VNX M&R instances?
  3. It collects mostly configuration information, but not the performance data.
    You cann’t change parameters what VNX M&R collects.
  4. You are very limited in changing its reports.
  5. Companies  already have different systems for monitoring its infrastructure.
    And they need a tool which can integrate with these systems.

So, we need a new, usable, tool. A tool without such problems.

Collecting data from EMC VNX, CLARiiON and Celerra

What can we use to gather data from EMC VNX, CLARiiON and Celerra?

The SNIA’s Storage Management Initiative (SMI) unifies the storage industry to develop and standardize interoperable storage management technologies. SMI develops the Storage Management Initiative Specification – SMI-S – the international storage management standard.

EMC offers SMI-S Provider allows a client application to retrieve information about a Symmetrix array, a CLARiiON array, and the VNX family of storage systems, as well as change the configuration of these EMC storage systems.

To  use SMI-S for VNX Block and CLARiiON we need to install EMC SMI-S Provider on a computer. VNX File and Celerra has SMI-S Provider installed by default on the Control Station.

But SMI-S from EMC provides only configuration information. It doesn’t provide performance data. So, we can’t use it.

Gathering data from VNX Block and CLARiiON

The only instrument we can use to gather data from VNX Block and CLARiiON is a naviseccli utility.

For example, to get information about SP we can use command: naviseccli -h spa getsp

cam05-getsp

Naviseccli also provide output in the XML format:

dev05-port

But it’s look like EMC programmers was drunk when they add XML output for the naviseccli. Some examples:

dev01-getall-spdev05-getall-cachecamvm-getdisk-xml2

The next problem we have with naviseccli is a different output for different firmware revisions and models’ generations.

Information about cache from VNX1:

dev01-getallcache

And from VNX2:

dev11-getallcache

Information about disk, both output from VNX1 but with different firmware versions:

cam05-getdiskdev01-getdisk

To use naviseccli we need to know how to parse its output and we should consider all these features.

Gathering data from VNX File and Celerra

Gathering performance data from VNX File and Celerra is a much easy task. We can use standard command server_stats on the Control Station.

file.basicfile.nfs

It’s easy to parse it output when it’s in CSV format.

To use server_stats we need to know how to do a SSH to the Control Station.

VNXCollector

It’s time now to talk about tool I have created.

Architecture

Most companies use Graphite for monitoring infrastructure.

Graphite does two things:

  1. Store numeric time-series data
  2. Render graphs of this data on demand

Graphite does not collect data, sending data to Graphite is very simple.

Graphite consists of 3 software components:

  1. carbon – a Twisted daemon that listens for time-series data
  2. whisper – a simple database library for storing time-series data (similar in design to RRD)
  3. graphite webapp – A Django webapp that renders graphs on-demand using Cairo

Getting data into Graphite is very flexible. There are three main methods for sending data to Graphite: Plaintext, Pickle, and AMQP.

The plaintext protocol is the most straightforward protocol supported by Carbon.
The data sent must be in the following format:
<metric path> <metric value> <metric timestamp>.

Carbon will then help translate this line of text into a metric that the web interface and Whisper understand.

So, VNXCollector gather data from VNX/CLARiiON/Celerra and provide it to a carbon in its plaintext protocol.

But that are not all parts of the puzzle. For better visual output I prefer to use Grafana.
Grafana is most commonly used for visualizing time series data for Internet infrastructure and application analytics but many use it in other domains including industrial sensors, home automation, weather, and process control.

In the result we have such architecture:
VNX – VNXCollector – Carbon – Whisper – Graphite – Grafana – Web browser

Configuration file

All configuration information for VNXCollector are in one file in XML format.
There is an example of this file.

<?xml version="1.0" encoding="UTF-8"?>
<collector>
  <configuration>
    <errorlog>/opt/Collector/log/collector-error.log</errorlog>
    <carbon address="127.0.0.1" port="2003"></carbon>
    <interval>10</interval>
    <replacelist>
      <replace what="SP " value=""></replace>
      <replace what="Bus " value=""></replace>
      <replace what="=" value=""></replace>
      <replace what="/" value=""></replace>
      <replace what="Enclosure" value="_"></replace>
      <replace what="Disk" value="_"></replace>
      <replace what="%" value="Prct"></replace>
      <replace what="." value="_"></replace>
      <replace what="CIFS " value=""></replace>
      <replace what="SMB Operation " value=""></replace>
      <replace what="NFS Op " value=""></replace>
      <replace what="NFS Export " value=""></replace>
      <replace what="NFS " value=""></replace>
      <replace what="dVol " value=""></replace>
      <replace what="MetaVol " value=""></replace>
      <replace what="device " value=""></replace>
      <replace what="Filesystem " value=""></replace>
      <replace what="Network " value=""></replace>
      <replace what="IP address " value=""></replace>
      <replace what="Client " value=""></replace>
      <replace what="id" value=""></replace>
    </replacelist>
  </configuration>
  <vnxblock>
    <cmd>/opt/Navisphere/bin/naviseccli -User #username -Password #password -Scope 0 -Address #address</cmd>
    <method>
      <name>sp</name>
      <type>simple</type>
      <cmd>getall -sp</cmd>
      <paramlist>
        <param>Prct Busy</param>
        <param>Read_requests</param>
        <param>Write_requests</param>
        <param>Blocks_read</param>
        <param>Blocks_written</param>
      </paramlist>
    </method>
    <method>
      <name>cache</name>
      <type>simple</type>
      <cmd>getall -cache</cmd>
      <paramlist>
        <param>Prct Dirty Cache Pages</param>
        <param>Prct Cache Pages Owned</param>
      </paramlist>
    </method>
    <method>
      <name>cache2</name>
      <type>simple</type>
      <cmd>cache -sp -info -perfData</cmd>
      <paramlist>
        <param>Read Hit Ratio</param>
        <param>Write Hit Ratio</param>
        <param>Dirty Cache Pages (MB)</param>
      </paramlist>
    </method>
    <method>
      <name>fastcache</name>
      <type>simple</type>
      <cmd>cache -fast -info</cmd>
      <paramlist>
        <param>Percentage Dirty SPA</param>
        <param>MBs Flushed SPA</param>
        <param>Percentage Dirty SPB</param>
        <param>MBs Flushed SPB</param>
      </paramlist>
    </method>
    <method>
      <name>port</name>
      <type>flat</type>
      <cmd>port -list -reads -writes -bread -bwrite -qfull</cmd>
      <paramlist>
        <param>Reads</param>
        <param>Writes</param>
        <param>Blocks Read</param>
        <param>Blocks Written</param>
        <param>Queue Full/Busy</param>
      </paramlist>
      <pattern>SP Name</pattern>
      <headerlist>
<header pos="right" sep=":"></header>
<header pos="right" sep=":"></header>

      </headerlist>
    </method>
    <method>
      <name>disk</name>
      <type>flat</type>
      <cmd>getdisk -all</cmd>
      <paramlist>
        <param>Read Requests</param>
        <param>Write Requests</param>
        <param>Kbytes Read</param>
        <param>Kbytes Written</param>
        <param>Hard Read Errors</param>
        <param>Hard Write Errors</param>
        <param>Soft Read Errors</param>
        <param>Soft Write Errors</param>
        <param>Busy Ticks</param>
        <param>Busy Ticks SPA</param>
        <param>Busy Ticks SPB</param>
        <param>Queue Length</param>
      </paramlist>
      <pattern>^Bus [0-9]</pattern>
      <headerlist>
<header pos="all"></header>

      </headerlist>
    </method>
  </vnxblock>
  <vnxfile>
    <cmd>export NAS_DB=/nas; /nas/bin/server_stats #server -count 1 -terminationsummary only -format csv -monitor #cmd</cmd>
    <methods>
      <method name="basic" cmd="basic-std" type="simple"></method>
      <method name="cache" cmd="caches-std" type="simple"></method>
      <method name="cifs" cmd="cifs-std" type="simple"></method>
      <method name="cifsOps" cmd="cifsOps-std" type="composite"></method>
      <method name="nfs" cmd="nfs-std" type="simple"></method>
      <method name="nfsOps" cmd="nfsOps-std" type="composite"></method>
      <method name="disk" cmd="diskVolumes-std" type="composite"></method>
      <method name="meta" cmd="metaVolumes-std" type="composite"></method>
      <method name="net" cmd="netDevices-std" type="composite"></method>
      <method name="cifsClient" cmd="cifs.client" type="composite"></method>
      <method name="nfsClient" cmd="nfs.client" type="composite"></method>
      <method name="nfsExport" cmd="nfs.export" type="composite"></method>
      <method name="nfsFilesystem" cmd="nfs.filesystem" type="composite3"></method>
      <method name="storeVolume" cmd="store.volume" type="composite2"></method>
    </methods>
  </vnxfile>
  <vnx>
    <name>vnx1u</name>
    <block>
      <username>sysadmin</username>
      <password>sysadmin</password>
      <methods>
        <method name="sp" title="sp.spa" address="10.10.10.2"></method>
        <method name="sp" title="sp.spb" address="10.10.10.3"></method>
        <method name="cache" title="cache.spa" address="10.10.10.2"></method>
        <method name="cache" title="cache.spb" address="10.10.10.3"></method>
        <method name="port" title="port" address="10.10.10.2"></method>
        <method name="disk" title="disk" address="10.10.10.2"></method>
      </methods>
    </block>
    <file>
      <cs>10.10.10.1</cs>
      <username>nasadmin</username>
      <password>nasadmin</password>
      <servers>
        <server>server_2</server>
      </servers>
      <methods>
        <method>basic</method>
        <method>cache</method>
        <method>nfs</method>
        <method>nfsOps</method>
        <method>nfsClient</method>
        <method>nfsExport</method>
        <method>nfsFilesystem</method>
        <method>storeVolume</method>
        <method>disk</method>
        <method>meta</method>
        <method>net</method>
      </methods>
    </file>
  </vnx>
  <vnx>
    <name>vnx2u</name>
    <block>
      <username>sysadmin</username>
      <password>sysadmin</password>
      <methods>
        <method name="sp" title="sp.spa" address="10.10.20.2"></method>
        <method name="sp" title="sp.spb" address="10.10.20.3"></method>
        <method name="cache2" title="cache.spa" address="10.10.20.2"></method>
        <method name="cache2" title="cache.spb" address="10.10.20.3"></method>
        <method name="port" title="port" address="10.10.20.2"></method>
        <method name="disk" title="disk" address="10.10.20.2"></method>
      </methods>
    </block>
    <file>
      <cs>10.10.20.1</cs>
      <username>nasadmin</username>
      <password>nasadmin</password>
      <servers>
        <server>server_2</server>
      </servers>
      <methods>
        <method>basic</method>
        <method>cache</method>
        <method>nfs</method>
        <method>nfsOps</method>
        <method>nfsClient</method>
        <method>nfsExport</method>
        <method>nfsFilesystem</method>
        <method>storeVolume</method>
        <method>disk</method>
        <method>meta</method>
        <method>net</method>
      </methods>
    </file>
  </vnx>
  <vnx>
    <name>vnx1b</name>
    <block>
      <username>sysadmin</username>
      <password>sysadmin</password>
      <methods>
        <method name="sp" title="sp.spa" address="10.10.30.1"></method>
        <method name="sp" title="sp.spb" address="10.10.30.2"></method>
        <method name="cache" title="cache.spa" address="10.10.30.1"></method>
        <method name="cache" title="cache.spb" address="10.10.30.2"></method>
        <method name="fastcache" title="fastcache" address="10.10.30.1"></method>
        <method name="port" title="port" address="10.10.10.1"></method>
        <method name="disk" title="disk" address="10.10.10.1"></method>
      </methods>
    </block>
  </vnx>
  <vnx>
    <name>cellera4</name>
    <block>
      <username>nasadmin</username>
      <password>nasadmin</password>
      <methods>
        <method name="sp" title="sp.spa" address="10.10.40.2"></method>
        <method name="sp" title="sp.spb" address="10.10.40.3"></method>
        <method name="cache" title="cache.spa" address="10.10.40.2"></method>
        <method name="cache" title="cache.spb" address="10.10.40.3"></method>
        <method name="port" title="port" address="10.10.40.2"></method>
        <method name="disk" title="disk" address="10.10.40.2"></method>
      </methods>
    </block>
    <file>
      <cs>10.10.40.1</cs>
      <username>nasadmin</username>
      <password>nasadmin</password>
      <servers>
        <server>server_2</server>
      </servers>
      <methods>
        <method>basic</method>
        <method>cache</method>
        <method>cifs</method>
        <method>cifsOps</method>
        <method>cifsClient</method>
        <method>nfs</method>
        <method>nfsOps</method>
        <method>nfsClient</method>
        <method>disk</method>
        <method>meta</method>
        <method>net</method>
      </methods>
    </file>
  </vnx>
</collector>

Let’s see what’s inside.

3 – 30, configuration – General configuration information:

  • errorlog – file to output errors
  • carbon – address and port of the carbon
  • interval – interval in minutes between gathering data
  • replacelist – list of pair for replacement in the naviseccli and server_stats output.

31 – 115, vnxblock – Description how to work with VNX Block and CLARiiON:

  1. cmd -command to call for gather data from VNX Block or CLARiiON
  2. method – how to gather data for different parameters: SP, Cache, FAST Cache, Port, Disk.
    There are two type of methods: simple and flat.
    Simple means that we have only one parameter for one instance.
    Flat means that we have a list with instance name followed by the parameters name.
    For simple methods we specify what parameters we want to gather.
    For flat methods we specify pattern to separate instances and number and format of headers.
    For example, for port we have two strings in the header:
    1) First – for SP Name (SP A or SP B)
    2) Second – for the port number (0, 1 or so on)

116 – 134, vnxfile – Description how to work with VNX File and Celerra:

  1. cmd – command to call for gather data from VNX File or Clerra
  2. method – how to gather data for different parameters.

135 – 258 – Description for concrete VNX/CLARiiON/Celerrra:

  1. name – name of the system
  2. block – what methods we use for gather block data, what is username and password.
  3. file – what methods we use for gather file data, what is Control Station address, username and password.

Source Code

VNXCollector written on the Scala language. Why? Why not? 🙂

Scala has full support for functional programming and a very strong static type system. Scala source code is intended to be compiled to Java bytecode, so that the resulting executable code runs on a Java virtual machine. Java libraries may be used directly in Scala code.

We can compile VNXCollector to a fat jar file, so we can run it on any system with Java and naviseccli. installed on it.

The source code of the VNXCollector is available here: https://github.com/vzaigrin/VNXCollector

The Results

Let’s see what we can get from this tool.

On the next screenshots we can see data from SP, Cache, FAST Cache, Ports and Disks.

g1g2g3g4g5g6

And some outputs from VNX File:

metavolsnetnfs.basenfs.clients1nfs.exportnfs.filesystemsvolumes1

The Conclusion

I hope this tool is usable and will help the storage systems administrators to see what’s going on its systems.

Advertisements

11 thoughts on “VNXCollector – DIY EMC VNX Monitoring and Reporting

  1. Hi there, really intested in this; I’ve successfully configured graphite and grafana; however I cannot for my life find out how to install the VNXcollector, I’m guessing I need to compile this but I do lack the knowledge; appreciate it if you can share some pointers? Thanks a lot in advance!

  2. Thank you for this. I have a question. I only have VNX bloxk and XTREMIO, do I need to edit the collector xml file? I am getting this when I run the start.sh ] [vnx-akka.actor.default-dispatcher-5] [akka://vnx/user/cellera4] timeout: socket is not established
    com.jcraft.jsch.JSchException: timeout: socket is not established

    • Yes, you need edit collector.xml
      Last sections named “vnx” (starting from 135 string) describes your VNX. If you have block only VNX, you don’t need subsection “file” in the “vnx” section.
      “vnx1b” (207 – 222 strings) is an example of block only VNX1. You can use it. But if you work with VNX2 you should change “cache” methods to “cache2”.

  3. I am trying to find the commands you use for navicli to grab the output to XML. I cannot seem to find anything with the -Xml flag in the configuration files. Can you help please?

  4. I was trying to find the commands for navicli you used to export the information with the -Xml flag, but I cannot seem to find it in the configuration files. If there a specific file where this info is listed? Do I have to open the project in Eclipse/Scala?

      • Oh okay, so, the output is not in XML from the commands — but you split the info as need be and then parse it that way?

      • Yes, output from naviseccli is not in XML format. Output in XML format is terrible (thanks to EMC’s programmers).
        I call naviseccli for SP, cache, FAST Cache, ports, disk, and then parse that output.
        I use two methods to parse that output. I called its “simple” and “flat”.
        “Simple” means that we have only one parameter for one instance.
        “Flat” means that we have a list with instance name followed by the parameters name.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s