Universal Storage Collector

First, I have created a VNX Collector. Then I have created a VPLEX Collector. This systems are very similar. My next task was periodically extracting performance data from NAR files. At that moment I decided to create an universal tool to collect performance data from different storage systems with output to different collectors. Later I decide to add gathering performance data from EMC VMAX.

And now after some month of work I present Universal Storage Collector — a modular and flexible tool.

What is it

Modular tool. This mean that it is easy to add an extractor for new types of storage systems, or add output to new data collector. This is modularity by source code. It does not use binary plugin at this time. But it is easy to add new code.

Flexible tool. This mean that we could work with different storage systems. Each storage system has its own extractor, output to its own collector, and has its own interval for gathering.

Extractors. At this time this tool could extract performance data from:

  • EMC CLARiiON/VNX block using naviseccli
  • EMC Celerra/ VNX file
  • EMC CLARiiON/VNX block using NAR files
  • EMC VPLEX
  • EMC VMAX from Unisphere reports

Output. At this time this tool could output to Carbon (Graphite) and InfluxDB.

How it works

This tool has one configuration file. At this file we specify:

  • List of extractors with its common parameters.
  • List of outputs with its parameters.
  • List of storage systems we would work with.

For each storage system we should specify:

  • name
  • class — name of storage system type (VNX, VMAX, VPLEX and so on)
  • type — type (block or file) for the unified storage systems (optional)
  • interval — time interval between polling
  • extractor — one from the list with concrete parameters for this storage system
  • output — one from the list

Combinations of name, class and type are unique. We could have (array1, vnx, block) and (array1, vnx, file), but we could not have two (array1, vnx, block)  with different extractors. Only one will survive. The first one from the list.

After start, this tool:

  • check each extractor definition
  • check each output definition
  • check each storage system definition and create concrete extractor and output exemplar for each of them
  • create the list of actors, each actor is a concrete storage system
  • ‘ask’ each actor in its interval

When each actor receive ‘ask’ signal, its begin transmission to output, send ‘ask’ message to its extractor, and then stop transmission.

When each concrete extractor receive ‘ask’ signal it extract data from concrete storage system and send its directly to concrete output.

Setup

To start this tool, we should:

  • create a home directory for it
  • create a conf subdirectory and put configuration file collector.xml into it
  • create a log subdirectory for error log file
  • create a pool subdirectory, if you plan to use extractor from VNX NAR files
  • specify USC_HOME environment variable with home directory

The source code of this tool is available here https://github.com/vzaigrin/UniversalStorageCollector

Один ответ на “Universal Storage Collector

      • Thanks for your reply
        In fact I just found a thread in your github about it, it seems monitors ans sinks need to be created first….
        I ll give a try to your awesome work and keep you informed

        Best regards

      • Hi

        In fact I m playing with the vnx nar extractor and I needed to see if it extracts the storage pools stats from nar because I can’t see them in graphite.
        Maybe the object must be added in collector.xml ?

    • VNXNar extractor is less configurable extractor. Because it is hard to make it flexible.
      But there are not a lot of data about pools in nar files. Only FAST Cache and MLU measurements.

      When USC extract data about LUNs and Disks it says what pool its belong to.
      In the Grafana you could sum over all LUNs from one pool — this will be information about pool.

      To check what USC outputs to the Carbon you could use ‘nc’ utility: nc -k -l 2003
      Here is an example of such output: https://vzaigrin.files.wordpress.com/2017/01/vnxnar.png

      • Hi Vadim

        Thanks for your reply, VNX performance stats are indeed pretty poor, I’ll give up till next array

        Have a nice day

  1. Hi Vadim,

    i’m playing with the vplex extractor and it seems to me that everytime you pull a csv-file you open a new ssh conncetion. would’n it be possible to pull all csv files over a single ssh connection? should even be faster, doesn’t it?

    • Hello.

      Sorry for delayed answer.

      Yes, for each monitor (csv file) VPLEX Extractor executes SSH.once with ‘tail’ command.
      It is possible to start a SSH session before ‘monitors foreach’ cycle, execute command inside cycle, and close session after cycle.
      But I don’t have an access to VPLEX hardware anymore.
      So I cann’t try this.

  2. Hello Vadim.

    Im having problems when compiling your tool, seems that the janalyse-ssh repo is not working, i tried with Maven and a newer version but im getting errors.

    Maybe its just me doing it wrong(1st time compiling scala 😉 ). Im trying to compile with : sudo sbt compile build.sbt

    Hope you can help me, thanks for your work!
    Regards

  3. Hi Vadim! I took a look at the configuration file of section «vnxblock» and saw that you get SP utilization from value returned by SP. I think this not very correct, because this value is show SP utulization from last statistics clear or SP reboot… I find an article https://thesanguy.com/2012/10/24/automating-storage-processor-utilization-alerts-with-emc-performance-manager/ where more correct SP utulization value calculated. What algorithm you use to calculate SP utilization?

Ответить на Vadim Zaigrin Отменить ответ