Ganglia

Last Modified: 6/07/2011. Ubuntu 10.10, Ganglia 3.1x (Ubuntu packages)

What is Ganglia

Ganglia is a grid monitoring system.

Overview/Concepts

    Major components.
  1. The stats-generators. The nodes you want to monitor. These need the package ganglia-monitor.
  2. The stats-collector(s). The machine(s) that collect the stats send by gmond. These need the package ganglia-webfrontend.
  3. The web-interface. This will display the data. These need the package ganglia-webfrontend.

One of the tricky things about Ganglia is how it connects to itself. It isn't that tricky, but it trips people up, so lets spell it out.:

  • stats-generators send their stats on to a stats collector box. The stats-gen and stats-col box run the same daemon (gmond, from ganglia-monitor) to do this. Which means they have different configs.
  • The stats-col boxes provide the data to the web front end, so it is also listening on it a port (8652 default) as well as making connections to its local gmond
  • The web frontend connects to the stats-col box (gmetad is the daemon) and gets the info
  • You can also talk to the ganglia-monitor and dump XML directly for troubleshooting purposes simply with nc localhost 8649
  • Installation

  • On Ubuntu nodes: sudo apt-get install ganglia-monitor (this installs the service ganglia-monitor, which is gmond)
  • On Ubuntu web interface machine: sudo apt-get install ganglia-webfrontend (this installs gmetad and the web frontend)
  • Configuration

    Stats Generation Boxes

    ganglia-monitor (gmond)

    # gmond on stats-generation boxes (netstat view)
    udp        0      0 192.168.157.40:50835    192.168.157.10:8650     ESTABLISHED -  
    
    # CONFIG stats gen box - /etc/ganglia/gmond.conf
    cluster {
      name = "hadoop"
      owner = "unspecified"
      latlong = "unspecified"
      url = "unspecified"
    }
    
    udp_send_channel {
      host = hadoop-1.skyboximaging.com
      port = 8650
    }
    

    Stats Collection Box(es)

    ganglia-monitor (gmond)

    # gmond (ganglia-monitor) on collector box
    tcp        0      0 0.0.0.0:8649 (serves up XML to gmetad on this port)
    udp        0      0 0.0.0.0:8650 (receives data from other gmond on this port) 
    
    cluster { 
      name = "Hadoop Dev" 
      owner = "Skybox" 
      latlong = "unspecified" 
      url = "unspecified" 
    } 
    
    /* The host section describes attributes of the host, like the location */ 
    host { 
      location = "unspecified" 
    } 
    
    udp_recv_channel {
      port = 8650
    }
    
    /* You can specify as many tcp_accept_channels as you like to share                                                                                                                                                                                                           
       an xml description of the state of the cluster */
    tcp_accept_channel {
      port = 8649
    }
    

    ganglia-webfrontend (gmetad)

    # gmetad - netstat view - listening for the web gui
    # change the web port at (you shouldn't need to, just FYI):
    # /usr/share/ganglia-webfrontend/conf.php
    tcp        0      0 0.0.0.0:8652            0.0.0.0:*               LISTEN      -  
    
    # CONFIG /etc/ganglia/gmetad.conf
    data_source "Hadoop Dev" localhost
    gridname "Skybox"
    

    Hadoop Statics in Ganglia

    In hadoop-metrics.properties, do this:

    dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
    dfs.period=10
    dfs.servers=(_your_gmetad_host_:_your_gmond_sending_port_)
    
    mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
    mapred.period=10
    mapred.servers=(_your_gmetad_host_:_your_gmond_sending_port_)
    
    jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
    jvm.period=10
    jvm.servers=(_your_gmetad_host_:_your_gmond_sending_port_)
    
    rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
    rpc.period=10
    rpc.servers=(_your_gmetad_host_:_your_gmond_sending_port_)
    
    Example: (_your_gmetad_host_:_your_gmond_sending_port_) ==> ganglia.example.com:8649 - this should be whatever you put in gmond.conf on your nodes

    Usage

    TODO

    Interpreting the Results

    TODO

    References

  • Ganglia on Sourceforge
  • http://www.ibm.com/developerworks/wikis/display/WikiPtype/ganglia
  • http://agiletesting.blogspot.com/2010/09/quick-note-on-installing-and.html
  • http://sourceforge.net/apps/trac/ganglia/wiki/ganglia_quick_start
  • http://www.allgoodbits.org/articles/view/5
  • Hadoop Metrics
  • Setting up Hadoop Metrics
  • http://blog.stlhadoop.org/2010/11/ganglia-hadoop-monitoring.html

    Back to Code