Kieker Data Bridge (KDB)

The following post was originally published by Reiner Jung on his personal blog oiloftrop.

Kieker Data Bridge (KDB)

Kieker is a Java-based monitoring and analysis framework, which can be used to instrument any kind of Java-application, either by directly introducing instrumentation code or using AOP techniques such as AspectJ. Furthermore, it can be introduced into servlet contexts. Monitoring data can be stored in files, databases or passed through messaging services, and later be processed with our analysis tools. For more detail, you may read the user guide or visit our wiki and ticket site.

As not all applications on earth are written in Java, other languages cannot be directly instrumented by the Java-framework. This shortcoming has been addressed for particular languages in the DynaMod (C#, VB6, Cobol) and MENGES (IEC61131-3 languages) projects, which ended in 2012. In the active project Pubflow, Kieker is used to instrument different languages, including Perl and Java. In the near future, additional monitoring scenarios will be addressed in the iObserve project. Therefore, I decided to build a commonly usable Kieker Data Bridge (KDB), which allows to add support for new host languages in a more elegant way.

KDB is presently not available in binary form and installation packages, but the sources of the KDB are available for the public in a git repository.

Public read only accessy

git clone http://git.kieker-monitoring.net/de.cau.se.instrumentation.language.git/

Read write access (login required)

git clone git@git.kieker-monitoring.net:de.cau.se.instrumentation.language

Note: The Kieker Data Bridge is currently being integrated into the main Kieker repository. Therefore, the specified URLs will change in the near future.

The Kieker Data Bridge

The Kieker Data Bridge (KDB) is designed to support a wide range of monitoring sources, allows to add monitoring to any language, and be extensible considering the means of data relay. Furthermore, it can be integrated in any other Java application, as it is comprised of a library providing all the functionality and two service implementations, a command line application and an Eclipse plugin.

Kieker Data Bridge Core

The core of the KDB is implemented in ServiceContainer. The class provides central service hooks for Kieker and a main loop, implemented by the run() method, for retrieving records and storing them with a Kieker MonitoringWriter.

The constructor takes two parameter. The first is a Kieker configuration object, which is used to setup the Kieker MonitoringWriter. It can be created with different factory methods provided by the Kieker framework through the ConfigurationFactory. The second, is a service connector conforming to the IServiceConnector interface. This interface defines three hooks for a service connector providing a connector setup, a connector shutdown and a record receiver method. In detail they are:

  • setup() is used to setup a data source. This can be opening a socket, connecting to queuing service, or other data sources, like RMI, Corba, OLE, etc.
  • close() is used to close and cleanup the source connection.
  • deserialize() is used to retrieve and deserialize data from a data source.

Beside retrieving, deserializing and storing monitoring record, a user might want to know what is going on or what is going wrong. The ServiceContainer provides therefore a listener registration for IServiceListener, which must implement a handleEvent-method.

public interface IServiceListener {

	/**
	 * Called by the main service loop to inform the listener about processed
	 * records and an optional message.
	 * 
	 * @param recordCount number of processed records
	 * @param message optional message (could be null)
	 */
	void handleEvent(long recordCount, String message);
}

The method has two arguments, which represent the number of records transferred and an optional message, which can be null.

In real world use cases, thousands of records could be received every second. Informing a server application of every record, which is received without an error, will lead to a slow service, which is mostly occupied with informing the user about new records rather than actually transferring data. To avoid this, the internal update method is only called for every 100th record. As this might be still too often, the method setListenerUpdateInterval allows to define a different update interval.

/**
* Set the update interval for the listener information. The default is 100 records. 
* @param listenerUpdateInterval the new update interval in number of records
*/
public void setListenerUpdateInterval(final long listenerUpdateInterval) {
	this.listenerUpdateInterval = listenerUpdateInterval;
}

At present, the Kieker Data Bridge supports five different connection realizations. First, it can act as a service waiting for one incoming connection from a client providing monitoring records. Second, it can be run as a service allowing multiple sources to connect and reconnect. Third, it can connect itself to a monitoring record provider by acting as a client. Fourth, it can be a JMS listener. And fifth, as the setup of a JMS messaging queue might be difficult, it can provide one itself and auto-connect to it.

Record Formats

These five implementations are able to receive monitoring data. To make any sense of it, the data must follow a predefined format. Right now tow format schemes have been defined and are used in the TCP (binary) and JMS (binary and textual) implementations.

In general, both formats must be able to identify record types. These types are encoded with numbers mapping to Kieker IMonitoringRecord classes.

For both all currently implemented services, this mapping must be provided by a Map<Integer, Class<IMonitoringRecord>>, which have to be composed by a server application. The two present implementation use a mapping configuration and dynamically loaded classes from URIs.

Binary Record Format

The binary record format is defined with a TCP connection in mind, which ensures, that, beside connection interruption, all send data is received in the same way. Therefore, no additional transmission control is defined.

A record starts with a 32 bit signed integer (as Java is not able to handle unsigned values) identifying the data type. The rest of the data stream is determined by the data structure implemented in the corresponding IMonitoringRecord described in its TYPES property. Each property is read, interpreted and then stored in a record of the right type.

Numeric values are all in network byte order (which is big endian by the way). As Java supports primitive types (lower case) and classes for primitive types, the deserialization routine must support both implementations. The TYPES array of a IMonitoringRecord must be used to identify the correct Java type. The following primitive types are supported (conforming to the serialization described in DataInput).

  • Boolean one byte (non-zero = true, zero = false)
  • Byte one byte (signed 8 bit)
  • Short two bytes (signed 16 bit)
  • Integer four bytes (signed 32 bit)
  • Long eight bytes (signed 64 bit)
  • Float four bytes (IEEE 754 floating-point “single format” bit layout)
  • Double eight bytes (IEEE 754 floating-point “double format” bit layout)
  • String four bytes indicating the buffer length of the String, followed by n bytes representing the buffer content. The String is in UTF-8.

Textual Record Format

The textual format is similar to the binary format. However, the received package is one large String containing all values. The values are separated by a semicolon (;). Therefore strings need to escape ; with a slash (\). The whole text is encoded in UTF-8 and the usual Java methods parse* are used to convert values to numbers. The String in textual representation is NOT preceded by an Integer to indicate the length, because the String length is determined by the ; or end of message.

Instrumentation in Perl

The Perl instrumentation is a work in progress and has initially been developed by Nis Börge Wechselberg in his bachelor thesis. At present an example implementation is available in a special repository. It comprises of monitoring records corresponding to the event-based set of Kieker monitoring records corresponding to BeforeOperationEvent.java (OperationEntryEvent.pm), AfterOperationEvent.java (OperationExitEvent.pm), and Trace.java (Trace.pm). In future, the different record types can be generated with an instrumentation record description language, which will provide generators for a wide range of host languages.

Monitoring probes in Perl create new records by directly instantiating a record class, filling in parameter values and finally call a writer instance to store the records. The writer used in conjunction with the Kieker Data Bridge is the JMSWriter.pm. This writer requires an active JMS message queue service. In our studies we used ActiveMQ. ActiveMQ provides support for the Stomp message protocol, used by Perl, and a Stomp-JMS mapping.

Instrumentation in C

The instrumentation of C code or other languages able to use C object files is at its infancy. However, a the basic primitives for monitoring programs written in C has been made. The code is available in the KDB repository. The library code and one example record type is located in src/kieker and src/kieker/records respectively.

  • src/kieker
    • socket.[hc] implements some convenience functions to establish and handle TCP connections.
    • binary_serializer.[hc] implements primitives for data serialization conform to the format defined above.
  • src/kieker/records
    • operation_execution_record.[hc] implements a Kieker record structure for an operation execution record, which conforms to the Java pendant OperationExecutionRecord.java, and a specialized serialization function for that record type.

It is intended to extend the C instrumentation library and use a generator to produce a wide range of record types. Furthermore, an automated weaving mechanism should be implemented to ease the use of instrumentation for C-based languages.

The Command Line Server

The current KDB implements two servers based on the KDB core. One is the command line server. The command line server (CLI) provides most functions of the KDB and has a rich set of options.

usage: cli-kieker-service [-d] [-h <hostname>] [-k <configuration>] -L
       <paths> [-l <jms-url>] -m <map-file> [-p <number>] [-s] -t <type>
       [-u <username>] [-v <arg>] [-w <password>]
 -d,--daemon                   detach from console; TCP server allows
                               multiple connections
 -h,--host <hostname>          connect to server named <hostname>
 -k,--kieker <configuration>   kieker configuration file
 -L,--libraries <paths>        List of library paths separated by :
 -l,--url <jms-url>            URL for JMS server
 -m,--map <map-file>           Class name to id (integer or string)
                               mapping
 -p,--port <number>            listen at port (tcp-server or jms-embedded)
                               or connect to port (tcp-client)
 -s,--stats                    output performance statistics
 -t,--type <type>              select the service type: tcp-client,
                               tcp-server, tcp-single-server, jms-client,
                               jms-embedded
 -u,--user <username>          user name for a JMS service
 -v,--verbose <arg>            output processing information
 -w,--password <password>      password for a JMS service

The primary option is -t. It determines which type of data source the bridge will use and which other parameters are required.

A tcp-client requires a port and a host name to connect to and receive data from the data source. A tcp-server or tcp-single-server opens a port for listening. Therefore only the port is necessary. A jms-client requires an URL of a JMS service. In addition that service might require a user name and password.

Beside the configuration for the data source, the server requires a set of Kieker MonitoringRecords, which must be provided by a library. Normally this is kieker-1.6.jar. If a user defines new record types, they must be provided in the same way.

Subsequently, a mapping file (-m filename) must be specified, which references numerical ids to full qualified Java class names, as shown in the following listing.

1=kieker.common.record.flow.trace.operation.BeforeOperationEvent
2=kieker.common.record.flow.trace.operation.AfterOperationEvent
3=kieker.common.record.flow.trace.Trace
10=kieker.common.record.controlflow.OperationExecutionRecord

Id and names are separated by an equal sign (=). Finally, a Kieker configuration file should be specified. If no configuration is provided, the server tries to use the default configutration.

The Eclipse Plugin

When using Kieker in a development, rather than an pure monitoring environment, an integration into Eclipse could be helpful. The Eclipse-plugin for KDB provides such integration.

The Eclipse-plugin comes with a run configuration (Kieker Servie) for KDB, where the same options are available. The configuration is organized in two tabs for the basic connectivity and project, and a second tab to configure the mapping. While the plugin uses normal Kieker confiugration files, the mapping is serialized with Eclipse services and therefore the mapping file from the CLIServer cannot be reused directly.

Leave a Reply

Your email address will not be published. Required fields are marked *