Labeled tab-separated values

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Labeled tab-separated values
Filename extension .ltsv
Type of format multiplatform, serial data streams
Container for database and log information organized as field separated lists

Labeled Tab-separated Values (LTSV) format is a variant of Tab-separated values (TSV). Each record in a LTSV file is represented as a single line. Each field is separated by TAB and has a label and a value. The label and the value have been separated by ':'. With the LTSV format, you can parse each line by splitting with TAB (like original TSV format) easily, and extend any fields with unique labels in no particular order.

As a replacement for Common Log Format[edit]

Common Log Format and Combined Log Format, its extended variant, have been widely used as standard web server log format. However, it is notoriously hard to parse, making it difficult for later analyses. Here is a sample parser in Perl:

    my @common     = qw/host ident user time req status size/;
    my @combined   = qw/referer ua/;
    my @re_unquote = ( qr/\"(.*?)\"/, qr/\"((?:\\[\\\"]|.)*?)\"/ );
    my @re_common  = map {
        qr{
        \A
        (\S+)     [ ] # host
        (\S+)     [ ] # ident
        (\S+)     [ ] # user
        (\[.*?\]) [ ] # time
        $_        [ ] # req
        (\S+)     [ ] # status
        (\S+)         # size
      }msx
    } @re_unquote;
    my @re_combined = map { qr/\G\s+$_ $_/ms } @re_unquote;
 
    sub parse_line {
        my $line = shift;
        my %rec;
        my $escaped = !( index( $line, '\"' ) < 0 );
        @rec{@common}   = ( $line =~ m/$re_common[$escaped]/gc );
        @rec{@combined} = ( $line =~ m/$re_combined[$escaped]/ );
        return \%rec;
    }

LTSV makes it as simple as the following:

    sub parse_line_ltsv {
        +{ map { split ':', $_, 2 } split "\t", shift };
    }

To log in LTSV instead of Common Log Format on Apache HTTP Server, use the following directive.

  LogFormat "host:%h\tident:%l\tuser:%u\ttime:%t\treq:%r\tstatus:%>s\tsize:%b\treferer:\%{Referer}i\tua:%{User-Agent}i" combined_ltsv

See also[edit]

References[edit]

External links[edit]