Tutorial: Postprocessing with TAWK

This tutorial presents tawk functionality through various scenarios. tawk works just like awk, but provides access to the columns via their names. In addition, it provides access to helper functions, such as host() or port(). For an overview, refer to the Alphabetical List of TAWK Functions. Custom functions can be added in the folder named t2custom where they will be automatically loaded.


This tutorial assumes a working knowledge of awk.


gawk version 4.1 is required.

  • Kali/Ubuntu: sudo apt-get install gawk
  • Arch: sudo pacman -S gawk
  • Fedora/Red Hat: sudo yum install gawk
  • Gentoo: sudo emerge gawk
  • OpenSUSE: sudo zypper install gawk
  • Mac OS X: brew install gawk (Homebrew package manager)


The recommended way to install tawk is to install t2_aliases as documented in README.md:

  • Append the following line to ~/.bashrc:
if [ -f "$T2HOME/scripts/t2_aliases" ]; then
    . $T2HOME/scripts/t2_aliases             # Note the leading `.'
  • Make sure to replace $T2HOME with the actual path, e.g., $HOME/tranalyzer-0.8.4/plugins.

Documentation (Man Pages)

The man pages for tawk and t2nfdump (more on that later) can be installed by running: ./install.sh man. Once installed, they can be consulted by running man tawk and man t2nfdump respectively.

General Introduction

Command line options

First, run tawk -h to list the available command line options:

$ tawk -h
    tawk [OPTION...] 'program' file_flows.txt
    tawk [OPTION...] -I file_flows.txt 'program'

Optional arguments:
    -I file             Alternative way to specify the input file
    -s char             First character for the row listing the columns name
    -F fs               Use 'fs' for the input field separator
    -n                  Load nfdump functions
    -e                  Load examples
    -X xerfile          Specify the .xer file to use with -k and -x options
    -x outfile          Run the fextractor on the extracted data
    -k                  Run Wireshark on the extracted data
    -t                  Do not validate column names
    -H                  Do not output the header (column names)
    -c[=u]              Output command line as a comment
                        (use -c=u for UTC instead of localtime)

Help and documentation arguments:
    -l[=n], --list[=n]  List column names and numbers
    -g[=n], --func[=n]  List available functions

    -d fname            Display function 'fname' documentation
    -V vname[=value]    Display variable 'vname' documentation

    -D                  Display tawk PDF documentation

    -?, -h, --help      Show help options and exit

-s Option

The -s option can be used to specify the starting character(s) of the row containing the column names (default: "%"). If several rows start with the specified character(s), then the last one is used as column names. To change this behaviour, the line number can be specified as well. For example if row 1 to 5 start with "#" and row 3 contains the column names, specify the separator as follows: tawk -s '#NR==3'. If the row with column names does not start with a special character, use -s '' or -s 'NR==2'.

What features (columns) are available?

$ tawk -l file_flows.txt

What functions are available?

$ tawk -g file_flows.txt

Alternatively, refer to the Alphabetical List of TAWK Functions.

How to use a specific function?

$ tawk -d function_name

How to interpret a specific column?

$ tawk -V colName
$ tawk -V colName=value

Ignore all flows between private IPs

Replace the protocol number by its string representation, e.g., 6 -> TCP

Replace the Unix timestamp used for timeFirst and timeLast by their value in UTC

Replace the Unix timestamp used for timeFirst and timeLast by their values in localtime

Inspect the flow number 1234 in the flow file

Follow a specific flow, e.g., the flow with flow index 1234, in the packet file

Inspect the packet number 1234 in the packet file

Extract all flows whose HTTP Host: header matches google using Wireshark field names

Extract the DNS query field from all flows where at least one DNS answer was seen (using Wireshark field names)

Open all ICMP flows involving the network in Wireshark

Create a PCAP files with all TCP flows with port 80 or 8080

Writing a tawk Function

  • Ideally one function per file (where the filename is the name of the function)
  • Private functions are prefixed with an underscore
  • Always declare local variables 8 spaces after the function arguments
  • Local variables are prefixed with an underscore
  • Use uppercase letters and two leading and two trailing underscores for global variables
  • Include all referenced functions
  • Files should be structured as follows:
  • Copy your files in the t2custom folder.
  • To have your functions automatically loaded, include them in the file t2custom/t2custom.load.

Using tawk Within Scripts

To use tawk from within a script:

  • Create a TAWK variable pointing to the script: TAWK="$T2HOME/scripts/tawk/tawk" (make sure to replace $T2HOME with the actual path to the scripts folder)
  • Call tawk as follows: $TAWK 'dport(80)' file.txt

Using tawk With Non-Tranalyzer files

tawk can also be used with files which were not produced by Tranalyzer.

  • The input field separator can be specified with the -F option, e.g., tawk -F ',' 'program' file.csv
  • The row listing the column names, can start with any character specified with the -s option, e.g., tawk -s '#' 'program' file.txt
  • All the column names must not be equal to a function name (tawk will rename them with a trailing underscore if -t option is NOT being used)
  • Valid column names must start with a letter (a-z, A-Z) and can be followed by any number of alphanumeric characters or underscores
  • If no column names are present, use the -t option to prevent tawk from trying to validate the column names.
  • If the column names are different from those used by Tranalyzer, refer to to the next section.

Mapping External Column Names to Tranalyzer Column Names

If the column names are different from those used by Tranalyzer, a mapping between the different names can be made in the file scripts/tawk/my_vars. The format of the file is as follows:

Once edited, run tawk with the -i $T2HOME/scripts/tawk/my_vars option and the external column names will be automatically used by tawk functions, such as tuple2(). For more details, refer to the my_vars file itself.

Using tawk with Bro Files

To use tawk with Bro log files, use the following command:


For more examples, refer to tawk -d option, e.g., tawk -d aggr, where every function is documented and comes with a set of examples. For more complex examples, have a look at the scripts/t2fm/tawk/ folder. The complete documentation can be consulted by running tawk -d all.

See Also