Tutorial: Deep Packet Inspection (nDPI)

Description

This tutorial details the different features of T2 concerning Deep Packet Inspection (DPI) T2 implements a wrapper for the well known nDPI being widely used by researchers and technicians. Hence, T2 provides the user with a highly effective selection mechanism based on L7 Applications. So producing training and test files for AI experiments is now very easy.

Preparation

In order to do so, we need to prepare T2. If you did not complete the tutorials before, just follow the procedure described below.

First, restore T2 into a pristine state by removing all unnecessary or older plugins from the plugin folder ~/.tranalyzer/plugins and compile the following plugins.

$ t2build -e
Are you sure you want to empty the plugin folder '/home/wurst/.tranalyzer/plugins' (y/N)? y
Plugin folder emptied
$ t2build tranalyzer2 basicFlow tcpStates portClassifier nDPI txtSink
...
BUILD SUCCESSFUL

If you did not create a separate data and results directory yet, please do it now in another bash window, that facilitates your workflow:

$ mkdir ~/data
$ mkdir ~/results
$ cd data

The anonymized sample PCAP being used, can be downloaded here: faf-exercise.pcap Please extract it under your data folder. Now you are all set for T2 flow based nDPI experiments.

Flow based nDPI

For network admins or researchers the L7 type of the traffic is great interest. So to select flows with this feature makes it very easy to weed out (un)interesting traffic, reduce the amount of flows or label flows for later AI training and testing. For the latter the nDPI plugin supplies beside the human readable also a numerical output.

To begin let us look into the directory.

$ cd nDPI
$ ls
AUTHORS  autogen.sh  ChangeLog  clean.sh  configure.ac  COPYING  doc  Makefile.am  new_ndpi_prepatch.sh  NEWS  prototex  README  src  t2plconf  test
$

Important to note is the new_ndpi_prepatch.sh script, which fetches the newest version of nDPI. Please refer to the documentation under the doc folder. Now move to the src directory

$ cd src
$ ls
Makefile.am  nDPI  nDPI.c  nDPI.h
$

Besides the nDPI plugin files there is the nDPI folder with all the C code and libraries from the open source. Open the nDPI.h file to look at the config.

$ vi nDPI.h

We leave the numerical classification off, which is useful for machine learning, as we like to compare nDPI to the L4 ports meaning provided by portClassifier. But if you like you can switch it on. If nDPI is not sure about the classification T2 helps a bit on flow terminate. This feature is enabled by default. If you changed the config, you need to rebuild nDPI, otherwise you can run T2 right away:

$ t2 -r ~/data/faf-exercise.pcap -w ~/results/
================================================================================
Tranalyzer 0.8.6 (Anteater), Tarantula. PID: 15321
================================================================================
[INF] Creating flows for L2, IPv4, IPv6
Active plugins:
    01: basicFlow, 0.8.6
    02: portClassifier, 0.8.6
    03: nDPI, 0.8.6
    04: tcpStates, 0.8.6
    05: txtSink, 0.8.6
[INF] basicFlow: IPv4 Ver: 3, Rev: 01072019, Range Mode: 0, subnet ranges loaded: 312743 (312.74 K)
[INF] basicFlow: IPv6 Ver: 3, Rev: 01072019, Range Mode: 0, subnet ranges loaded: 21494 (21.49 K)
Processing file: /home/wurst/data/faf-exercise.pcap
Link layer type: Ethernet [EN10MB/1]
Dump start: 1258544215.037210 sec (Wed 18 Nov 2009 11:36:55 GMT)
Dump stop : 1258594491.683288 sec (Thu 19 Nov 2009 01:34:51 GMT)
Total dump duration: 50276.646078 sec (13h 57m 56s)
Finished processing. Elapsed time: 0.005900 sec
Finished unloading flow memory. Time: 0.005919 sec
Percentage completed: 100.00%
Number of processed packets: 5902 (5.90 K)
Number of processed bytes: 4993414 (4.99 M)
Number of raw bytes: 4993414 (4.99 M)
Number of pcap bytes: 5087870 (5.09 M)
Number of IPv4 packets: 5902 (5.90 K) [100.00%]
Number of A packets: 1986 (1.99 K) [33.65%]
Number of B packets: 3916 (3.92 K) [66.35%]
Number of A bytes: 209315 (209.31 K) [4.19%]
Number of B bytes: 4784099 (4.78 M) [95.81%]
Average A packet load: 105.40
Average B packet load: 1221.68 (1.22 K)
--------------------------------------------------------------------------------
nDPI: Number of flows classified: 72 [100.00%]
tcpStates: Aggregated anomaly flags: 0x4a
--------------------------------------------------------------------------------
Headers count: min: 3, max: 3, average: 3.00
Number of TCP packets: 5902 (5.90 K) [100.00%]
Number of TCP bytes: 4993414 (4.99 M) [100.00%]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Number of processed   flows: 72
Number of processed A flows: 36 [50.00%]
Number of processed B flows: 36 [50.00%]
Number of request     flows: 36 [50.00%]
Number of reply       flows: 36 [50.00%]
Total   A/B    flow asymmetry: 0.00
Total req/rply flow asymmetry: 0.00
Number of processed   packets/flows: 81.97
Number of processed A packets/flows: 55.17
Number of processed B packets/flows: 108.78
Number of processed total packets/s: 0.12
Number of processed A+B packets/s: 0.12
Number of processed A   packets/s: 0.04
Number of processed   B packets/s: 0.08
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Number of average processed flows/s: 0.00
Average full raw bandwidth: 795 b/s
Average full bandwidth : 792 b/s
Max number of flows in memory: 18 [0.01%]
Memory usage: 0.06 GB [0.09%]
Aggregate flow status: 0x0000000000004000
[INF] IPv4
$

Open the flow file in your results folder. The end report states that nDPI was able to classify all flows. So let’s look into the flow file.

$ tcol faf-exercise_flows.txt
%dir  flowInd  flowStat            timeFirst          timeLast           duration    numHdrDesc  numHdrs  hdrDesc       srcMac             dstMac             ethType  ethVlanID  srcIP           srcIPCC  srcIPWho                   srcPort  dstIP           dstIPCC  dstIPWho                   dstPort  l4Proto  dstPortClassN  dstPortClass  nDPIclass      tcpStates
A     1        0x0000000000004000  1258544215.037210  1258544215.372742  0.335532    1           3        eth:ipv4:tcp  00:0b:db:4f:6b:10  00:19:e3:e7:5d:23  0x0800              192.168.1.104   09       "Private network"          1258     77.67.44.206    fr       "GTT Communications Inc."  80       6        80             http          "HTTP"         0x00
B     1        0x0000000000004001  1258544215.202900  1258544215.537951  0.335051    1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              77.67.44.206    fr       "GTT Communications Inc."  80       192.168.1.104   09       "Private network"          1258     6        80             http          "HTTP"         0x00
A     2        0x0000000000004000  1258544216.385370  1258544216.723144  0.337774    1           3        eth:ipv4:tcp  00:0b:db:4f:6b:10  00:19:e3:e7:5d:23  0x0800              192.168.1.104   09       "Private network"          1259     77.67.44.206    fr       "GTT Communications Inc."  80       6        80             http          "HTTP"         0x00
B     2        0x0000000000004001  1258544216.551313  1258544216.888595  0.337282    1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              77.67.44.206    fr       "GTT Communications Inc."  80       192.168.1.104   09       "Private network"          1259     6        80             http          "HTTP"         0x00
A     3        0x0000000000004000  1258544216.908284  1258544217.008468  0.100184    1           3        eth:ipv4:tcp  00:0b:db:4f:6b:10  00:19:e3:e7:5d:23  0x0800              192.168.1.104   09       "Private network"          1260     198.189.255.75  --       "--"                       80       6        80             http          "HTTP"         0x00
B     3        0x0000000000004001  1258544216.915576  1258544217.008019  0.092443    1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              198.189.255.75  --       "--"                       80       192.168.1.104   09       "Private network"          1260     6        80             http          "HTTP"         0x00
...
A     12       0x0000000000004000  1258563573.941668  1258563576.594009  2.652341    1           3        eth:ipv4:tcp  00:0b:db:63:5b:d4  00:19:e3:e7:5d:23  0x0800              192.168.1.103   09       "Private network"          1397     192.168.1.1     09       "Private network"          25       6        25             smtp          "SMTP"         0x00
B     12       0x0000000000004001  1258563573.941709  1258563576.594045  2.652336    1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:63:5b:d4  0x0800              192.168.1.1     09       "Private network"          25       192.168.1.103   09       "Private network"          1397     6        25             smtp          "SMTP"         0x08
A     13       0x0000000000004000  1258565030.304653  1258565030.420837  0.116184    1           3        eth:ipv4:tcp  00:0b:db:63:5b:d4  00:19:e3:e7:5d:23  0x0800              192.168.1.103   09       "Private network"          1749     192.168.1.1     09       "Private network"          25       6        25             smtp          "SMTP"         0x00
B     13       0x0000000000004001  1258565030.304696  1258565030.420877  0.116181    1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:63:5b:d4  0x0800              192.168.1.1     09       "Private network"          25       192.168.1.103   09       "Private network"          1749     6        25             smtp          "SMTP"         0x08
A     14       0x0000000000004000  1258565174.919134  1258565175.037809  0.118675    1           3        eth:ipv4:tcp  00:0b:db:63:5b:d4  00:19:e3:e7:5d:23  0x0800              192.168.1.103   09       "Private network"          1755     192.168.1.1     09       "Private network"          25       6        25             smtp          "SMTP"         0x00
...
A     33       0x0000000000004000  1258587444.865917  1258587445.631435  0.765518    1           3        eth:ipv4:tcp  00:0b:db:4f:6b:10  00:19:e3:e7:5d:23  0x0800              192.168.1.104   09       "Private network"          1908     198.189.255.75  --       "--"                       80       6        80             http          "HTTP"         0x02
B     33       0x0000000000004001  1258587444.873221  1258587445.638482  0.765261    1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              198.189.255.75  --       "--"                       80       192.168.1.104   09       "Private network"          1908     6        80             http          "HTTP"         0x02
A     34       0x0000000000004000  1258587445.990733  1258587446.040428  0.049695    1           3        eth:ipv4:tcp  00:0b:db:4f:6b:10  00:19:e3:e7:5d:23  0x0800              192.168.1.104   09       "Private network"          1910     198.189.255.75  --       "--"                       80       6        80             http          "HTTP"         0x02
B     34       0x0000000000004001  1258587445.998250  1258587446.047471  0.049221    1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              198.189.255.75  --       "--"                       80       192.168.1.104   09       "Private network"          1910     6        80             http          "HTTP"         0x02
A     36       0x0000000000004000  1258594163.408285  1258594191.015208  27.606923   1           3        eth:ipv4:tcp  00:08:74:38:01:b4  00:19:e3:e7:5d:23  0x0800              192.168.1.105   09       "Private network"          49330    143.166.11.10   us       "Dell"                     64334    6        64334          unknown       "FTP_DATA"     0x42
B     36       0x0000000000004001  1258594163.487027  1258594185.427506  21.940479   1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:08:74:38:01:b4  0x0800              143.166.11.10   us       "Dell"                     64334    192.168.1.105   09       "Private network"          49330    6        64334          unknown       "FTP_DATA"     0x02
A     35       0x0000000000004000  1258594162.928342  1258594185.618346  22.690004   1           3        eth:ipv4:tcp  00:08:74:38:01:b4  00:19:e3:e7:5d:23  0x0800              192.168.1.105   09       "Private network"          49329    143.166.11.10   us       "Dell"                     21       6        21             ftp           "FTP_CONTROL"  0x02
B     35       0x0000000000004001  1258594163.008594  1258594491.683288  328.674694  1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:08:74:38:01:b4  0x0800              143.166.11.10   us       "Dell"                     21       192.168.1.105   09       "Private network"          49329    6        21             ftp           "FTP_CONTROL"  0x42

If you scroll to the right you will notice the nDPIclass output classifying the traffic. For this simple traffic type most of the ports match actually the meaning except for the FTP-Data flow. As NDPI_OUTPUT_STATS is enabled nDPI supplies a separate traffic type statistics file shown below.

$ cat faf-exercise_nDPI.txt
# Protocol ID	Packets	Bytes	Description
  1	                  22 [  0.37%]	                2595 [  0.05%]	FTP_CONTROL
  3	                 894 [ 15.15%]	              148980 [  2.98%]	SMTP
  7	                 371 [  6.29%]	              304381 [  6.10%]	HTTP
175	                4615 [ 78.19%]	             4537458 [ 90.87%]	FTP_DATA

The file can be sorted and manipulated with the protStat script.

To sort the file by packets, run:

$ protStat faf-exercise_nDPI.txt
Protocol ID                        Packets                               Bytes      Description
175                         4615 [ 78.19%]                   4537458 [ 90.87%]      FTP_DATA
  3                          894 [ 15.15%]                    148980 [  2.98%]      SMTP
  7                          371 [  6.29%]                    304381 [  6.10%]      HTTP
  1                           22 [  0.37%]                      2595 [  0.05%]      FTP_CONTROL

And to sort it by bytes, run it with the -b option as follows:

$ protStat -b faf-exercise_nDPI.txt | tcol
Protocol ID                        Packets                               Bytes      Description
175                         4615 [ 78.19%]                   4537458 [ 90.87%]      FTP_DATA
  7                          371 [  6.29%]                    304381 [  6.10%]      HTTP
  3                          894 [ 15.15%]                    148980 [  2.98%]      SMTP
  1                           22 [  0.37%]                      2595 [  0.05%]      FTP_CONTROL

Now run t2 on the other pcaps or your own ones and see how nDPI performs. It still has its problems with encryption, as expected.

Have fun!