Tutorial: Regex the flow

PCRE Regex

In this tutorial we will show you how transform T2 into a regex based IDS, or a flow labeller for AI training. The plugin regex_pcre implements a full PCRE regex machine where rule trees can be constructed which traverse the flow boundary. In order to enhance performance L3/4 header parameters can be preselected before a regex is applied.

Preparation

Before we start we need to prepare T2. If you did not complete the tutorials before just follow the procedure described below.

First, I recommend to restore T2 into a pristine state by removing all unnecessary or older plugins from the plugin folder ~/.tranalyzer/plugins and compile the plugins listed below.

$ t2build -e
Are you sure you want to empty the plugin folder '/home/wurst/.tranalyzer/plugins' (y/N)? y
Plugin folder emptied
$ t2build tranalyzer2 basicFlow tcpStates regex_pcre txtSink
...
BUILD SUCCESSFUL

$

If you have not create a separate data/ and results/ directory yet, please do it now in another cmd window, that facilitates your workflow:

$ mkdir ~/data
$ mkdir ~/results
$

Download the sample pcap if you have not already: faf-exercise.pcap. Now you’re all set.

regex_pcre plugin

The regex plugin produces flow based output if a rule matches. It also implements the ALARM mode, only releasing flows when rules match. It can be one rule or a collection of rules operating on many flows forming a tree.

The configuration of regex_pcre is essentially controlled by two .h files located in the src/ folder.

  • regfile_pcre.h
  • regex_pcre.h

regfile_pcre.h defines the ingredients of the regfile.txt, containing all rules.

$ regex_pcre
$ cd src
$ vi regfile_pcre.h

So currently only 4 predecessors are allowed in a rule set, you may increase it, if needed.

$ vi regex_pcre.h

regexfile

REXPOSIX_FILE defines the regex file containing all rules to be tested against every packet of a flow. The rule trees which can be built, are very mighty but also confusing for the uninitiated, so let’s have a look at some examples of different rule types depicted below:

$ tcol regexfile.txt
#ID     PreID   Flags   ClassID Severity        Sel     Regexmode       FlwStat Proto   srcPort dstPort offset  Regex
# standalone rule: Alarm, start L7, Regexmode: default, select FlwStat: Req; Proto, dstPort
1       0       0x10    15      3       0x8000000d      0x0000000       0x00000000      6       0       80      0       (OPTIONS|GET|HEAD|POST|PUT|DELETE|TRACE|CONNECT)[^\r\n]*\/u7avi*\.bin
# standalone rule: Alarm, disabled, start L7, select Regexmode: (PCRE_CASELESS|PCRE_DOTALL), FlwStat: Teredo, IPv6, Vlan, Repl; Proto, srcPort
3       0       0x10    15      3       0x0800000e      0x0000005       0x00088101      6       80      0       0       \x31\xDB\x8D\x43\x0D\xCD\x80\x66.*\x31
# standalone rule: Alarm, start L7, Regexmode: default, FlwStat: IPv4, Rply
4       0       0x10    15      3       0x8000000c      0x0000000       0x00004001      6       80      0       20      \x38\x55\x42\x66\xe2\xb5\x34.*\xb5\x95\xbb
# standalone rule, Alarm, start L7, select Regexmode: (PCRE_CASELESS|PCRE_DOTALL)
100     0       0x10    1       0       0x88000000      0x0000005       0x00000000      6       0       80      0       ^http/1.0
# root rules to following tree, Reset if leaf fires
202     0       0x40    10      4       0x80000000      0x0000000       0x00000001      6       0       80      0       (GET|PUT).*update/u7avi1777u1705ff.bin
203     202,4   0x41    20      4       0x88000000      0x0000005       0x00000001      6       0       80      0       302 (?i)Found
# sucessors and predesessors, Reset if leaf fires
204     202,203 0x41    43      5       0x80000000      0x0000000       0x00000001      6       0       21      0       (?i)\.exe
# successors 206 & 205 to 204 AND ruleset, dont reset tree if 205 fires
205     204     0x16    40      4       0x80000002      0x0000000       0x00000000      6       0       20      0       ^get .*porno.*
206     204     0x56    35      6       0x8000000c      0x0000000       0x00000001      6       0       21      0       igfxzoom\.exe

t2build invokes the regconv script which transforms regfile.txt to a T2 compatible regexfile.txt and copies it under the plugin directory. After changing regfile.txt, always invoke t2build -f.

Each rule has an ID which not necessary needs to be unique, so that it can be linked by the predecessor preD. The latter denotes that a rule only fires if the predecessor ID also fired. The Flags define the modes of operation

The regfile.txt file reflects the following rule tree:

 1    3            4
                   |
       202:RST - 203:202&4,RST
            \      /
             \    /
          204:1&2,RST
             /   \
            /     \
    205:1&2,RST   206:1&2

The Flags define the modes of operation, internal states of the pcre engine and action on alarm in the flow shown below:

Flags

code description
0x00 solitary node
0x01 and(pred1, pred2, …)
0x02 or(pred1, pred2, …)
0x03 xor(pred1, pred2, …)
0x04 leaf
0x08 -
0x10 Print alarm to flow file
0x20 future: rule octive only in flow boundary
0x40 Reset REG_F_MTCH tree if match
0x80 Internal: regex match

The first 2 bits define the operation on the predecessors, such as and, or, xor. Hence, a specific rule with predecessors can only fire if the operation on the results of its predecessors results true.

ClassID and Severity describe the class and severity of an alarm. You may choose these numbers at your discretion. By default, they will be displayed in the flow output.

The Sel colum controls the activation of the following fields in the selection packet process:

Dir, Proto, srcPort, dstPort

and in which layer the application of the regex rule starts.

Sel

code object
0x0001 Activate flowStat
0x0002 Activate l4Proto
0x0004 Activate srcPort
0x0008 Activate dstPort
0x0010 Activate -
0x0020 Activate -
0x0040 Activate -
0x0080 Activate -
0x0100 Activate -
0x0200 Activate -
0x0400 Activate -
0x0800 Activate -
0x1000 Offset start L2 header
0x2000 Offset start L3 header
0x4000 Offset start L4 header
0x8000 Offset start L7 header

flowStat denotes the first 16 bit of the flow status, hence the requesting or replying flow could be selected or IPv4/6 etc. max 12 different parameters can be selected for the Sel columns. If you add columns in the regfile.txt file, HDRSELMX in regfile_pcre.h has to be increased accordingly.

default output

Run t2 on the pcap in default configuration and look at the end report and flow file.

$ t2 -r ~/data/faf-exercise.pcap -w ~/results/
===============================================================================
Tranalyzer 0.8.7 (Anteater), Tarantula. PID: 17047
================================================================================
[INF] Creating flows for L2, IPv4, IPv6
Active plugins:
    01: basicFlow, 0.8.7
    02: tcpStates, 0.8.7
    03: regex_pcre, 0.8.7
    04: txtSink, 0.8.7
[INF] basicFlow: IPv4 Ver: 4, Rev: 20102019, Range Mode: 0, subnet ranges loaded: 310391 (310.39 K)
[INF] basicFlow: IPv6 Ver: 4, Rev: 20102019, Range Mode: 0, subnet ranges loaded: 21495 (21.50 K)
[INF] regex_pcre: 9 regexes loaded
Processing file: /home/user/data/faf-exercise.pcap
Link layer type: Ethernet [EN10MB/1]
Dump start: 1258544215.037210 sec (Wed 18 Nov 2009 11:36:55 GMT)
Dump stop : 1258594491.683288 sec (Thu 19 Nov 2009 01:34:51 GMT)
Total dump duration: 50276.646078 sec (13h 57m 56s)
Finished processing. Elapsed time: 0.029434 sec
Finished unloading flow memory. Time: 0.029452 sec
Percentage completed: 100.00%
Number of processed packets: 5902 (5.90 K)
Number of processed bytes: 4993414 (4.99 M)
Number of raw bytes: 4993414 (4.99 M)
Number of pcap bytes: 5087870 (5.09 M)
Number of IPv4 packets: 5902 (5.90 K) [100.00%]
Number of A packets: 1986 (1.99 K) [33.65%]
Number of B packets: 3916 (3.92 K) [66.35%]
Number of A bytes: 209315 (209.31 K) [4.19%]
Number of B bytes: 4784099 (4.78 M) [95.81%]
Average A packet load: 105.40
Average B packet load: 1221.68 (1.22 K)
--------------------------------------------------------------------------------
tcpStates: Aggregated anomaly flags: 0x4a
regex_pcre: 6 alarms in 4 flows with max severity: 6
--------------------------------------------------------------------------------
Headers count: min: 3, max: 3, average: 3.00
Number of TCP packets: 5902 (5.90 K) [100.00%]
Number of TCP bytes: 4993414 (4.99 M) [100.00%]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Number of processed   flows: 72
Number of processed A flows: 36 [50.00%]
Number of processed B flows: 36 [50.00%]
Number of request     flows: 36 [50.00%]
Number of reply       flows: 36 [50.00%]
Total   A/B    flow asymmetry: 0.00
Total req/rply flow asymmetry: 0.00
Number of processed   packets/flows: 81.97
Number of processed A packets/flows: 55.17
Number of processed B packets/flows: 108.78
Number of processed total packets/s: 0.12
Number of processed A+B packets/s: 0.12
Number of processed A   packets/s: 0.04
Number of processed   B packets/s: 0.08
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Number of average processed flows/s: 0.00
Average full raw bandwidth: 795 b/s
Average full bandwidth : 792 b/s
Max number of flows in memory: 18 [0.01%]
Memory usage: 0.05 GB [0.08%]
Aggregate flow status: 0x0000000000004000
[WRN] 6 alarms in 4 flows [5.56%]
[INF] IPv4
$

regex_pcre reports 2 flows with 3 alarms in 72 flows. If you look at the flow file, ID 4, 100 and 206 produce these alarms, which are exactly the ones who have the print bit on.

$ tawk '$RgxCnt' faf-exercise_flows.txt | tcol
%dir  flowInd  flowStat            timeFirst          timeLast           duration   numHdrDesc  numHdrs  hdrDesc       srcMac             dstMac             ethType  ethVlanID  srcIP           srcIPCC  srcIPWho                       srcPort  dstIP          dstIPCC  dstIPWho           dstPort  l4Proto  tcpStates  RgxCnt  RID_RCTyp_RSev
B     3        0x0000000000004001  1258544216.915576  1258544217.008019  0.092443   1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              198.189.255.75  us       "California State University"  80       192.168.1.104  07       "Private network"  1260     6        0x00       1       4_15_3
B     33       0x0000000000004001  1258587444.873221  1258587445.638482  0.765261   1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              198.189.255.75  us       "California State University"  80       192.168.1.104  07       "Private network"  1908     6        0x02       2       100_1_0;100_1_0
B     34       0x0000000000004001  1258587445.998250  1258587446.047471  0.049221   1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              198.189.255.75  us       "California State University"  80       192.168.1.104  07       "Private network"  1910     6        0x02       1       100_1_0
B     36       0x0000000000004001  1258594163.487027  1258594185.427506  21.940479  1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:08:74:38:01:b4  0x0800              143.166.11.10   us       "Dell"                         64334    192.168.1.105  07       "Private network"  49330    6        0x02       2       206_35_6;206_35_6
$

Expert mode, timestamp, Alarm aggregation

Switch on aggregation of alarms, expert mode and the output of the timestamp just to see all the info provided to you. Recompile regex_pcre and run t2 on the pcap.

$ t2conf regex_pcre -D AGGR=1 -D EXPERTMODE=1 -D PKTTIME=1
$ t2build regex_pcre
...
$  t2 -r ~/data/faf-exercise.pcap -w ~/results/
...
--------------------------------------------------------------------------------
tcpStates: Aggregated anomaly flags: 0x4a
regex_pcre: 4 alarms in 4 flows with max severity: 6
--------------------------------------------------------------------------------
...
Aggregate flow status: 0x0002000000004000
WRN] 4 alarms in 4 flows [5.56%]
[INF] IPv4
$

As we aggregate, duplicate alarms are suppressed, that explains the reduction by two alarms, but still 4, 100 and 206.

$ tawk '$RgxCnt' faf-exercise_flows.txt | tcol
%dir  flowInd  flowStat            timeFirst          timeLast           duration   numHdrDesc  numHdrs  hdrDesc       srcMac             dstMac             ethType  ethVlanID  srcIP           srcIPCC  srcIPWho                       srcPort  dstIP          dstIPCC  dstIPWho           dstPort  l4Proto  tcpStates  RgxCnt  RID_RCTyp_RSev_NPkt_BPos_RTme
B     3        0x0000000000004001  1258544216.915576  1258544217.008019  0.092443   1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              198.189.255.75  us       "California State University"  80       192.168.1.104  07       "Private network"  1260     6        0x00       1       4_15_3_18_12_1258544216.000960
B     33       0x0000000000004001  1258587444.873221  1258587445.638482  0.765261   1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              198.189.255.75  us       "California State University"  80       192.168.1.104  07       "Private network"  1908     6        0x02       1       100_1_0_1_0_1258587444.000924
B     34       0x0000000000004001  1258587445.998250  1258587446.047471  0.049221   1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              198.189.255.75  us       "California State University"  80       192.168.1.104  07       "Private network"  1910     6        0x02       1       100_1_0_1_0_1258587446.000016
B     36       0x0000000000004001  1258594163.487027  1258594185.427506  21.940479  1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:08:74:38:01:b4  0x0800              143.166.11.10   us       "Dell"                         64334    192.168.1.105  07       "Private network"  49330    6        0x02       1       206_35_6_3058_89_1258594184.000878
$

The first three numbers are the same as in the default case. The next are packet number, byte position in the packet and the time stamp.

Regex based pcap extraction

As described in the pcap extraction tutorial, the regex_pcre plugin has also the capability to extract packets on an alarm basis. The pcapd plugin acts on the FL_ALARM bit set by a firing regex rule in flowStat if SALRMFLG is enabled.

Recompile regex_pcre and run t2 on the pcap.

$ t2conf regex_pcre -D SALRMFLG=1
$ t2build regex_pcre pcapd
...

$  t2 -r ~/data/faf-exercise.pcap -w ~/results/
...
--------------------------------------------------------------------------------
tcpStates: Aggregated anomaly flags: 0x4a
regex_pcre: 4 alarms in 4 flows with max severity: 6
pcapd: number of packets extracted: 3233 (3.23 K) [54.78%]
--------------------------------------------------------------------------------
...
Aggregate flow status: 0x0002000000004000
WRN] 4 alarms in 4 flows [5.56%]
[INF] IPv4
$

If you look into the results/ directory, you also see faf-exercise_pcapd.pcap created by pcapd.

$ cd ~/results
$ ls
faf-exercise_flows.txt  faf-exercise_headers.txt  faf-exercise_pcapd.pcap
$

Now run t2 now on faf-exercise_pcapd.pcap but unload pcapd or switch off SALRMFLG to prevent creating the same pcap again.

$ t2build -u pcapd
...
$ t2 -r ./faf-exercise_pcapd.pcap -w ~/results/
--------------------------------------------------------------------------------
tcpStates: Aggregated anomaly flags: 0x03
regex_pcre: 3 alarms in 3 flows with max severity: 5
pcapd: number of packets extracted: 3233 (3.23 K) [100.00%]
--------------------------------------------------------------------------------
Headers count: min: 3, max: 3, average: 3.00
Number of TCP packets: 3233 (3.23 K) [100.00%]
Number of TCP bytes: 4572148 (4.57 M) [100.00%]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Number of processed   flows: 11
Number of processed A flows: 10 [90.91%]
Number of processed B flows: 1 [9.09%]
Number of request     flows: 4 [36.36%]
Number of reply       flows: 7 [63.64%]
...
Aggregate flow status: 0x0002000000004000
[WRN] 3 alarms in 3 flows [27.27%]
[INF] IPv4
$

Only three alarms. The flow file shows that now ID 206 is missing, Why?

$ tawk '$RgxCnt' faf-exercise_pcapd_flows.txt | tcol
%dir  flowInd  flowStat            timeFirst          timeLast           duration  numHdrDesc  numHdrs  hdrDesc       srcMac             dstMac             ethType  ethVlanID  srcIP           srcIPCC  srcIPWho                       srcPort  dstIP          dstIPCC  dstIPWho           dstPort  l4Proto  tcpStates  RgxCnt  RID_RCTyp_RSev_NPkt_BPos_RTme
B     2        0x0002000000004001  1258544216.960826  1258544217.008019  0.047193  1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              198.189.255.75  us       "California State University"  80       192.168.1.104  07       "Private network"  1260     6        0x01       1       4_15_3_1_12_1258544216.000960
A     7        0x0002000000004001  1258587444.924436  1258587445.638482  0.714046  1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              198.189.255.75  us       "California State University"  80       192.168.1.104  07       "Private network"  1908     6        0x03       1       100_1_0_1_0_1258587444.000924
A     8        0x0002000000004001  1258587446.016254  1258587446.047471  0.031217  1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              198.189.255.75  us       "California State University"  80       192.168.1.104  07       "Private network"  1910     6        0x03       1       100_1_0_1_0_1258587446.000016
$

If you look for the FL_ALARM bit in flowStat, all flows which produced an alarm including the ones where no alarm is printed in the flow file.

$ tawk 'bitsanyset($flowStat, 0x0002000000000000)' faf-exercise_pcapd_flows.txt | tcol
%dir  flowInd  flowStat            timeFirst          timeLast           duration   numHdrDesc  numHdrs  hdrDesc       srcMac             dstMac             ethType  ethVlanID  srcIP           srcIPCC  srcIPWho                       srcPort  dstIP           dstIPCC  dstIPWho                       dstPort  l4Proto  tcpStates  RgxCnt  RID_RCTyp_RSev_NPkt_BPos_RTme
A     1        0x0002000000004000  1258544216.554751  1258544216.723144  0.168393   1           3        eth:ipv4:tcp  00:0b:db:4f:6b:10  00:19:e3:e7:5d:23  0x0800              192.168.1.104   07       "Private network"              1259     77.67.44.206    us       "GTT Communications Inc."      80       6        0x03       0
A     2        0x0002000000004000  1258544216.929764  1258544217.008468  0.078704   1           3        eth:ipv4:tcp  00:0b:db:4f:6b:10  00:19:e3:e7:5d:23  0x0800              192.168.1.104   07       "Private network"              1260     198.189.255.75  us       "California State University"  80       6        0x01       0
B     2        0x0002000000004001  1258544216.960826  1258544217.008019  0.047193   1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              198.189.255.75  us       "California State University"  80       192.168.1.104   07       "Private network"              1260     6        0x01       1       4_15_3_1_12_1258544216.000960
A     3        0x0002000000004001  1258544217.346549  1258544217.513942  0.167393   1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              77.67.44.206    us       "GTT Communications Inc."      80       192.168.1.104   07       "Private network"              1261     6        0x03       0
A     4        0x0002000000004001  1258544217.752541  1258544217.919686  0.167145   1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              77.67.44.206    us       "GTT Communications Inc."      80       192.168.1.104   07       "Private network"              1263     6        0x03       0
A     5        0x0002000000004001  1258544218.127308  1258544218.294696  0.167388   1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              77.67.44.206    us       "GTT Communications Inc."      80       192.168.1.104   07       "Private network"              1265     6        0x03       0
A     6        0x0002000000004001  1258562467.761692  1258562509.653962  41.892270  1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              63.245.221.11   us       "Mozilla Corporation"          80       192.168.1.104   07       "Private network"              1379     6        0x03       0
A     7        0x0002000000004001  1258587444.924436  1258587445.638482  0.714046   1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              198.189.255.75  us       "California State University"  80       192.168.1.104   07       "Private network"              1908     6        0x03       1       100_1_0_1_0_1258587444.000924
A     8        0x0002000000004001  1258587446.016254  1258587446.047471  0.031217   1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              198.189.255.75  us       "California State University"  80       192.168.1.104   07       "Private network"              1910     6        0x03       1       100_1_0_1_0_1258587446.000016
A     10       0x0002000000004000  1258594164.127154  1258594185.427506  21.300352  1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:08:74:38:01:b4  0x0800              143.166.11.10   us       "Dell"                         64334    192.168.1.105   07       "Private network"              49330    6        0x03       0
A     9        0x0002000000004000  1258594163.487490  1258594185.618346  22.130856  1           3        eth:ipv4:tcp  00:08:74:38:01:b4  00:19:e3:e7:5d:23  0x0800              192.168.1.105   07       "Private network"              49329    143.166.11.10   us       "Dell"                         21       6        0x03       0

For forensic purposes, it is useful to also extract the flow direction which did not produce an alarm, but is part of the alarm process of the opposite flow.

Extract also the opposite flows

In order to extract also the opposite flow of an alarm flow the constant PD_OPP has to be enabled and the plugin recompiled. Then rerun t2 on faf-exercise.pcap.

$ t2conf pcapd -D PD_OPP=1
$ t2build pcapd
...
$ t2 -r ./faf-exercise.pcap -w ~/results/
...
--------------------------------------------------------------------------------
tcpStates: Aggregated anomaly flags: 0x4a
regex_pcre: 4 alarms in 4 flows with max severity: 6
pcapd: number of packets extracted: 4775 (4.78 K) [80.90%]
--------------------------------------------------------------------------------
...
Aggregate flow status: 0x0002000000004000
[WRN] 4 alarms in 4 flows [5.56%]
[INF] IPv4
$ t2build -u pcapd
...
$ t2 -r ./faf-exercise_pcapd.pcap -w ~/results/
...
--------------------------------------------------------------------------------
tcpStates: Aggregated anomaly flags: 0x43
regex_pcre: 3 alarms in 3 flows with max severity: 5
--------------------------------------------------------------------------------
Headers count: min: 3, max: 3, average: 3.00
Number of TCP packets: 4775 (4.78 K) [100.00%]
Number of TCP bytes: 4699757 (4.70 M) [100.00%]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Number of processed   flows: 20
Number of processed A flows: 10 [50.00%]
Number of processed B flows: 10 [50.00%]
Number of request     flows: 10 [50.00%]
Number of reply       flows: 10 [50.00%]
...
Aggregate flow status: 0x0002000000004000
[WRN] 3 alarms in 3 flows [15.00%]
[INF] IPv4

Now we have 20 flows, because also the opposite A or B flow is extracted, as you can see if you look at the flow file below.

$ tcol faf-exercise_pcapd_flows.txt
%dir  flowInd  flowStat            timeFirst          timeLast           duration    numHdrDesc  numHdrs  hdrDesc       srcMac             dstMac             ethType  ethVlanID  srcIP           srcIPCC  srcIPWho                       srcPort  dstIP           dstIPCC  dstIPWho                       dstPort  l4Proto  tcpStates  RgxCnt  RID_RCTyp_RSev_NPkt_BPos_RTme
A     1        0x0002000000004000  1258544216.554751  1258544216.723144  0.168393    1           3        eth:ipv4:tcp  00:0b:db:4f:6b:10  00:19:e3:e7:5d:23  0x0800              192.168.1.104   07       "Private network"              1259     77.67.44.206    us       "GTT Communications Inc."      80       6        0x01       0
B     1        0x0000000000004001  1258544216.720958  1258544216.888595  0.167637    1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              77.67.44.206    us       "GTT Communications Inc."      80       192.168.1.104   07       "Private network"              1259     6        0x01       0
A     2        0x0002000000004000  1258544216.929764  1258544217.008468  0.078704    1           3        eth:ipv4:tcp  00:0b:db:4f:6b:10  00:19:e3:e7:5d:23  0x0800              192.168.1.104   07       "Private network"              1260     198.189.255.75  us       "California State University"  80       6        0x01       0
B     2        0x0002000000004001  1258544216.936827  1258544217.008019  0.071192    1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              198.189.255.75  us       "California State University"  80       192.168.1.104   07       "Private network"              1260     6        0x01       1       4_15_3_18_12_1258544216.000960
A     3        0x0000000000004000  1258544217.347008  1258544217.348506  0.001498    1           3        eth:ipv4:tcp  00:0b:db:4f:6b:10  00:19:e3:e7:5d:23  0x0800              192.168.1.104   07       "Private network"              1261     77.67.44.206    us       "GTT Communications Inc."      80       6        0x03       0
B     3        0x0002000000004001  1258544217.346549  1258544217.513942  0.167393    1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              77.67.44.206    us       "GTT Communications Inc."      80       192.168.1.104   07       "Private network"              1261     6        0x03       0
A     4        0x0000000000004000  1258544217.753003  1258544217.754495  0.001492    1           3        eth:ipv4:tcp  00:0b:db:4f:6b:10  00:19:e3:e7:5d:23  0x0800              192.168.1.104   07       "Private network"              1263     77.67.44.206    us       "GTT Communications Inc."      80       6        0x03       0
B     4        0x0002000000004001  1258544217.752541  1258544217.919686  0.167145    1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              77.67.44.206    us       "GTT Communications Inc."      80       192.168.1.104   07       "Private network"              1263     6        0x03       0
A     5        0x0000000000004000  1258544218.127768  1258544218.129260  0.001492    1           3        eth:ipv4:tcp  00:0b:db:4f:6b:10  00:19:e3:e7:5d:23  0x0800              192.168.1.104   07       "Private network"              1265     77.67.44.206    us       "GTT Communications Inc."      80       6        0x03       0
B     5        0x0002000000004001  1258544218.127308  1258544218.294696  0.167388    1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              77.67.44.206    us       "GTT Communications Inc."      80       192.168.1.104   07       "Private network"              1265     6        0x03       0
A     6        0x0000000000004000  1258562467.900050  1258562509.633370  41.733320   1           3        eth:ipv4:tcp  00:0b:db:4f:6b:10  00:19:e3:e7:5d:23  0x0800              192.168.1.104   07       "Private network"              1379     63.245.221.11   us       "Mozilla Corporation"          80       6        0x01       0
B     6        0x0002000000004001  1258562467.761692  1258562509.653962  41.892270   1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              63.245.221.11   us       "Mozilla Corporation"          80       192.168.1.104   07       "Private network"              1379     6        0x01       0
A     7        0x0000000000004000  1258587444.924890  1258587445.631435  0.706545    1           3        eth:ipv4:tcp  00:0b:db:4f:6b:10  00:19:e3:e7:5d:23  0x0800              192.168.1.104   07       "Private network"              1908     198.189.255.75  us       "California State University"  80       6        0x03       0
B     7        0x0002000000004001  1258587444.924436  1258587445.638482  0.714046    1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              198.189.255.75  us       "California State University"  80       192.168.1.104   07       "Private network"              1908     6        0x03       1       100_1_0_1_0_1258587444.000924
A     8        0x0000000000004000  1258587446.016701  1258587446.040428  0.023727    1           3        eth:ipv4:tcp  00:0b:db:4f:6b:10  00:19:e3:e7:5d:23  0x0800              192.168.1.104   07       "Private network"              1910     198.189.255.75  us       "California State University"  80       6        0x03       0
B     8        0x0002000000004001  1258587446.016254  1258587446.047471  0.031217    1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:0b:db:4f:6b:10  0x0800              198.189.255.75  us       "California State University"  80       192.168.1.104   07       "Private network"              1910     6        0x03       1       100_1_0_1_0_1258587446.000016
A     10       0x0002000000004000  1258594164.127154  1258594185.427506  21.300352   1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:08:74:38:01:b4  0x0800              143.166.11.10   us       "Dell"                         64334    192.168.1.105   07       "Private network"              49330    6        0x03       0
B     10       0x0000000000004001  1258594164.127586  1258594191.015208  26.887622   1           3        eth:ipv4:tcp  00:08:74:38:01:b4  00:19:e3:e7:5d:23  0x0800              192.168.1.105   07       "Private network"              49330    143.166.11.10   us       "Dell"                         64334    6        0x43       0
A     9        0x0002000000004000  1258594163.487490  1258594185.618346  22.130856   1           3        eth:ipv4:tcp  00:08:74:38:01:b4  00:19:e3:e7:5d:23  0x0800              192.168.1.105   07       "Private network"              49329    143.166.11.10   us       "Dell"                         21       6        0x03       0
B     9        0x0000000000004001  1258594163.565990  1258594491.683288  328.117298  1           3        eth:ipv4:tcp  00:19:e3:e7:5d:23  00:08:74:38:01:b4  0x0800              143.166.11.10   us       "Dell"                         21       192.168.1.105   07       "Private network"              49329    6        0x43       0
$

So now you have the basics of the regex_pcre plugin. Create your own rules and test them on your own traffic.

You can also use the findexer instead of pcapd. Refer to pcapextraction. We are always happy to get feedback to improve the anteater.

Do not forget to reset the configuration in regex_pcre.h:

$ t2conf regex_pcre -D SALRMFLG=0 -D AGGR=0 -D EXPERTMODE=0 -D PKTTIME=0
$ t2conf pcapd -D PD_OPP=0
$ t2build regex_pcre
...
$

Have fun!