Tutorial: Geolocation and WHOIS behind it

Description

This tutorial details the different features of T2 concerning geolocation and the determination of the organization behind an IP address. There are two options:

  • basicFlow, T2 geolocation and organization
  • geoip, open source geolocation geoIP, maxmind DB

Note that the geoip DB is considerable slower than basicFlow. Indeed, geoip produces currently more information, about city etc, which basicFlow will do in the next release of 0.8.4 and still be faster.

Preparation

In order to do so, we need to prepare T2. If you did not complete the tutorials before, just follow the procedure described below.

First, restore T2 into a pristine state by removing all unnecessary or older plugins from the plugin folder ~/.tranalyzer/plugins and compile the following plugins.

$ t2build -e
Are you sure you want to empty the plugin folder '/home/wurst/.tranalyzer/plugins' (y/N)? y
Plugin folder emptied
$ t2build tranalyzer2 basicFlow basicStats tcpStates connStat txtSink
...
BUILD SUCCESSFUL

If you did not create a separate data and results directory yet, please do it now in another bash window, that facilitates your workflow:

$ mkdir ~/data
$ mkdir ~/results
$ cd data

The anonymized sample PCAP being used, can be downloaded here: faf-exercise.pcap Please extract it under your data folder. Now you are all set for T2 IP label experiments.

basicFlow subnet and IP labeling

T2 provides its own geolabeling and IP identification service, so no need anymore to lookup a maxmind DB or whois every IP address. The files necessary are always updated with each version of T2. The bzip2 subnet files for IPv4/6 are extracted by the autogen.sh script or by t2build using the programs under utils. We will look at it below.

$ basicFlow
$ ls
AUTHORS  autogen.sh  ChangeLog  configure.ac  COPYING  doc  Makefile.am  NEWS  README  src  subnets4.txt.bz2  subnets6.txt.bz2  t2plconf  tests  tor  utils
$

Now move to the src directory. the subnetHL4/6 files contain our binary-vector search algorithm. All .h files contain configuration constants.

$ cd src
$ ls
basicFlow.c  basicFlow.h  Makefile.am  subnetHL4.c  subnetHL4.h  subnetHL6.c  subnetHL6.h  utils.h
$

Open basicFlow.h and look for the user defined switches concerning subnets as shown below:

$ vi basicFlow.h

BFO_SUBNET_TESTactivates the subnet labeling. It is switched on by default. If GRE, L2TP or TEREDO output switches, not shown here, are activated, then separately for these addresses the subnet labeling can be activated. We leave them off because the pcaps in this tutorial do not contain any of these encapsulations.

To be close to the default geoip plugin output we switch on the Autonomous Systems Numbers (ASN) and the longitude, latitude output as indicated below. The HEX option we leave off, it toggles between a human readable whois output, or a hex coded one. The latter is a powerful selection mechanism when searching large flow files.

Now open utils.h

$ vi utils.h

The SUBRNG constant defines the search mode, either CIDR or ranges. The range mode has the advantage that any range can be defined by one single line whereas the CIDR notation would need many lines in the subnet file. We leave it at the default CIDR.

The WHOLEN constant defines the length of the WHOIS column in the basicFlow output including the “\0”.

SUBVERS defines the subnet version. Different versions are NOT compatible. t2build will warn you if there is a discrepancy. So leave it at the default value.

Save all open files and rebuild basicFlow, basicStats and connStat, because basicStats and connStat depend on the subnetHL4.c routines if BFO_SUBNET_TEST is activated. You may also rebuild all plugins build so far, it is shorter to type. And run t2 on the pcap. Instead of editing all the files you can also use the t2conf command:

$ t2conf basicFlow -D BFO_SUBNET_ASN=1 -D BFO_SUBNET_LL=1
$ t2build -R
...
$ t2 -r ~/data/faf-exercise.pcap -w ~/results
================================================================================
Tranalyzer 0.8.5 (Anteater), Tarantula. PID: 12542
================================================================================
[INF] Creating flows for L2, IPv4, IPv6
Active plugins:
    01: basicFlow, 0.8.5
    02: basicStats, 0.8.5
    03: tcpStates, 0.8.5
    04: connStat, 0.8.5
    05: txtSink, 0.8.5
[INF] basicFlow: IPv4 Ver: 3, Rev: 01072019, Range Mode: 0, subnet ranges loaded: 312747 (312.75 K)
[INF] basicFlow: IPv6 Ver: 3, Rev: 01072019, Range Mode: 0, subnet ranges loaded: 21494 (21.49 K)
Processing file: /home/wurst/faf-exercise.pcap
Link layer type: Ethernet [EN10MB/1]
Dump start: 1258544215.037210 sec (Wed 18 Nov 2009 11:36:55 GMT)
Dump stop : 1258594491.683288 sec (Thu 19 Nov 2009 01:34:51 GMT)
Total dump duration: 50276.646078 sec (13h 57m 56s)
Finished processing. Elapsed time: 0.004831 sec
Finished unloading flow memory. Time: 0.004860 sec
Percentage completed: 100.00%
Number of processed packets: 5902 (5.90 K)
Number of processed bytes: 4993414 (4.99 M)
Number of raw bytes: 4993414 (4.99 M)
Number of pcap bytes: 5087870 (5.09 M)
Number of IPv4 packets: 5902 (5.90 K) [100.00%]
Number of A packets: 1986 (1.99 K) [33.65%]
Number of B packets: 3916 (3.92 K) [66.35%]
Number of A bytes: 209315 (209.31 K) [4.19%]
Number of B bytes: 4784099 (4.78 M) [95.81%]
Average A packet load: 105.40
Average B packet load: 1221.68 (1.22 K)
--------------------------------------------------------------------------------
basicStats: Biggest Talker: 143.166.11.10 (US): 3101 (3.10 K) [52.54%] packets
basicStats: Biggest Talker: 143.166.11.10 (US): 4436320 (4.44 M) [88.84%] bytes
tcpStates: Aggregated anomaly flags: 0x4a
connStat: Number of unique source IPs: 25
connStat: Number of unique destination IPs: 26
connStat: Number of unique source/destination IPs connections: 10
connStat: Max unique number of source IP / destination port connections: 18
connStat: IP prtcon/sdcon, prtcon/scon: 1.800000, 0.720000
connStat: Source IP with max connections: 192.168.1.104: 2 connections
connStat: Destination IP with max connections: 77.67.44.206 (FR): 1 connections
--------------------------------------------------------------------------------
Headers count: min: 3, max: 3, average: 3.00
Number of TCP packets: 5902 (5.90 K) [100.00%]
Number of TCP bytes: 4993414 (4.99 M) [100.00%]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Number of processed   flows: 72
Number of processed A flows: 36 [50.00%]
Number of processed B flows: 36 [50.00%]
Number of request     flows: 36 [50.00%]
Number of reply       flows: 36 [50.00%]
Total   A/B    flow asymmetry: 0.00
Total req/rply flow asymmetry: 0.00
Number of processed   packets/flows: 81.97
Number of processed A packets/flows: 55.17
Number of processed B packets/flows: 108.78
Number of processed total packets/s: 0.12
Number of processed A+B packets/s: 0.12
Number of processed A   packets/s: 0.04
Number of processed   B packets/s: 0.08
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Number of average processed flows/s: 0.00
Average full raw bandwidth: 795 b/s
Average full bandwidth : 792 b/s
Max number of flows in memory: 18 [0.01%]
Memory usage: 0.07 GB [0.11%]
Aggregate flow status: 0x0000000000004000
[INF] IPv4
$

Note that biggest talkers and connectors are now labeled with the country acronym, if one is found.

Let’s print the essential columns of the flow file relevant to geolocation and whois.

$ tawk '{ print $srcIP, $srcIPASN, $srcIPCC, $srcIPWho, $srcIPLat_Lng_relP, $dstIP, $dstIPASN, $dstIPCC, $dstIPWho, $dstIPLat_Lng_relP }' faf-exercise_flows.txt | sort -Vru -k1,1 | tcol
srcIP           srcIPASN  srcIPCC  srcIPWho                       srcIPLat_Lng_relP      dstIP          dstIPASN  dstIPCC  dstIPWho                   dstIPLat_Lng_relP
198.189.255.75  0         us       "California State University"  666_666_-1             192.168.1.104  0         09       "Private network"          666_666_-1
192.168.1.105   0         09       "Private network"              666_666_-1             192.168.1.1    0         09       "Private network"          666_666_-1
192.168.1.104   0         09       "Private network"              666_666_-1             77.67.44.206   3257      fr       "GTT Communications Inc."  48.71667_2.25_80
192.168.1.103   0         09       "Private network"              666_666_-1             192.168.1.1    0         09       "Private network"          666_666_-1
192.168.1.102   0         09       "Private network"              666_666_-1             192.168.1.1    0         09       "Private network"          666_666_-1
192.168.1.1     0         09       "Private network"              666_666_-1             192.168.1.103  0         09       "Private network"          666_666_-1
143.166.11.10   0         us       "Dell"                         30.51748_-97.67207_80  192.168.1.105  0         09       "Private network"          666_666_-1
77.67.44.206    3257      fr       "GTT Communications Inc."      48.71667_2.25_80       192.168.1.104  0         09       "Private network"          666_666_-1
63.245.221.11   395642    us       "Mozilla Corporation"          38.6409_-121.5228_80   192.168.1.104  0         09       "Private network"          666_666_-1
$

A 666 in the longitude, latitude column means that there is no location defined, also indicated by the radius -1. If you look in the subnets4.txt file you can confirm the IPv4 labeling. We will look at these files in detail below: Internal WHOIS: subnet your own

TOR address labeling

By default TOR addresses are integrated in the subnet file by the subconvscript under basicFlow/utils when t2build or autogen.sh are invoked. You can switch it off, just edit the autogen.sh file and remove the -t option. Below a flow file is shown where TOR addresses are present, I currently do not have an anonymized pcap for you to play with. I’m on it.

$ t2 -r ~/data/wurst.pcap -w ~/results
...
Aggregate flow status: 0x010038f2c098fb04
[WRN] L3 SnapLength < Length in IP header
[WRN] Consecutive duplicate IP ID
[WRN] IPv4/6 fragmentation header packet missing
[WRN] IPv4/6 packet fragmentation sequence not finished
[INF] IPv4
[INF] IPv6
[INF] IPv4/6 fragmentation
[INF] VLAN encapsulation
[INF] MPLS encapsulation
[INF] L2TP encapsulation
[INF] PPP/HDLC encapsulation
[INF] GRE encapsulation
[INF] AYIYA tunnel
[INF] Teredo tunnel
[INF] CAPWAP/LWAPP tunnel
[INF] Ethernet flows
[INF] Authentication Header (AH)
[INF] Encapsulating Security Payload (ESP)
[INF] TOR addresses

Note that the end report indicates that TOR addresses are present. In the flow file TOR addresses will be labeled by a TOR,, or just select all TOR traffic with the TORADD bit in flowStat as shown below.

$ tawk '{ if (bitsanyset($flowStat,0x0100000000000000)) print $dir, $flowInd, $flowStat, $srcIP, $srcIPASN, $srcIPCC, $srcIPWho, $srcIPLat_Lng_relP, $dstIP, $dstIPASN, $dstIPCC, $dstIPWho, $dstIPLat_Lng_relP }' wurst_flows.txt | tcol 
%dir  flowInd  flowStat            srcIP         srcIPASN  srcIPCC  srcIPWho                       srcIPLat_Lng_relP    dstIP         dstIPASN  dstIPCC  dstIPWho                       dstIPLat_Lng_relP
A     29388    0x0100000000004300  N.U.D.E   	 3303      ch       "Bluewin"                      46.20222_6.14569_80  L.O.L.U       8437      at       "TOR,Hutchison Drei Austria "  16.37208_48.20849_1
B     29388    0x0100000000004301  L.O.L.U       8437      at       "TOR,Hutchison Drei Austria "  16.37208_48.20849_1  N.U.D.E       3303      ch       "Bluewin"                      46.20222_6.14569_80
$

geoip plugin

T2 implements the open source GeoLiteCity and Maxmind2 DBs, note that geoip is not supported anymore. As a lot of people still use the geoIP DB we will continue to support the geoip DB in our plugin.

Now move to the geoip plugin and look into it

$ geoip
$ ls
AUTHORS  autogen.sh  ChangeLog  configure.ac  COPYING  doc  GeoLite2-City.mmdb.gz  GeoLiteCity.dat.gz  GeoLiteCityv6.dat.gz  Makefile.am  NEWS  README  scripts  src  t2plconf  tests
$

Note the geoIP DB: GeoLiteCity.dat.gz and GeoLiteCityv6.dat.gaz as well as the maxmind2 DB: GeoLite2-City.mmdb.gz. If you move into the scripts folder you see two scripts:

  • genkml.sh (map coordinates to google earth)
  • updatedb.sh (update DB)

The first maps a flow file to a KML google earth file to produce an earth view with the location of the various IPs. The second updates the DBs. Please refer to the documentation and the doc folder for detailed information.

Now move to the src directory and look into the .h file

$ cd src
$ ls
geoip.c  geoip.h  Makefile.am
$ vi geoip.h

Important is the selection of the type of DB. Since the 0.8.4 default is the maxmind DB. As you can see the classification of srcIP or dstIP can be separately enabled. Any output of country, city, language etc can also be enabled. For this tutorial we leave everything in default configuration as shown below.

...
// user defines
#define GEOIP_LEGACY     0 // Whether to use GeoLite2 (0) or the GeoLite legacy database (1)

#define GEOIP_SRC        1 // whether or not to display geo info for the source IP
#define GEOIP_DST        1 // whether or not to display geo info for the destination IP

#define GEOIP_CONTINENT  2 // 0: no continent, 1: name (GeoLite2), 2: two letters code
#define GEOIP_COUNTRY    2 // 0: no country, 1: name, 2: two letters code, 3: three letters code (Legacy)
#define GEOIP_CITY       1 // whether or not to display the city of the IP
#define GEOIP_POSTCODE   1 // whether or not to display the postal code of the IP
#define GEOIP_POSITION   1 // whether or not to display the position (latitude, longitude) of the IP
#define GEOIP_METRO_CODE 0 // whether or not to display the metro (dma) code of the IP (US only)

#if GEOIP_LEGACY == 0
#define GEOIP_ACCURACY   1    // whether or not to display the accuracy (GeoLite2)
#define GEOIP_TIMEZONE   1    // whether or not to display the time zone (GeoLite2)
#define GEOIP_LANG       "en" // Output language: en, de, fr, es, ja, pt-BR, ru, zh-CN, ...
#define GEOIP_BUFSIZE    64   // buffer size
#else // GEOIP_LEGACY == 1
#define GEOIP_REGION     1 // 0: no region,  1: name, 2: code
#define GEOIP_AREA_CODE  0 // whether or not to display the telephone area code of the IP
#define GEOIP_NETMASK    1 // 0: no netmask, 1: netmask as int (cidr),
                           // 2: netmask as hex (IPv4 only), 3: netmask as IP (IPv4 only)
#define GEOIP_DB_CACHE   2 // 0: read DB from file system (slower, least memory)
                           // 1: index cache (cache frequently used index only)
                           // 2: memory cache (faster, more memory)
#endif // GEOIP_LEGACY == 1

#define GEOIP_UNKNOWN    "--" // Representation of unknown locations (GeoIP's default)
...

So compile the plugin and rerun T2 on the said pcap.

$ t2build geoip
...
$ t2 -r ~/data/faf-exercise.pcap -w ~/results/
...
$

To compare with the basicFlow output I aggregated the same columns as above:

$ tawk '{ print $srcIP, $dstIP, $srcIpContinent, $srcIpCountry, $srcIpCity, $srcIpPostcode, $srcIpAccuracy, $srcIpLat, $srcIpLong, $srcIpTimeZone, $dstIpContinent, $dstIpCountry, $dstIpCity, $dstIpPostcode, $dstIpAccuracy, $dstIpLat, $dstIpLong, $dstIpTimeZone }' faf-exercise_flows.txt | sort -Vru -k1,1 | tcol
srcIP           dstIP          dstIPLat_Lng_relP  srcIpContinent  srcIpCountry  srcIpCity     srcIpPostcode  srcIpAccuracy  srcIpLat   srcIpLong    srcIpTimeZone          dstIpContinent  dstIpCountry  dstIpCity  dstIpPostcode  dstIpAccuracy  dstIpLat   dstIpLong  dstIpTimeZone
198.189.255.75  192.168.1.104  666_666_-1         NA              US            "Long Beach"  90802          5              33.770600  -118.182000  "America/Los_Angeles"  --              --            "--"       --             0              0.000000   0.000000   ""
192.168.1.105   192.168.1.1    666_666_-1         --              --            "--"          --             0              0.000000   0.000000     ""                     --              --            "--"       --             0              0.000000   0.000000   ""
192.168.1.104   77.67.44.206   48.71667_2.25_80   --              --            "--"          --             0              0.000000   0.000000     ""                     EU              IE            "--"       --             200            53.347200  -6.243900  "Europe/Dublin"
192.168.1.103   192.168.1.1    666_666_-1         --              --            "--"          --             0              0.000000   0.000000     ""                     --              --            "--"       --             0              0.000000   0.000000   ""
192.168.1.102   192.168.1.1    666_666_-1         --              --            "--"          --             0              0.000000   0.000000     ""                     --              --            "--"       --             0              0.000000   0.000000   ""
192.168.1.1     192.168.1.103  666_666_-1         --              --            "--"          --             0              0.000000   0.000000     ""                     --              --            "--"       --             0              0.000000   0.000000   ""
143.166.11.10   192.168.1.105  666_666_-1         NA              US            "--"          --             1000           37.751000  -97.822000   "--"                   --              --            "--"       --             0              0.000000   0.000000   ""
77.67.44.206    192.168.1.104  666_666_-1         EU              IE            "--"          --             200            53.347200  -6.243900    "Europe/Dublin"        --              --            "--"       --             0              0.000000   0.000000   ""
63.245.221.11   192.168.1.104  666_666_-1         NA              US            "San Jose"    95124          50             37.256300  -121.922900  "America/Los_Angeles"  --              --            "--"       --             0              0.000000   0.000000   ""
...

Hex code labeling

As mentioned above t2 supports hex code labeling, which is a powerful flow selection mechanism, as integer AND operations are much faster than strings compares. Open basicFlow.h and set BFO_SUBNET_HEX 1, rebuild all and rerun t2, as indicated below

$ t2conf basicFlow -D BFO_SUBNET_HEX=1
$ t2build -R
...
$ t2 -r ~/data/faf-exercise.pcap -w ~/results/
...
$

Now the strings are gone and replaced by 32 bit hex numbers. Now you can select all flows of a certain country and/or organization with a simple tawk script.

$ tawk '{ print $srcIP, $srcIPASN, $srcIPCC, $srcIPLat_Lng_relP, $dstIP, $dstIPASN, $dstIPCC, $dstIPLat_Lng_relP }' faf-exercise_flows.txt | sort -Vru -k1,1 | tcol
srcIP		srcIPASN	srcIPCC		srcIPLat_Lng_relP	dstIP		dstIPASN	dstIPCC		dstIPLat_Lng_relP
198.189.255.75	0		0x00000000	0_0_0			192.168.1.104	0		0x010136e0	666_666_-1
192.168.1.105	0		0x010136e0	666_666_-1		192.168.1.1	0		0x010136e0	666_666_-1
192.168.1.104	0		0x010136e0	666_666_-1		77.67.44.206	3257		0x4b0093da	48.71667_2.25_80
192.168.1.103	0		0x010136e0	666_666_-1		192.168.1.1	0		0x010136e0	666_666_-1
192.168.1.102	0		0x010136e0	666_666_-1		192.168.1.1	0		0x010136e0	666_666_-1
192.168.1.1	0		0x010136e0	666_666_-1		192.168.1.103	0		0x010136e0	666_666_-1
143.166.11.10	0		0xe90068d8	30.51748_-97.67207_80	192.168.1.105	0		0x010136e0	666_666_-1
77.67.44.206	3257		0x4b0093da	48.71667_2.25_80	192.168.1.104	0		0x010136e0	666_666_-1
63.245.221.11	395642		0xe900fd1a	38.6409_-121.5228_80	192.168.1.104	0		0x010136e0	666_666_-1

The 32 bit hex coding is shown below:

The code to text resolution can be found in

  • who4CntryCds.txt
  • who4/6OrgCds.txt

Now the strings are gone and replaced by 32 bit hex numbers. Now you can select all flows of a certain country and/or organization with a simple tawk script.

$ tawk 'and(strtonum($srcIPCC), 0xff000000) == 0xe9000000 { print $srcIP, $srcIPASN, $srcIPCC, $srcIPLat_Lng_relP, $dstIP, $dstIPASN, $dstIPCC, $dstIPLat_Lng_relP }' faf-exercise_flows.txt | sort -Vru -k1,1 | tcol
srcIP		srcIPASN  srcIPCC	srcIPLat_Lng_relP	dstIP		dstIPASN	dstIPCC		dstIPLat_Lng_relP
198.189.255.75	0         0x00000000	0_0_0			192.168.1.104	0		0x010136e0	666_666_-1
192.168.1.105	0	  0x010136e0	666_666_-1		192.168.1.1	0		0x010136e0	666_666_-1
192.168.1.104	0	  0x010136e0	666_666_-1		77.67.44.206	3257		0x4b0093da	48.71667_2.25_80
192.168.1.103	0	  0x010136e0	666_666_-1		192.168.1.1	0		0x010136e0	666_666_-1
192.168.1.102	0	  0x010136e0	666_666_-1		192.168.1.1	0		0x010136e0	666_666_-1
192.168.1.1	0	  0x010136e0	666_666_-1		192.168.1.103	0		0x010136e0	666_666_-1
143.166.11.10	0	  0xe90068d8	30.51748_-97.67207_80	192.168.1.105	0		0x010136e0	666_666_-1

In the $srcIPCC or $dstIPCC the bit 0x00800000 indicates a TOR address or you can select TOR flows just with the $flowStat bit as indicated below from the pcap I did not anonymize yet.

$ tawk '{ if (bitsanyset($flowStat,0x0100000000000000)) print $dir, $flowInd, $flowStat, $srcIP, $srcIPASN, $srcIPCC, $srcIPLat_Lng_relP, $dstIP, $dstIPASN, $dstIPCC, $dstIPLat_Lng_relP }' wurst_flows.txt | tcol
%dir  flowInd  flowStat            srcIP         srcIPASN  srcIPCC     srcIPLat_Lng_relP    dstIP         dstIPASN  dstIPCC     dstIPLat_Lng_relP
A     29388    0x0100000000004300  K.A.C.K  	 3303      0x2c003a60  46.20222_6.14569_80  S.H.I.T   	  8437      0x0f80c5a4  16.37208_48.20849_1
B     29388    0x0100000000004301  S.H.I.T   	 8437      0x0f80c5a4  16.37208_48.20849_1  K.A.C.K   	  3303      0x2c003a60  46.20222_6.14569_80
$

Internal WHOIS subnet your own

Which admin was not asking himself WHO, WHERE and WHY the fuck is somebody doing what he is doing, or how to find an in-house IP 10.23.4.5? Yeah, I did lot’s and got weary to lookup Excel sheets, logs or if I was lucky, DBs. Now you try to do that on 1000 addresses and hand over a report in no time.

As the private IPv4/6 address space is hopefully only listed inside your organization we need to build our own subnet file. Building one is fairly easy if IP to location and organization is available as a tab or csv file. So that you can expand the current subnet files or rewrite them, T2 is shipped with the .txt version and including scripts to convert them to the T2 compatible binary version. That is the reason, why the initial build of basicFlow takes a bit longer.

Let’s look now at the basicFlow directory after the plugin is compiled. The HL.txt files are intermittent files to the binary format HL.bin. The original is the decompressed subnets4/6.txt file, which contains all information.

$ ls
aclocal.m4  autom4te.cache  config.h     config.status  COPYING  libtool   Makefile.am  README    subnets4_HLP.bin  subnets4.txt      subnets6_HLP.txt  subnets6.txt.bz2  tor
AUTHORS     build-aux       config.h.in  configure      doc      m4        Makefile.in  src       subnets4_HLP.txt  subnets4.txt.bz2  subnets6_HL.txt   t2plconf          utils
autogen.sh  ChangeLog       config.log   configure.ac   INSTALL  Makefile  NEWS         stamp-h1  subnets4_HL.txt   subnets6_HLP.bin  subnets6.txt      tests
$

Open subnets4.txt

$ lsx subnets4.txt
#                                   3    21062019
# IP CIDR                           Msk  IP range                         CtryWhoCode  ASN    Uncert  Latitude    Longitude    WhoCntry  Org
# Begin IPv4 private address space
10.0.0.0/8                          8    10.0.0.0-10.255.255.255          0x010136e0   0      -1.0    666.000000  666.000000   01        Private network
127.0.0.0/8                         8    127.0.0.0-127.255.255.255        0x0100e47e   0      -1.0    666.000000  666.000000   02        Loopback address
100.64.0.0/10                       10   100.64.0.0-100.127.255.255       0x0101616f   0      -1.0    666.000000  666.000000   03        Shared address space
169.254.0.0/16                      16   169.254.0.0-169.254.255.255      0x0100e245   0      -1.0    666.000000  666.000000   04        Link-local address
172.16.0.0/12                       12   172.16.0.0-172.31.255.255        0x010136e0   0      -1.0    666.000000  666.000000   05        Private network
192.0.0.0/24                        24   192.0.0.0-192.0.0.255            0x010136e0   0      -1.0    666.000000  666.000000   06        Private network
192.0.2.0/24                        24   192.0.2.0-192.0.2.255            0x01017a78   0      -1.0    666.000000  666.000000   07        TEST-NET-1
192.88.99.0/24                      24   192.88.99.0-192.88.99.255        0x0100b462   0      -1.0    666.000000  666.000000   08        IPv6 to IPv4 relay
192.168.0.0/16                      16   192.168.0.0-192.168.255.255      0x010136e0   0      -1.0    666.000000  666.000000   09        Private network
198.18.0.0/15                       15   198.18.0.0-198.119.255.255       0x010136e0   0      -1.0    666.000000  666.000000   10        Private network
198.51.100.0/16                     16   198.51.100.0-198.51.100.255      0x01017a79   0      -1.0    666.000000  666.000000   11        TEST-NET-2
203.0.113.0/24                      24   203.0.113.0-203.0.113.255        0x01017a7a   0      -1.0    666.000000  666.000000   12        TEST-NET-3
224.0.0.0/4                         4    224.0.0.0-239.255.255.255        0x0100fd73   0      -1.0    666.000000  666.000000   13        Multicast
240.0.0.0/4                         4    240.0.0.0-255.255.255.254        0x01014696   0      -1.0    666.000000  666.000000   14        Reserved
255.255.255.255/32                  32   255.255.255.255-255.255.255.255  0x01003609   0      -1.0    666.000000  666.000000   15        Broadcast
# End IPv4 privat address space
1.0.0.0/24                          24   1.0.0.0-1.0.0.255                0xe9000a24   13335  80.0    34.052230   -118.243680  us        APNIC Research and Development
1.0.1.0/24                          24   1.0.1.0-1.0.1.255                0x31003c7c   0      80.0    26.061390   119.306110   cn        CHINANET FUJIAN PROVINCE NETWORK
1.0.4.0/22                          22   1.0.4.0-1.0.7.255                0x1001b2e9   0      80.0    -37.814000  144.963320   au        Wirefreebroadband Pty Ltd
...

You can now write your own subnet file or modify the original one, so make a copy of the subnets4.txt to have an easy way to restore the default. Let’s define the 192.168. network a bit more precise by adding two more lines describing the Knoedelrutschen company with one /24 and one /28 network:

...
192.168.0.0/16                      16   192.168.0.0-192.168.255.255      0x010136e0   0      -1.0    666.000000  666.000000   09        Private network
# Begin Knoedelrutschen company internal network
192.168.1.0/24                      24   192.168.0.0-192.168.1.255        0x010136e0   0      -1.0    666.000000  666.000000   KRI	 Knoedelrutschen Inc
# Begin Knoedelrutschen company internal sub networks
192.168.1.0/28                      28   192.168.0.0-192.168.1.15         0x010136e0   0       0.05   48.856892   2.350850     fr        KRI, Managers, Eifeltower, over paid
# End Knoedelrutschen company internal sub networks
198.18.0.0/15                       15   198.18.0.0-198.119.255.255       0x010136e0   0      -1.0    666.000000  666.000000   10        Private network
...

Because autogen.sh decompresses the subnets4.txt.bz2 and thus overwrites the subnet file we need first to bzip2 your subnets4.txt and the build the basicFlow with the -f option. That is for beginners the easiest way to reconstruct the binary and ship it to the .tranalyzer/plugins folder. Then rerun t2 with the pcap.

$ bzip2 -cf subnets4.txt > subnets4.txt.bz2
$ t2build -f basicFlow

...
$ t2 -r ~/data/faf-exercise.pcap -w ~/results/
...
$

Now open the flow file and you will see your IP labeling.

$ tawk '{ print $srcIP, $srcIPASN, $srcIPCC, $srcIPWho, $srcIPLat_Lng_relP, $dstIP, $dstIPASN, $dstIPCC, $dstIPWho, $dstIPLat_Lng_relP }' faf-exercise_flows.txt | sort -Vru -k1,1
rcIP	srcIPASN	srcIPCC	srcIPWho			srcIPLat_Lng_relP	dstIP		dstIPASN	dstIPCC	dstIPWho			dstIPLat_Lng_relP
198.189.255.75	0	us	"California State University"	666_666_-1		192.168.1.104	0		eu	"Knoedelrutschen Inc"		666_666_-1
192.168.1.105	0	eu	"Knoedelrutschen Inc"		666_666_-1		192.168.1.1	0		fr	"KRI, Managers, Eifeltower, "	48.85689_2.35085_0.05
192.168.1.104	0	eu	"Knoedelrutschen Inc"		666_666_-1		77.67.44.206	3257		fr	"GTT Communications Inc."	48.71667_2.25_80
192.168.1.103	0	eu	"Knoedelrutschen Inc"		666_666_-1		192.168.1.1	0		fr	"KRI, Managers, Eifeltower, "	48.85689_2.35085_0.05
192.168.1.102	0	eu	"Knoedelrutschen Inc"		666_666_-1		192.168.1.1	0		fr	"KRI, Managers, Eifeltower, "	48.85689_2.35085_0.05
192.168.1.1	0	fr	"KRI, Managers, Eifeltower, "	48.85689_2.35085_0.05	192.168.1.103	0		eu	"Knoedelrutschen Inc"		666_666_-1
143.166.11.10	0	us	"Dell"				30.51748_-97.67207_80	192.168.1.105	0		eu	"Knoedelrutschen Inc"		666_666_-1
77.67.44.206	3257	fr	"GTT Communications Inc."	48.71667_2.25_80	192.168.1.104	0		eu	"Knoedelrutschen Inc"		666_666_-1
63.245.221.11	395642	us	"Mozilla Corporation"		38.6409_-121.5228_80	192.168.1.104	0		eu	"Knoedelrutschen Inc"		666_666_-1

As the most important part of a company are the engineer department, let’s expand the network definition by one more /26 network

...
192.168.0.0/16                      16   192.168.0.0-192.168.255.255      0x010136e0   0      -1.0    666.000000  666.000000   09        Private network
# Begin Knoedelrutschen company internal network
192.168.1.0/24                      24   192.168.1.0-192.168.1.255        0x010136e0   0       1000.0 666.000000  666.000000   eu	 Knoedelrutschen Inc
# Begin Knoedelrutschen company internal sub networks
192.168.1.0/28                      28   192.168.1.0-192.168.1.15         0x010136e0   0       1.5    48.856892   2.350850     fr        KRI, Managers, Eifeltower, over paid
192.168.1.64/26                     26   192.168.1.64-192.168.1.127       0x010136e0   0       0.01   46.947990   7.459672     ch        Engineers, Bern, @bears
# End Knoedelrutschen company internal sub networks
198.18.0.0/15                       15   198.18.0.0-198.119.255.255       0x010136e0   0      -1.0    666.000000  666.000000   10        Private network
...

Compress to bzip2, recompile and rerun t2.

$ bzip2 -cf subnets4.txt > subnets4.txt.bz2
$ t2build -f basicFlow

...
$ t2 -r ~/data/faf-exercise.pcap -w ~/results/
$

Note, that the engineers are now properly labeled. If an address is located outside the managers and engineers network it would be labeled as Knoedelrutschen Inc.

$ tawk '{ print $srcIP, $srcIPASN, $srcIPCC, $srcIPWho, $srcIPLat_Lng_relP, $dstIP, $dstIPASN, $dstIPCC, $dstIPWho, $dstIPLat_Lng_relP }' faf-exercise_flows.txt | sort -Vru -k1,1
srcIP	srcIPASN	srcIPCC	srcIPWho			srcIPLat_Lng_relP	dstIP		dstIPASN	dstIPCC	dstIPWho			dstIPLat_Lng_relP
198.189.255.75	0	us	"California State University"	666_666_-1		192.168.1.104	0		ch	"Engineers, Bern, @bears"	46.94799_7.459672_0.01
192.168.1.105	0	ch	"Engineers, Bern, @bears"	46.94799_7.459672_0.01	192.168.1.1	0		fr	"KRI, Managers, Eifeltower, "	48.85689_2.35085_0.05
192.168.1.104	0	ch	"Engineers, Bern, @bears"	46.94799_7.459672_0.01	77.67.44.206	3257		fr	"GTT Communications Inc."	48.71667_2.25_80
192.168.1.103	0	ch	"Engineers, Bern, @bears"	46.94799_7.459672_0.01	192.168.1.1	0		fr	"KRI, Managers, Eifeltower, "	48.85689_2.35085_0.05
192.168.1.102	0	ch	"Engineers, Bern, @bears"	46.94799_7.459672_0.01	192.168.1.1	0		fr	"KRI, Managers, Eifeltower, "	48.85689_2.35085_0.05
192.168.1.1	0	fr	"KRI, Managers, Eifeltower, "	48.85689_2.35085_0.05	192.168.1.103	0		ch	"Engineers, Bern, @bears"	46.94799_7.459672_0.01
143.166.11.10	0	us	"Dell"				30.51748_-97.67207_80	192.168.1.105	0		ch	"Engineers, Bern, @bears"	46.94799_7.459672_0.01
77.67.44.206	3257	fr	"GTT Communications Inc."	48.71667_2.25_80	192.168.1.104	0		ch	"Engineers, Bern, @bears"	46.94799_7.459672_0.01
63.245.221.11	395642	us	"Mozilla Corporation"		38.6409_-121.5228_80	192.168.1.104	0		ch	"Engineers, Bern, @bears"	46.94799_7.459672_0.01

As we are using the CIDR mode, lets now test the range mode. So open utils.h and setSUBRNG 1` or use the t2conf command below.

$ t2conf basicFlow -D SUBRNG=1
$

Now t2 selects the third column in the subnet file. Add a new /28 network as listed below. If you have a dash in the CIDR column and CIDR is configured, the entry is ignored, as the range is definetely not CIDR. You can have any values in the CIDR or range column, as non CIDR ranges would consist of several rows of CIDR. Here we have clearly a non CIDR network and we are in the RANGE mode anyway. We have now SW and HW engineers separated.

...
192.168.0.0/16                      16   192.168.0.0-192.168.255.255      0x010136e0   0      -1.0    666.000000  666.000000   09        Private network
# Begin Knoedelrutschen company internal network
192.168.1.0/24                      24   192.168.0.0-192.168.1.255        0x010136e0   0       1000.0 666.000000  666.000000   eu	 Knoedelrutschen Inc
# Begin Knoedelrutschen company internal sub networks
192.168.1.0/28                      28   192.168.1.0-192.168.1.15         0x010136e0   0       1.5    48.856892   2.350850     fr        KRI, Managers, Eifeltower, over paid
192.168.1.0/28                      26   192.168.1.64-192.168.1.103       0x010136e0   0       0.01   46.947990   7.459672     ch        HW-Engineers, Bern, @bears
-                                   26   192.168.1..4-192.168.1.108       0x010136e0   0       0.01   46.947990   7.459672     ch        SW-Engineers, Bern, @bears
# End Knoedelrutschen company internal sub networks
198.18.0.0/15                       15   198.18.0.0-198.119.255.255       0x010136e0   0      -1.0    666.000000  666.000000   10        Private network
...

so again bzip2, rebuild and rerun t2.

$ bzip2 -cf subnets4.txt > subnets4.txt.bz2
$ t2build -f basicFlow

...
$ t2 -r ~/data/faf-exercise.pcap -w ~/results/
$

If you look into the flow file, you will now discover that there are also SW-Engineers

$ tawk '{ print $srcIP, $srcIPASN, $srcIPCC, $srcIPWho, $srcIPLat_Lng_relP, $dstIP, $dstIPASN, $dstIPCC, $dstIPWho, $dstIPLat_Lng_relP }' faf-exercise_flows.txt | sort -Vru -k1,1
srcIP	srcIPASN	srcIPCC	srcIPWho			srcIPLat_Lng_relP	dstIP		dstIPASN	dstIPCC	dstIPWho			dstIPLat_Lng_relP
198.189.255.75	0	us	"California State University"	666_666_-1		192.168.1.104	0		ch	"SW-Engineers, Bern, @bears"	46.94799_7.459672_0.01
192.168.1.105	0	ch	"SW-Engineers, Bern, @bears"	46.94799_7.459672_0.01	192.168.1.1	0		fr	"KRI, Managers, Eifeltower, "	48.85689_2.35085_0.05
192.168.1.104	0	ch	"SW-Engineers, Bern, @bears"	46.94799_7.459672_0.01	77.67.44.206	3257		fr	"GTT Communications Inc."	48.71667_2.25_80
192.168.1.103	0	ch	"HW-Engineers, Bern, @bears"	46.94799_7.459672_0.01	192.168.1.1	0		fr	"KRI, Managers, Eifeltower, "	48.85689_2.35085_0.05
192.168.1.102	0	ch	"HW-Engineers, Bern, @bears"	46.94799_7.459672_0.01	192.168.1.1	0		fr	"KRI, Managers, Eifeltower, "	48.85689_2.35085_0.05
192.168.1.1	0	fr	"KRI, Managers, Eifeltower, "	48.85689_2.35085_0.05	192.168.1.103	0		ch	"HW-Engineers, Bern, @bears"	46.94799_7.459672_0.01
143.166.11.10	0	us	"Dell"				30.51748_-97.67207_80	192.168.1.105	0		ch	"SW-Engineers, Bern, @bears"	46.94799_7.459672_0.01
77.67.44.206	3257	fr	"GTT Communications Inc."	48.71667_2.25_80	192.168.1.104	0		ch	"SW-Engineers, Bern, @bears"	46.94799_7.459672_0.01
63.245.221.11	395642	us	"Mozilla Corporation"		38.6409_-121.5228_80	192.168.1.104	0		ch	"SW-Engineers, Bern, @bears"	46.94799_7.459672_0.01

And don’t forget to reset the configs of basicFlow for the next tutorials:

$ t2conf basicFlow -D BFO_SUBNET_ASN=0 -D BFO_SUBNET_LL=0 -D BFO_SUBNET_HEX=0 -D SUBRNG=0
$ t2build -R
...
$

Have fun!