Tutorial: Data Carving
This example is prepared for unix systems (linux and osx) only. In order to assure that no old or unnecessary plugins are being loaded please clean your plugin directory by
$ t2build -e $ ls
then recompile T2:
$ t2build basicFlow basicStats tcpStates txtSink
As being noted in previous plugins, make sure you have a data and a results directory which is good practice to separate original data from processed flows.
$ mkdir data results
Now you are all set for the following tutorial.
Plugins and Configuration
In the good old times, before 2012, a lot of traffic was not encrypted, so content could be extracted from the packets defining the flow. This process is called data carving. Tranalyzer(T2) has this abillity, but each plugin operating on unencrpyted data has to implement it. Hence, the following plugins provide a data caving mode:
To illustrate the configuration and application of the data carving mode lets have a look at the more complex plugin httpSniffer.
First move to directory
$ cd httpSniffer/src $ vi httpSniffer.h
then move to the following block
// data carving modes #define HTTP_SAVE_IMAGE **0** // 1: Save images in files under HTTP_IMAGE_PATH; 0: Dont save images #define HTTP_SAVE_VIDEO **0** // 1: Save videos in files under HTTP_VIDEO_PATH; 0: Dont save videos #define HTTP_SAVE_AUDIO **0** // 1: Save audios in files under HTTP_TEXT_PATH; 0: Dont save audios #define HTTP_SAVE_MSG **0** // 1: Save messages in files under HTTP_MSG_PATH; 0: Dont save pdfs #define HTTP_SAVE_TEXT **0** // 1: Save texts in files under HTTP_TEXT_PATH; 0: Dont save text #define HTTP_SAVE_APPL **0** // 1: Save applications in files under HTTP_TEXT_PATH; 0: Dont save applications #define HTTP_SAVE_PUNK **0** // 1: Save put/else content in files under HTTP_PUNK_PATH; 0: Dont save put content
If any of these defines is toggled to
1 the plugin will save all pictures, videos, audio, etc to the
HTTP_PATH defined below. If you like to keep the data, even after turning off your computer, choose another directory, e.g. in your home path. Note, that considerable amount of data will be placed onto your storage, if your pcap is of larger nature.
The resulting file name of each item occurring for the specific packet and flow is defined as follows:
Hence, a extracted content can be directly linked to the very flow, direction, packet and the Mimetype. This facilitates automated search and correlation between content and flow meta data.
// User defined storage boundary conditions #define HTTP_PATH "/tmp/" // root path #define HTTP_IMAGE_PATH HTTP_PATH"httpPicture/" // Path for pictures #define HTTP_VIDEO_PATH HTTP_PATH"httpVideo/" // Path for videos #define HTTP_AUDIO_PATH HTTP_PATH"httpAudio/" // Path for audios #define HTTP_MSG_PATH HTTP_PATH"httpMSG/" // Path for messages #define HTTP_TEXT_PATH HTTP_PATH"httpText/" // Path for texts #define HTTP_APPL_PATH HTTP_PATH"httpAppl/" // Path for applications #define HTTP_PUNK_PATH HTTP_PATH"httpPunk/" // Path for Post / else / unknown content #define HTTP_NONAME "nudel" // name of files without name
Imagine you see only the
B flow, thus you can extract the content, but you do not have the filename. In this case
HTTP_NONAME defines a default name followed by the same information as denote above.
1, leave your editor and type
$ cd .. $ ./autogen.sh
$ t2build httpSniffer
Prepare a directory where your pcaps are residing and one where T2 should store the flow files. If you do not have a pcap, download this file (Source: malware-traffic-analysis.net) and copy the pcap under your data directory.
Now run t2
$ t2 -r ~/data/yourpcap.pcap -w ~/results
or in case you use the traffic from the link above:
$ t2 -r ~/data/2015-05-08-traffic-analysis-exercise.pcap -w ~/results
now move to
$ cd /tmp $ ls httpPicture httpVideo ...
$ cd httpPicture $ ls
smtpDecode and set
SMTP_SAVE 1 in
smtpDecode.h, recompile, rerun T2 and move into
$ cd smtpDecode/src $ vi smtpDecode.h
$ t2build smtpDecode $ t2 -r ~/data/faf-exercise.pcap -w ~/results $ cd /tmp/SMTPFILES $ ls ...
you can read the email without writing scripts. And in your flow file all filenames have the flowIndex attached to correlate flows with the extracted files under