Exported .evtx files may contain corrupted data – Check interpretation of forensic tools.
Author: Jeffrey Wassenaar
As forensic investigators, we truly love log files. During the investigation of a system with a Microsoft Windows operating system, Windows Event Log files (.evtx) can be very useful. System events (such as logons) are logged, but applications also tend to save event records in such files. A lot of research has been done since the introduction of the .evtx files in Windows , and since then, several tools have been developed to extract and analyze event records.
During our forensic analysis in the field, we have discovered anomalies in certain Windows Event Log files. Research and analysis of these specific anomalies have led to findings that may be of forensic importance for investigators doing similar work. This blogpost will shed a light on the anomalies and the way in which these have been introduced. Subsequently, this blogpost will describe recommendations on how to ensure that the conclusions you draw from forensic analysis of records from Windows Event Log files are actually correct and solid.
Depending on the scenario of your forensic investigation or incident response case, Windows Event Log files are either extracted from disk images, or may be extracted from a live system. Some first responders or system administrators may be tempted to use the tooling of the Microsoft Windows operating system to export and save event log files from a live system. When they do (using either wevtutil  or Windows Event Viewer itself), anomalies are introduced. Depending on which forensic investigation tool you use, the anomalies may lead to incorrect analysis findings and conclusions. To understand what is going on, let’s take a look at the Windows Event Log file format.
A Windows Event Log file starts with a file header, and is followed by several chunks. The file header contains information about the amount of chunks, IDs of the first and last chunk, etc. A file chunk starts with a chunk header, and is followed by event records. The chunk header contains information about the number of records, IDs of the first and last record in this chunk, etc. An event record contains a record header and is followed by the record data. The record header contains information about the record size, a record ID, a timestamp, etc. The actual record is formatted as a binary XML structure in which the actual event record data fields are stored.
A common configuration of Windows Event logging sets a limit on the file size of a log file. After the first initialization of an event log file, the log file will continue to expand as events are added. When the maximum file size is reached, the oldest information will be overwritten with the most recent log records (a roll-over). To do so, the file chunk containing the oldest information is identified. This file chunk in the file will then be used to store the new log records. This means that the oldest information of a Windows Event Log file may actually reside in the middle of the file.
All event log records contain a (sequential) record ID and a timestamp that describes when the record was created. The record header of each record also contains a record ID and a timestamp. When new records are written to the event log file, the same record ID and timestamp are written to both the header as well as to the record content.
Export: sorting and renumbering
When wevtutil or Windows Event Viewer is used to export the contents of an event log file to an .evtx file, a new file is created. Within this new file, a file header is created, and new chunks are added to it. The oldest event record from the source is identified, and written as the first record in the export .evtx. After the export is complete, the new event log file will be completely sequential in terms of record IDs and will end with the most recent record. Nothing wrong with a bit of sorting, right?
While exporting, all actual record data is left intact. However, the record headers are rewritten. During the export, the oldest record will be numbered as record ID 1 in the record header. This means that for an event log that had rolled over before an export, record IDs in the record headers will be changed. In these cases, since the actual record content is left intact, a mismatch between record ID of the header and the record ID inside the record is introduced. That has not proven to be a real problem though.
While rewriting headers, the wevtutil tool (which is actually also used when using Event Viewer to export files) also rewrites the timestamps of the record headers. For some reason (bug or feature), the timestamp of record 2 is written to the header of record 1. In a similar way, the timestamp of record 3 is written to the header of record 2. At the end of the file, because there is no following record to take a timestamp from, the timestamp inside the header of the last record is set to 0. This means that all event records until the last one are post-dated. Depending on the frequency of the generated events, the amount of time difference in post-dating header timestamps may range from microseconds to even days.
The post-dating in record header timestamps may not be a problem though. After all, the actual record data is left unchanged. Bad news: some tools use timestamps from record headers, instead of those from the actual record data.
The following picture shows the output from a small test script (which uses the python-evtx library  from William Ballenthin) of a file exported using wevtutil, and the same file extracted from the disk image. The output shows the timestamp and record ID from both the header as well as from the content of records. The data from the three oldest and the three most recent records are shown.
Note that in the data that was exported using wevtutil, there is a mismatch in timestamp data between record headers and record content. Also note the invalid timestamp in the record header of the most recent record.
Check the source of the event log files you examine and make sure that you understand which timestamps are shown by the tooling of your choice. Interpretation of the timestamp from event log record headers may lead to a difference ranging from microseconds up to days. In the end, the misinterpretation of evidence may put an innocent man in jail, or a criminal out on the street.
Comparison of popular tools
The following table lists a couple of popular tools for the analysis of Windows Event Log, and shows which timestamp (from record content, from record header or both) is shown.
When using python-evtx (v0.6.1), Plaso (plaso-20190531) or libevtx (libevtx-20181227) for the analysis of event log records, make sure that timestamps from both the record header as well as from the record content are checked. Microsoft as well as developers of these tools have been notified.
- Andreas Schuster – Introducing the Microsoft Vista Log File Format: https://www.dfrws.org/sites/default/files/session-files/paper-introducing_the_microsoft_vista_log_file_format.pdf
Brandon Charter – EVTX and Windows Event Logging: https://www.sans.org/reading-room/whitepapers/logging/evtx-windows-event-logging-32949
Joachim Metz – Windows Event Viewer Log (EVT) format: https://github.com/libyal/libevt/blob/master/documentation/Windows%20Event%20Log%20(EVT)%20format.asciidoc
Microsoft – Event Log File Format: https://docs.microsoft.com/en-us/windows/desktop/eventlog/event-log-file-format