omerbenamram / evtx Goto Github PK
View Code? Open in Web Editor NEWA Fast (and safe) parser for the Windows XML Event Log (EVTX) format
License: Apache License 2.0
A Fast (and safe) parser for the Windows XML Event Log (EVTX) format
License: Apache License 2.0
Would you consider implementing a constant log monitoring option "-d --run-as-service"?
The idea is to monitor a single evtx log for changes and feed them to STDOUT a or a xml/json file so the new changes can be streamed to another host for processing.
The way it works now when it finishes processing the evtx log file evtx_dump exits.
Awesome work by the way! Thank you!
Here is an example of missing data. (See Data tags).
H_Application.evtx.evtx_dump.xml
Record 3308
<?xml version="1.0" encoding="utf-8"?>
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="ESENT">
</Provider>
<EventID Qualifiers="0">916</EventID>
<Level>4</Level>
<Task>1</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2018-08-09 07:21:00.046087 UTC">
</TimeCreated>
<EventRecordID>3308</EventRecordID>
<Channel>Application</Channel>
<Computer>DESKTOP-1N4R894</Computer>
<Security>
</Security>
</System>
<EventData>
<Data></Data>
<Binary></Binary>
</EventData>
</Event>
Record 3309
<?xml version="1.0" encoding="utf-8"?>
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="ESENT">
</Provider>
<EventID Qualifiers="0">916</EventID>
<Level>4</Level>
<Task>1</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2018-08-09 08:22:00.061763 UTC">
</TimeCreated>
<EventRecordID>3309</EventRecordID>
<Channel>Application</Channel>
<Computer>DESKTOP-1N4R894</Computer>
<Security>
</Security>
</System>
<EventData>
<Data></Data>
<Binary></Binary>
</EventData>
</Event>
Compared to H_Application.evtx.evtxecmd.xml
<?xml version="1.0" encoding="utf-16"?>
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="ESENT" />
<EventID Qualifiers="0">916</EventID>
<Level>4</Level>
<Task>1</Task>
<Keywords>EventlogClassic</Keywords>
<TimeCreated SystemTime="2018-08-09 07:21:00.0460872" />
<EventRecordID>3308</EventRecordID>
<Channel>Application</Channel>
<Computer>DESKTOP-1N4R894</Computer>
<Security />
</System>
<EventData>
<Data>svchost, 2672,G,98, EseDiskFlushConsistency, ESENT, 0x800000</Data>
<Binary></Binary>
</EventData>
</Event>
<?xml version="1.0" encoding="utf-16"?>
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="ESENT" />
<EventID Qualifiers="0">916</EventID>
<Level>4</Level>
<Task>1</Task>
<Keywords>EventlogClassic</Keywords>
<TimeCreated SystemTime="2018-08-09 08:22:00.0617638" />
<EventRecordID>3309</EventRecordID>
<Channel>Application</Channel>
<Computer>DESKTOP-1N4R894</Computer>
<Security />
</System>
<EventData>
<Data>svchost, 2672,G,98, EseDiskFlushConsistency, ESENT, 0x800000</Data>
<Binary></Binary>
</EventData>
</Event>
You can find the RAW evtxs here:
https://www.dropbox.com/s/0vejq9lsjq1cskq/DEFCON_2018_DESKTOP_KAPE_EVTX_SET.zip?dl=0
You can find the output reports here:
https://www.dropbox.com/s/emx7lbkmq6xrwuc/DEFCON_2018_DESKTOP_EVTX_COMPARISON.zip?dl=0
In this example the file that contains the data depicted here can be found in the DEFCON_2018_DESKTOP_KAPE_EVTX_SET.zip set [\H\Windows\system32\winevt\logs\Application.evtx]
How hard would it be to implement the JSON parsing from a XML string? This would help with testing where you want to validate the output json structure against specific XML events. Could also be useful for something that uses Windows API that retrieves the XML string and wants to convert it to JSON and maintain a 1 to 1 structure with this library.
I get the error :
Failed to dump the next record.
Caused by:
0: Failed to parse record number 341
1: An error occurred while trying to serialize binary xml to output.
2: Building a JSON document failed with message: This is a bug - expected current value to exist, and to be an object type.
Check that the value is not Value::null
Unfortunately I have not additional Info to provide from the output, and it seems to fail on all records.
Looking for a way to quickly output the date range of a particular evtx ie:
Oldest log: 2/2/20
Newest log: 3/15/20
Something like what this cmdlet does:
Get-WinEvent -Path 'C:\workspace\Security.evtx' -MaxEvents 1 -oldest | Select-Object -Property TimeCreated
Any ideas?
I am wanting to create a custom output. Something very similar to evtx::json_output::JsonOutput, but I want to be able to tweak how the json is generated just a little bit to get around some hurdles of ingesting the json into Elastic. Is it possible in 0.4.1 to create a custom output structure that implements BinXmlOutput? My issue is that some of the required modules are not public (for example evtx::model::xml::XmlElement).
Thanks in advance for the help.
Some fields in Json output have spaces in the names, e.g., .Event.EventData.Error Description
. This introduces some complications when forwarding the resultant JSON in to various other tools (logstash, in my case).
I've worked out a command for jq
to replace the spaces in field names with underscore, but it would be convenient to have a command-line flag to do it, producing, for example, .Event.EventData.Error_Description
instead.
FWIW for anybody else who stumbles across this and wants to know how I'm doing it with jq
:
evtx \
--threads 1 \
--format jsonl \
--separate-json-attributes \
"${FNAME}" 2>/dev/null | jq -c 'def walk(f):
. as $in
| if type == "object" then
reduce keys[] as $key
( {}; . + {($key | gsub(" "; "_")): ($in[$key] | walk(f))} )
elif type == "array" then
map( walk(f) )
else
f
end;
walk(.)' > "${FNAME_JSON}"
In a recent discussion, it became clear to me that there's a desire for evtx tooling that supports an offline database of templates. Here's some some relevant background on the topic:
The forensikblog.de post describes exactly my goal: to process the resource directory of PE files and collect evtx templates for subsequent use. For example, to put the templates in a sqlite database, carve evtx records from unallocated space, and render the records using the templates from the database. Going forward, I expect to use this evtx library over python-evtx, for many reasons :-).
However, getting this to work may take some changes to this evtx library. I'll describe what I find in this thread. I hope that we can work together to support all these use cases!
Incidentally, I've been chatting with @forensicmatt whose also interested in working with evtx templates, so he may chime in too.
wevt_template is my work in progress project for extracting evtx templates from PE files.
Here is services.exe (renamed to gif extension) that I'll reference below.
In the attached services.exe
at offset 0xA3020 with length 0x4b7e is the embedded instrumentation manifest that includes evtx templates:
00000000: 43 52 49 4d 7c 4b 00 00 05 00 01 00 03 00 00 00 CRIM|K..........
00000010: 5b 71 63 00 da ee 07 40 94 29 ad 52 6f 62 69 6e [qc....@.).Robin
00000020: 4c 00 00 00 97 4c 18 06 01 52 0e 48 92 af 3a 36 L....L...R.H..:6
00000030: 26 c5 b1 40 f4 08 00 00 d1 08 59 55 d7 a6 95 46 &[email protected]
00000040: 8e 1e 26 93 1d 20 12 f4 48 0c 00 00 57 45 56 54 ..&.. ..H...WEVT
00000050: a8 08 00 00 01 00 00 90 08 00 00 00 05 00 00 00 ................
00000060: 9c 00 00 00 07 00 00 00 08 01 00 00 0d 00 00 00 ................
00000070: d8 04 00 00 02 00 00 00 24 05 00 00 00 00 00 00 ........$.......
00000080: a4 05 00 00 01 00 00 00 e4 05 00 00 03 00 00 00 ................
00000090: f4 06 00 00 04 00 00 00 68 07 00 00 43 48 41 4e ........h...CHAN
000000a0: 6c 00 00 00 01 00 00 00 00 00 00 00 b8 00 00 00 l...............
000000b0: 10 00 00 00 ff ff ff ff 50 00 00 00 4d 00 69 00 ........P...M.i.
000000c0: 63 00 72 00 6f 00 73 00 6f 00 66 00 74 00 2d 00 c.r.o.s.o.f.t.-.
000000d0: 57 00 69 00 6e 00 64 00 6f 00 77 00 73 00 2d 00 W.i.n.d.o.w.s.-.
000000e0: 53 00 65 00 72 00 76 00 69 00 63 00 65 00 73 00 S.e.r.v.i.c.e.s.
000000f0: 2f 00 44 00 69 00 61 00 67 00 6e 00 6f 00 73 00 /.D.i.a.g.n.o.s.
00000100: 74 00 69 00 63 00 00 00 54 54 42 4c d0 03 00 00 t.i.c...TTBL....
00000110: 02 00 00 00 54 45 4d 50 c0 00 00 00 01 00 00 00 ....TEMP........
00000120: 01 00 00 00 a8 01 00 00 01 00 00 00 fe be 19 ab ................
00000130: f0 23 65 5f 2f fd 44 4c 0b e7 4f 99 0f 01 01 00 .#e_/.DL..O.....
00000140: 01 ff ff 5e 00 00 00 44 82 09 00 45 00 76 00 65 ...^...D...E.v.e
00000150: 00 6e 00 74 00 44 00 61 00 74 00 61 00 00 00 02 .n.t.D.a.t.a....
...
Notably, this services.exe
from Win10 2020H1 uses the CRIM version 5.1 (in contrast to the libexe description for version 3.1). We'll see why this matters in a moment.
At 0xA306C is the start of an event provider structure (WEVT) for Microsoft-Windows-Services/Diagnostic
:
00000000 57 45 56 54 a8 08 00 00 01 00 00 90 08 00 00 00 |WEVTยจ...........|
00000010 05 00 00 00 9c 00 00 00 07 00 00 00 08 01 00 00 |................|
00000020 0d 00 00 00 d8 04 00 00 02 00 00 00 24 05 00 00 |....ร.......$...|
00000030 00 00 00 00 a4 05 00 00 01 00 00 00 e4 05 00 00 |....ยค.......รค...|
00000040 03 00 00 00 f4 06 00 00 04 00 00 00 68 07 00 00 |....รด.......h...|
...
At 0xA3128 is the template table (TTBL) and finally at 0xA315C is a binary XML template structure. Ideally, we'd be able to parse the data using this evtx library. I'm currently using the following to parse the data:
let de = evtx::binxml::deserializer::BinXmlDeserializer::init(
&buf,
0x0,
None,
false,
encoding::all::WINDOWS_1252,
);
let mut iterator = de.iter_tokens(None)?;
loop {
let token = iterator.next();
if let Some(t) = token {
debug!("token: {:#x?}", t);
} else {
break;
}
}
Anyways, here is the binary template:
00000000 0f 01 01 00 01 ff ff 5e 00 00 00 44 82 09 00 45 |.....รฟรฟ^...D...E|
00000010 00 76 00 65 00 6e 00 74 00 44 00 61 00 74 00 61 |.v.e.n.t.D.a.t.a|
00000020 00 00 00 02 41 ff ff 3d 00 00 00 8a 6f 04 00 44 |....Aรฟรฟ=....o..D|
00000030 00 61 00 74 00 61 00 00 00 25 00 00 00 06 4b 95 |.a.t.a...%....K.|
00000040 04 00 4e 00 61 00 6d 00 65 00 00 00 05 01 09 00 |..N.a.m.e.......|
00000050 47 00 72 00 6f 00 75 00 70 00 4e 00 61 00 6d 00 |G.r.o.u.p.N.a.m.|
00000060 65 00 02 0d 00 00 01 04 04 00 00 00 00 00 00 00 |e...............|
...
Unfortunately, this doesn't parse well with the code from this library. Let me explain what I see:
00000000 0f 01 01 00 BinXmlFragmentHeader{version 1.1, flags: 0x0}
01 OpenStartElement
ff ff dependency identifier
5e 00 00 00 data size=0x5E
44 82 <<< hash???
09 00 number of characters in following wstring
45 wstring="EventData"
00000010 00 76 00 65 00 6e 00 74 00 44 00 61 00 74 00 61
00000020 00 00 00 end(wstring="EventData"0
02 CloseStartElement
41 OpenStartElement with Attributes
ff ff 3d ...
My guess is that in (at least) format version 5.1 (or 4+???), strings are stored inline rather than as references. I think the structure for tag 01
is maybe:
struct OpenStartElementNoAttributes {
tag: u8, // == 0x01
dependency_identifier: Option<u16>, // 0xFFFF -> None
data_size: u32,
name_hash: u16, // unknown algorithm
name_character_count: u16,
name: OsString<utf16> // name_character_count + trailing NULL character
}
This inline string strategy seems to be used in other parts of the template, too.
I think these strings share a structure with the BinXmlName
described by libevtx:
0 4 ย Unknown 4 2 ย Name hash Which hash algorithm? 6 2 ย Number of characters 8 โฆโ ย UTF-16 little-endian string with an end-of-string character
So, I wonder if its reasonable to extend read_open_start_element
to support this variant of the format. And if so, how to manage the set of features that each variant may support (evtx-file-mode vs WEVT_MODE vs ....).
In a subsequent discussion, assuming we can parse out these templates, then we can chat about how to apply the templates toward data carved from allocated space. But, I haven't gotten this far, yet :-)
Is it possible to tail evtx files? using custom ReadSeek?
tl;dr make type RecordId
public so it can be use
d
I'd like to store the RecordId
instance for later processing. While I know the RecordId
is a u64
, it would nice if I could simply refer to RecordId
.
As in this contrived example.
use ::evtx::EvtxParser;
use ::evtx::RecordId;
fn main() {
let fp = PathBuf::new();
let mut parser = EvtxParser::from_path(fp).unwrap();
let mut ids: Vec<RecordId> = vec![];
for record in parser.records() {
match record {
Ok(r) => {
ids.push(r.event_record_id);
},
_ => {},
}
}
}
Currently, that code does not compile
error[E0432]: unresolved import `evtx::RecordId`
--> src\main.rs:2:5
|
2 | use ::evtx::RecordId;
| ^^^^^^^^^^^^^^^^ no `RecordId` in the root
Using evtx
version 0.8.1.
Hi , Thanks for the library . i have one question regarding installing the library on windows as it shows below error . am using latest python3.9 .
Collecting evtx
Using cached evtx-0.6.8.tar.gz (2.2 kB)
ERROR: Command errored out with exit status 1:
command: 'c:\users\user\appdata\local\programs\python\python39\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\user\AppData\Local\Temp\pip-install-yxibz94l\evtx_04d644cd67554da6b2028e9d4b820743\setup.py'"'"'; file='"'"'C:\Users\user\AppData\Local\Temp\pip-install-yxibz94l\evtx_04d644cd67554da6b2028e9d4b820743\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users\user\AppData\Local\Temp\pip-pip-egg-info-cylfeyzm'
cwd: C:\Users\user\AppData\Local\Temp\pip-install-yxibz94l\evtx_04d644cd67554da6b2028e9d4b820743
Complete output (5 lines):
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\user\AppData\Local\Temp\pip-install-yxibz94l\evtx_04d644cd67554da6b2028e9d4b820743\setup.py", line 34, in
RustExtension(
TypeError: init() got an unexpected keyword argument 'target'
----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/33/18/b32715bae61c4fe6a7cdb79aafccb0d4797a1bfef028e9689197af214966/evtx-0.6.8.tar.gz#sha256=414507b79fe997a35fbf05ae57dd2f55a7acfc669b19d9125a894ffe40dbeade (from https://pypi.org/simple/evtx/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
Hello !
We stumbled upon an error thread 'main' panicked at 'invalid or out-of-range date'
while using the evtx library.
We are wondering if it's the expected behavior, and if not, is there a workaround ?
It seems that when the evtx library processes a "faulty" event, it fails and returns by throwing the aformentioned error.
Used command:
./evtx_dump-v0.7.2-x86_64-unknown-linux-gnu <filename>.evtx -f <filename>.json --no-confirm-overwrite -ojson --no-indent
Error:
thread '<unnamed>' panicked at 'invalid or out-of-range date', /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.19/src/naive/date.rs:173:51
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread '<unnamed>' panicked at 'invalid or out-of-range date', /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.19/src/naive/date.rs:173:51
thread '<unnamed>' panicked at 'invalid or out-of-range date', /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.19/src/naive/date.rs:173:51
We looked inside our evtx file with Windows Event Viewer. We found that the evtx command failed on events containing the following data:
<EventData>
<Data Name="IdentificationGUID">{00280040-0022-0049-6400-65006e007400}</Data>
<Data Name="ProtectorGUID">{00660069-0069-0063-6100-740069006f00}</Data>
<Data Name="ProtectorType">0x47006e</Data>
<Data Name="UnlockTime">1601-01-01T00:00:00.0000000Z</Data>
</EventData>
specifically on the "UnlockTime" field (see the attached image).
Things look fine by viewing the associated scheme though:
Template : <template xmlns="http://schemas.microsoft.com/win/2004/08/events">
<data name="IdentificationGUID" inType="win:GUID" outType="xs:GUID"/>
<data name="ProtectorGUID" inType="win:GUID" outType="xs:GUID"/>
<data name="ProtectorType" inType="win:HexInt32" outType="win:HexInt32"/>
<data name="UnlockTime" inType="win:SYSTEMTIME" outType="xs:dateTime"/>
</template>
We found topics similar to this case:
Therefore, we supposed that the raw evtx file contains an "UnlockTime" event date with a raw value of 0.
Windows Event Viewer support and display the value "1601-01-01T00:00:00.0000000Z" while the evtx library don't.
By looking at the code, we found that the library use the rust function from_ymd that can throw this error.
In this case, if any event has a wrong "UnlockTime" value, the whole evtx file cannot be processed.
If it's the expected behavior, is adding an option that allows the user to process the whole file while skipping faulty events possible as a workaround ?
If not, can an update to this using from_ymd_opt
instead of from_ymd
fix it ? Events will have empty "UnlockTime" data value.
In any case, thank you for your work !
Regards.
I was doing some tests between a couple different tools.
With the linked to file down below, this library is missing the following events (tracked via EventRecordID): [14358, 14359, 14360, 14361, 14362, 14363, 14364, 14365, 14366, 14367, 14368, 14369, 14370, 14371, 14372, 14373, 14374, 14375, 14376, 14377, 14378, 14379, 14380, 14381, 14382, 14383, 14384, 14385, 14386, 14387, 14388, 14389, 14390, 14391, 14392, 14393, 14394, 14395, 14396, 14397, 14398, 14399, 14400, 14401, 14402, 14403, 14404, 14405, 14406, 14407, 14408, 14409, 14410, 14411, 14412, 14413, 14414, 14415, 14416, 14417, 14418, 14419, 14420, 14421, 14422, 14423, 14424, 14425, 14426, 14427, 14428, 14429, 14430, 14431, 14432, 14433, 14434, 14435, 14436, 14437, 14438, 14439, 14440, 14441, 14442, 14443, 14444, 14445, 14446, 14447, 14448, 14449, 14450, 14451, 14452, 14453, 14454, 14455, 14456, 14457, 14458, 14459, 14460, 14461, 14462, 14463, 14464, 14465, 14466, 14467, 14468, 14469, 14470, 14471, 14472, 14473, 14474, 14475, 14476, 14477, 14478, 14479, 14480, 14481, 14482, 14483, 14484, 14485, 14486, 14487, 14488, 14489, 14490, 14491, 14492, 14493, 14494, 14495, 14496, 14497, 14498, 14499, 14500, 14501, 14502, 14503, 14504, 14505, 14506, 14507, 14508, 14509, 14510, 14511, 14512, 14513, 14514, 14515, 14516, 14517, 14518, 14519, 14520, 14521, 14522, 14523, 14524, 14525, 14526, 14527, 14528, 14529, 14530, 14531, 14532, 14533, 14534, 14535, 14536, 14537, 14538, 14539, 14540, 14541, 14542, 14543, 14544, 14545, 14546, 14547, 14548, 14549, 14550, 14551, 14552, 14553, 14554, 14555, 14556, 14557, 14558, 14559, 14560, 14561, 14562, 14563, 14564, 14565, 14566, 14567, 14568, 14569, 14570, 14571, 14572, 14573, 14574, 14575, 14576, 14577, 14578, 14579, 14580, 14581, 14582, 14583, 14584, 14585, 14586, 14587, 14588, 14589, 14590, 14591, 14592, 14593, 14594, 14595, 14596, 14597, 14598, 14599, 14600, 14601, 14602, 14603, 14604, 14605, 14606, 14607, 14608, 14609, 14610, 14611, 14612, 14613, 14614, 14615, 14616, 14617, 14618, 14619, 14620, 14621]
It almost looks like there is a block being skipped? Maybe due to a range index?
Test data I was using (link expires 6/1/2019):
https://www.dropbox.com/s/kdy4fxp3ndvq3r9/event_testing.zip?dl=0
The zip has the output from the other tools used for comparison. See below for tool info.
Rust info:
evtx version = "0.1.6"
stable-x86_64-pc-windows-msvc (default)
rustc 1.34.0 (91856ed52 2019-04-10)
Other tools used for comparison:
libevtx - evtxexport.exe [Metz]
https://github.com/libyal/libevtx/releases/tag/20181227
EvtxECmd [Zim]
https://github.com/EricZimmerman/evtx
An event log can be so large that it has more chunks than allowable in the header's u16 chunk count. We can calculate the chunk count by taking size of evtx stream, the header size, and chunk size. This allows for parsing chunk where the index is greater than u16. Will submit a PR.
Hi,
I recently came across your evtx parser and was really impressed by it's speed. Thank you for your efforts.
In one of my use cases I would like to import the resulting json files to elasticsearch via logstash to work with some logstash filter to make them "ECS" (Elastic Common naming scheme) compliant. One of their rules are lowercase field names. Logstash has a nice json parser but it's not the best point to lowercase all potential keys in a json structure.
Therefor I would like to ask, if there is a chance to get another cli argument a la "lowercase all json keys"?
While trying to import Sysmon Event Logs provided by SANS in the Workshop "Cobalt Strike Detection with Event Log Analysis" (see https://www.sans.org/webcasts/tech-tuesday-workshop-cobalt-strike-detection-log-analysis-119395/) to Kuiper, I faced the following parsing error:
Failed 1: Invalid EVTX record header magic, expected `2a2a0000`, found `[ 0, 0, 0, 0]` - Line No. 21
I was able to reproduce the issue with another Sysmon Event Log file and found out that if the chunk header fields last_event_record_id
as well as free_space_offset
are greater than the actual number of records in the chunk, the parser fails with the aforementioned error. A sample output of parsing the Sysmon Event Log file provided by SANS in debug mode is shown below. Please note, that I added the output of the chunk header fields for debugging purposes.
14:38:58 [INFO] first_event_record_number - 188705
14:38:58 [INFO] last_event_record_number - 188775
14:38:58 [INFO] first_event_record_id - 188705
14:38:58 [INFO] last_event_record_id - 188775
14:38:58 [INFO] free_space_offset - 64568
14:38:58 [INFO] Initializing string cache
14:38:58 [INFO] Initializing template cache
14:38:58 [INFO] Record id - 188705
14:38:58 [DEBUG] (1) evtx::evtx_chunk: Record header - EvtxRecordHeader { data_size: 3000, event_record_id: 188705, timestamp: 2018-09-07T04:28:25.337132Z }
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 550 was not found in cache
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 2065 was not found in cache
14:38:58 [INFO] Record id - 188706
14:38:58 [DEBUG] (1) evtx::evtx_chunk: Record header - EvtxRecordHeader { data_size: 776, event_record_id: 188706, timestamp: 2018-09-07T04:29:02.596583Z }
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 550 was not found in cache
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 2065 was not found in cache
14:38:58 [INFO] Record id - 188707
14:38:58 [DEBUG] (1) evtx::evtx_chunk: Record header - EvtxRecordHeader { data_size: 744, event_record_id: 188707, timestamp: 2018-09-07T04:29:40.365998Z }
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 550 was not found in cache
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 2065 was not found in cache
14:38:58 [INFO] Record id - 188708
14:38:58 [DEBUG] (1) evtx::evtx_chunk: Record header - EvtxRecordHeader { data_size: 744, event_record_id: 188708, timestamp: 2018-09-07T04:30:58.380798Z }
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 550 was not found in cache
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 2065 was not found in cache
14:38:58 [INFO] Record id - 188709
14:38:58 [DEBUG] (1) evtx::evtx_chunk: Record header - EvtxRecordHeader { data_size: 744, event_record_id: 188709, timestamp: 2018-09-07T04:30:58.380798Z }
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 550 was not found in cache
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 2065 was not found in cache
14:38:58 [INFO] Record id - 188710
14:38:58 [DEBUG] (1) evtx::evtx_chunk: Record header - EvtxRecordHeader { data_size: 744, event_record_id: 188710, timestamp: 2018-09-07T04:30:58.380798Z }
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 550 was not found in cache
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 2065 was not found in cache
14:38:58 [INFO] Record id - 188711
14:38:58 [DEBUG] (1) evtx::evtx_chunk: Record header - EvtxRecordHeader { data_size: 744, event_record_id: 188711, timestamp: 2018-09-07T04:32:12.405200Z }
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 550 was not found in cache
14:38:58 [DEBUG] (1) evtx::binxml::assemble: Template in offset 2065 was not found in cache
Failed to dump the next record.
Caused by:
0: An error occurred while trying to deserialize evtx stream.
1: Invalid EVTX record header magic, expected `2a2a0000`, found `[ 0, 0, 0, 0]`
Within the source file evtx_chunk.rs
this should be the code lines of interest.
Lines 250 to 258 in 0950198
From my point of view, the parser should not completely fail if chunk header fields are not set correctly. Instead, the parser should continue at least with the next chunk after an errorneous record could not be parsed.
Nevertheless, thank you for your excellent work and for providing this Event Log parser!
fd -e evtx -x evtx_dump -f "{.}.xml
will create an xml file next to each evtx file, for all files in folder recursively!
Got:
error: The following required arguments were not provided:
<INPUT>
USAGE:
evtx_dump <INPUT> --ansi-codec <ansi-codec> --threads <num-threads> --format <output-format>
For more information try --help
Thx for this stuff, really handy tool :)
Just noticed there was no darwin version for 0.7.2. Figured I'd report it.
Is it possible to pass a file via standard input? It looks like there's some seeking going on that would prevent this at the moment. I tried:
evtx_dump -o jsonl /dev/stdin
This prints:
Error: Failed to open evtx file at: /dev/stdin
Caused by:
0: An error occurred while trying to deserialize evtx stream.
1: An expected I/O error has occurred
2: Offset `0x00000000 (0)` - An error has occurred while trying to deserialize binary stream
failed to seek in file_header
Original message:
`Illegal seek (os error 29)`
It seems that there is some interference with serde-1.0.123
Environment
$ cargo --version
cargo 1.49.0 (d00d64df9 2020-12-05)
$ uname -v
Darwin Kernel Version 19.6.0: Tue Nov 10 00:10:30 PST 2020; root:xnu-6153.141.10~1/RELEASE_X86_64
Command
cargo install evtx
Error Message
Compiling evtx v0.6.8
error[E0603]: module `export` is private
--> /Users/jasa/.cargo/registry/src/github.com-1ecc6299db9ec823/evtx-0.6.8/src/binxml/name.rs:11:12
|
11 | use serde::export::Formatter;
| ^^^^^^ private module
|
note: the module `export` is defined here
--> /Users/jasa/.cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.123/src/lib.rs:275:5
|
275 | use self::__private as export;
| ^^^^^^^^^^^^^^^^^^^^^^^^^
error[E0603]: module `export` is private
--> /Users/jasa/.cargo/registry/src/github.com-1ecc6299db9ec823/evtx-0.6.8/src/model/deserialized.rs:5:12
|
5 | use serde::export::Formatter;
| ^^^^^^ private module
|
note: the module `export` is defined here
--> /Users/jasa/.cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.123/src/lib.rs:275:5
|
275 | use self::__private as export;
| ^^^^^^^^^^^^^^^^^^^^^^^^^
error: aborting due to 2 previous errors
For more information about this error, try `rustc --explain E0603`.
error: failed to compile `evtx v0.6.8`, intermediate artifacts can be found at `/var/folders/0d/tjq7d2vn3nl19k00k4_gpzyr1q_6wx/T/cargo-installdk40ap`
Caused by:
could not compile `evtx`
To learn more, run the command again with --verbose.
When using separate json attributes, elements that have no value should be left out. Currently, these are empty Maps. For example, in one event you may have a entry that has no value for the RevocationResult element. This looks currently looks like:
"RevocationInfo": {
"RevocationResult": {},
"RevocationResult_attributes": {
"value": "80092013"
}
}
In another entry, the RevocationResult element has a text value:
"RevocationInfo": {
"RevocationResult": "The revocation function was unable to check revocation because the revocation server was offline.",
"RevocationResult_attributes": {
"value": "80092013"
}
}
While a value shouldn't be represented as something its not, this also causes errors when doing actions like indexing because of type differences.
Will make a PR to fix.
Hi, I am new to rust and wonder if you have any examples for reading windows event logs on a live system. And of course thanks for making this fast library!
from evtx import PyEvtxParser
how to parser "descending order" use
seeing this error for evtx files? Not sure what is causing this though, is there any evtx logs that can't be handled by this rust binary?
Failed to dump the next record.
Caused by:
0: Failed to parse chunk number 0
1: Failed to parse chunk header
2: Failed to deserialize next_template_offset
of type u32
3: Offset 0x08180000 (135790592)
- An error has occurred while trying to deserialize binary stream
Original message:
`failed to fill whole buffer`
Hexdump:
---------------------------------------------------------------------------
Current Value 00
--
00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000060: 00 00 00 00 ....
----------------------------------------------------------------------------
4: failed to fill whole buffer
Failed to dump the next record.
Caused by:
0: Failed to parse chunk number 7
1: Failed to parse chunk header
2: Invalid EVTX chunk header magic, expected ElfChnk0
, found [ 0, 0, 1B, 5, 0, 0, 2, E]
Failed to dump the next record.
Caused by:
0: Failed to parse chunk number 8
1: Failed to parse chunk header
2: Invalid EVTX chunk header magic, expected ElfChnk0
, found [8A, 14, B3, D8, 1, F, 1, 1]
Failed to dump the next record.
Came up during a quick check:
[...]
= note: this warning originates in the macro `try_read` (in Nightly builds, run with -Z macro-backtrace for more info)
warning: trailing semicolon in macro used in expression position
--> src/macros.rs:49:69
|
49 | .map_err(|e| capture_context!($cursor, e, "u16", $name));
| ^
|
::: src/utils/time.rs:15:24
|
15 | let milliseconds = try_read!(r, u16)?;
| ----------------- in this macro invocation
|
= warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
= note: for more information, see issue #79813 <https://github.com/rust-lang/rust/issues/79813>
= note: this warning originates in the macro `try_read` (in Nightly builds, run with -Z macro-backtrace for more info)
warning: `evtx` (lib) generated 88 warnings
Finished release [optimized] target(s) in 49.87s
Add directory support as an input
Getting this error about evtx:
Failed to build evtx
Installing collected packages: evtx
Running setup.py install for evtx ... error
ERROR: Command errored out with exit status 1:
command: /home/template/LogonTracer/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-k6d_zuyt/evtx_022aa9be838d483e91b20221f327d5e8/setup.py'"'"'; file='"'"'/tmp/pip-install-k6d_zuyt/evtx_022aa9be838d483e91b20221f327d5e8/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-naa187u2/install-record.txt --single-version-externally-managed --compile --install-headers /home/template/LogonTracer/include/site/python3.9/evtx
cwd: /tmp/pip-install-k6d_zuyt/evtx_022aa9be838d483e91b20221f327d5e8/
Complete output (44 lines):
running install
running build
running build_ext
running build_rust
error: manifest path Cargo.toml
does not exist
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-install-k6d_zuyt/evtx_022aa9be838d483e91b20221f327d5e8/setup.py", line 21, in
setup(
File "/home/template/LogonTracer/lib/python3.9/site-packages/setuptools/init.py", line 153, in setup
return distutils.core.setup(**attrs)
File "/usr/lib/python3.9/distutils/core.py", line 148, in setup
dist.run_commands()
File "/usr/lib/python3.9/distutils/dist.py", line 966, in run_commands
self.run_command(cmd)
File "/usr/lib/python3.9/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/home/template/LogonTracer/lib/python3.9/site-packages/setuptools/command/install.py", line 61, in run
return orig.install.run(self)
File "/usr/lib/python3.9/distutils/command/install.py", line 590, in run
self.run_command('build')
File "/usr/lib/python3.9/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/usr/lib/python3.9/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/usr/lib/python3.9/distutils/command/build.py", line 135, in run
self.run_command(cmd_name)
File "/usr/lib/python3.9/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/usr/lib/python3.9/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/home/template/LogonTracer/lib/python3.9/site-packages/setuptools_rust/setuptools_ext.py", line 103, in run
build_rust.run()
File "/home/template/LogonTracer/lib/python3.9/site-packages/setuptools_rust/command.py", line 52, in run
self.run_for_extension(ext)
File "/home/template/LogonTracer/lib/python3.9/site-packages/setuptools_rust/build.py", line 92, in run_for_extension
dylib_paths = self.build_extension(ext)
File "/home/template/LogonTracer/lib/python3.9/site-packages/setuptools_rust/build.py", line 131, in build_extension
metadata = json.loads(check_output(metadata_command))
File "/usr/lib/python3.9/subprocess.py", line 424, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/usr/lib/python3.9/subprocess.py", line 528, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['cargo', 'metadata', '--manifest-path', 'Cargo.toml', '--format-version', '1']' returned non-zero exit status 101.
----------------------------------------
ERROR: Command errored out with exit status 1: /home/template/LogonTracer/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-k6d_zuyt/evtx_022aa9be838d483e91b20221f327d5e8/setup.py'"'"'; file='"'"'/tmp/pip-install-k6d_zuyt/evtx_022aa9be838d483e91b20221f327d5e8/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-naa187u2/install-record.txt --single-version-externally-managed --compile --install-headers /home/template/LogonTracer/include/site/python3.9/evtx Check the logs for full command output.
Using VM with
Distributor ID: Ubuntu
Description: Ubuntu 21.04
Release: 21.04
Codename: hirsute
โฏ python3 -V
Python 3.9.5
โฏ python -V
Python 3.9.5
โฏ pip -V
pip 21.1.2 from /home/template/LogonTracer/lib/python3.9/site-packages/pip (python 3.9)
โฏ pip3 -V
pip 21.1.2 from /home/template/LogonTracer/lib/python3.9/site-packages/pip (python 3.9)
โฏ rustc -V
rustc 1.52.1 (9bc8c42bb 2021-05-09)
pip3 install evtx just fail but pip3 install python-evtx is working fine.
pip3 install python-evtx
Requirement already satisfied: python-evtx in ./lib/python3.9/site-packages (0.7.4)
Requirement already satisfied: pyparsing==2.4.7 in ./lib/python3.9/site-packages (from python-evtx) (2.4.7)
Requirement already satisfied: hexdump==3.3 in ./lib/python3.9/site-packages (from python-evtx) (3.3)
Requirement already satisfied: configparser==4.0.2 in ./lib/python3.9/site-packages (from python-evtx) (4.0.2)
Requirement already satisfied: more-itertools==5.0.0 in ./lib/python3.9/site-packages (from python-evtx) (5.0.0)
Requirement already satisfied: zipp==1.0.0 in ./lib/python3.9/site-packages (from python-evtx) (1.0.0)
Requirement already satisfied: six in ./lib/python3.9/site-packages (from python-evtx) (1.16.0)
even if installed evtx:
evtx_dump -h
EVTX Parser 0.7.2
Omer B. [email protected]
Utility to parse EVTX files
USAGE:
evtx_dump [FLAGS] [OPTIONS]
FLAGS:
--no-confirm-overwrite When set, will not ask for confirmation before overwriting files, useful for
pip install just fail
anything i miss or do wrong ?
thanks
Hi
Great tool! Is there an option to exclude the following lines from the output files?
Record ######
<?xml version="1.0" encoding="utf-8"?>
Thanks
Would you be willing to add a feature that allows the user to iterate json Values for the parser?
My use case is to augment data to the Value without having to re serialize it.
Evtx'es have a property "InstanceID" which is related to EventID:
InstanceID is not EventID, but can be:
The InstanceId property uniquely identifies an event entry for a configured event source. The InstanceId for an event log entry represents the full 32-bit resource identifier for the event in the message resource file for the event source. The EventID property equals the InstanceId with the top two bits masked off. Two event log entries from the same source can have matching EventID values, but have different InstanceId values due to differences in the top two bits of the resource identifier. If the application wrote the event entry using one of the WriteEntry methods, the InstanceId property matches the optional eventId parameter. If the application wrote the event using WriteEvent, the InstanceId property matches the resource identifier specified in the InstanceId of the instance parameter. If the application wrote the event using the Win32 API ReportEvent, the InstanceId property matches the resource identifier specified in the dwEventID parameter.
Taken from here: https://evotec.xyz/powershell-everything-you-wanted-to-know-about-event-logs/
I would very much like to have InstanceID read in. It isn't in the XML data; XML data contains EventID
I don't know enough about evtx structure to offer a patch.
Cross post with pyevtx-rs/issues/9
Hi ! Thank you for your work :)
I noticed that somehow the <Event> tag is never closed:
$ cargo run -- --input samples/new-user-security.evtx
Record 1
<?xml version="1.0" encoding="utf-8"?>
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="Microsoft-Windows-Security-Auditing" Guid="54849625-5478-4994-A5BA-3E3B0328C30D">
[...]
<Data Name="SubjectLogonId">0x3e7</Data>
<Data Name="PrivilegeList">-</Data>
</EventData>
Record 2
<?xml version="1.0" encoding="utf-8"?>
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
[...]
<Data Name="LogonHours">%%1797</Data>
</EventData>
Record 3
<?xml version="1.0" encoding="utf-8"?>
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
[...]
</EventData>
Record 4
<?xml version="1.0" encoding="utf-8"?>
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
[...]
</EventData>
Shouldn't there be a </Event> at the end of each record ?
Only gave a quick look at the code and it seems that the last call to visit_close_element (in src/xml_output.rs) returns because eof_reached is already true.
By the way, would you be interested in a json output or it's not in the scope of the project ?
Hello,
I try to use the library mode of evtx, because I need to script additional things (I want to send it to a Splunk instance). Maybe I missed something but the multithreading does not seem to be enabled. See the two following pictures showing a test with a 20Mo Application evtx file, with the binary, and with the library:
Any idea ? Maybe I should add additional things on the Cargo configuration file.
Observe that for the library mode, the additional threads are created but not used.
Best regards,
ekt0
Can I request jsonl output formatting? JSON is nice, but in its current form it does not ingest easily. Current JSON output example:
Record 1
{
"Event": {
"#attributes": {
"xmlns": "http://schemas.microsoft.com/win/2004/08/events/event"
},
"EventData": {
"Data": "caller=sppsvc.exe"
},
"System": {
"Channel": "Microsoft-Client-Licensing-Platform/Admin",
"Computer": "DESKTOP-1N4R894",
"Correlation": null,
"EventID": 100,
"EventRecordID": 1,
"Execution": {
"#attributes": {
"ProcessID": 1348,
"ThreadID": 1368
}
},
"Keywords": "0x2000000000000001",
"Level": 4,
"Opcode": 0,
"Provider": {
"#attributes": {
"Guid": "B6CC0D55-9ECC-49A8-B929-2B9022426F2A",
"Name": "Microsoft-Client-Licensing-Platform"
}
},
"Security": {
"#attributes": {
"UserID": "S-1-5-18"
}
},
"Task": 0,
"TimeCreated": {
"#attributes": {
"SystemTime": "2018-07-06T18:38:20.815807Z"
}
},
"Version": 0
}
}
}
Record 2
{
"Event": {
"#attributes": {
"xmlns": "http://schemas.microsoft.com/win/2004/08/events/event"
},
"EventData": {
"Data": "10.0.17134.1"
},
"System": {
"Channel": "Microsoft-Client-Licensing-Platform/Admin",
"Computer": "DESKTOP-1N4R894",
"Correlation": null,
"EventID": 101,
"EventRecordID": 2,
"Execution": {
"#attributes": {
"ProcessID": 1348,
"ThreadID": 1372
}
},
"Keywords": "0x2000000000000001",
"Level": 4,
"Opcode": 0,
"Provider": {
"#attributes": {
"Guid": "B6CC0D55-9ECC-49A8-B929-2B9022426F2A",
"Name": "Microsoft-Client-Licensing-Platform"
}
},
"Security": {
"#attributes": {
"UserID": "S-1-5-18"
}
},
"Task": 0,
"TimeCreated": {
"#attributes": {
"SystemTime": "2018-07-06T18:38:20.819393Z"
}
},
"Version": 0
}
}
}
Proposed JSONL output:
JSONL is easier to work with because each line is the record. Example:
{"Event":{"#attributes":{"xmlns":"http://schemas.microsoft.com/win/2004/08/events/event"},"EventData":{"Data":"caller=sppsvc.exe"},"System":{"Channel":"Microsoft-Client-Licensing-Platform/Admin","Computer":"DESKTOP-1N4R894","Correlation":null,"EventID":100,"EventRecordID":1,"Execution":{"#attributes":{"ProcessID":1348,"ThreadID":1368}},"Keywords":"0x2000000000000001","Level":4,"Opcode":0,"Provider":{"#attributes":{"Guid":"B6CC0D55-9ECC-49A8-B929-2B9022426F2A","Name":"Microsoft-Client-Licensing-Platform"}},"Security":{"#attributes":{"UserID":"S-1-5-18"}},"Task":0,"TimeCreated":{"#attributes":{"SystemTime":"2018-07-06T18:38:20.815807Z"}},"Version":0}}}
{"Event":{"#attributes":{"xmlns":"http://schemas.microsoft.com/win/2004/08/events/event"},"EventData":{"Data":"10.0.17134.1"},"System":{"Channel":"Microsoft-Client-Licensing-Platform/Admin","Computer":"DESKTOP-1N4R894","Correlation":null,"EventID":101,"EventRecordID":2,"Execution":{"#attributes":{"ProcessID":1348,"ThreadID":1372}},"Keywords":"0x2000000000000001","Level":4,"Opcode":0,"Provider":{"#attributes":{"Guid":"B6CC0D55-9ECC-49A8-B929-2B9022426F2A","Name":"Microsoft-Client-Licensing-Platform"}},"Security":{"#attributes":{"UserID":"S-1-5-18"}},"Task":0,"TimeCreated":{"#attributes":{"SystemTime":"2018-07-06T18:38:20.819393Z"}},"Version":0}}}
Hi,
We are trying to use your library to parse Windows logs but we encountered some strange error when parsing EVTX files coming from a Windows Event Collector server.
In the event viewer, the XML is the following:
- <EventData>
<Data>Set-Mailbox</Data>
<Data>-Identity "Administrateur" -DeliverToMailboxAndForward "False" -ForwardingSmtpAddress "smtp:[email protected]"</Data>
<Data>ave.local/Users/Administrateur</Data>
<Data>S-1-5-21-186559946-3925841745-111227986-500</Data>
<Data>S-1-5-21-186559946-3925841745-111227986-500</Data>
<Data>Remote-ManagementShell-Unknown</Data>
<Data>5668 w3wp#MSExchangePowerShellAppPool</Data>
<Data />
<Data>5</Data>
<Data>00:00:26.0389557</Data>
<Data>Afficher la forรชt entiรจre : 'False', Portรฉe par dรฉfaut : ยซ ave.local ยป, Configuration du contrรดleur de domaine : ยซ DC.ave.local ยป, Catalogue global prรฉfรฉrรฉ : ยซ DC.ave.local ยป, Contrรดleurs de domaine prรฉfรฉrรฉs : ยซ { DC.ave.local } ยป</Data>
<Data />
<Data />
<Data />
<Data />
<Data />
<Data />
<Data>False</Data>
<Data />
<Data>0 objects execution has been proxied to remote server.</Data>
<Data />
<Data />
<Data>0</Data>
<Data>ActivityId: a3591746-a27b-447a-b8be-ff54ae3a46f1</Data>
<Data>ServicePlan:;IsAdmin:True;</Data>
<Data />
<Data>fr-FR</Data>
</EventData>
If we convert the original EVTX, we obtain the following JSON:
"Data": {
"#text": [
"Set-Mailbox",
"-Identity \"Administrateur\" -DeliverToMailboxAndForward \"False\" -ForwardingSmtpAddress \"smtp:[email protected]\"",
"ave.local/Users/Administrateur",
"S-1-5-21-186559946-3925841745-111227986-500",
"S-1-5-21-186559946-3925841745-111227986-500",
"Remote-ManagementShell-Unknown",
"5668 w3wp#MSExchangePowerShellAppPool",
"",
"5",
"00:00:26.0389557",
"Afficher la forรชt entiรจre : 'False', Portรฉe par dรฉfaut : ยซ ave.local ยป, Configuration du contrรดleur de domaine : ยซ DC.ave.local ยป, Catalogue global prรฉfรฉrรฉ : ยซ DC.ave.local ยป, Contrรดleurs de domaine prรฉfรฉrรฉs : ยซ { DC.ave.local } ยป",
"",
"",
"",
"",
"",
"",
"False",
"",
"0 objects execution has been proxied to remote server.",
"",
"",
"0",
"ActivityId: a3591746-a27b-447a-b8be-ff54ae3a46f1",
"ServicePlan:;IsAdmin:True;",
"",
"fr-FR"
]
[MSExchange_Management.zip](https://github.com/omerbenamram/evtx/files/7571802/MSExchange_Management.zip)
}
},
But when the log has been forwarding using WEF, when the EVTX is parsed, we obtain the following JSON:
"EventData": {
"Data": {
"#text": "fr-FR"
}
},
As you can see, almost all the information are lost. If you want to make some tests, the EVTX are here: MSExchange_Management.zip
Thank you!
The cause is that samples_dir()
assumes that the project dir is the current working dir, which is not the case when a workspace is being used.
Solution:
diff --git a/tests/fixtures.rs b/tests/fixtures.rs
index 5ff166e..4d760ce 100644
--- a/tests/fixtures.rs
+++ b/tests/fixtures.rs
@@ -20,11 +20,7 @@ pub fn ensure_env_logger_initialized() {
}
pub fn samples_dir() -> PathBuf {
- PathBuf::from(file!())
- .parent()
- .unwrap()
- .parent()
- .unwrap()
+ PathBuf::from(env!("CARGO_MANIFEST_DIR"))
.join("samples")
.canonicalize()
.unwrap()
Getting an error on this file.
Here is the link: https://www.dropbox.com/s/1tugvc0gy0icv59/VSS1_Windows_system32_winevt_logs_HardwareEvents.evtx?dl=0
D:\Tools\evtx_dump>evtx_dump.exe D:\Images\CTF_DEFCON_2018\Image3-Desktop\Extracts\EVTX\VSS1_Windows_system32_winevt_logs_HardwareEvents.evtx
thread 'main' panicked at 'Failed to load evtx file located at D:\Images\CTF_DEFCON_2018\Image3-Desktop\Extracts\EVTX\VSS1_Windows_system32_winevt_logs_HardwareEvents.evtx', src\bin\evtx_dump.rs:201:29
stack backtrace:
0: std::sys::windows::backtrace::set_frames
at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\sys\windows\backtrace\mod.rs:94
1: std::sys::windows::backtrace::unwind_backtrace
at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\sys\windows\backtrace\mod.rs:81
2: std::sys_common::backtrace::_print
at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\sys_common\backtrace.rs:70
3: std::sys_common::backtrace::print
at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\sys_common\backtrace.rs:58
4: std::panicking::default_hook::{{closure}}
at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\panicking.rs:200
5: std::panicking::default_hook
at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\panicking.rs:215
6: std::panicking::rust_panic_with_hook
at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\panicking.rs:478
7: std::panicking::continue_panic_fmt
at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\panicking.rs:385
8: std::panicking::begin_panic_fmt
at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\panicking.rs:340
9: evtx_dump::is_a_non_negative_number
10: <evtx::xml_output::XmlOutput<W> as evtx::xml_output::BinXmlOutput<W>>::visit_open_start_element
11: std::rt::lang_start_internal::{{closure}}
at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\rt.rs:49
12: std::panicking::try::do_call<closure,i32>
at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\panicking.rs:297
13: panic_unwind::__rust_maybe_catch_panic
at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libpanic_unwind\lib.rs:87
14: std::panicking::try
at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\panicking.rs:276
15: std::panic::catch_unwind
at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\panic.rs:388
16: std::rt::lang_start_internal
at /rustc/fc50f328b0353b285421b8ff5d4100966387a997\/src\libstd\rt.rs:48
17: main
18: invoke_main
at d:\agent\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:78
19: __scrt_common_main_seh
at d:\agent\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
20: BaseThreadInitThunk
21: RtlUserThreadStart
Those (or a similar) messages are created when evtx
reads a boolean value (type code 0x0d
with a length of 4
which has a value different from 0x00
or 0x01
. According to Microsofts definition, a BoolType
is An 8-bit integer that MUST be 0x00 or 0x01 (mapping to true or false, respectively). (https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-even6/8aa98312-f199-4e37-a51f-d3a2ccb50d60)
There seems to be a bug somewhere either in the creator of evtx files or in the parser.
Microsoft defines the following (https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-even6/c73573ae-1c90-43a2-a65f-ad7501155956):
TemplateInstanceData = ValueSpec *Value; Emit using TemplateInstanceDataRule
ValueSpec = NumValues *ValueSpecEntry
ValueSpecEntry = ValueByteLength ValueType %x00
ValueByteLength = WORD
ValueType =
NullType / StringType / AnsiStringType / Int8Type / UInt8Type /
Int16Type / UInt16Type / Int32Type / UInt32Type / Int64Type /
Int64Type / Real32Type / Real64Type / BoolType / BinaryType /
GuidType / SizeTType / FileTimeType / SysTimeType / SidType /
HexInt32Type / HexInt64Type / BinXmlType / StringArrayType /
AnsiStringArrayType / Int8ArrayType / UInt8ArrayType /
Int16ArrayType / UInt16ArrayType / Int32ArrayType / UInt32ArrayType/
Int64ArrayType / UInt64ArrayType / Real32ArrayType /
Real64ArrayType / BoolArrayType / GuidArrayType / SizeTArrayType /
FileTimeArrayType / SysTimeArrayType / SidArrayType /
HexInt32ArrayType / HexInt64ArrayType
BoolType = %x0D
Value =
StringValue / AnsiStringValue / Int8Value / UInt8Value /
Int16Value / UInt16Value / Int32Value / UInt32Value / Int64Value /
UInt64Value / Real32Value / Real64Value / BoolValue / BinaryValue /
GuidValue / SizeTValue / FileTimeValue / SysTimeValue / SidValue /
HexInt32Value / HexInt64Value / BinXmlValue / StringArrayValue /
AnsiStringArrayValue / Int8ArrayValue / UInt8ArrayValue /
Int16ArrayValue / UInt16ArrayValue / Int32ArrayValue /
UInt32ArrayValue / Int64ArrayValue / UInt64ArrayValue /
Real32ArrayValue / Real64ArrayValue / BoolArrayValue /
GuidArrayValue / SizeTArrayValue / FileTimeArrayValue /
SysTimeArrayValue / SidArrayValue / HexInt32ArrayValue /
HexInt64ArrayValue
So, a boolean should could like the following:
0x00000001 0x01 0x0d 0x00 0x00
| | | | |
| | | | +-> Value
| | | +------> %x00
| | +-----------> ValueType
| +----------------> ValueByteLength
+-------------------------> NumValues
But obviously, there are (sometimes) BoolType
s with a ValueByteLength
of 4
, which violate the specification.
You've added a special handling for boolean values which do not match 0x00
or 0x01
. Do you know why there are such values?
I'm not sure if this is really a bug of your code, but reading 4 Byte for a boolean value also violates the specification and I was interested in what the reason for this is.
I get the following error in version 0.6.0 but not in 0.5.1.
thread 'main' panicked at 'It can only be an object or null, and null was covered', src\libcore\option.rs:1188:5
stack backtrace:
0: backtrace::backtrace::trace_unsynchronized
at C:\Users\...\.cargo\registry\src\github.com-1ecc6299db9ec823\backtrace-0.3.40\src\backtrace\mod.rs:66
1: std::sys_common::backtrace::_print_fmt
at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\sys_common\backtrace.rs:84
2: std::sys_common::backtrace::_print::{{impl}}::fmt
at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\sys_common\backtrace.rs:61
3: core::fmt::write
at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libcore\fmt\mod.rs:1024
4: std::io::Write::write_fmt<std::sys::windows::stdio::Stderr>
at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\io\mod.rs:1428
5: std::sys_common::backtrace::_print
at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\sys_common\backtrace.rs:65
6: std::sys_common::backtrace::print
at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\sys_common\backtrace.rs:50
7: std::panicking::default_hook::{{closure}}
at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\panicking.rs:193
8: std::panicking::default_hook
at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\panicking.rs:210
9: std::panicking::rust_panic_with_hook
at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\panicking.rs:471
10: std::panicking::begin_panic_handler
at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\panicking.rs:375
11: core::panicking::panic_fmt
at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libcore\panicking.rs:82
12: core::option::expect_failed
at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libcore\option.rs:1188
13: evtx::model::raw::BinXMLRawToken::from_u8
14: <evtx::json_output::JsonOutput as evtx::xml_output::BinXmlOutput>::visit_open_start_element
15: evtx::binxml::assemble::parse_tokens
16: evtx::evtx_record::EvtxRecord::into_json_value
17: <alloc::vec::Vec<T> as alloc::vec::SpecExtend<T,I>>::from_iter
18: core::ops::function::impls::<impl core::ops::function::FnMut<A> for &F>::call_mut
19: rayon::iter::plumbing::Folder::consume_iter
20: rayon::iter::plumbing::bridge_producer_consumer::helper
21: <rayon::vec::IntoIter<T> as rayon::iter::IndexedParallelIterator>::with_producer
22: rayon::iter::collect::special_extend
23: rayon::iter::collect::<impl rayon::iter::ParallelExtend<T> for alloc::vec::Vec<T>>::par_extend
24: evtxtools::evtxhandler::EvtxHandler<T>::get_attribute_mapping
25: alloc::alloc::box_free
26: alloc::alloc::box_free
27: crossbeam_epoch::deferred::Deferred::new::call
28: std::rt::lang_start_internal::{{closure}}
at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\rt.rs:52
29: std::panicking::try::do_call<closure-0,i32>
at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\panicking.rs:292
30: panic_unwind::__rust_maybe_catch_panic
at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libpanic_unwind\lib.rs:78
31: std::panicking::try
at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\panicking.rs:270
32: std::panic::catch_unwind
at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\panic.rs:394
33: std::rt::lang_start_internal
at /rustc/7afe6d9d1f48b998cc88fe6f01ba0082788ba4b9\/src\libstd\rt.rs:51
34: main
35: invoke_main
at d:\agent\_work\3\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:78
36: __scrt_common_main_seh
at d:\agent\_work\3\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
37: BaseThreadInitThunk
38: RtlUserThreadStart
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Here is the logfile that caused the issue.
E_ShadowCopy6_windows_system32_winevt_logs_Microsoft-Windows-CAPI2%4Operational.zip
Dear Omer,
Awesome work on this library, it is really blazing fast.
I hope you can help me with the following question about the JSON serializer. I would like to alter the JSON data that is outputted by the parser and I am looking for the best way to do it.
By default it outputs something like this:
{
"Event": {
"EventData": {
"Binary": null,
...
"Event_attributes": {
"xmlns": "http://schemas.microsoft.com/win/2004/08/events/event"
}
}
Which I would like to append a few properties to, e.g.:
{
"Event": {
"EventData": {
"Binary": null,
...
"Event_attributes": {
"xmlns": "http://schemas.microsoft.com/win/2004/08/events/event"
},
"fields": {
"host": "WIN-TEST",
"source": "Setup.evtx",
"time": 1623066248.0
}
}
This should happen somewhere around this snippet of code, which returns a record
which contains the data object which is already a string (from the into_json
function):
EvtxOutputFormat::JSON => {
for record in parser.records_json() {
self.dump_record(record)?;
}
}
The following solutions were the ones I could think off:
fields
part.record.data
string
to object with serde_json
, alter it, and convert it to string
again.records_json
functioninsert even better solution here
I'm asking for your advise on this because I wasn't able to figure it out how to properly do it in rust, also performance is important for me so I want to find a very efficient solution.
For solution (3) I already tried to implement something but that doesn't work. Maybe you can provide some guidance or you might even have a much better solution in mind.
// Stable shim until https://github.com/rust-lang/rust/issues/59359 is merged.
// Taken from proposed std code.
pub trait ReadSeek: Read + Seek {
fn tell(&mut self) -> io::Result<u64> {
self.seek(SeekFrom::Current(0))
}
fn stream_len(&mut self) -> io::Result<u64> {
let old_pos = self.tell()?;
let len = self.seek(SeekFrom::End(0))?;
// Avoid seeking a third time when we were already at the end of the
// stream. The branch is usually way cheaper than a seek operation.
if old_pos != len {
self.seek(SeekFrom::Start(old_pos))?;
}
Ok(len)
}
}
impl<T: Read + Seek> ReadSeek for T {}
pub struct JsonSerialize<'a, T: ReadSeek> {
settings: ParserSettings,
parser: &'a mut EvtxParser<T>,
}
impl<T: ReadSeek> JsonSerialize<'_, T> {
/// Return an iterator over all the records.
/// Records will be JSON-formatted.
pub fn records_json(
&mut self,
) -> impl Iterator<Item = Result<SerializedEvtxRecord<String>, EvtxError>> + '_ {
EvtxParser::serialized_records(self.parser, |record| record.and_then(|record| self.into_json(record)))
}
/// Consumes the record and parse it, producing a JSON serialized record.
fn into_json(self, record: EvtxRecord) -> Result<SerializedEvtxRecord<String>, EvtxError> {
let indent = self.settings.should_indent();
let mut record_with_json_value = EvtxRecord::into_json_value(record)?;
let data = if indent {
serde_json::to_string_pretty(&record_with_json_value.data)
.map_err(SerializationError::from)?
} else {
serde_json::to_string(&record_with_json_value.data).map_err(SerializationError::from)?
};
Ok(SerializedEvtxRecord {
event_record_id: record_with_json_value.event_record_id,
timestamp: record_with_json_value.timestamp,
data,
})
}
}
Great tool! Can you please create a l2t output option?
Here is the spec:
Just a small feature request. Could you exclude the null chars in output? It breaks a lot of processing of the output.
Also, there seems to be a formatting issue with integer rendering when the hex value is 1 char. A space between 0x and the integer. I know they are small things, but, it helps a lot when trying to serialize for ingestion or post processing.
Proposal to support jsonl https://jsonlines.org/ as output format.
jsonl are json dicts/types seperated by a newline.
{"event": "foo"}
{"event": "baa"}
...
This makes it extremely easy to use the output with every language that support iterating through lines,
and parsing json. (and its would also be grepable)
Are there any plans to extend this tool to support output to evtx files?
And feed it with a former XML file?
With that both directions would be possible which I think would be extremely great for certain scenarios.
There is a format definition out there as well:
https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-even6/7cdd0c95-2181-4794-a094-55c78b389358?redirectedfrom=MSDN
This happens when I try to parse some carved evtx file
Thanks for the effort to build this great tool, we're throwing it a forwarded log files and really appreciate the performance boost!
There's one minor step which required preprocessing for our use case, as we are loading data in Google's Bigquery.
I unfortunately don't have a build environment setup for rust atm, but it seems the responsible code is here, impacting both #attributes and #text:
Line 239 in 0950198
Line 321 in 0950198
Is there a reason i'm missing to use a special character in these two field names? It's a rather minor issue and we can run sed, but it would save some steps.
Thanks in advance!
Hi,
i'm trying to use this library with pyspark, since it is super fast and easy to use.
Basicly i am trying to load some evtx files and convert them into json files for further processing.
It works great without pyspark, however, i am trying to run my parse function I'll get the following error:
AttributeError: type object 'PyEvtxParser' has no attribute '__iter__'
I think it might have to do with pickle serialization and pyspark, but i don't know for sure. Maybe it will work if you make the parser iterable.
Many Thanks in Advance!
---------------------------------------------------------------------------
Py4JJavaError Traceback (most recent call last)
<ipython-input-5-d168f16142e1> in <module>
1 json_str = evtx_files.map(lambda bdata: parseEvents(bdata[1]))
----> 2 json_str.top(1)
~/workspace/lib/python3.8/site-packages/pyspark/rdd.py in top(self, num, key)
1369 return heapq.nlargest(num, a + b, key=key)
1370
-> 1371 return self.mapPartitions(topIterator).reduce(merge)
1372
1373 def takeOrdered(self, num, key=None):
~/workspace/lib/python3.8/site-packages/pyspark/rdd.py in reduce(self, f)
928 yield reduce(f, iterator, initial)
929
--> 930 vals = self.mapPartitions(func).collect()
931 if vals:
932 return reduce(f, vals)
~/workspace/lib/python3.8/site-packages/pyspark/rdd.py in collect(self)
887 """
888 with SCCallSiteSync(self.context) as css:
--> 889 sock_info = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
890 return list(_load_from_socket(sock_info, self._jrdd_deserializer))
891
~/workspace/lib/python3.8/site-packages/py4j/java_gateway.py in __call__(self, *args)
1302
1303 answer = self.gateway_client.send_command(command)
-> 1304 return_value = get_return_value(
1305 answer, self.gateway_client, self.target_id, self.name)
1306
~/workspace/lib/python3.8/site-packages/pyspark/sql/utils.py in deco(*a, **kw)
126 def deco(*a, **kw):
127 try:
--> 128 return f(*a, **kw)
129 except py4j.protocol.Py4JJavaError as e:
130 converted = convert_exception(e.java_exception)
~/workspace/lib/python3.8/site-packages/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
324 value = OUTPUT_CONVERTER[type](answer[2:], gateway_client)
325 if answer[1] == REFERENCE_TYPE:
--> 326 raise Py4JJavaError(
327 "An error occurred while calling {0}{1}{2}.\n".
328 format(target_id, ".", name), value)
Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, 192.168.1.123, executor driver): org.apache.spark.api.python.PythonException: Traceback (most recent call last):
File "/home/workspace/lib/python3.8/site-packages/pyspark/python/lib/pyspark.zip/pyspark/worker.py", line 587, in main
func, profiler, deserializer, serializer = read_command(pickleSer, infile)
File "/home/workspace/lib/python3.8/site-packages/pyspark/python/lib/pyspark.zip/pyspark/worker.py", line 74, in read_command
command = serializer._read_with_length(file)
File "/home/workspace/lib/python3.8/site-packages/pyspark/python/lib/pyspark.zip/pyspark/serializers.py", line 172, in _read_with_length
return self.loads(obj)
File "/home/workspace/lib/python3.8/site-packages/pyspark/python/lib/pyspark.zip/pyspark/serializers.py", line 458, in loads
return pickle.loads(obj, encoding=encoding)
AttributeError: type object 'PyEvtxParser' has no attribute '__iter__'
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:503)
at org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:638)
at org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:621)
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:456)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator.foreach(Iterator.scala:941)
at scala.collection.Iterator.foreach$(Iterator.scala:941)
at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:62)
at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:53)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:105)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:49)
at scala.collection.TraversableOnce.to(TraversableOnce.scala:315)
at scala.collection.TraversableOnce.to$(TraversableOnce.scala:313)
at org.apache.spark.InterruptibleIterator.to(InterruptibleIterator.scala:28)
at scala.collection.TraversableOnce.toBuffer(TraversableOnce.scala:307)
at scala.collection.TraversableOnce.toBuffer$(TraversableOnce.scala:307)
at org.apache.spark.InterruptibleIterator.toBuffer(InterruptibleIterator.scala:28)
at scala.collection.TraversableOnce.toArray(TraversableOnce.scala:294)
at scala.collection.TraversableOnce.toArray$(TraversableOnce.scala:288)
at org.apache.spark.InterruptibleIterator.toArray(InterruptibleIterator.scala:28)
at org.apache.spark.rdd.RDD.$anonfun$collect$2(RDD.scala:1004)
at org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2139)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:127)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2059)
at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2008)
at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2007)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2007)
at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:973)
at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:973)
at scala.Option.foreach(Option.scala:407)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:973)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2239)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2188)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2177)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:775)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2099)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2120)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2139)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2164)
at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1004)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:388)
at org.apache.spark.rdd.RDD.collect(RDD.scala:1003)
at org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:168)
at org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.spark.api.python.PythonException: Traceback (most recent call last):
File "/home/workspace/lib/python3.8/site-packages/pyspark/python/lib/pyspark.zip/pyspark/worker.py", line 587, in main
func, profiler, deserializer, serializer = read_command(pickleSer, infile)
File "/home/workspace/lib/python3.8/site-packages/pyspark/python/lib/pyspark.zip/pyspark/worker.py", line 74, in read_command
command = serializer._read_with_length(file)
File "/home/workspace/lib/python3.8/site-packages/pyspark/python/lib/pyspark.zip/pyspark/serializers.py", line 172, in _read_with_length
return self.loads(obj)
File "/home/workspace/lib/python3.8/site-packages/pyspark/python/lib/pyspark.zip/pyspark/serializers.py", line 458, in loads
return pickle.loads(obj, encoding=encoding)
AttributeError: type object 'PyEvtxParser' has no attribute '__iter__'
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:503)
at org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:638)
at org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:621)
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:456)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator.foreach(Iterator.scala:941)
at scala.collection.Iterator.foreach$(Iterator.scala:941)
at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:62)
at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:53)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:105)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:49)
at scala.collection.TraversableOnce.to(TraversableOnce.scala:315)
at scala.collection.TraversableOnce.to$(TraversableOnce.scala:313)
at org.apache.spark.InterruptibleIterator.to(InterruptibleIterator.scala:28)
at scala.collection.TraversableOnce.toBuffer(TraversableOnce.scala:307)
at scala.collection.TraversableOnce.toBuffer$(TraversableOnce.scala:307)
at org.apache.spark.InterruptibleIterator.toBuffer(InterruptibleIterator.scala:28)
at scala.collection.TraversableOnce.toArray(TraversableOnce.scala:294)
at scala.collection.TraversableOnce.toArray$(TraversableOnce.scala:288)
at org.apache.spark.InterruptibleIterator.toArray(InterruptibleIterator.scala:28)
at org.apache.spark.rdd.RDD.$anonfun$collect$2(RDD.scala:1004)
at org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2139)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:127)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
hi, i think there is a bug in event parsing regarding ordering.
the records() iterator return records appended in this way:
chunk0: record10,record9,record8,record7,record6,record5,record4,record3,record2,record1
chunk1: record20,record19,record18,record17,record16,record15,record14,record13,record12,record11
and so on ...
basically, each chunk is orderered in a descending way, and this leads to the records not being in the original order when pulled from the iterator. and this may break some utilization of your lib where the original order needs to be preserved.
The JSON output contains "#attributes" which alters the true nature of the log and makes querying data a challenge.
The introduction of a simple command line flag that skips printing the "#attributes" text and prints even attributes as simple parent-child will make life easy for anybody who has to load and query the output of this project.
JSON formed by parsing EVTX using rust_evtx:
{
"Event": {
"#attributes": {
"xmlns": "http://schemas.microsoft.com/win/2004/08/events/event"
}
.
.
}
}
Desired JSON:
{
"Event": {
{
"xmlns": "http://schemas.microsoft.com/win/2004/08/events/event"
}
.
.
}
}
Thank you for considering my sincere request.
Hi,
I tried to parse these files: https://github.com/fox-it/danderspritz-evtx/tree/master/examples, and pre-Security.evtx was parsed successfully while post-Security.evtx wasn't. I got the following error:
Caused by:
0: An error occurred while trying to deserialize evtx stream.
1: Unknown EVTX record header flags value: 462880768
I don't know what these flags are, but they can potentially appear in some other evtx-files. Maybe the problem can be solved by replacing from_bits with from_bits_truncate in evtx_file_header.rs and evtx_chunk.rs. For example:
let raw_flags = try_read!(stream, u32, "file_header_flags")?;
let flags = match HeaderFlags::from_bits(raw_flags) {
Some(val) => val,
None => return Err(DeserializationError::UnknownEvtxHeaderFlagValue { value: raw_flags }),
};
should become
let raw_flags = try_read!(stream, u32, "file_header_flags")?;
let flags = HeaderFlags::from_bits_truncate(raw_flags);
And
let raw_flags = try_read!(input, u32)?;
let flags = match ChunkFlags::from_bits_truncate(raw_flags) {
Some(val) => val,
None => {
return Err(DeserializationError::UnknownEvtxHeaderFlagValue { value: raw_flags })
}
};
should become
let raw_flags = try_read!(input, u32)?;
let flags = ChunkFlags::from_bits_truncate(raw_flags);
or something like that.
Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.