Comments (7)
It didn't used to do that. Hm...
from tcpflow.
Yes, it seems that the newly committed code doesn't unzip.
from tcpflow.
I didn't see unzipping in the code I stripped out. The entire HTTP scanner as of scan_http.cpp @ 2852a1 is:
if(sp.sbuf.memcmp(reinterpret_cast<const uint8_t *>("HTTP/1.1 "),0,9)==0){
/* Looks like a HTTP response. Split it at the \r\n\r\n into two sbufs and save each */
ssize_t body_start = sp.sbuf.find("\r\n\r\n",0);
if(body_start==-1) return; // no body to be found
tcpdemux *d = tcpdemux::getInstance();
std::stringstream xml_head;
d->write_to_file(xml_head,sp.sbuf.pos0.path+"-HTTP",sbuf_t(sp.sbuf,0,body_start-2));
sbuf_t sbuf_body(sp.sbuf,body_start,sp.sbuf.bufsize - body_start);
std::stringstream xml_body;
d->write_to_file(xml_body,sp.sbuf.pos0.path+"-HTTPBODY",sbuf_body);
/* Need to do something with the XML */
/* Need to handle the gzip */
}
Where did that logic live before?
from tcpflow.
Oh, it's in the #if 0
later in the file. I see.
Edit: it looks like process_gzip()
is #if 0
'd in tcpdemux.cpp
as of the current master.
from tcpflow.
It was there before I did the upgrade to the bulk_extractor plug-in system. I probably forgot to re-enable it. The code works, though. We should put it back in. Do you want to handle that?
Overall, do you like the plug-in system?
from tcpflow.
I'll take a stab at handling gzip/deflate. I think I'll do it a little differently -- write an HTTPBODY file in all cases, but if it's a compressed Content-Encoding
that we have the ability to decompress, then decompress while writing to the output file instead of writing two output files. That seems most appropriate to me; the output should correspond to HTTP payloads, regardless of how they're compressed or transmitted over the wire.
Additionally, we can at least write out to .html.gz or similar if it's gzip
and decompression is disabled/unavailable.
from tcpflow.
Sounds great. I'm most appreciative!
On Nov 27, 2012, at 11:34 PM, Will Glynn [email protected] wrote:
I'll take a stab at handling gzip/deflate. I think I'll do it a little differently -- write an HTTPBODY file in all cases, but if it's a compressed Content-Encoding that we have the ability to decompress, then decompress while writing to the output file instead of writing two output files. That seems most appropriate to me; the output should correspond to HTTP payloads, regardless of how they're compressed or transmitted over the wire.
Additionally, we can at least write out to .html.gz or similar if it's gzip and decompression is disabled/unavailable.
—
Reply to this email directly or view it on GitHub.
from tcpflow.
Related Issues (20)
- Python 3 HOT 4
- configure script falls in infinite loop of "=yes" output on Centos8 and Centos7 HOT 3
- configure error: tcpflow requires a version of Boost that has Boost interval_map and interval_set HOT 6
- Release tag HOT 6
- Generate an error if -R option is provided without the -r option. HOT 4
- Source violates -Werror=format-security HOT 2
- tcpflow: can't parse filter expression: syntax error HOT 2
- Cannot use -S enable_report=NO in write protected directory HOT 6
- .findx file not written in real-time HOT 4
- Provide option to perform reverse DNS lookup of IP addresses HOT 7
- Feature Request: need to Add Stream number HOT 6
- Lack update of read_end_offset in shift_file (tcpip.cpp)
- Is there a way to find media file's HTTP link of non web browser app? HOT 3
- Introduced change creates problem with spurious data HOT 3
- Length in IP header ignored by the code HOT 1
- upgrade to use be20_api
- Some TCP keepalives corrupt the extracted data streams HOT 4
- Remove misslead requirement for python-2.7 HOT 1
- Performance description HOT 1
- Discrepancies between TCPFlow v1.5.1 and v1.6.1 number of flows. Violations occurring with 1.6.1 but not with 1.5.1 as well. HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tcpflow.