Wireshark and libpcap meet Python! A developer preview is now available
Wireshark is an excellent tool for analyzing network traces mainly thanks to a rich set of Protocol dissectors it supports - nearly 1500 protocols. Utilities like
tshark exploit this feature set. While Wireshark is great at doing the dissection part, programmatically using these dissectors outside wireshark's codebase is a kind of an involved task. There are a few issues -
- The APIs are provided in C only - So those APIs are available only in the C world.
- The APIs are hardly documented and do suffer from being quite tightly coupled with the way they are used in Wireshark.
- To overcome this
tsharkprovides a number of command line switches, but even there the functionality is available only if it is available in
tsharkhas two switches for json one
-T ek, one for getting the json output and another for getting
It will help a lot if this dissection functionality is available -
wireshark's code base
- Through a set of programmer friendly APIs that can be leveraged to build other solutions.
- In a more high level language, popular language - say Python
Enter 'wishpy'! 'Wishpy' also stands for - I wish such a functionality was available in Python world!
There are other projects notably
scapy, but in terms of support for Protocol dissectors they are considerably limited. So clearly, there is a merit in bringing the Wireshark's dissectors to the Python World.
Wishpy - The Basics
Actually 'wishpy' is not the first attempt at providing Python bindings for Wireshark. There is a project called
wirepy, but it is not maintained any more and even the APIs that are supported are not very Pythonic. 'Wishpy' is not for writing Protocol dissectors in Python but instead use the Protocol dissectors written in C in Python with more Pythonic APIs. We'll come to that in a bit.
'Wishpy' provides the extensions to Wireshark library using
cffi module for Python. One advantage of this approach is - it is easier to track the development of upstream
Wireshark and make the features supported in the upstream easily available to
'Wishpy' also wraps
libpcap library for packet capture.
'Wishpy' works with Python queues (and for that matter any queues which have simple
put abstractions), thus making Packet processing functionality a composable Pipeline.
Wishpy is in active development and right now it primarily works on Linux, but we have a proof of concept working on Windows for dissecting pcap files. This is an early release and supports following features -
Dissected Packets are available as
json. The output is compatible with what
tsharkprovides. 'Wishpy' also treats the field types semantically, that is the fields that are integers, floats or booleans are not just wrapped into a
jsonstring, the way it is done by
Supports pretty printing of
Supports version 2.6 and 3.2 of Wireshark. The actual version that is used is determined on the target machine where wishpy is installed.
Supports a GCD of APIs supported by
libpcapversions 1.7, 1.8 and 1.9.
Wishpy also provides
pcapycompatible APIs, thus it's a drop in replacement for
pcapy, at-least for the main APIs.
Wishpy works with
pypy3as well. See below for more details.
Supports simple Pythonic
Clean separation between application and library. This example demonstrates how json outputs of Packets can be dumped to a Redis Queue.
We have conducted some profiling and performance experiments. These experiments suggest that using Wishpy with CPython may be about 3-4 times slower than the
tshark that is built in C. This does not look very promising, prima facie. However, when we switch to
pypy3 using it's
jit capabilities, we often do as well as if not better than
tshark that ships with Wireshark! This makes 'wishpy' very promising because one can get a performance at par (or better) with
tshark and offering all the flexibility and rich ecosystem of Python's data processing capabilities.
See below for performance results from a PCAP file with large packets (data download), but not a very deep dissection tree per packet.
Summary: Pypy3: fastest --> 'tshark': little more than twice slower --> 'CPython': Nearly 4 times slower.
# 'wishpy' with CPython 3.5 $ venv/bin/python examples/tshark_timed.py ~/pcaps/Capture1.pcap Running dissector for 50000 Packets. Total time taken in seconds: 25.690130949020386 # 'wishpy' with PyPy3 (Python 3.6) $ venv-pypy/bin/python examples/tshark_timed.py ~/pcaps/Capture1.pcap Running dissector for 50000 Packets. Total time taken in seconds: 6.10959005355835 # 'tshark' $ echo "Running 'tshark' for 50000 packets." && \ date +%H:%M:%S.%N && tshark -T json -r ~/pcaps/Capture1.pcap -c 50000 > /dev/null \ && date +%H:%M:%S.%N Running 'tshark' for 50000 packets. 11:03:36.033859533 11:03:49.346423480
See below for performance results from a PCAP file with small packets (protocol data), but with a deep dissection tree per packet.
Summary: Pypy3: fastest -> 'tshark': nearly 90% slower --> 'CPython': Little more than 4 times slower.
# 'wishpy' with CPython 3.5 $ venv/bin/python examples/tshark_timed.py ~/pcaps/Capture2.pcap Running dissector for 50000 Packets. Total time taken in seconds: 42.661524534225464 # 'wishpy' with PyPy3 (Python 3.6) $ venv-pypy/bin/python examples/tshark_timed.py ~/pcaps/Capture2.pcap Running dissector for 50000 Packets. Total time taken in seconds: 9.754892110824585 # 'tshark' $ echo "Running 'tshark' for 50000 packets." && \ date +%H:%M:%S.%N && tshark -T json -r ~/pcaps/Capture2.pcap -c 50000 > /dev/null \ && date +%H:%M:%S.%N Running 'tshark' for 50000 packets. 11:14:32.583997783 11:14:50.733474335
This is just a quick experiment to understand the performance of 'wishpy'.
Honestly, it started as how bad the performance is compared with
tshark. However, thinking more about it, I realized, this particular problem might indeed be a 'sweet spot' for
jit capabilities. I won't go on and claim that it runs faster than the native
tshark as yet, because to be fair in the comparison above,
tshark's output is redirected to
/dev/null, so it has to take efforts to generate the output which is then ignored, thus not making this a fair comparison. However, I didn't come across an option where I could run
tshark in something like
--silent mode. Which again suggests, why having something like
wishpy makes sense as one is not limited by CLI switches that are offered by
In future, we'll talk about a detailed comparison of all the three approaches.
'Wishpy' is still an early project and focus so far has been on getting an early preview version out, with a set of basic capture and dissection features. Help is surely needed in terms of trying it out on different Linux distributions. Also MacOS support might work, but is not tested at the moment.
Also, if you would like to use 'Wishpy' for a use-case that you have in mind where 'Wishpy' might be well suited, but is missing some features, please file an issue on the github.
Wishpy is available at -
- Documentation - (early!)
Happy protocol dissecting!! :-)