Detection of Malware Presence using Wireshark Traffic Analysis (Emotet PCAP Study)
π Introduction
In today’s rapidly evolving digital landscape, malware attacks have become increasingly sophisticated, often leveraging network communication channels to remain undetected. Traditional endpoint detection methods are no longer sufficient to identify advanced threats such as botnets and trojans.
This project focuses on analyzing network traffic using Wireshark to identify the presence of malware within a real-world packet capture (PCAP) dataset. The dataset used in this study is based on Emotet malware traffic, a well-known and highly evasive banking trojan.
This analysis is based on a real-world Emotet malware dataset from Palo Alto Networks Unit 42 threat research
π― Objectives
To analyze network traffic using Wireshark for identifying malware activity
To detect Indicators of Compromise (IoCs) such as suspicious domains and abnormal communication patterns
To understand malware behavior through DNS and HTTP analysis
To apply AI-based anomaly detection concepts to network traffic
π Dataset Description
The dataset used is a PCAP file containing real-world Emotet infection traffic. The capture includes DNS queries, HTTP communications, and TCP sessions generated by an infected host communicating with external servers.
This dataset is widely used in cybersecurity research and training for identifying command-and-control (C2) behavior and malware communication patterns.
π️ Architecture of the Work
[PCAP File]
↓
[Wireshark Analysis]
↓
[DNS Analysis] —— [HTTP Analysis]
↓
[Feature Extraction]
↓
[AI-based Analysis]
↓
[Malware Detection Output]
This architecture represents the workflow from raw packet capture to malware detection using structured traffic analysis and AI-assisted insights.
⚙️ Procedure
The PCAP file was loaded into Wireshark
DNS traffic was filtered using:
dnsSuspicious domains were isolated using:
dns && !(dns.qry.name contains "microsoft" || dns.qry.name contains "windows" || dns.qry.name contains "office" || dns.qry.name contains "live")External communication was analyzed using:
ip.src == 10.1.6.206 && !(ip.dst == 10.1.6.0/24)HTTP traffic was analyzed to identify payload transfer and abnormal requests
HTTP objects were extracted to examine transferred data
User-Agent and request patterns were analyzed
AI-based anomaly detection concepts were applied
π Inferences (Proof of Malware Presence)
πΉ Inference 1: DNS Retransmission
Repeated DNS queries to legitimate domains such as v10.events.data.microsoft.com indicate persistence behavior often associated with malware.
πΉ Inference 2: Blended Legitimate Traffic
Repeated queries to dns.msftncsi.com suggest malware attempting to blend malicious traffic with normal system activity.
πΉ Inference 3: Suspicious Domain Communication
Queries to unknown domains such as hangarlastik.com indicate possible command-and-control communication.
πΉ Inference 4: Malicious Subdomain Usage
The domain seo.udaipurkart.com resolves to an external IP, indicating potential malicious infrastructure usage.
πΉ Inference 5: Beaconing Behavior
Repeated DNS queries to the same domain demonstrate automated communication patterns typical of C2 beaconing.
πΉ Inference 6: External Communication
The infected host communicates with external IPs such as 89.252.164.58 over HTTP, confirming outbound malware activity.
πΉ Inference 7: Binary Payload Transfer
Presence of application/octet-stream content indicates possible malware payload delivery.
πΉ Inference 8: Large HTTP Responses
Unusually large HTTP responses suggest obfuscated or encoded malicious data transfer.
πΉ Inference 9: Random Filenames
Non-human-readable filenames indicate attempts to evade detection.
πΉ Inference 10: Abnormal HTTP Behavior
Multiple POST requests with random URI patterns indicate automated malware communication.
πΉ Inference 11–20: Advanced Behavioral Indicators
Excessive POST request activity
Obfuscated URI patterns
Use of non-standard ports
Concentrated external IP communication
Repeated TCP handshakes
HTTP 200 OK responses confirming active sessions
Use of
/cgi-bin/scriptsMixed legitimate and malicious traffic
DNS resolution to external IPs
Fully automated traffic behavior
⚠️ Effects of Malware
Data theft
Unauthorized system access
Network congestion
System performance degradation
Financial and privacy risks
π§ New Findings
Detection of beaconing behavior
Identification of suspicious DNS domains
Evidence of external C2 communication
Presence of obfuscated HTTP traffic
Indicators of possible payload delivery
π€ Use of AI in this DA
Artificial Intelligence techniques can significantly enhance network traffic analysis by identifying anomalies and classifying malicious behavior.
In this work, AI-based concepts such as threshold-based anomaly detection were applied. For instance, an unusually high number of DNS and HTTP requests from a single host can be flagged as suspicious.
Example logic:
if request_count > threshold:
print("Suspicious traffic detected")
This demonstrates how AI can automate detection and reduce manual effort in identifying malware activity.
π¬ Conclusion
The analysis of the PCAP file using Wireshark successfully confirmed the presence of malware activity. Various indicators such as suspicious DNS queries, repeated communication patterns, abnormal HTTP requests, and external server connections were identified.
The integration of structured analysis techniques and AI-based concepts further strengthened the detection process. This study highlights the effectiveness of network traffic analysis in identifying malware and understanding its behavior in real-world scenarios.
π References
Malware-Traffic-Analysis.net
Palo Alto Networks Unit 42
Wireshark Documentation
π Acknowledgements
I would like to express my sincere gratitude to my parents, university, VIT SCOPE, and the course instructors for their support and guidance throughout this project. I also acknowledge the resources and datasets provided by the cybersecurity research community.
good overview, really informative
ReplyDelete