Detection of Malware Presence using Wireshark Traffic Analysis (Emotet PCAP Study)

🔍 Introduction

In today’s rapidly evolving digital landscape, malware attacks have become increasingly sophisticated, often leveraging network communication channels to remain undetected. Traditional endpoint detection methods are no longer sufficient to identify advanced threats such as botnets and trojans.

This project focuses on analyzing network traffic using Wireshark to identify the presence of malware within a real-world packet capture (PCAP) dataset. The dataset used in this study is based on Emotet malware traffic, a well-known and highly evasive banking trojan.

This analysis is based on a real-world Emotet malware dataset from Palo Alto Networks Unit 42 threat research

🎯 Objectives

To analyze network traffic using Wireshark for identifying malware activity
To detect Indicators of Compromise (IoCs) such as suspicious domains and abnormal communication patterns
To understand malware behavior through DNS and HTTP analysis
To apply AI-based anomaly detection concepts to network traffic

📁 Dataset Description

The dataset used is a PCAP file containing real-world Emotet infection traffic. The capture includes DNS queries, HTTP communications, and TCP sessions generated by an infected host communicating with external servers.

This dataset is widely used in cybersecurity research and training for identifying command-and-control (C2) behavior and malware communication patterns.

🏗️ Architecture of the Work

[PCAP File]
      ↓
[Wireshark Analysis]
      ↓
[DNS Analysis] —— [HTTP Analysis]
      ↓
[Feature Extraction]
      ↓
[AI-based Analysis]
      ↓
[Malware Detection Output]

This architecture represents the workflow from raw packet capture to malware detection using structured traffic analysis and AI-assisted insights.

⚙️ Procedure

The PCAP file was loaded into Wireshark
DNS traffic was filtered using:
```
dns
```

Suspicious domains were isolated using:

dns && !(dns.qry.name contains "microsoft" || dns.qry.name contains "windows" || dns.qry.name contains "office" || dns.qry.name contains "live")

External communication was analyzed using:

ip.src == 10.1.6.206 && !(ip.dst == 10.1.6.0/24)

HTTP traffic was analyzed to identify payload transfer and abnormal requests
HTTP objects were extracted to examine transferred data
User-Agent and request patterns were analyzed
AI-based anomaly detection concepts were applied

📊 Inferences (Proof of Malware Presence)

🔹 Inference 1: DNS Retransmission

Repeated DNS queries to legitimate domains such as v10.events.data.microsoft.com indicate persistence behavior often associated with malware.

🔹 Inference 2: Blended Legitimate Traffic

Repeated queries to dns.msftncsi.com suggest malware attempting to blend malicious traffic with normal system activity.

🔹 Inference 3: Suspicious Domain Communication

Queries to unknown domains such as hangarlastik.com indicate possible command-and-control communication.

🔹 Inference 4: Malicious Subdomain Usage

The domain seo.udaipurkart.com resolves to an external IP, indicating potential malicious infrastructure usage.

🔹 Inference 5: Beaconing Behavior

Repeated DNS queries to the same domain demonstrate automated communication patterns typical of C2 beaconing.

🔹 Inference 6: External Communication

The infected host communicates with external IPs such as 89.252.164.58 over HTTP, confirming outbound malware activity.

🔹 Inference 7: Binary Payload Transfer

Presence of application/octet-stream content indicates possible malware payload delivery.

🔹 Inference 8: Large HTTP Responses

Unusually large HTTP responses suggest obfuscated or encoded malicious data transfer.

🔹 Inference 9: Random Filenames

Non-human-readable filenames indicate attempts to evade detection.

🔹 Inference 10: Abnormal HTTP Behavior

Multiple POST requests with random URI patterns indicate automated malware communication.

🔹 Inference 11–20: Advanced Behavioral Indicators

Excessive POST request activity
Obfuscated URI patterns
Use of non-standard ports
Concentrated external IP communication
Repeated TCP handshakes
HTTP 200 OK responses confirming active sessions
Use of /cgi-bin/ scripts
Mixed legitimate and malicious traffic
DNS resolution to external IPs
Fully automated traffic behavior

⚠️ Effects of Malware

Data theft
Unauthorized system access
Network congestion
System performance degradation
Financial and privacy risks

🧠 New Findings

Detection of beaconing behavior
Identification of suspicious DNS domains
Evidence of external C2 communication
Presence of obfuscated HTTP traffic
Indicators of possible payload delivery

🤖 Use of AI in this DA

Artificial Intelligence techniques can significantly enhance network traffic analysis by identifying anomalies and classifying malicious behavior.

In this work, AI-based concepts such as threshold-based anomaly detection were applied. For instance, an unusually high number of DNS and HTTP requests from a single host can be flagged as suspicious.

Example logic:

if request_count > threshold:
    print("Suspicious traffic detected")

This demonstrates how AI can automate detection and reduce manual effort in identifying malware activity.

🎬 Conclusion

The analysis of the PCAP file using Wireshark successfully confirmed the presence of malware activity. Various indicators such as suspicious DNS queries, repeated communication patterns, abnormal HTTP requests, and external server connections were identified.

The integration of structured analysis techniques and AI-based concepts further strengthened the detection process. This study highlights the effectiveness of network traffic analysis in identifying malware and understanding its behavior in real-world scenarios.

🔗 References

Malware-Traffic-Analysis.net
Palo Alto Networks Unit 42
Wireshark Documentation

🙏 Acknowledgements

I would like to express my sincere gratitude to my parents, university, VIT SCOPE, and the course instructors for their support and guidance throughout this project. I also acknowledge the resources and datasets provided by the cybersecurity research community.

Search This Blog

Detection of Malware Presence using Wireshark Traffic Analysis (Emotet PCAP Study)

Detection of Malware Presence using Wireshark Traffic Analysis (Emotet PCAP Study)

🔍 Introduction

🎯 Objectives

📁 Dataset Description

🏗️ Architecture of the Work

⚙️ Procedure

📊 Inferences (Proof of Malware Presence)

🔹 Inference 1: DNS Retransmission

🔹 Inference 2: Blended Legitimate Traffic

🔹 Inference 3: Suspicious Domain Communication

🔹 Inference 4: Malicious Subdomain Usage

🔹 Inference 5: Beaconing Behavior

🔹 Inference 6: External Communication

🔹 Inference 7: Binary Payload Transfer

🔹 Inference 8: Large HTTP Responses

🔹 Inference 9: Random Filenames

🔹 Inference 10: Abnormal HTTP Behavior

🔹 Inference 11–20: Advanced Behavioral Indicators

⚠️ Effects of Malware

🧠 New Findings

🤖 Use of AI in this DA

🎬 Conclusion

🔗 References

🙏 Acknowledgements

Comments

Post a Comment