Real-time Threat Detection Using Machine Learning and Apache Kafka

Streaming Audio: Apache Kafka® & Real-Time Data

Nov 29 2022 • 29 mins

Can we use machine learning to detect security threats in real-time? As organizations increasingly rely on distributed systems, it is becoming more important to analyze the traffic that passes through those systems quickly. Confluent Hackathon ’22 finalist, Géraud Dugé de Bernonville (Data Consultant, Zenika Bordeaux), shares how his team used TensorFlow (machine learning) and Neo4j (graph database) to analyze and detect network traffic data in real-time. What started as a research and development exercise turned into ZIEM, a full-blown internal project using ksqlDB to manipulate, export, and visualize data from Apache Kafka®.

Géraud and his team noticed that large amounts of data passed through their network, and they were curious to see if they could detect threats as they happened. As a hackathon project, they built ZIEM, a network mapping and intrusion detection platform that quickly generates network diagrams. Using Kafka, the system captures network packets, processes the data in ksqlDB, and uses a Neo4j Sink Connector to send it to a Neo4j instance. Using the Neo4j browser, users can see instant network diagrams showing who's on the network, allowing them to detect anomalies quickly in real time.

The Ziem project was initially conceived as an experiment to explore the potential of using Kafka for data processing and manipulation. However, it soon became apparent that there was great potential for broader applications (banking, security, etc.). As a result, the focus shifted to developing a tool for exporting data from Kafka, which is helpful in transforming data for deeper analysis, moving it from one database to another, or creating powerful visualizations.

Géraud goes on to talk about how the success of this project has helped them better understand the potential of using Kafka for data processing. Zenika plans to continue working to build a pipeline that can handle more robust visualizations, expose more learning opportunities, and detect patterns.

EPISODE LINKS

You Might Like

Darknet Diaries
Darknet Diaries
Jack Rhysider
Paradigm Shift
Paradigm Shift
Microsoft India
Waveform: The MKBHD Podcast
Waveform: The MKBHD Podcast
Vox Media Podcast Network
CyberWire Daily
CyberWire Daily
N2K Networks
The Stack Overflow Podcast
The Stack Overflow Podcast
The Stack Overflow Podcast
Super Data Science Podcast with Jon Krohn
Super Data Science Podcast with Jon Krohn
Jon Krohn and Guests on Machine Learning, A.I., and Data-Career Success
WSJ’s The Future of Everything
WSJ’s The Future of Everything
The Wall Street Journal
Talk Python To Me
Talk Python To Me
Michael Kennedy (@mkennedy)
System Design
System Design
Wes and Kevin
Acquired
Acquired
Ben Gilbert and David Rosenthal