A New Strategy for DDoS Protection: Log Analysis on Steroids

Wednesday, August 26, 2020

Dave Armlin

Ab1a40d752a0a40d2ecbec2965acfe3d

Anyone whose business depends on online traffic knows how critical it is to protect your business against Distributed Denial of Service (DDoS) attacks. And with cyber attackers more persistent than ever – Q1 2020 DDoS attacks surged by 80% year over year and their average duration rose by 25%—you also know how challenging this can be.

Now imagine you’re responsible for blocking, mitigating, and neutralizing DDoS attacks where the attack surface is tens of thousands of websites. That’s exactly what HubSpot, a top marketing and sales SaaS provider, was up against. How they overcame the challenges they faced makes for an interesting case study in DDoS response and mitigation.

Drinking from a Firehouse

HubSpot’s CMS Hub powers thousands of websites across the globe. Like many organizations, HubSpot uses a Content Delivery Network (CDN) solution to help bolster security and performance.

CDNs, which are typically associated with improving web performance, are built to make content available at edges of the network, providing both performance and data about access patterns across the network. To handle the CDN log data spikes inherent with DDoS attacks, organizations often guesstimate how much compute they may need and maintain that higher level of resource (and expenditure) for their logging solution. Or if budgets don’t allow, they dial back the amount of log data they retain and analyze.

In HubSpot’s case, they use Cloudflare CDN as the first layer of protection for all incoming traffic on the websites they host. This equates to about 136,000 requests/second, or roughly 10TB/day, of Cloudflare log data that HubSpot has at its disposal to help triage and neutralize DDoS attacks. Talk about drinking from a firehouse!

HubSpot makes use of Cloudflare’s Logpushservice to push Cloudflare logs that contain headers and cache statuses for each request directly to HubSpot’s Amazon S3 cloud object storage. In order to process that data and make it searchable, HubSpot’s dedicated security team deployed and managed their own open-source ELK Stack consisting of Elasticsearch (a search database), Logstash (a log ingestion and processing pipeline), and Kibana (a visualization tool for log search analytics). They also used open source Kafka to queue logs into the self-managed ELK cluster.

To prepare the Cloudflare logs for ingestion into the ELK cluster, HubSpot had created a pipeline that would download the Cloudflare logs from S3 into a Kafka pipeline, apply some transformations on the data, insert into a second Kafka queue whereby Logstash would then process the data, and output it into the Elasticsearch cluster. The security team would then use Kibana to interact with the Cloudflare log data to triage DDoS attacks as they occur.

Managing an Elasticsearch cluster dedicated to this Cloudflare/DDoS mitigation use case presented a number of continuing challenges. It required constant maintenance by members of the HubSpot Elasticsearch team. The growth in log data from HubSpot’s rapid customer base expansion was compounded by the fact that DDoS attacks themselves inherently generate a massive spike in log data while they are occurring. Unfortunately, these spikes often triggered instability in the Elastic cluster when they were needed most, during the firefighting and mitigation process. 

Cost was also a concern. Although Elasticsearch, Logstash, and Kibana open source applications can be acquired at no cost, the sheer volume of existing and incoming log data from Cloudflare required HubSpot to manage a very large and increasingly expensive ELK cluster. Infrastructure costs for storage, compute, and networking to support the growing cluster grew faster than the data. And certainly, the human capital in time spent monitoring, maintaining, and keeping the cluster stable and secure was significant. The team constantly had discussions about whether to add more compute to the cluster or reduce data retention time. To accommodate their Cloudflare volume, which was exceeding 10TB/day and growing, HubSpot was forced to limit retention to just five days. 

The Data Lake Way

Like many companies whose business solely or significantly relies on online commerce, HubSpot wanted a simple, scalable, and cost-effective way to handle the continued growth of their security log data volume.

They were wary of solutions that might ultimately force them to reduce data retention to a point where the data wasn’t useful. They also needed to be able to keep up with huge data throughput at a low latency so that when it hit Amazon S3, HubSpot could quickly and efficiently firefight DDoS attacks.

HubSpot decided to rethink its approach to security log analysis and management. They embraced a new approach that consisted primarily of these elements:

- Using a fully managed log analysis serviceso internal teams wouldn’thave to manage the scaling of ingestion or query side components and could eliminate compute resources

- Leveraging the Kibana UIthat the security team is already proficient with

- Turning their S3 cloud object storage into a searchable analytic data lakeso Cloudflare CDN and other security-related log data could be easily cleaned, prepared, and analyzed in place, without data movement or schema management

By doing this, HubSpot can effectively tackle DDoS challenges. They significantly cut their costs and can easily handle the 10TB+/day flow of Cloudflare log data, without impacting performance.

HubSpot no longer has to sacrifice data retention time. They can retain Cloudflare log data for much longer than 5 days, without worrying about costs, and can dynamically scale resources so there is no need to invest in compute that’s not warranted. This is critical for long-tail DDoS protection planning and execution, and enables HubSpot to easily meet SLAs for DDoS attack response time.

Data lake-based approaches also enable IT organizations to unify all their security data sources in one place for better and more efficient overall protection. Products that empower data lake thinking allow  new workloads to be added on the fly with no provisioning or configuration required, helping organizations gain even greater value from log data for security use cases. For instance, in addition to storing and analyzing externally generated log data within their S3 cloud object storage, HubSpot will be storing and monitoring internal security log data to enhance insider threat detection and prevention.

Incorporating a data lake philosophy into your security strategy is like putting log analysis on steroids. You can store and process exponentially more data volume and types, protect better, and spend much less.

About the author: Dave Armlin is VP of Customer Success and Solutions Architecture at ChaosSearch. Dave has spent his 25+ year career building, deploying, and evangelizing secure enterprise and cloud-based architectures.

Possibly Related Articles:
60181
Cloud Security Enterprise Security
DDoS DDoS Protection Data Lake log data
Post Rating I Like this!
The views expressed in this post are the opinions of the Infosec Island member that posted this content. Infosec Island is not responsible for the content or messaging of this post.

Unauthorized reproduction of this article (in part or in whole) is prohibited without the express written permission of Infosec Island and the Infosec Island member that posted this content--this includes using our RSS feed for any purpose other than personal use.