
A novel multi-scale cnn and bi-lstm arbitration dense network model for low-rate ddos attack detection
- Select a language for the TTS:
- UK English Female
- UK English Male
- US English Female
- US English Male
- Australian Female
- Australian Male
- Language selected: (auto detect) - EN
Play all audios:
ABSTRACT Low-rate distributed denial of service attacks, as known as LDDoS attacks, pose the notorious security risks in cloud computing network. They overload the cloud servers and degrade
network service quality with the stealthy strategy. Furthermore, this kind of small ratio and pulse-like abnormal traffic leads to a serious data scale problem. As a result, the existing
models for detecting minority and adversary LDDoS attacks are insufficient in both detection accuracy and time consumption. This paper proposes a novel multi-scale Convolutional Neural
Networks (CNN) and bidirectional Long-short Term Memory (bi-LSTM) arbitration dense network model (called MSCBL-ADN) for learning and detecting LDDoS attack behaviors under the condition of
limited dataset and time consumption. The MSCBL-ADN incorporates CNN for preliminary spatial feature extraction and embedding-based bi-LSTM for time relationship extraction. And then, it
employs arbitration network to re-weigh feature importance for higher accuracy. At last, it uses 2-block dense connection network to perform final classification. The experimental results
conducted on popular ISCX-2016-SlowDos dataset have demonstrated that the proposed MSCBL-ADN model has a significant improvement with high detection accuracy and superior time performance
over the state-of-the-art models. SIMILAR CONTENT BEING VIEWED BY OTHERS A NOVEL APPROACH FOR APT ATTACK DETECTION BASED ON AN ADVANCED COMPUTING Article Open access 27 September 2024 A
NOVEL OPTIMIZATION-DRIVEN DEEP LEARNING FRAMEWORK FOR THE DETECTION OF DDOS ATTACKS Article Open access 14 November 2024 A MULTILAYER DEEP AUTOENCODER APPROACH FOR CROSS LAYER IOT ATTACK
DETECTION USING DEEP LEARNING ALGORITHMS Article Open access 25 March 2025 INTRODUCTION With the widespread deployment of cloud computing network, LDDoS attacks have been reported as the
notorious network security issues. These attacks are a form of periodic multi-variate time series pulse flow where a malicious user floods the cloud service stealthily. They can severely
degrade the quality of service confirmed by researchers1. But their behaviors are alike to normal flows in terms of speed and volume, making them difficult to distinguish from normal flows.
For example, the BlackNurse attack reported by TDC Security Operations Center2 could degrade operators’ cloud services with the traffic speed of 5 packets per second, and offline operators’
firewalls if the traffic volume bumped to 15–18 Mbit/s. Thus, LDDoS attacks attract more and more attentions, and many LDDoS variants with smarter behaviors are designed. Hivenets3 used bee
colony algorithm to guide a group of intelligent bots to fire autonomous LDDoS attacks with minimal supervision. A multi-target LDDoS attack model4 tried to utilize and orchestrate bots’
spare time slots between bursts to fire new LDDoS attacks. And the bad news is these smarter LDDoS attacks are more strenuous to be captured. Even worse, all the traffics in the network are
impacted heavily at the same time, which leads to packet loss to some extent. This is why the big-scale public LDDoS dataset is scarce. As a result, designing and training a powerful and
satisfactory LDDoS attack detection model based on the limited poor data is a changeling task. Even the performance of models is positively correlated with the dataset, researchers tried to
centralize datasets that are scattered in different clients to train a powerful model with high performance. Federated Learning (FL) is just this kind of paradigm offering joint learning
through multiple datasets. Wang et al.5 proposed an intrusion detection method based on FL and CNNs to solve the problem of training a depth model with high accuracy under the limited
labelled data generated by a single mechanism. Li et al.6 proposed a novel DeepFed federated deep learning framework. The framework employed CNN and gated recurrent unit (GRU) as the basic
layers of intrusion detection model, and allowed multiple clients to collectively train the parameters of model. The experiments showed the proposed DeepFed scheme had the high effectiveness
in detecting network intrusion. Mothukuri et al.7 proposed a GRU-based IoT networks intrusion detection model. The weights of central model were updated from multiple clients to optimize
its accuracy by federated training rounds. This method kept the data intact on local IoT devices by sharing only the learned weights and guaranteed the data safe. Idrissi et al.8 proposed an
autoencoder-based method for distributed network intrusion detection systems by leveraging FL and anomaly detection. The method computed an intrusion score based on the reconstruction error
while preserving the privacy of local data. Wu et al.9 proposed a novel LDDoS attack detection FL framework using transfer learning and support vector machine (SVM). The framework performed
aggregation of data distributed in various clients through FL, then utilized transfer learning and SVM build personalized models. Bertoli et al.10 introduced the stacked-unsupervised FL
approach for detecting a flow-based network intrusion. The novel method comprised a deep autoencoder in conjunction with an energy flow classifier in an ensemble learning task. These methods
are available to use the high-quality LDDoS attack examples in different data nodes. However, their deep learning models are insufficiency to learn time sequence features. Thus, Tang et
al.11 used Gramian angular summation field transformation to analyze time-based features for finding out the attack, and the results were effective. With the emerging LSTM, researchers
integrated it into FL framework. By taking advantage of LSTM, Zhao et al.12 proposed an effective intelligent intrusion detection method based on FL aided LSTM framework, which can solve the
problem of a powerful deep learning model training and intrusion risks at the central server and violate user privacy if collecting dataset from all the user servers. Huong et al.13
proposed the hybrid variational autoencoder-based LSTM model based on FL for time series data anomalies. The architecture achieves high detection performance in time series data anomalies.
Zhang et al.14 proposed a vertical federated learning framework based on LSTM fault classification network. The framework could optimize model parameters on the entire firefighting IoT
platform, and get the effective predicted results. Wang et al.15 designed an attention-based bi-LSTM model within the framework of multi-domain FL framework for detecting coordinated network
attacks. Liu et al.16 proposed an asynchronous FL arbitration framework named AsyncFL-bLAM based on bi-LSTM and attention model. The novel arbitration detection model took on the
responsibility of LDDoS attack detection locally. The above methods can use well-labelled training data across many organizations. However, when researchers examined and compared the
performance of LDDoS attack detection methods deployed by centralized mode and FL mode17. The accuracy and outperforming of models in FL mode could only close to, but lower than ones in
centralized mode. Thus, researchers turned to LDDoS data enhancement and novel model design. At early stage, Random Forest (RF)18, SVM19, K-Near Neighbor (KNN)20 and their enhanced shallow
models with well-designed feature engineer were adopted for LDDoS attack detection. Tang et al.21 proposed performance and features framework for LDDoS attacks based on SVM, KNN and so on.
The framework analyzed the performance of normal traffic and made full use of flow features, which gained a high detection accuracy. Tang et al.22 also enhanced the fine-grained detection
model by adaptive Kohonen network cluster analysis algorithm. As a result, the proposal had an ability to accurate fine-grained detection and detect every attack burst. Muhammad et al.23
proposed a multivariate chart by integrating support vector data description and kernel density estimation. The chart can monitor the network’s anomaly with high performance. Later,
researchers realized that deep learning models were superior to shallow ones in detection accuracy and scalability. Liu et al.4 introduced locality sensitive features extraction method for
classifying the different flows into small buckets ahead, and then proposed a simple 3-layer CNN-based model to extract key feature representations. Zhou et al.24 proposed a few-shot
learning model with siamese CNN (FSL-SCNN) to alleviate the over-fitting issue and enhance the performance for intelligent anomaly detection. The FSL-SCNN encoding network was helpful to
optimize feature representations and enhance the efficiency of the training process. Asgharzadeh et al.25 worked out CNN-BMECapSA-RF model based on CNN and hybrid layers to enhance classify
accuracy, and developed a binary multi-objective enhanced capuchin search algorithm to implement feature extraction automatically. Ren et al.26 proposed a novel CANET network. The CNN
network and attention mechanism are mixed for local spatio-temporal feature extraction of LDDoS attacks. Experiments demonstrated CANET is efficient in accuracy, detection rate, and false
positive rate. Venkateshwarlu et al.27 proposed a framework named LRDADF, which used CNN and deep sparse autoencoder to detect LDDoS attack. The sparse autoencoder was used to learn from
traffic patterns and CNN was for classifying. Considering the time series properties of LDDoS, Salahuddin et al.28 proposed an anomaly detection system based on autoencoder to leverage
time-based features over multiple time windows for efficiently detecting anomalous LDDoS. Romany29 employed adaptive harmony search algorithm for feature selection, and attention based
bi-directional gated recurrent neural network model for time-based feature classification. And researchers introduced LSTM network in LDDoS attack detection, since its good performance in
time sequence data. Zhou et al.30 designed a variational LSTM (VLSTM) deep model for IoT attack detection. It used encoder-decoder network to reconstruct feature representation, used a
variational reparameterization scheme to learn key feature representation, and used three loss functions to control weigh learning. Mushtaq et al.31 proposed a hybrid framework comprising
deep auto-encoder with LSTM and bi-LSTM for attack detection. In this framework, the auto-encoder was used to obtain optimal features and LSTMs were used to finish the classification task.
Liu et al.32 proposed a novel, practical and fast ensemble model FastCBLA-EM. It used a competitive network composed of 1-D CNN and a bi-LSTM network to learn feature representations of
samples, and then, an arbitration mechanism was used to weigh the representations for classifying LDDoS attacks. Du et al.33 proposed NIDS-CNNLSTM model for network security. It combined the
powerful learning ability of LSTM network in time series data to extract key features, and use CNN network to classify those filtered features. To abridge, the state-of-the-art centralized
methods took use of both CNN and LSTM for feature extraction. But these existing methods used CNN and LSTM sequentially. The CNN was used to extract internal features, and then, the LSTM was
used to extract association based on the output of CNN. This means it is possible to loss important spatio-temporal features. This is unbearable on the limited poor dataset, specially.
Thus, in this paper, we propose a novel multi-scale CNN and bi-LSTM arbitration dense network model MSCBL-ADN for alleviating the impact of the insufficiency of high-quality attack examples.
The contributions of this paper are as follows: * The paper introduces a novel fixed _T_-length sliding segmentation data augmentation method to split raw multi-variate data into multiple
equal length detection blocks real time. This not only enlarges the high-quality attack examples, but also speeds up the LDDoS attack detection. * The paper proposes a MSCBL-ADN model, which
incorporates multi-scale CNN for preliminary spatial feature extraction and embedding-based bi-LSTM for time relationship extraction; which employs arbitration network for high accuracy by
redistribution of the weights of key representations and uses 2-block dense connection network to perform final classification. * The paper compares MSCBL-ADN model with state-of-the-art
models by classification accuracy and time consumption on ISCX-2016-SlowDos dataset. The experimental results demonstrate the effectiveness of the proposed method. The rest of the paper is
organized as follows: In Section “Methods”, the data augmentation method and the architecture of MSCBL-ADN model have been expressed in detail. In Section “Results”, the process of
ISCX-2016-SlowDos dataset augmentation and the experiment environment are introduced. And the performance evaluation is described in terms of the detection accuracy and the time consumption.
“Discussion” Section covers the analysis of the ablation experiments and contrast experiments. Finally, in Section “Conclusions”, the paper is concluded and suggests future directions
briefly. METHODS A novel multi-scale CNN and bi-LSTM arbitration dense network model is proposed and explained detailed in this section. As illustrated in Fig. 1, the block diagram includes
data pre-processing and MSCBL-ADN network model. In data pre-processing part, the captured packets are organized into different flows per their flow 5-tuple (source IP, source port,
protocol, destination IP, destination port). And then, the flow is segmented into multiple detection blocks by the default length _T_, named \(F_x\)-\(SubF_{kx}\). The feature number in
\(F_x\)-\(SubF_{kx}\) is _n_. Since the raw multi-variate data contains spatio-temporal features, the \(F_x\)-\(SubF_{kx}\) is split into \(F_x\)-\(SubF_{kx}\)-_S_ for spatial feature set
and \(F_x\)-\(SubF_{kx}\)-_T_ for time relationship feature set, whose numbers are \(n_1\) and \(n_2\) separately. In MSCBL-ADN network model part, the spatial features are handled by
multi-scale CNN network with \(1 \times 1\), \(3 \times 3\) and \(5 \times 5\) convolutional kernels. The extracted features are concatenated and flattened into a vector. Parallelly, the
bi-LSTM networks are used to learn the valid information from time delta embedding \(F_x\)-\(SubF_{kx}\)-_T_ dataset through forward and backward propagation. After that, the learnt temporal
representation vector concatenates with spatial feature vector. And then, an arbitration network is adopted to redistribute weights by multi-head attention mechanism. As shown, the critical
representations are given heavy weights. At last, 1-D dense network with two dense blocks is employed for classifying. Overall, some important methods, such as batch normalization,
_softmax_ function, cross-entropy loss and dropout technology, are used in MSCBL-ADN model for eliminating over-fitting problem. FIXED _T_-LENGTH SLIDING SEGMENTATION METHOD The MSCBL-ADN
model needs to be trained based on the limited poor LDDoS dataset, and run in real-time network. For that, the paper introduces a novel fixed _T_-length sliding segmentation method for both
data augmentation and speeding detection up. The method is proposed as illustrated in Fig. 2. It mainly has four tasks. One of tasks is to sort packets and extract features to form flows
based on communication quintuple and arrival time. Sequentially, the second task is to segment flows into fixed _T_-length sub-flows. And then, the third task is to pad the flow-agnostic
packets, once the length of the last sub-flow is less than the required length during training phase or the expected packets don’t arrive yet. Finally, the method splits multi-variate
sub-flows into spatial part \(F_x\)-\(SubF_{kx}\)-_S_ and time relationship part \(F_x\)-\(SubF_{kx}\)-_T_. As a result, the high-quality LDDoS training examples are augmented and the LDDoS
attack detection is speeded up. In task one, we find that the collected packets on dataset and in real network world are mixed up and sorted by the arrival time. The reason is that the
physical links are shared and multiplexed by flows. But the network attacks fire from fixed victims normally. This asks us to distinguish flows. Therefore, we sort packets based on
communication quintuple and arrival time. Let the number of packets be _m_, the number of flows be _z_, then \(m=\sum _{i=1}^zl_i\), where \(l_i\) means the length of \(i^{th}\) flow. Next
in task two, the sliding window with the fixed _T_-length is used to segment LDDoS attack detection blocks on the flows. By default, the stride of sliding window is set to 1 for data
augmentation purpose. Thus, we will have a new sub-flow detection once a new packet is arrived. Take sub-flows \(F_1\)-\(SubF_1\) as example, when new packet “Pkg #1,2,_T_+1” is arrived, the
new _T_-length detection block \(F_1\)-\(SubF_2\) is formed. Its _T_-1 packets are same with \(F_1\)-\(SubF_1\). By this method, the MSCBL-ADN model can detect whether LDDoS attack happens
when any new packet arrives in real network world. It is obviously that the detection speed is accelerated. As to task three, the method helps to dilate the sub-flows with zero-padding
packets, when the number of packets in the sub-flow is less than _T_. Take \(F_2\)-\(SubF_1\) as examples, there are only a few packets collected when LDDoS attack detection is triggered,
since it is a new sub-flow. It is obviously that the count of collected packets is less than _T_. For the input requirement of MSCBL-ADN model, we must dilate the sub-flow. Take
\(F_z\)-\(SubF_{kz}\) as examples, its last packet is agnostic. That is, the count of packages in the last segment may be less than _T_. For that, we must pad the sub-flow. When reviewing
the MSCBL-ADN model, there is a bi-LSTM network part. It is high time consumption, especially when huge feature set is fed into. But these time-independent features have almost no impact for
detection accuracy32. Therefore, in task four, we split the original multi-variate sub-flows into spatial feature sub-flows \(F_x\)-\(SubF_{kx}\)-_S_ and time relationship sub-flows
\(F_x\)-\(SubF_{kx}\)-_T_. After above operations, the raw data are reshaped into the tensor of \(\sum _{k=1}^zk_i*(n_1+n_2)*T\) as shown in Eq. (1), $$\begin{aligned} k_i= {\left\{
\begin{array}{ll} l_i-T+1 &{} l_i \ge T \\ 1 &{} l_i < T \end{array}\right. }, \end{aligned}$$ (1) where \(k_i\) is the number of segmentations to be trained/detected, \(n_1\) and
\(n_2\) are the number of spatial and temporal features separately. And we label the corresponding sub-flows per its original flow classification in training set. Obviously, the method has
the several advantages. First, it can speed the frequency of detection up. Once any new package arrives, the detection can start without waiting for all the packages on this flow. Second,
multi-variate data is split and fed into CNN and bi-LSTM paths parallelly, which save time consumption further. MULTI-SCALE CNN NETWORK LDDoS attacks have the characteristic of periodicity.
As shown in Fig. 3, the multi-scale CNN network is designed to handle the insufficient utilization of LDDoS attack packet. It adopts a multi-scale feature fusion extraction channel that
combines skip connections and feature fusion operations. Usually, this kind of periodicity can be analyzed by observing 3–5 packets at least. Take slow-body attacks as examples, they
continue to send small data packets to server with an ultra-slow speed once the connects are established. The first packets in these attacks have bigger content-length values in their HTTP
headers normally. And the sequential data packets have the almost same HTTP header, but different HTTP body. Thus, we need to analyze 3–5 packets before making a decision. The 5\(\times\)5
kernel is just used for finishing this kind of analysis. Another LDDoS attacks, such as slow-headers, aim to deplete CPU resources or network connection pools. They attack servers with
endless HTTP headers. These packets have same HTTP headers and can be analyzed through 1–3 packets. The \(3 \times 3\) kernel is good at handing these cases. The \(1 \times 1\) kernel helps
to add non-linear into the original data and ascend dimensions. This improves the presentation ability of the multi-scale CNN network. After above convolutional operations, these extracted
presentations are fused and flatted by long skip connections for prediction. Firstly, three convolutional kernels of different sizes, 1\(\times\)1, 3\(\times\)3 and 5\(\times\)5, are used to
extract features from detection blocks. The 1\(\times\)1 convolutional kernel in the first channel is used to obtain detailed features of network packets. In this channel, we use 56
1\(\times\)1 convolutional kernels. The formula is shown in Eq. (2), $$\begin{aligned} F_1(Y)=W_1 \times F_x\text {-}SubF_{kx}\text {-}S + B_1 \end{aligned}$$ (2) where \(F_1(Y)\) is the
outputted feature map by the first channel. \(W_1\) is the weights of 56 1\(\times\)1 convolutional kernels and \(B_1\) represents the bias. Normally, LDDoS attack flows are periodic
multi-variate time series pulses. That is, the packets nearby have strong positive correlation. Thus, we use 28 3\(\times\)3 convolutional kernels in the second channel and 28 5\(\times\)5
convolutional kernels in the third channel. They are used to cope with spatial relationship among packets. The formula is shown in Eq. (3), $$\begin{aligned} \begin{aligned} {\begin{matrix}
&{}F_2(Y)=W_2 \times F_x\text {-}SubF_{kx}\text {-}S + B_2, \\ &{}F_3(Y)=W_3 \times F_x\text {-}SubF_{kx}\text {-}S + B_3 \end{matrix}} \end{aligned} \end{aligned}$$ (3) where
\(F_2(Y)\) and \(F_3(Y)\) are the outputted feature map of the second channel and third channel separately. \(W_2\) and \(W_3\) are the weights of all those 3\(\times\)3 and 5\(\times\)5
convolutional kernels. \(B_2\) and \(B_3\) represents the bias. And then, a _Concat_ operation is adopted to combine the different scales of feature information after the second and third
channels. At the same time, we use the long skip connections to fuse the multi-scale feature information. This ensures the completeness of LDDoS attack features from detection blocks, and
thereby improving the ability to extract useful high-frequency spatial information. The formula is shown in Eq. (4), $$\begin{aligned} \begin{aligned} {\begin{matrix}
&{}F_{23}(Y)=Concat(F_2(Y) + F_3(Y)), \\ &{}F(Y)=W \times (F_1(Y) + F_{23}(Y)) + B \end{matrix}} \end{aligned} \end{aligned}$$ (4) where \(F_{23}(Y)\) is the concatenated feature map
of the second channel and third channel, the size is \((56, T, n_1)\). And the size of \(F_1(Y)\) is \((56, T, n_1)\), too. Thus, we add \(F_{1}(Y)\) and \(F_{23}(Y)\), and get _F_(_Y_).
_W_ is the weights and _B_ is the bias in this fusion layer. At last, a _Flatten_ layer is for transforming multi-dimensional feature maps into one-dimensional arrays for next arbitration
network. TIME DELTA EMBEDDING-BASED BI-LSTM NETWORK The collected LDDoS attack sub-flows are composed of a sequence of attack packets. They are seen as time series data. Thus, they can be
coped with LSTM for time relationship information. But there may be some packets missed in detection blocks under the condition of network jam in our case. We need refer whether LDDoS
attacks happen from forwards and backwards directions simultaneously. Naturally, bi-LSTM network, which uses stacked two LSTM layers, is the effective mean as shown in Fig. 4. It can learn
long-term and short-term time relationship information of network packets from both directions. Nevertheless, LSTM-like network is time-consuming. And the consumed time mainly comes from two
aspects. The first aspect is introduced by its recurrent structure. As known, the calculation of current time step must wait the hidden variables in last time step in LSTM-like network. As
a result, the calculations in different time steps are hard to execute parallelly. Thus, we turn to the second aspect, the computation. Per our research, too many inputs mean large
computations. we propose a time-delta embedding method through encoding and fusing limited key time-related features as input to get hidden variables. These features include ‘arrival time’,
‘begin time’ and ‘time delta’ so on. The ‘arrival time’ is the key feature, which includes the chronological order of LDDoS attack. The ‘time to live’ means how many networks the packet has
passed through over time. It can be used to infer the ‘begin time’ of attacks. Most importantly, we introduce the ‘time delta’ feature between any two packets in same sub-flow. It is an
import indicator of LDDoS attack, which can indicate whether flow is similar. As the input features decrease, the computational workload greatly cuts down. The import components of bi-LSTM
are LSTM memory cell and bidirectional self-cycling mechanism. The LSTM memory cell adopts the forget gate \(f_t\), input gate \(i_t\) and output gate \(o_t\). The \(f_t\) determines what is
essential to retain from the previous state, the \(i_t\) determines how to add new information to the cell state, and the \(o_t\) defines what should be the next hidden state. In addition,
\(\tilde{C}_t\) represents the candidate updating value. Thus, the outputs of current LSTM cell \(h_t\) and \(h'_t\) can be calculated through Eq. (5), $$\begin{aligned} \begin{aligned}
{\begin{matrix} &{}h_t=\sigma (W_ox_t+U_oh_{t-1}+b_o) \times tanh(\sigma (W_fx_t+U_fh_{t-1}+b_f)C_{t-1}+\sigma (W_ix_t+U_ih_{t-1}+b_i)\tilde{C}_{t-1}),\\ &{}h'_t=\sigma
(W'_ox_t+U'_oh'_{t-1}+b'_o) \times tanh(\sigma (W'_fx_t+U'_fh_{t-1}+b'_f)C'_{t-1}+\sigma
(W'_ix_t+U'_ih_{t-1}+b'_i)\tilde{C'}_{t-1}). \end{matrix}} \end{aligned} \end{aligned}$$ (5) where \(x_t\) is the input at time _t_. \(W's\), \(U's\) and
\(b's\) denote the weights and bias of \(f_t\), \(i_t\), \(o_t\). _sigma_ and _tanh_ are activation functions, while \(\times\) means dot multiplication. And then, the \(og_t\) can be
gotten by Eq. (6), $$\begin{aligned} og_t=W_4h_t+W_5h'_t. \end{aligned}$$ (6) and it has bi-directional feature representations. MULTI-HEAD ATTENTION ARBITRATION NETWORK The extracted
features by multi-scale CNN and bi-LSTM are contacted into a vector. And the feature vector is fed into the arbitration network for their weights redistribution. The arbitration network
depends mainly on multi-head attention mechanism to efficiently extract information from the input feature vector, as depicted in Fig. 5. Firstly, the arbitration network projects at the
_Qs_, _Ks_, and _Vs_ _N_ times with different linear models. They are self-attended, that is, the projected _Qs_, _Ks_, and _Vs_ come from that same extracted feature vector. Then, the _Qs_,
_Ks_, and _Vs_ feed into scaled dot-product attention module in parallel, and yield result values by Eq. (7), $$\begin{aligned} Head_i=Softmax\left( \frac{QK^T}{\sqrt{d}}\right) V.
\end{aligned}$$ (7) where \(Head_i\) is the result value of head _i_ (\(i<N\)). _d_ represents the dimensional of _Qs_ or _Ks_. At last, the result values are concatenated and projected
once again, resulting in the arbitration values by Eq. (8), $$\begin{aligned} arbitration=FC(Concat(head_1, \cdots , head_N)). \end{aligned}$$ (8) After this arbitration network, the
upstream extracted features by different network models are weighed. It helps find the important features out. And it has strong parallelism capability for speeding LDDoS detection up. 1-D
DENSE CLASSIFICATION NETWORK To further keep the key LDDoS detection feature representation between layers, the direct connections from any layer to its subsequent layers are adopted in a
feed-forward fashion. That is, the multiple output vectors produced in layers from 0 to \(l-1\) are concatenated into a single tensor as described in Eq. (9), $$\begin{aligned} x_l=
Concat([x_0, x_1, \cdots , x_{l-1}]). \end{aligned}$$ (9) where \(x_i\) refers to the output vector in layer _i_. There are two dense blocks in our proposal. And we add \(2\times 2\) average
pooling operation between blocks, which can down-sample and shrink the size of feature vectors. In this 1-D dense network, there are many different paths between layers, which can alleviate
the vanishing-gradient problem effectively. At the same time, the convolutional kernels can be reduced since the feature is propagated well between layers. This means the number of
parameters in model substantially reduces, which can speed LDDoS detection up further. For getting the classification of LDDoS, _softmax_ activations are used in last layer. And during
MSCBL-ADN model compiling, \(categorical\_crossentropy\) loss function is used to calculate cost in training set and validation set. _Adam_ optimizer is used to adjust weights and biases
through back propagation. RESULTS In this section, the dataset, experimental environment and results are given. Firstly, the public ISCX-2016-SlowDos dataset is descripted and revised for
the sequenced experiments. And then, experimental environment is deployed, hyperparameters are listed, and evaluation indicators are descripted. At last, experiments and comparable
experiments are carried out in terms of accuracy and computational time complexity. DATASET DESCRIPTION, REVISION AND PRE-PROCESSING To verify the detection accuracy and time performance of
MSCBL-ADN model, a reliable, multi-class, and small-scale balanced raw-format dataset is needed. The popular low-rate DDoS detection dataset ISCX-2016-SlowDos34, provided by Canadian
Institute for Cybersecurity, is collected from the dynamic, real and complex network testbed and stored as the raw-format data. This declares the reliability of these collected packets
without doubt. In terms of multi-class labeling, the dataset includes different kinds of DDoS packets, such as low-rate and high-volume ones, but the bad news is that the packets are all
labelled as single malign type. Even worse, there is not any benign packet captured on the dataset. In this point, the dataset without revision is not suitable for verifying multi-class
detection ability of MSCBL-ADN model. The other reason for revising the ISCX-2016-SlowDos dataset is to validate the model’s learning ability on small-scale dataset. As described in above
sections, the novel LDDoS attacks are increasingly hard to be captured. As a result, the size of the training dataset is also decreasing reasonably. To simulate this situation, we need to
perform down-sampling on the original dataset. Based on above principles and rationalities, we firstly filter RUDY, slow-body, slow-headers, and slow-read LDDoS attacks out. Taking RUDY as
an example, its target servers are “208.113.162.153” and some other listed IP’s per the official documents34. Thus, we apply the filter of “ip.src==208.113.162.153 or
ip.dst==208.113.162.153” to filter raw RUDY network packets out in WireShark tool. The same method is applied to other IP’s and other LDDoS attack type. And then, we keep hulk attack on this
dataset as high-volume DDoS attack and choose packets from 1999 DARPA dataset35 as benign ones for the purpose of multi-class detection. The training data of first week on 1999 DARPA
dataset does not contain any attacks35. And the packets are stored as raw-format data, which are consistent with the ones on the ISCX-2016-SlowDos dataset. They can be intermixed easily
later. Next, we sort the packets by communication quintuple and segment them by TCP SYN, FIN, and RESET flags into flows. At last, a random counter and a packet counter are used for
generating the revised dataset. By the random counter, a certain flow is selected. By the packet counter, the data down-sampling is performed, and the size of dataset is under control. When
we set the packet counter for every attack, a small-scale revised dataset is gotten. The detail of the revised dataset is shown in Table 1. After getting the revised dataset, the packets in
flows are pre-processed. The one-hot encoding is applied to transform the categorical features into numeric features, such as protocol type, and the max-min normalization is performed for
the purpose of preventing gradient explosion during model training. To learn patterns of flows, the flows are further split into sub-flows (samples) and labeled by the fixed _T_-length
sliding segmentation method. The detailed operation can refer to section Methods. In this way, the conventional packet-based features are formed to many fixed length two-dimensional
sub-flows, by which the model can learn LDDoS attack patterns based on current and previous \((T-1)\) packets. The size of sub-flows is with 85.71% overlap, since the default stride is set
to 1. As known, the detailed experimental results of MSCBL-ADN model are discussed on both binary-class and multi-class scenarios. In the binary-class scenario, we label both hulk and benign
samples as normal classification, and label others as LDDoS classification. In multi-class scenarios, the attacks keep their original labels. At last, the dataset is randomly partitioned
into two small sets by 80%:20% by using the train_test_split() function of the sklearn library. The 80% dataset is for training, while the 20% one is for testing. After above data
operations, we analyze about whether the data is balanced. As shown in Fig. 6, the numbers of LDDoS and normal sub-flows almost equal in the binary-class scenario. And the numbers of
different LDDoS attacks are between 3600 and 5650 in multi-class scenario, they are good for training and testing. Thus, we also down sample the benign number to 5000 during multi-class
experiments. EXPERIMENTAL SETUP AND EVALUATION INDICATORS The experiments are conducted on an Ubuntu deep learning server with an Intel(R) Xeon(R) Silver 4216 CPU, 128G RAM and 2 Nvidia
GeForce RTX 3080 graphic card. The proposed model is implemented using the Python 3.6.2 and the library Keras with TensorFlow as the backend. The key well-tuned experiment regarding the
selection of fixed length _T_ is done as Fig. 7 shown. Normally, some kinds of LDDoS attacks based on TCP three-way handshake are finished in 3 or 4 packets. The other kinds of LDDoS attacks
such as slow-read attacks may need 7 packets or so. Thus, the well-tuned range of _T_ should be from 3 to 8 theoretically. But the good news is the MSCBL-ADN model has a multi-scale CNN
network which can extract feature representation among the 5 most recently arrived packets. As a result, the range of _T_ shrinks to [6, 7, 8]. The tuning result shows that the accuracy of
model looks best when the _T_ is set to 7. The accuracy is as high as 96.74%, and its curve is above the others’. At the same time, we know the average flow size is 7 after packets sorted
per communication quintuple in Table 1. That is, the size 7 of sliding window can segment most of flows into sub-flows without padding flow-agnostic packets. This reduces the impact of
additional operations. Once _T_ is greater than or equal to 8, there could be too many flow-agnostic packets in some segmentations to be detected. In other words, even if the largest
5\(\times\)5 convolutional kernel is used, it cannot get any useful information. Obviously, this may lead to multi-scale CNN network in MSCBL-ADN model failure. Additionally, in the real
network, the collection time of packets increases since the model has to wait for the 8th packet before detection. Overall, the chosen and recommended _T_ is 7. And the MSCBL-ADN model
parameters and settings are shown in Table 2. To evaluate the performance of model, three evaluation metrics (accuracy, precision, and recall rate) based on confusion matrix and one CPU/GPU
elapse (time complexity) are adopted. In confusion matrix, it contains _TP_, _FN_, _FP_, _TN_ elements. _TP_ means the number of normal sub-flows classified as normal traffic, and _FN_ means
the number of abnormal sub-flows classified as abnormal traffic. _FP_ represents the number of normal sub-flows classified as abnormal ones, while _TN_ represents the number of abnormal
sub-flows classified as normal ones. According to confusion matrix, accuracy, precision, and recall rate are calculated by Eq. (10). Each experiment runs 10 times by StratifiedKFold, the
average values of above estimated metrics are used as the results. $$\begin{aligned} \begin{aligned} {\begin{matrix} &{}accuracy=\frac{TP+TN}{TP+TN+FP+FN},\\
&{}precision=\frac{TP}{TP+FP},\\ &{}recallrate=\frac{TP}{TP+FN}. \end{matrix}} \end{aligned} \end{aligned}$$ (10) EXPERIMENTS AND COMPARABLE EXPERIMENTS EXPERIMENTAL RESULTS:
CLASSIFICATION AND TIME INDICATORS Figure 8 shows that accuracy and loss curves on the revised ISCX-2016-SlowDos dataset. On the binary-class scenario, the accuracy bumps to 98.53% during
first several epochs, and then, increases smoothly and steadily to 98.99%. The loss looks good, too. It decreases gradually to 0.0366. On the multi-class scenario, the accuracy grows quickly
from 89.16 to 95.05% in its first four epochs, and increases slowly to 96.90%. At the same time, its loss curve is smooth. The final loss is 0.1108. Obviously, the performance of MSCBL-ADN
model is excellent for classifying the network LDDoS attacks. And benefiting from its multi-channel structure, it can be trained a lot of epochs. As Figure 9 shown, they are the confusion
matrix of classification of MSCBL-ADN model on 20% testing set. Figure 9a is about binary-class scenario. We can get accuracy, precision, and recall rate as 98.78%, 98.82%, and 98.93%. The
_FP_ and _FN_ are both small, which proves the model has good classification ability. Figure 9b is about multi-class scenario. the MSCBL-ADN model has a higher but bearable error rate
between RUDY and slow-body. The reason is RUDY is a kind of slow-body fired by RUDY tool. Thus, its features are similar to slow-body ones. The overall accuracy, precision and recall rate
are 96.74%, 96.77%, and 96.74%. This indicates the MSCBL-ADN model is good to finish LDDoS attack detection. The time consumption experiments are run in the mentioned Ubuntu server. The GPU
elapse is 5.831 ms every 32 sub-flows. And the time consumption is increased linearly, when we enlarge the number of detecting sub-flows. Thus, the MSCBL-ADN model can be deployed into real
network for detection task. PERFORMANCE COMPARISON EXPERIMENTS During comparison experiments, we compare the proposed MSCBL-ADN model with RF18, SVM19 and KNN20 shallow classifiers on
testing set. We also compare it with CNN-based models (CNN-BMECapSA-RF25, LRDADF27), LSTM-based models (VLSTM30), and CNN-LSTM-based models (AsyncFL-bLAM16, NIDS-CNNLSTM33) during training
and testing phase. In Fig. 10, it displays the validation accuracy and training loss of multi-class scenario during 100 epochs. We can obviously see that the loss curve of CNN-based
CNN-BMECapSA-RF25 model is not stable. It explains that the LDDoS attack is strong time relationship, which is a little hard to extract time features to CNN-based model. The LRDADF27 model
is enhanced by deep sparse autoencoder. As a result, its loss curve is smoother than other CNN-based models. LSTM-based models look better. Their accuracies all can reach to 96% above, and
losses are lower than 0.15. The CNN-LSTM-based models improve the accuracies and decrease their losses further. Per the Fig. 10, our proposed model has the best performance. Figure 11 shows
the binary-class training scenario. As known, we merge the hulk and benign samples as normal sub-flows. As a result, VLSTM30 model spends lots of epochs to learn time relationship
repressions. Until 80th epoch or so, its accuracies and losses are normal. Its first 80 training accuracies and losses don’t show in Fig. 11, since the values are out of the display range.
But to CNN-based models, their accuracy curves are graceful. It proves that sparse features are important to detect LDDoS attacks, too. Obviously, the proposed model is a little better than
other comparison models. It has an accuracy of 99.04% and a loss of 0.0057 on training set. The shallow machine learning classifiers have no ability to analyze time steps. Thus, we treat a
single sub-flow as a training/testing example. As listed in Table 3, we compare the proposed model with classic machine learning classifiers and deep learning models on testing set on
multi-class scenario. The experimental results show that the key indicators of RF18, SVM19 and KNN20 are significantly lower than the deep learning models. But their detection time is faster
than deep learning models, since they have only a few parameters for the purpose of classification. Among the deep learning models, the CNN-based models have faster detection speed, but
lower detection accuracy. On the contrary, the detection speed of LSTM-based models are slower. In CNN-LSTM-based models, such as AsyncFL-bLAM16, NIDS-CNNLSTM33, researchers enhanced LSTM
part. Their accuracy and detection time gain a little improvement. The testing results demonstrate that the performance of MSCBL-ADN model is a little better than other state-of-the-art
models, whose accuracy is as high as 96.74 % and detection time is as low as 5.831 ms each batch. DISCUSSION Per the results of experiments and comparison experiments, the proposed MSCBL-ADN
model has a high accuracy, precision, recall rate, and an acceptable detection time. This benefits from data augmentation method, multi-scale CNN network, time delta embedding-based bi-LSTM
network, and multi-head attention arbitration network. Let’s discuss them by ablation studies. The first experiment is used to verify the effectiveness of the fixed _T_-length sliding
segmentation data augmentation method. It is conducted with all modules on the original dataset and augmented dataset. As shown in Fig. 12, on the original poor data, the accuracy and loss
of MSCBL-ADN model are both impacted. Its final accuracy is lower than 95%, and it has a higher loss as 0.26. When we augment the original dataset, the training data is dilated almost by 10
times. The new augmented dataset is good for training model, since its accuracy and loss curves are both growing steadily. This experiment explains the proposed data augmentation method is
effective. The experimental results shown in Fig. 13 are about the effectiveness of multi-scale CNN network. We conduct the experiments without CNN network, with CNN network and with
multi-scale CNN network. Without CNN network, the proposed model can learn how to detect LDDoS attacks, but the accuracy is low. By using CNN network, the sparse representation can be
learned for detection. As known, some LDDoS attacks, such as slow-read attacks, need more than one packets for firing an attack. That is, we need different scale CNN kernels for learning
representations among packets. As a result, multi-scale CNN network model has the best accuracy and the smallest loss. The basic idea of third group of experiments is to verify time delta
embedding-based bi-LSTM network. We compare the accuracies and losses in the experiments without LSTM network, with LSTM network, with bi-LSTM network and with embedding-based bi-LSTM
network. The packets of LDDoS attacks have strong time positive correlation. As shown in Fig. 14, without LSTM network, the accuracy curve is volatile, and its loss curve is growing after
dozens of epochs. With LSTM network, the curves look better. By using bi-LSTM network, the model can learn from both the front and back packets. This can alleviate the bad impact of missed
packets in sub-flows, and improve the accuracy further. When we embed the time delta info between packets into bi-LSTM network, the model is enhanced. From the experimental outputs, our
proposed network is more accurate and effective. The arbitration network helps a lot, too. The representations from multi-scale CNN network and embedding-based bi-LSTM network are extracted
parallelly and independently. Thus, an arbitration network is needed to decide which representations are key items. As illustrated in Fig. 15, the performance is better with it. For saving
the detection time, benefiting from the fixed _T_-length sliding segmentation method and padding technology, when any packet comes, the LDDoS detection can start. This speeds the frequency
of detection up in real network world. In bi-LSTM network part, we only allow time related features as inputs. Thus, the number of bi-LSTM neurons can be small, which saves the scale of
training parameters and speed detection up. The MHA structure handles the inputted representations parallelly. This makes use of multi-GPU processors and saves time. By above technologies
and tricks, the MSCBL-ADN model gets LDDoS detection results in an acceptable amount of computational time. CONCLUSIONS LDDoS attacks are a kind of new worldwide network security issues.
There is only a little poor public dataset used for training AI model currently. In this paper, we have designed and verified an effective LDDoS attack detection model under the limit of
poor data, called MSCBL-ADN. It augments the dataset by a novel fixed _T_-length sliding segmentation data augmentation method. It incorporates multi-scale CNN and embedding-based bi-LSTM
for key feature extraction, and employs arbitration network for redistributing weights of key representations. The experimental results demonstrate that MSCBL-ADN model has good performance
in both binary-class and multi-class scenario. In the near future, we wish to enhance the MSCBL-ADN model with two tasks. The task is to try and replace LSTM-like network with attention
network for the purpose of speeding up detection time further. At the same time, we also find there is a high error rate between hulk and slow-body attacks per experimental results. Thus,
the other task is to analyze the detailed reason and retrain the enhanced MSCBL-ADN model. DATA AVAILABILITY The datasets analysed during the current study are available in the Canadian
Institute for Cybersecurity webpage https://www.unb.ca/cic/datasets/dos-dataset.html, and in the MIT webpage
http://www.ll.mit.edu/r-d/datasets/1999-darpa-intrusion-detection-evaluation-dataset. REFERENCES * Tang, D., Zhang, S., Yan, Y., Chen, J. & Qin, Z. Real-time detection and mitigation of
ldos attacks in the SDN using the HGB-FP algorithm. _IEEE Trans. Serv. Comput._ 15, 3471–3484. https://doi.org/10.1109/TSC.2021.3102046 (2022). Article Google Scholar * BlackNurse.
Blacknurse-it can bring you down. BlackNurse (2018). * Fortinet. Fortinet predicts highly destructive and self-learning “swarm” cyberattacks in 2018. Fortinet (2018). * Liu, Z., Yin, X.
& Hu, Y. CPSS lr-ddos detection and defense in edge computing utilizing DCNN q-learning. _IEEE Access_ 8, 42120–42130. https://doi.org/10.1109/ACCESS.2020.2976706 (2020). Article Google
Scholar * Wang, R., Ma, C. & Wu, P. An intrusion detection method based on federated learning and convolutional neural network. _Netinfo
Secur._https://doi.org/10.3969/j.issn.1671-1122.2020.04.006 (2020). Article Google Scholar * Li, B. _et al._ Deepfed: Federated deep learning for intrusion detection in industrial
cyber-physical systems. _IEEE Trans. Industr. Inf._ 17, 5615–5624. https://doi.org/10.1109/TII.2020.3023430 (2021). Article Google Scholar * Mothukuri, V. _et al._ Federated-learning-based
anomaly detection for iot security attacks. _IEEE Internet Things J._ 9, 2545–2554. https://doi.org/10.1109/JIOT.2021.3077803 (2022). Article Google Scholar * Idrissi, M. J. _et al._
Fed-anids: Federated learning for anomaly-based network intrusion detection systems. _Expert Syst. Appl._ 234, 121000. https://doi.org/10.1016/j.eswa.2023.121000 (2023). Article Google
Scholar * Wu, W. & Zhang, Y. An efficient intrusion detection method using federated transfer learning and support vector machine with privacy-preserving. _Intell. Data Anal._ 27,
1121–1141. https://doi.org/10.3233/IDA-226617 (2023). Article Google Scholar * de Carvalho Bertoli, G., Júnior, L. A. P., Saotome, O. & dos Santos, A. L. Generalizing intrusion
detection for heterogeneous networks: A stacked-unsupervised federated learning approach. _Comput. Secur._ 127, 103106. https://doi.org/10.1016/j.cose.2023.103106 (2023). Article Google
Scholar * Tang, D., Wang, S., Liu, B., Jin, W. & Zhang, J. GASF-IPP: Detection and mitigation of ldos attack in SDN. _IEEE Trans. Serv. Comput._ 16, 3373–3384.
https://doi.org/10.1109/TSC.2023.3266757 (2023). Article Google Scholar * Zhao, R., Yin, Y., Shi, Y. & Xue, Z. Intelligent intrusion detection based on federated learning aided long
short-term memory. _Phys. Commun._ 42, 101157. https://doi.org/10.1016/j.phycom.2020.101157 (2020). Article Google Scholar * Huong, T. T. _et al._ Detecting cyberattacks using anomaly
detection in industrial control systems: A federated learning approach. _Comput. Ind._ 132, 1–16. https://doi.org/10.1016/j.compind.2021.103509 (2021). Article Google Scholar * Zhang, X.,
Ma, Z., Wang, A., Mi, H. & Hang, J. Lstfcfedlear: A LSTM-FC with vertical federated learning network for fault prediction. _Wirel. Commun. Mob. Comput._ 1–10, 2021.
https://doi.org/10.1155/2021/2668761 (2021). Article Google Scholar * Wang, X., Liu, J. & Zhang, C. Network intrusion detection based on multi-domain data and ensemble-bidirectional
LSTM. _EURASIP J. Inf. Secur._ 2023, 5. https://doi.org/10.1186/s13635-023-00139-y (2023). Article Google Scholar * Liu, Z., Guo, C., Liu, D. & Yin, X. An asynchronous federated
learning arbitration model for low-rate ddos attack detection. _IEEE Access_ 11, 18448–18460. https://doi.org/10.1109/ACCESS.2023.3247512 (2023). Article Google Scholar * Rahman, S. A.,
Tout, H., Talhi, C. & Mourad, A. Internet of things intrusion detection: Centralized, on-device, or federated learning?. _IEEE Network_ 34, 310–317.
https://doi.org/10.1109/MNET.011.2000286 (2020). Article Google Scholar * Jiang, J., Wang, Q., Shi, Z., Lv, B. & Qi, B. RST-RF: A hybrid model based on rough set theory and random
forest for network intrusion detection. In Proceedings of the 2nd International Conference on Cryptography, Security and Privacy, ICCSP 2018, Guiyang, China, March 16-19, 2018, 77–81,
https://doi.org/10.1145/3199478.3199489 (ACM, 2018). * Kaushik, R., Singh, V. & Kumari, R. Multi-class svm based network intrusion detection with attribute selection using infinite
feature selection technique. _J. Discrete Math. Sci. Cryptogr._ 24, 2137–2153. https://doi.org/10.1080/09720529.2021.2009189 (2021). Article Google Scholar * de Miranda Rios, V., Inácio,
P. R. M., Magoni, D. & Freire, M. M. Detection of reduction-of-quality ddos attacks using fuzzy logic and machine learning algorithms. _Comput. Netw._ 186, 107792.
https://doi.org/10.1016/j.comnet.2020.107792 (2021). Article Google Scholar * Tang, D., Yan, Y., Zhang, S., Chen, J. & Qin, Z. Performance and features: Mitigating the low-rate
tcp-targeted dos attack via SDN. _IEEE J. Sel. Areas Commun._ 40, 428–444. https://doi.org/10.1109/JSAC.2021.3126053 (2022). Article Google Scholar * Tang, D., Wang, X., Li, X.,
Vijayakumar, P. & Kumar, N. AKN-FGD: adaptive kohonen network based fine-grained detection of ldos attacks. _IEEE Trans. Dependable Secur. Comput._ 20, 273–287.
https://doi.org/10.1109/TDSC.2021.3131531 (2023). Article Google Scholar * Muhammad, A., Hidayatul, K., Wibawati & Lee, M. H. Support vector data description with kernel density
estimation (svdd-kde) control chart for network intrusion monitoring. Sci. Rep. 13, 1–12, https://doi.org/10.1038/s41598-023-46719-3 (2023). * Zhou, X., Liang, W., Shimizu, S., Ma, J. &
Jin, Q. Siamese neural network based few-shot learning for anomaly detection in industrial cyber-physical systems. _IEEE Trans. Industr. Inf._ 17, 5790–5798.
https://doi.org/10.1109/TII.2020.3047675 (2021). Article Google Scholar * Asgharzadeh, H., Ghaffari, A., Masdari, M. & Gharehchopogh, F. S. Anomaly-based intrusion detection system in
the internet of things using a convolutional neural network and multi-objective enhanced capuchin search algorithm. _J. Parallel Distrib. Comput._ 175, 1–21.
https://doi.org/10.1016/j.jpdc.2022.12.009 (2023). Article Google Scholar * Ren, K., Yuan, S., Zhang, C., Shi, Y. & Huang, Z. CANET: A hierarchical cnn-attention model for network
intrusion detection. _Comput. Commun._ 205, 170–181. https://doi.org/10.1016/j.comcom.2023.04.018 (2023). Article Google Scholar * Venkateshwarlu, V., Ranjith, D. & Raju, A. Lrdadf: An
ai enabled framework for detecting low-rate ddos attacks in cloud computing environments. In 2023 Fifth International Conference on Electrical, Computer and Communication Technologies
(ICECCT), 1–8, https://doi.org/10.1109/ICECCT56650.2023.10179834 (2023). * Salahuddin, M. A., Pourahmadi, V., Alameddine, H. A., Bari, M. F. & Boutaba, R. Chronos: Ddos attack detection
using time-based autoencoder. IEEE Transactions on Network and Service Management 1–1, https://doi.org/10.1109/TNSM.2021.3088326 (2021). * Mansour, R. F. Artificial intelligence based
optimization with deep learning model for blockchain enabled intrusion detection in cps environment. _Sci. Rep._ 12, 1–14. https://doi.org/10.1038/s41598-022-17043-z (2022). Article CAS
Google Scholar * Zhou, X., Hu, Y., Liang, W., Ma, J. & Jin, Q. Variational LSTM enhanced anomaly detection for industrial big data. _IEEE Trans. Ind. Inform._ 17, 3469–3477.
https://doi.org/10.1109/TII.2020.3022432 (2021). Article Google Scholar * Mushtaq, E., Zameer, A., Umer, M. & Abbasi, A. A. A two-stage intrusion detection system with auto-encoder and
lstms. _Appl. Soft Comput._ 121, 108768. https://doi.org/10.1016/j.asoc.2022.108768 (2022). Article Google Scholar * Liu, Z., Yu, J., Yan, B. & Wang, G. A deep 1-d CNN and
bidirectional LSTM ensemble model with arbitration mechanism for lddos attack detection. _IEEE Transact. Emerg. Top. Comput. Intell._ 6, 1396–1410. https://doi.org/10.1109/TETCI.2022.3170515
(2022). Article Google Scholar * Du, J., Yang, K., Hu, Y. & Jiang, L. NIDS-CNNLSTM: Network intrusion detection classification model based on deep learning. _IEEE Access_ 11,
24808–24821. https://doi.org/10.1109/ACCESS.2023.3254915 (2023). Article Google Scholar * Jazi, H. H., Gonzalez, H., Stakhanova, N. & Ghorbani, A. A. Detecting http-based application
layer dos attacks on web servers in the presence of sampling. _Comput. Netw._ 121, 25–36. https://doi.org/10.1016/j.comnet.2017.03.018 (2017). Article Google Scholar * MIT. Darpa intrusion
detection evaluation dataset. MIT (1999). Download references ACKNOWLEDGEMENTS This work was supported in part by the Key Technologies R &D Program of Weifang under Grant 2023GX063 and
2021GX056, in part by the Foundation for the Talents by the Shandong Vocational College of Science and Technology, in part by the Foundation for the Talents by the Weifang University of
Science and Technology under Grant KJRC2021002, in part by the Natural Science Foundation of Shandong Province under Grant ZR2021MF086, and in part by the Key R &D Program of Shandong
Province under Grant 2019GNC106034. AUTHOR INFORMATION AUTHORS AND AFFILIATIONS * Shandong Provincial University Laboratory for Protected Horticulture, Weifang Key Laboratory of Blockchain
on Agricultural Vegetables, Weifang University of Science and Technology, Weifang, 262700, China Xiaochun Yin * Computer Science and Engineering College, Weifang University of Science and
Technology, Weifang, 262700, China Wei Fang & Deyong Liu * School of Information Engineering, Shandong Vocational College of Science and Technology, Weifang, 261053, China Zengguang Liu
Authors * Xiaochun Yin View author publications You can also search for this author inPubMed Google Scholar * Wei Fang View author publications You can also search for this author inPubMed
Google Scholar * Zengguang Liu View author publications You can also search for this author inPubMed Google Scholar * Deyong Liu View author publications You can also search for this author
inPubMed Google Scholar CONTRIBUTIONS X.Y. conceived the methodology and wrote the manuscript, Z.L. conceived the experiments. W.F., Z.L. and D.L. analyzed the results. All authors reviewed
the manuscript. CORRESPONDING AUTHOR Correspondence to Zengguang Liu. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare no competing interests. ADDITIONAL INFORMATION
PUBLISHER'S NOTE Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. RIGHTS AND PERMISSIONS OPEN ACCESS This article
is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give
appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in
this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative
Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a
copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Yin, X., Fang, W., Liu, Z. _et al._ A novel multi-scale
CNN and Bi-LSTM arbitration dense network model for low-rate DDoS attack detection. _Sci Rep_ 14, 5111 (2024). https://doi.org/10.1038/s41598-024-55814-y Download citation * Received: 05
October 2023 * Accepted: 28 February 2024 * Published: 01 March 2024 * DOI: https://doi.org/10.1038/s41598-024-55814-y SHARE THIS ARTICLE Anyone you share the following link with will be
able to read this content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing
initiative KEYWORDS * Arbitration mechanism * Dense connection * Embedding-based bi-LSTM * LDDoS attacks * Multi-scale CNN * Network security