The rows represent attack sources and the column represent
the targeted networks (attack victims). An ``X'' in the table cell
indicates that the corresponding source has reportedly attacked the
corresponding network. Consider *s*_{2} and *s*_{7}.
Although they have attacked the same number of victims, from the
viewpoint of *v*_{1}, one may say that *s*_{2}
is more likely to attack than *s*_{7}, because *s*_{2}
has attacked *v*_{2}, which shares more common attackers
with *v*_{1}. Now compare the source *s*_{5}
to *s*_{7}. Both sources attacked only one network. None
of these networks share common attacks with *v*_{1}.
However, for *v*_{1}, *s*_{5} and *s*_{7}
are not equal. Notice that *v*_{2} shares common attacks
with *v*_{1}, and *v*_{3} shares common
attacks with *v*_{2}. A path *v*_{3} – *v*_{2} – *v*_{1}
connects *s*_{5}
to *v*_{1}. One may say that *s*_{5} is
more likely to attack for *v*_{1}.

**Figure 1:**Correlation Graph

We model the attack correlation as a graph.
The correlation graph is a weighted
directed graph *G* = (*V*, *E*). The nodes in the
graph are the victims, i.e. *V* = {*v*_{1}, *v*_{2}, ...}
There is an edge from node *v*_{i}
to node *v*_{j} if *v*_{i} is correlated
with *v*_{j}. The weight on the edge is proportional to
the strength of this correlation. Figure 1
shows the correlation graph for the victims in Table 1.

We rank an attack source, with respect to a victim, using the source's
probability to attack the victim. We estimate this probabilities in the
following way: suppose we have an estimation on source *s*'s
probability of attacking victim *v*_{i}. Following the
outgoing edges of *v*_{i}, A fraction of this probability
can be distributed to the neighbors of *v*_{i} in the
graph. Each neighbor receives a share of this probability that is
proportional to its strength of correlation with *v*_{i}
(i.e., proportional to the weight of the edge from *v*_{i}
to that neighbor.) Suppose *v*_{j} is one of these
neighbors in the correlation graph. A fraction of the probability
received by *v*_{j} is then further distributed, in the
similar fashion, to its neighbors. The propagation of probability
continues until the estimations for each victim reach a stable state.

Such a probability-propagation process can be simulated by a random
walk on the correlation graph.
Let *P*^{s}(*v*) be the estimate of the total
probability that *s* attacks *v*. Let *W*_{ij}
be the correlation strength from victim *v*_{j} to *v*_{i}
and *B*^{s}(*v*_{i}) be an initial
estimation based on whether *s* attacks *v*_{i} in
the attack table. The stable distribution of the random walk is the following:

The full HPB system implements many other practical considerations for
blacklisting. However the source ranking and the probability estimation
by random walk are the heart of the system. Our experiments show that
the HPB exhibits a higher hit count than traditional blacklists for
most of the contributors. The experiments also show that HPB's performance is
consistent over time,
and these advantages remain stable across various list lengths and
predict windows.