Efficient <italic>(k, &#x03B1;)</italic>-Maximal-Cliques Enumeration Over Uncertain Graphs

A maximal clique (<inline-formula> <tex-math notation="LaTeX">${MC}$ </tex-math></inline-formula>) is a complete subgraph satisfying that no other cliques can take it as their proper subgraph. Given an uncertain graph, the <inline-formula> <tex-math notatio...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Jing Bai, Junfeng Zhou, Ming Du, Peng Zhong
Formato: article
Lenguaje:EN
Publicado: IEEE 2021
Materias:
Acceso en línea:https://doaj.org/article/b554445da06c4e41a07fa51d01a42232
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:A maximal clique (<inline-formula> <tex-math notation="LaTeX">${MC}$ </tex-math></inline-formula>) is a complete subgraph satisfying that no other cliques can take it as their proper subgraph. Given an uncertain graph, the <inline-formula> <tex-math notation="LaTeX">$top\text{-}K\,\,{MC}\text{s}$ </tex-math></inline-formula> enumeration problem studies how to return <inline-formula> <tex-math notation="LaTeX">$k {MC}\text{s}$ </tex-math></inline-formula> with the highest rank value. Existing algorithms rank <inline-formula> <tex-math notation="LaTeX">${MC}\text{s}$ </tex-math></inline-formula> according to their probabilities, thus usually return <inline-formula> <tex-math notation="LaTeX">${MC}\text{s}$ </tex-math></inline-formula> with higher probabilities but less number of vertices, and fail to return large <inline-formula> <tex-math notation="LaTeX">${MC}\text{s}$ </tex-math></inline-formula> that convey more useful information. Considering this problem, this paper studies the problem of enumerating <inline-formula> <tex-math notation="LaTeX">$top\text{-}K\,\,{MC}\text{s}$ </tex-math></inline-formula>. Our approach returns <inline-formula> <tex-math notation="LaTeX">$k {MC}\text{s}$ </tex-math></inline-formula> with the most number of vertices satisfying that their probabilities <inline-formula> <tex-math notation="LaTeX">$\geq \alpha $ </tex-math></inline-formula>, where each <inline-formula> <tex-math notation="LaTeX">${MC}$ </tex-math></inline-formula> is called an <inline-formula> <tex-math notation="LaTeX">$\alpha $ </tex-math></inline-formula>- <inline-formula> <tex-math notation="LaTeX">${MC}$ </tex-math></inline-formula>, and computing <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula> largest <inline-formula> <tex-math notation="LaTeX">$\alpha $ </tex-math></inline-formula>- <inline-formula> <tex-math notation="LaTeX">${MC}\text{s}$ </tex-math></inline-formula> is called as <inline-formula> <tex-math notation="LaTeX">$(k,\alpha)$ </tex-math></inline-formula>- <inline-formula> <tex-math notation="LaTeX">${MC}\text{s}$ </tex-math></inline-formula>. We propose an efficient <inline-formula> <tex-math notation="LaTeX">$(k,\alpha)$ </tex-math></inline-formula>- <inline-formula> <tex-math notation="LaTeX">${MC}\text{s}$ </tex-math></inline-formula> enumeration algorithm, <inline-formula> <tex-math notation="LaTeX">$Top\text{-}KMC$ </tex-math></inline-formula>, which works in three steps, including partition, enumeration and verification. Here, partition means that we compute the set <inline-formula> <tex-math notation="LaTeX">$\mathcal {M}$ </tex-math></inline-formula> of all <inline-formula> <tex-math notation="LaTeX">${MC}\text{s}$ </tex-math></inline-formula> without considering the probability information, as if the graph is partitioned into a set of subgraphs. Enumeration means that we compute <inline-formula> <tex-math notation="LaTeX">$\alpha $ </tex-math></inline-formula>- <inline-formula> <tex-math notation="LaTeX">${MC}\text{s}$ </tex-math></inline-formula> from each <inline-formula> <tex-math notation="LaTeX">${MC}$ </tex-math></inline-formula> of <inline-formula> <tex-math notation="LaTeX">$\mathcal {M}$ </tex-math></inline-formula>. As each such subgraph is an <inline-formula> <tex-math notation="LaTeX">${MC}$ </tex-math></inline-formula>, the cost of computing common neighbors for finding <inline-formula> <tex-math notation="LaTeX">$\alpha $ </tex-math></inline-formula>- <inline-formula> <tex-math notation="LaTeX">${MC}\text{s}$ </tex-math></inline-formula> can be reduced. Verification means that we need to verify whether an <inline-formula> <tex-math notation="LaTeX">$\alpha $ </tex-math></inline-formula>- <inline-formula> <tex-math notation="LaTeX">${MC}$ </tex-math></inline-formula> is a subgraph of another <inline-formula> <tex-math notation="LaTeX">$\alpha $ </tex-math></inline-formula>- <inline-formula> <tex-math notation="LaTeX">${MC}$ </tex-math></inline-formula>. If not, it is an <inline-formula> <tex-math notation="LaTeX">$\alpha $ </tex-math></inline-formula>- <inline-formula> <tex-math notation="LaTeX">${MC}$ </tex-math></inline-formula>; otherwise, it is a useless <inline-formula> <tex-math notation="LaTeX">$\alpha $ </tex-math></inline-formula>- <inline-formula> <tex-math notation="LaTeX">${MC}$ </tex-math></inline-formula> and should be removed. We further propose an optimized algorithm <inline-formula> <tex-math notation="LaTeX">$Top\text{-}KMC+$ </tex-math></inline-formula> to reduce both time and space by merging the above three steps into a whole step. The experimental results on real datasets show that both <inline-formula> <tex-math notation="LaTeX">$Top\text{-}KMC$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$Top\text{-}KMC+$ </tex-math></inline-formula> can return <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula> largest <inline-formula> <tex-math notation="LaTeX">$\alpha $ </tex-math></inline-formula>- <inline-formula> <tex-math notation="LaTeX">${MC}\text{s}$ </tex-math></inline-formula> efficiently.