Group list of lists by shared elements

Thanks for the suggestions, everyone, I guess I just needed the right vocabulary! Since under the linked answers, a couple of people asked for code to implement all of this, I thought I’d post an answer for future reference. Apparently, the concept of strongly connected components is not defined for non-directed graphs, so the solution is to look for connected components.

For my answer, I adjusted the code found here: https://www.geeksforgeeks.org/connected-components-in-an-undirected-graph/

It just requires reformulating l has integers, rather than strings:

class Graph:
    # init function to declare class variables
    def __init__(self, V):
        self.V = V
        self.adj = [[] for i in range(V)]
 
    def DFSUtil(self, temp, v, visited):
 
        # Mark the current vertex as visited
        visited[v] = True
 
        # Store the vertex to list
        temp.append(v)
 
        # Repeat for all vertices adjacent
        # to this vertex v
        for i in self.adj[v]:
            if visited[i] == False:
 
                # Update the list
                temp = self.DFSUtil(temp, i, visited)
        return temp
 
    # method to add an undirected edge
    def addEdge(self, v, w):
        self.adj[v].append(w)
        self.adj[w].append(v)
 
    # Method to retrieve connected components
    # in an undirected graph
    def connectedComponents(self):
        visited = []
        cc = []
        for i in range(self.V):
            visited.append(False)
        for v in range(self.V):
            if visited[v] == False:
                temp = []
                cc.append(self.DFSUtil(temp, v, visited))
        return cc

Now we can run

l = [[0, 1], 
 [0, 2], 
 [1, 2],
 [2, 3],  
 [4, 5], 
 [5, 6], 
 [7, 8]]


g = Graph(
    max([item for sublist in l for item in sublist])+1
)

for sl in l:
    g.addEdge(sl[0], sl[1])
cc = g.connectedComponents()
print("Following are connected components")
print(cc)

And we get:

Following are connected components
[[0, 1, 2, 3], [4, 5, 6], [7, 8]]

We can then go back and group the original list:

result = []
for sublist in cc:
    bucket = [x for x in l if any(y in x for y in sublist)]
    result.append(bucket)

Output:

[[[0, 1], [0, 2], [1, 2], [2, 3]], [[4, 5], [5, 6]], [[7, 8]]]

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top