Get where an element appears inside each sublist in Python

It can be made in a clean way, iterating on the original data only once, by building an intermediate structure, a dict with the bases as keys, whose values are dicts with sublist indices as keys and whose values are the indices of bases in the sublists (well, see the sample in the comment). Then, we just transform it to build the final list structured in the way you want:

from collections import defaultdict

db=[['C','A','G','A','A','G','T'],['T','G','A','C','A','G'],['G','A','A','G','T']]

bases = ['C', 'A', 'G', 'T']
by_base = {base: defaultdict(list) for base in bases}

for sublist_idx, sublist in enumerate(db):
    for base_idx, base in enumerate(sublist):
        by_base[base][sublist_idx].append(base_idx)
        
# by_base looks like {'C': defaultdict(<class 'list'>, {0: [0], 1: [3]}), ...}


final_list = [(base, list(occurences.items())) for base, occurences in by_base.items()]

which gives us:

print(final_list)
# [('C', [(0, [0]), (1, [3])]), ('A', [(0, [1, 3, 4]), (1, [2, 4]), (2, [1, 2])]), ('G', [(0, [2, 5]), (1, [1, 5]), (2, [0, 3])]), ('T', [(0, [6]), (1, [0]), (2, [4])])]

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top