# Python: Removing lower versions from a 2D list

You’re using a second loop to check if the current “key” that you’re checking exists in the list. This is slowing down your code. Why? Because, as your code demonstrates, checking for membership in lists is slow. Really slow, because you need to iterate over the entire list, which means it’s an O(N) operation, so the time depends linearly on the size of the list.

Instead, you could simply change the list to a dictionary. Lookup in a dictionary is an O(1) operation, so the lookup happens in constant (or nearly constant) time regardless of the size of the dictionary. When you do this, there’s no longer a need for two loops. Here’s an idea:

``````def remove_duplicates_new(duplicate):
final_dict = {}
case_sensitive_keys = {}
for k, v in duplicate:
klower = k.lower()
vint = int(v)
old_val = final_dict.get(klower, 0) # Get the key k, with a default of zero if the key doesn't exist
if vint > old_val:
# Replace if current value is greater than old value
final_dict[klower] = vint
case_sensitive_keys[klower] = k

# Now we're done looping, so create the list
final_list = [[case_sensitive_keys[k], str(v)] for k, v in final_dict.items()]
return final_list
``````

To compare, let’s make a test list with 10000 elements. The “keys” are random numbers between 1 and 100, so we’re bound to get a whole bunch of duplicates.:

``````import random
import timeit

testList = [[str(random.randint(1, 100)), str(random.randint(1, 10))] for i in range(10000)]

timeit.timeit('remove_duplicates(testList)', setup='from __main__ import testList, remove_duplicates', number=10)
# Output: 1.1064800999999989

timeit.timeit('remove_duplicates_new(testList)', setup='from __main__ import testList, remove_duplicates_new', number=10)
# Output:  0.03743689999998878
``````

Hot damn! That’s a ~30x speedup!