Forum Archive

Natural sorting of values in nested dictionary

Niklas

Hi!

I have a nested dictionary with string values containing both letters and numbers. I want to sort them using natural sorting (”M1, M2, M10, Sö202” instead of ”M1, M10, M2, Sö202”).

Despite reading a lot on StackOverflow and other places, I still haven’t managed to get it right. I find parts of answers here and there, but some are for Python 2, some Python 3, some use modules I don’t have, and so on.

This is a very small part of the dictionary:


dict_nested = {'Tynderö 163': {'lamnings_id_raa': 'L1934:1839', 'fid': 184092, 'sv_runinskr_fnrtdb': 'Sö 9', 'sv_runinskr_trim': 'Sö9', 'obj_id_fnrtdb': '', 'signum_fnrtdb': 'Sö 9 $'}, 'Härnösand 148': {'lamnings_id_raa': 'L1934:2591', 'fid': 150041, 'sv_runinskr_fnrtdb': 'Sö 11', 'sv_runinskr_trim': 'Sö11', 'obj_id_fnrtdb': '', 'signum_fnrtdb': 'Sö 11 +'}, 'Njurunda 807': {'lamnings_id_raa': 'L1934:374', 'fid': 149317, 'sv_runinskr_fnrtdb': 'U 215', 'sv_runinskr_trim': 'U215', 'obj_id_fnrtdb': '', 'signum_fnrtdb': 'U 215 $'}, 'Skön 70:1': {'sv_runinskr_fnrtdb': 'M 5', 'sv_runinskr_trim': 'M5', 'signum_fnrtdb': 'M 5', 'lamnings_id_raa': {}, 'fid': {}, 'obj_id_fnrtdb': '10249200700001'}, 'Skön 70:2': {'sv_runinskr_fnrtdb': 'M 16', 'sv_runinskr_trim': 'M16', 'signum_fnrtdb': 'M 16', 'lamnings_id_raa': {}, 'fid': {}, 'obj_id_fnrtdb': '10249200700002'}, 'Skog 7:4': {'lamnings_id_raa': 'L1935:1941', 'fid': 42683, 'sv_runinskr_fnrtdb': '', 'sv_runinskr_trim': '', 'obj_id_fnrtdb': '', 'signum_fnrtdb': ''}, 'Njurunda 173:1': {'sv_runinskr_fnrtdb': 'M 10', 'sv_runinskr_trim': 'M10', 'signum_fnrtdb': 'M 10 $', 'lamnings_id_raa': {}, 'fid': {}, 'obj_id_fnrtdb': '10248101730001'}, 'Njurunda 116:1': {'sv_runinskr_fnrtdb': 'M 1', 'sv_runinskr_trim': 'M1', 'signum_fnrtdb': 'M 1 $', 'lamnings_id_raa': {}, 'fid': {}, 'obj_id_fnrtdb': '10248101160001'}}


I want to sort the values for the ”sv_runinskr_trim” key. This is something I need to do all the time. Can anyone help me make a breaktrough here? Ideally I would also like to understand what I’m doing. 🙂

Best regards, Niklas

(The dictionary is for a website about Swedish rune stones.)

mikael

@Niklas, this is easier if we can sort on sv_runinskr_fnrtdb. If so:

def sort_key(value):
    key, dct = value
    raw_sort = dct['sv_runinskr_fnrtdb']
    if not raw_sort:
        return ('', 0)
    letter, number, *extras = raw_sort.split()
    try:
        number = int(number)
    except:
        number = 0
    return (letter, number)


# Turn the dict into tuples
result = [
    (key, value)
    for key, value
    in dict_nested.items()
]

# Sort them
result.sort(key=sort_key)

# Print keys to check that we got it right
for key, value in result:
    print(key, ' - ', value['sv_runinskr_fnrtdb'])
mikael

@Niklas, let me know what explanation is needed. Mainly this takes advantage of the fact that Python can sort tuples, so once we turn ”M 2 $” into (”M”, 2), it takes care of the rest.

7upser

I had a similar Problem. I need a german sorted List and locale didn't work with Pythonista. So i have to write my own Sort def. That should solve your Problem.
But @mikael was faster than light


def myOwnSort(vInput):
    import re

    vInput = dict_nested[vInput]['sv_runinskr_trim']
    vInNr = re.findall('\d+', vInput)
    vInNr = int(vInNr[0]) if len(vInNr) > 0 else 0
    vInNr = '{0:0>3}'.format(vInNr)

    vInChar = re.findall('\D+', vInput)
    vInChar = vInChar[0] if len(vInChar) > 0 else ''

    vNewSortKey = vInNr + vInChar

    return vNewSortKey


TheOneAndOnlyNewAndPrivateSortedDictionaryWithMyOwnSortKey = sorted(dict_nested, key = myOwnSort)


for i in TheOneAndOnlyNewAndPrivateSortedDictionaryWithMyOwnSortKey:
    print(dict_nested[i]['sv_runinskr_trim'])

Niklas

Thanks to both of you! 🙂

@mikael, I think I may be able to use sorting on the “sv_runinskr_fnrtdb” values. As far as I can see, it will work for the remaining part of the dictionary as well. I also think I understand what is going on in your code. 🙂

@7upser, I like that your solution use the “sv_runinskr_trim” values. However, it seems that it sorts on numbers only. Since all of my values contain letters first, not just M, Sö, and U, as in the example, I also want them sorted on the letters. Therefore your solution will not work in my particular case. I like your variable naming skills, though. 😉

mikael

@Niklas, if you need or want to use the trim values instead, here’s a regex alternative you can apply in the search key function:

import re

# 1+ non-digits followed by 1+ digits
expr = re.compile(r'([^\d]+)(\d+)')
match = expr.search('Sö1')
letters, number = match.group(1), int(match.group(2))
7upser

@mikael, as i write FTL :)

@Niklas, my Solutions sort on:
First: 3 digit numbers, leading zeros
Second: Character

This are the sort keys (not sorted):

009Sö
011Sö
215U
005M
016M
000
010M
001M

They have all different Numbers, so charakter dont care.

You can change the regex to whatever you need.

talasnaly

I like that your solution use the “sv_runinskr_trim” values. However, it seems that it sorts on numbers only. Since all of my values contain letters first, not just M, Sö, and U, as in the example, I also want them sorted on the letters. Therefore your solution will not work in my particular case. I like your variable naming skills, though. dqfansurvey