Sorting with two digits in string - Python

Sorting with two digits in string - Python - python

I am new to Python and I have a hard time solving this.
I am trying to sort a list to be able to human sort it 1) by the first number and 2) the second number. I would like to have something like this:
'1-1bird'
'1-1mouse'
'1-1nmouses'
'1-2mouse'
'1-2nmouses'
'1-3bird'
'10-1birds'
(...)
Those numbers can be from 1 to 99 ex: 99-99bird is possible.
This is the code I have after a couple of headaches. Being able to then sort by the following first letter would be a bonus.
Here is what I've tried:
#!/usr/bin/python
myList = list()
myList = ['1-10bird', '1-10mouse', '1-10nmouses', '1-10person', '1-10cat', '1-11bird', '1-11mouse', '1-11nmouses', '1-11person', '1-11cat', '1-12bird', '1-12mouse', '1-12nmouses', '1-12person', '1-13mouse', '1-13nmouses', '1-13person', '1-14bird', '1-14mouse', '1-14nmouses', '1-14person', '1-14cat', '1-15cat', '1-1bird', '1-1mouse', '1-1nmouses', '1-1person', '1-1cat', '1-2bird', '1-2mouse', '1-2nmouses', '1-2person', '1-2cat', '1-3bird', '1-3mouse', '1-3nmouses', '1-3person', '1-3cat', '2-14cat', '2-15cat', '2-16cat', '2-1bird', '2-1mouse', '2-1nmouses', '2-1person', '2-1cat', '2-2bird', '2-2mouse', '2-2nmouses', '2-2person']
def mysort(x,y):
x1=""
y1=""
for myletter in x :
if myletter.isdigit() or "-" in myletter:
x1=x1+myletter
x1 = x1.split("-")
for myletter in y :
if myletter.isdigit() or "-" in myletter:
y1=y1+myletter
y1 = y1.split("-")
if x1[0]>y1[0]:
return 1
elif x1[0]==y1[0]:
if x1[1]>y1[1]:
return 1
elif x1==y1:
return 0
else :
return -1
else :
return -1
myList.sort(mysort)
print myList
Thanks !
Martin

You have some good ideas with splitting on '-' and using isalpha() and isdigit(), but then we'll use those to create a function that takes in an item and returns a "clean" version of the item, which can be easily sorted. It will create a three-digit, zero-padded representation of the first number, then a similar thing with the second number, then the "word" portion (instead of just the first character). The result looks something like "001001bird" (that won't display - it'll just be used internally). The built-in function sorted() will use this callback function as a key, taking each element, passing it to the callback, and basing the sort order on the returned value. In the test, I use the * operator and the sep argument to print it without needing to construct a loop, but looping is perfectly fine as well.
def callback(item):
phrase = item.split('-')
first = phrase[0].rjust(3, '0')
second = ''.join(filter(str.isdigit, phrase[1])).rjust(3, '0')
word = ''.join(filter(str.isalpha, phrase[1]))
return first + second + word
Test:
>>> myList = ['1-10bird', '1-10mouse', '1-10nmouses', '1-10person', '1-10cat', '1-11bird', '1-11mouse', '1-11nmouses', '1-11person', '1-11cat', '1-12bird', '1-12mouse', '1-12nmouses', '1-12person', '1-13mouse', '1-13nmouses', '1-13person', '1-14bird', '1-14mouse', '1-14nmouses', '1-14person', '1-14cat', '1-15cat', '1-1bird', '1-1mouse', '1-1nmouses', '1-1person', '1-1cat', '1-2bird', '1-2mouse', '1-2nmouses', '1-2person', '1-2cat', '1-3bird', '1-3mouse', '1-3nmouses', '1-3person', '1-3cat', '2-14cat', '2-15cat', '2-16cat', '2-1bird', '2-1mouse', '2-1nmouses', '2-1person', '2-1cat', '2-2bird', '2-2mouse', '2-2nmouses', '2-2person']
>>> print(*sorted(myList, key=callback), sep='\n')
1-1bird
1-1cat
1-1mouse
1-1nmouses
1-1person
1-2bird
1-2cat
1-2mouse
1-2nmouses
1-2person
1-3bird
1-3cat
1-3mouse
1-3nmouses
1-3person
1-10bird
1-10cat
1-10mouse
1-10nmouses
1-10person
1-11bird
1-11cat
1-11mouse
1-11nmouses
1-11person
1-12bird
1-12mouse
1-12nmouses
1-12person
1-13mouse
1-13nmouses
1-13person
1-14bird
1-14cat
1-14mouse
1-14nmouses
1-14person
1-15cat
2-1bird
2-1cat
2-1mouse
2-1nmouses
2-1person
2-2bird
2-2mouse
2-2nmouses
2-2person
2-14cat
2-15cat
2-16cat

You need leading zeros. Strings are sorted alphabetically with the order different from the one for digits. It should be
'01-1bird'
'01-1mouse'
'01-1nmouses'
'01-2mouse'
'01-2nmouses'
'01-3bird'
'10-1birds'
As you you see 1 goes after 0.

The other answers here are very respectable, I'm sure, but for full credit you should ensure that your answer fits on a single line and uses as many list comprehensions as possible:
import itertools
[''.join(r) for r in sorted([[''.join(x) for _, x in
itertools.groupby(v, key=str.isdigit)]
for v in myList], key=lambda v: (int(v[0]), int(v[2]), v[3]))]
That should do nicely:
['1-1bird',
'1-1cat',
'1-1mouse',
'1-1nmouses',
'1-1person',
'1-2bird',
'1-2cat',
'1-2mouse',
...
'2-2person',
'2-14cat',
'2-15cat',
'2-16cat']

Related

How to sort a list of strings delimited by '.' with also numbers in the middle?

I have a list of strings that contain commands separated by a dot . like this:
DeviceA.CommandA.1.Hello,
DeviceA.CommandA.2.Hello,
DeviceA.CommandA.11.Hello,
DeviceA.CommandA.3.Hello,
DeviceA.CommandB.1.Hello,
DeviceA.CommandB.1.Bye,
DeviceB.CommandB.What,
DeviceA.SubdeviceA.CommandB.1.Hello,
DeviceA.SubdeviceA.CommandB.2.Hello,
DeviceA.SubdeviceB.CommandA.1.What
And I would want to order them in natural order:
The order must prioritize by field index (e.g The commands that start with DeviceA will always go before DeviceB etc)
Order alphabetically the strings
When it finds a number sort numerically in ascending order
Therefore, the sorted output should be:
DeviceA.CommandA.1.Hello,
DeviceA.CommandA.2.Hello,
DeviceA.CommandA.3.Hello,
DeviceA.CommandA.11.Hello,
DeviceA.CommandB.1.Bye,
DeviceA.CommandB.1.Hello,
DeviceA.SubdeviceA.CommandB.1.Hello,
DeviceA.SubdeviceA.CommandB.2.Hello,
DeviceA.SubdeviceB.CommandA.What,
DeviceB.CommandB.What
Also note that the length of the command fields is dynamic, the number of fields separated by dot can be any size.
So far I tried this without luck (the numbers are order alphabetically, for example 11 goes before 5):
list = [
"DeviceA.CommandA.1.Hello",
"DeviceA.CommandA.2.Hello",
"DeviceA.CommandA.11.Hello",
"DeviceA.CommandA.3.Hello",
"DeviceA.CommandB.1.Hello",
"DeviceA.CommandB.1.Bye",
"DeviceB.CommandB.What",
"DeviceA.SubdeviceA.CommandB.1.Hello",
"DeviceA.SubdeviceA.CommandB.2.Hello",
"DeviceA.SubdeviceB.CommandA.1.What"
]
sorted_list = sorted(list, key=lambda x: x.split('.'))
EDIT: Corrected typo error.

Something like this should get you going.
from pprint import pprint
data_list = [
"DeviceA.CommandA.1.Hello",
"DeviceA.CommandA.2.Hello",
"DeviceA.CommandA.3.Hello",
"DeviceA.CommandB.1.Hello",
"DeviceA.CommandB.1.Bye",
"DeviceB.CommandB.What",
"DeviceA.SubdeviceA.CommandB.1.Hello",
"DeviceA.SubdeviceA.CommandB.15.Hello", # added test case to ensure numbers are sorted numerically
"DeviceA.SubdeviceA.CommandB.2.Hello",
"DeviceA.SubdeviceB.CommandA.1.What",
]
def get_sort_key(s):
# Turning the pieces to integers would fail some comparisons (1 vs "What")
# so instead pad them on the left to a suitably long string
return [
bit.rjust(30, "0") if bit.isdigit() else bit
for bit in s.split(".")
]
# Note the key function must be passed as a kwarg.
sorted_list = sorted(data_list, key=get_sort_key)
pprint(sorted_list)
The output is
['DeviceA.CommandA.1.Hello',
'DeviceA.CommandA.2.Hello',
'DeviceA.CommandA.3.Hello',
'DeviceA.CommandB.1.Bye',
'DeviceA.CommandB.1.Hello',
'DeviceA.SubdeviceA.CommandB.1.Hello',
'DeviceA.SubdeviceA.CommandB.2.Hello',
'DeviceA.SubdeviceA.CommandB.15.Hello',
'DeviceA.SubdeviceB.CommandA.1.What',
'DeviceB.CommandB.What']

Specifying a key in sorted seems to achieve what you want:
import re
def my_key(s):
n = re.search("\d+",s)
return (s[:n.span()[0]], int(n[0])) if n else (s,)
print(sorted(l, key = my_key))
Output:
['DeviceA.CommandA.1.Hello', 'DeviceA.CommandA.2.Hello', 'DeviceA.CommandA.3.Hello', 'DeviceA.CommandA.11.Hello', 'DeviceA.CommandB.1.Hello', 'DeviceA.CommandB.1.Bye', 'DeviceA.SubdeviceA.CommandB.1.Hello', 'DeviceA.SubdeviceA.CommandB.2.Hello', 'DeviceA.SubdeviceB.CommandA.1.What', 'DeviceB.CommandB.What']

There are many ways to achieve this. Here's one that doesn't rely on importation of any additional modules:
LOS = ['DeviceA.CommandA.1.Hello',
'DeviceA.CommandA.2.Hello',
'DeviceA.CommandA.11.Hello',
'DeviceA.CommandA.3.Hello',
'DeviceA.CommandB.1.Hello',
'DeviceA.CommandB.1.Bye',
'DeviceB.CommandB.What',
'DeviceA.SubdeviceA.CommandB.1.Hello',
'DeviceA.SubdeviceA.CommandB.2.Hello',
'DeviceA.SubdeviceB.CommandA.1.What']
def func(s):
tokens = s.split('.')
for i, token in enumerate(tokens):
try:
v = int(token)
return ('.'.join(tokens[0:i]), v)
except ValueError:
pass
return (s, 0)
print(sorted(LOS, key=func))

selecting a value using a conditional statement based on a list of tuples

I have a list of tuples converted from a dictionary. I am looking to compare a conditional value against the list of tuples(values) whether it is higher or lower starting from the beginning on the list. When this conditional value is lower than a tuple's(value) I want to use that specific tuple for further coding.
Please can somebody give me an insight into how this is achieved?
I am relatively new to coding, self-learning and I am not 100% sure the example would run but for the sake of demonstrating I have tried my best.
`tuple_list = [(12:00:00, £55.50), (13:00:00, £65.50), (14:00:00, £75.50), (15:00:00, £45.50), (16:00:00, £55.50)]
conditional_value = £50
if conditional_value != for x in tuple_list.values()
y = 0
if conditional_value < tuple_list(y)
y++1
else
///"return the relevant value from the tuple_list to use for further coding. I would be
looking to work with £45.50"///`
Thank you.

Just form a new list with a condition:
tuple_list = [("12:00:00", 55.50), ("13:00:00", 65.50), ("14:00:00", 75.50), ("15:00:00", 45.50), ("16:00:00", 55.50)]
threshold = 50
below = [tpl for tpl in tuple_list if tpl[1] < threshold]
print(below)
Which yields
[('15:00:00', 45.5)]
Note that I added quotation marks and removed the currency sign to be able to compare the values. If you happen to have the £ in your actual values, you'll have to preprocess (stripping) them before.

If I'm understanding your question correctly, this should be what you're looking for:
for key, value in tuple_list:
if conditional_value < value:
continue # Skips to next in the list.
else:
# Do further coding.

You can use
tuple_list = [("12:00:00", 55.50), ("13:00:00", 65.50), ("14:00:00", 75.50), ("15:00:00", 45.50), ("16:00:00", 55.50)]
conditional_value = 50
new_tuple_list = list(filter(lambda x: x[1] > conditional_value, tuple_list))
This code will return a new_tuple_list with all items that there value us greater then the conditional_value.

How to break protein sequence into equal size of 20 in python

I want to break the structure of protein into the chunks of 20 equal size
the structure of the protein is something like this
MASTEGANNMPKQVEVRMHDSHLGSEEPKHRHLGLRLCDKLGKNLLLTLTVFGVILGAVCGGLLRLASPI
HPDVVMLIAFPGDILMRMLKMLILPLIISSLITGLSGLDAKASGRLGTRAMVYYMSTTIIAAVLGVILVL
AIHPGNPKLKKQLGPGKKNDEVSSLDAFLDLIRNLFPENLVQACFQQIQTVTKKVLVAPPPDEEANATSA
VVSLLNETVTEVPEETKMVIKKGLEFKDGMNVLGLIGFFIAFGIAMGKMGDQAKLMVDFFNILNEIVMKL
VIMIMWYSPLGIACLICGKIIAIKDLEVVARQLGMYMVTVIIGLIIHGGIFLPLIYFVVTRKNPFSFFAG
IFQAWITALGTASSAGTLPVTFRCLEENLGIDKRVTRFVLPVGATINMDGTALYEAVAAIFIAQMNGVVL
DGGQIVTVSLTATLASVGAASIPSAGLVTMLLILTAVGLPTEDISLLVAVDWLLDRMRTSVNVVGDSFGA
GIVYHLSKSELDTIDSQHRVHEDIEMTKTQSIYDDMKNHRESNSNQCVYAAHNSVIVDECKVTLAANGKS
ADCSVEEEPWKREK
I have tried the by iterating loop
x="abfgjjhuyuryitfvbkjuhhgyuumnabcdfrfhghhoiutgfctrdgfvijnk"
length=len(x)
values= [length/20+1]
a=1
for i in length(a,x)
print(i)
but this is not working

Try this by importing the textwrap
import textwrap
myArray="MASTEGANNMPKQVEVRMHDSHLGSEEPKHRHLGLRLCDKLGKNLLLTLTVFGVILGAVCGGLLRLASPIHPDVVMLIAFPGDILMRMLKMLILPLIISSLITGLSGLDAKASGRLGTRAMVYYMSTTIIAAVLGVILVLAIHPGNPKLKKQLGPGKKNDEVSSLDAFLDLIRNLFPENLVQACFQQIQTVTKKVLVAPPPDEEANATSAVVSLLNETVTEVPEETKMVIKKGLEFKDGMNVLGLIGFFIAFGIAMGKMGDQAKLMVDFFNILNEIVMKLVIMIMWYSPLGIACLICGKIIAIKDLEVVARQLGMYMVTVIIGLIIHGGIFLPLIYFVVTRKNPFSFFAGIFQAWITALGTASSAGTLPVTFRCLEENLGIDKRVTRFVLPVGATINMDGTALYEAVAAIFIAQMNGVVLDGGQIVTVSLTATLASVGAASIPSAGLVTMLLILTAVGLPTEDISLLVAVDWLLDRMRTSVNVVGDSFGAGIVYHLSKSELDTIDSQHRVHEDIEMTKTQSIYDDMKNHRESNSNQCVYAAHNSVIVDECKVTLAANGKSADCSVEEEPWKREK"
list_string = str(myArray)
textwrap.wrap(list_string, 20)
the output is something like this!
['MASTEGANNMPKQVEVRMHD',
'SHLGSEEPKHRHLGLRLCDK',
'LGKNLLLTLTVFGVILGAVC',
'GGLLRLASPIHPDVVMLIAF',
'PGDILMRMLKMLILPLIISS',
'LITGLSGLDAKASGRLGTRA',
'MVYYMSTTIIAAVLGVILVL',
'AIHPGNPKLKKQLGPGKKND',
'EVSSLDAFLDLIRNLFPENL',
'VQACFQQIQTVTKKVLVAPP',
'PDEEANATSAVVSLLNETVT',
'EVPEETKMVIKKGLEFKDGM',
'NVLGLIGFFIAFGIAMGKMG',
'DQAKLMVDFFNILNEIVMKL',
'VIMIMWYSPLGIACLICGKI',
'IAIKDLEVVARQLGMYMVTV',
'IIGLIIHGGIFLPLIYFVVT',
'RKNPFSFFAGIFQAWITALG',
'TASSAGTLPVTFRCLEENLG',
'IDKRVTRFVLPVGATINMDG',
'TALYEAVAAIFIAQMNGVVL',
'DGGQIVTVSLTATLASVGAA',
'SIPSAGLVTMLLILTAVGLP',
'TEDISLLVAVDWLLDRMRTS',
'VNVVGDSFGAGIVYHLSKSE',
'LDTIDSQHRVHEDIEMTKTQ',
'SIYDDMKNHRESNSNQCVYA',
'AHNSVIVDECKVTLAANGKS',
'ADCSVEEEPWKREK']

Something like this would do the trick:
values = [x[i:i+20] for i in range(0, len(x), 20)]
Just as a reference:
x[a:b] takes a slice of the string x from the index a up to (but not including) index b, therefore x[i:i+20] takes a slice of size twenty starting from index i.
range(a, b, step) would generate a sequence of numbers from a up to (but not including) b with increments of step.

You could use re.findall, after first removing all whitespace, e.g.:
inp = """MASTEGANNMPKQVEVRMHDSHLGSEEPKHRHLGLRLCDKLGKNLLLTLTVFGVILGAVCGGLLRLASPI
HPDVVMLIAFPGDILMRMLKMLILPLIISSLITGLSGLDAKASGRLGTRAMVYYMSTTIIAAVLGVILVL
AIHPGNPKLKKQLGPGKKNDEVSSLDAFLDLIRNLFPENLVQACFQQIQTVTKKVLVAPPPDEEANATSA"""
inp = re.sub(r'\s+', '', inp)
chunks = re.findall(r'.{1,20}', inp)
This prints:
['MASTEGANNMPKQVEVRMHD',
'SHLGSEEPKHRHLGLRLCDK',
'LGKNLLLTLTVFGVILGAVC',
'GGLLRLASPIHPDVVMLIAF',
'PGDILMRMLKMLILPLIISS',
'LITGLSGLDAKASGRLGTRA',
'MVYYMSTTIIAAVLGVILVL',
'AIHPGNPKLKKQLGPGKKND',
'EVSSLDAFLDLIRNLFPENL',
'VQACFQQIQTVTKKVLVAPP',
'PDEEANATSA']

You could use something like this :
## With protein your string containing the data
size_of_split = 20
splited_protein = list(map(''.join, zip(*[iter(protein)]*size_of_split)))
It uses zip() in a way that is described in its documentation.
To quote it :
[...] clustering a data series into n-length groups using zip(*[iter(s)]*n).
This repeats the same iterator n times so that each output tuple has
the result of n calls to the iterator. This has the effect of dividing
the input into n-length chunks.

How can I take the lowest value in this code?

how are you?
I'm trying to take the lowest value of the following code, my idea is that for example the result will be like. country,price,date
im using python for the code
valores= ["al[8075]['2019-05-27']", "de[2177]['2019-05-27']", "at[3946]['2019-05-27']", "be[3019]['2019-05-26']", "by[5741]['2019-05-27']", "ba[0]['2019-05-26', '2019-05-27']", "bg[3223]['2019-05-26']", "hr[4358]['2019-05-26']", "dk[5006]['2019-05-27']", "sk[4964]['2019-05-27']", "si[5253]['2019-05-26']", "es[3813]['2019-05-27']", "ee[4699]['2019-05-27']", "ru[4889]['2019-05-27']", "fi[5410]['2019-05-26']", "fr[2506]['2019-05-26']", "gi[0]['2019-05-26', '2019-05-27']", "gr[1468]['2019-05-26']", "hu[3475]['2019-05-27']", "ie[5360]['2019-05-26']", "is[0]['2019-05-26']", "it[2970]['2019-05-26']", "lv[2482]['2019-05-27']", "lt[1276]['2019-05-27']", "lu[0]['2019-05-26']", "mk[5417]['2019-05-26']", "mt[3532]['2019-05-26']", "md[6158]['2019-05-27']", "me[11080]['2019-05-26']", "no[2967]['2019-05-27']", "nl[3640]['2019-05-27']", "pl[2596]['2019-05-27']", "pt[5409]['2019-05-27']", "uk[5010]['2019-05-27']", "cz[5493]['2019-05-26']", "ro[1017]['2019-05-27']", "rs[6535]['2019-05-27']", "se[3971]['2019-05-26']", "ch[5112]['2019-05-26']", "tr[3761]['2019-05-26']", "ua[5187]['2019-05-26']"]
the idea in this example will be like
as you see country(ro) price(1017) date('2019-05-27') is the lowest
valores= "ro[1017]['2019-05-27']"

Python's max() and min() functions take a key argument. So, whenever you need a minimum or maximum you can often leverage these built-ins. The only code you have to write something to convert a value to the corresponding representation for max/min purposes.
def f(s):
return int(s.split('[')[1].split(']')[0]) or float('inf')
lowest = min(valores, key = f) # ro[1017]['2019-05-27']

There are more than one way of coding this. The following will do this:
lowest = 1000000
target = " "
for i in valores:
ix = i.find("[") + 1
iy = i.find("]")
value = int(i[ix:iy])
if value < lowest and value != 0:
lowest = value
target = i
print(target)
It will output
"ro[1017]['2019-05-27]"
However, here I am assuming you do not want 0 values, otherwise the answer would be
"ba[0]['2019-05-26', '2019-05-27']"
If you want to include 0, just modify the if block.

This should work for you. I assume you want the lowest non-zero price.
I split every string in the lists into sublists via square brackets [ and strip away the extra brackets [ and ] for each item, hence each sublist will have [state, price, dates] .
I then sort on the price, which is the second item of each sublist, and filter out the 0 prices,
The result will then be the first element of the filtered list
import re
import re
valores= ["al[8075]['2019-05-27']", "de[2177]['2019-05-27']", "at[3946]['2019-05-27']", "be[3019]['2019-05-26']", "by[5741]['2019-05-27']", "ba[0]['2019-05-26', '2019-05-27']", "bg[3223]['2019-05-26']", "hr[4358]['2019-05-26']", "dk[5006]['2019-05-27']", "sk[4964]['2019-05-27']", "si[5253]['2019-05-26']", "es[3813]['2019-05-27']", "ee[4699]['2019-05-27']", "ru[4889]['2019-05-27']", "fi[5410]['2019-05-26']", "fr[2506]['2019-05-26']", "gi[0]['2019-05-26', '2019-05-27']", "gr[1468]['2019-05-26']", "hu[3475]['2019-05-27']", "ie[5360]['2019-05-26']", "is[0]['2019-05-26']", "it[2970]['2019-05-26']", "lv[2482]['2019-05-27']", "lt[1276]['2019-05-27']", "lu[0]['2019-05-26']", "mk[5417]['2019-05-26']", "mt[3532]['2019-05-26']", "md[6158]['2019-05-27']", "me[11080]['2019-05-26']", "no[2967]['2019-05-27']", "nl[3640]['2019-05-27']", "pl[2596]['2019-05-27']", "pt[5409]['2019-05-27']", "uk[5010]['2019-05-27']", "cz[5493]['2019-05-26']", "ro[1017]['2019-05-27']", "rs[6535]['2019-05-27']", "se[3971]['2019-05-26']", "ch[5112]['2019-05-26']", "tr[3761]['2019-05-26']", "ua[5187]['2019-05-26']"]
results = []
#Iterate through valores
for item in valores:
#Extract elements from each string by splitting on [ and then stripping extra square brackets
items = [it.strip('][') for it in item.split('[')]
results.append(items)
#Sort on the second element which is price, and filter prices with are 0
res = list(
filter(lambda x: int(x[1]) > 0,
sorted(results, key=lambda x:int(x[1])))
)
#This is your lowest non-zero price
print(res[0])
The output will be
['ro', '1017', "'2019-05-27'"]

Split string every nth character from the right?

I have different very large sets of files which I'd like to put in different subfolders. I already have an consecutive ID for every folder I want to use.
I want to split the ID from the right to always have 1000 folders in the deeper levels.
Example:
id: 100243 => resulting_path: './100/243'
id: 1234567890 => resulting path: '1/234/567/890'
I found Split string every nth character?, but all solutions are from left to right and I also did not want to import another module for one line of code.
My current (working) solution looks like this:
import os
base_path = '/home/made'
n=3 # take every 'n'th from the right
max_id = 12345678900
test_id = 24102442
# current algorithm
str_id = str(test_id).zfill(len(str(max_id)))
ext_path = list(reversed([str_id[max(i-n,0):i] for i in range(len(str_id),0,-n)]))
print(os.path.join(base_path, *ext_path))
Output is: /home/made/00/024/102/442
The current algorithm looks awkward and complicated for the simple thing I want to do.
I wonder if there is a better solution. If not it might help others, anyway.
Update:
I really like Joe Iddons solution. Using .join and mod makes it faster and more readable.
In the end I decided that I never want to have a /in front. To get rid of the preceeding /in case len(s)%3is zero, I changed the line to
'/'.join([s[max(0,i):i+3] for i in range(len(s)%3-3*(len(s)%3 != 0), len(s), 3)])
Thank you for your great help!
Update 2:
If you are going to use os.path.join (like in my previous code) its even simpler since os.path.jointakes care of the format of the args itself:
ext_path = [s[0:len(s)%3]] + [s[i:i+3] for i in range(len(s)%3, len(s), 3)]
print(os.path.join('/home', *ext_path))

You can adapt the answer you linked, and use the beauty of mod to create a nice little one-liner:
>>> s = '1234567890'
>>> '/'.join([s[0:len(s)%3]] + [s[i:i+3] for i in range(len(s)%3, len(s), 3)])
'1/234/567/890'
and if you want this to auto-add the dot for the cases like your first example of:
s = '100243'
then you can just add a mini ternary use or as suggested by #MosesKoledoye:
>>> '/'.join(([s[0:len(s)%3] or '.']) + [s[i:i+3] for i in range(len(s)%3, len(s), 3)])
'./100/243'
This method will also be faster than reversing the string before hand or reversing a list.

Then if you got a solution for the direction left to right, why not simply reverse the input and output ?
str = '1234567890'
str[::-1]
Output:
'0987654321'
You can use the solution you found for left to right and then, you simply need to reverse it again.

You could use regex and modulo to split the strings into groups of three. This solution should get you started:
import re
s = [100243, 1234567890]
final_s = ['./'+'/'.join(re.findall('.{2}.', str(i))) if len(str(i))%3 == 0 else str(i)[:len(str(i))%3]+'/'+'/'.join(re.findall('.{2}.', str(i)[len(str(i))%3:])) for i in s]
Output:
['./100/243', '1/234/567/890']

Try this:
>>> line = '1234567890'
>>> n = 3
>>> rev_line = line[::-1]
>>> out = [rev_line[i:i+n][::-1] for i in range(0, len(line), n)]
>>> ['890', '567', '234', '1']
>>> "/".join(reversed(out))
>>> '1/234/567/890'

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Sorting with two digits in string - Python - python

You need leading zeros. Strings are sorted alphabetically with the order different from the one for digits. It should be '01-1bird' '01-1mouse' '01-1nmouses' '01-2mouse' '01-2nmouses' '01-3bird' '10-1birds' As you you see 1 goes after 0.

Related

How to sort a list of strings delimited by '.' with also numbers in the middle?

selecting a value using a conditional statement based on a list of tuples

How to break protein sequence into equal size of 20 in python

How can I take the lowest value in this code?

Split string every nth character from the right?

Categories

Resources