python String Formatting Operations - python

Faulty code:
pos_1 = 234
pos_n = 12890
min_width = len(str(pos_n)) # is there a better way for this?
# How can I use min_width as the minimal width of the two conversion specifiers?
# I don't understand the Python documentation on this :(
raw_str = '... from %(pos1)0*d to %(posn)0*d ...' % {'pos1':pos_1, 'posn': pos_n}
Required output:
... from 00234 to 12890 ...
______________________EDIT______________________
New code:
# I changed my code according the second answer
pos_1 = 10234 # can be any value between 1 and pos_n
pos_n = 12890
min_width = len(str(pos_n))
raw_str = '... from % *d to % *d ...' % (min_width, pos_1, min_width, pos_n)
New Problem:
There is one extra whitespace (I marked it _) in front of the integer values, for intigers with min_width digits:
print raw_str
... from _10234 to _12890 ...
Also, I wonder if there is a way to add Mapping keys?

pos_1 = 234
pos_n = 12890
min_width = len(str(pos_n))
raw_str = '... from %0*d to %0*d ...' % (min_width, pos_1, min_width, pos_n)

Concerning using a mapping type as second argument to '%':
I presume you mean something like that '%(mykey)d' % {'mykey': 3}, right?! I think you cannot use this if you use the "%*d" syntax, since there is no way to provide the necessary width arguments with a dict.
But why don't you generate your format string dynamically:
fmt = '... from %%%dd to %%%dd ...' % (min_width, min_width)
# assuming min_width is e.g. 7 fmt would be: '... from %7d to %7d ...'
raw_string = fmt % pos_values_as_tuple_or_dict
This way you decouple the width issue from the formatting of the actual values, and you can use a tuple or a dict for the latter, as it suits you.

"1234".rjust(13,"0")
Should do what you need
addition:
a = ["123", "12"]
max_width = sorted([len(i) for i in a])[-1]
put max_width instead of 13 above and put all your strings in a single array a (which seems to me much more usable than having a stack of variables).
additional nastyness:
(Using array of numbers to get closer to your question.)
a = [123, 33, 0 ,223]
[str(x).rjust(sorted([len(str(i)) for i in a])[-1],"0") for x in a]
Who said Perl is the only language to easily produce braindumps in? If regexps are the godfather of complex code, then list comprehension is the godmother.
(I am relatively new to python and rather convinced that there must be a max-function on arrays somewhere, which would reduce above complexity. .... OK, checked, there is. Pity, have to reduce the example.)
[str(x).rjust(max([len(str(i) for i in a]),"0") for x in a]
And please observe below comments on "not putting calculation of an invariant (the max value) inside the outer list comprehension".

Related

PySpark / Python Slicing and Indexing Issue

Can someone let me know how to pull out certain values from a Python output.
I would like the retrieve the value 'ocweeklyreports' from the the following output using either indexing or slicing:
'config': '{"hiveView":"ocweeklycur.ocweeklyreports"}
This should be relatively easy, however, I'm having problem defining the Slicing / Indexing configuation
The following will successfully give me 'ocweeklyreports'
myslice = config['hiveView'][12:30]
However, I need the indexing or slicing modified so that I will get any value after'ocweeklycur'
I'm not sure what output you're dealing with and how robust you're wanting it but if it's just a string you can do something similar to this (for a quick and dirty solution).
input = "Your input"
indexStart = input.index('.') + 1 # Get the index of the input at the . which is where you would like to start collecting it
finalResponse = input[indexStart:-2])
print(finalResponse) # Prints ocweeklyreports
Again, not the most elegant solution but hopefully it helps or at least offers a starting point. Another more robust solution would be to use regex but I'm not that skilled in regex at the moment.
You could almost all of it using regex.
See if this helps:
import re
def search_word(di):
st = di["config"]["hiveView"]
p = re.compile(r'^ocweeklycur.(?P<word>\w+)')
m = p.search(st)
return m.group('word')
if __name__=="__main__":
d = {'config': {"hiveView":"ocweeklycur.ocweeklyreports"}}
print(search_word(d))
The following worked best for me:
# Extract the value of the "hiveView" key
hive_view = config['hiveView']
# Split the string on the '.' character
parts = hive_view.split('.')
# The value you want is the second part of the split string
desired_value = parts[1]
print(desired_value) # Output: "ocweeklyreports"

Split string every nth character from the right?

I have different very large sets of files which I'd like to put in different subfolders. I already have an consecutive ID for every folder I want to use.
I want to split the ID from the right to always have 1000 folders in the deeper levels.
Example:
id: 100243 => resulting_path: './100/243'
id: 1234567890 => resulting path: '1/234/567/890'
I found Split string every nth character?, but all solutions are from left to right and I also did not want to import another module for one line of code.
My current (working) solution looks like this:
import os
base_path = '/home/made'
n=3 # take every 'n'th from the right
max_id = 12345678900
test_id = 24102442
# current algorithm
str_id = str(test_id).zfill(len(str(max_id)))
ext_path = list(reversed([str_id[max(i-n,0):i] for i in range(len(str_id),0,-n)]))
print(os.path.join(base_path, *ext_path))
Output is: /home/made/00/024/102/442
The current algorithm looks awkward and complicated for the simple thing I want to do.
I wonder if there is a better solution. If not it might help others, anyway.
Update:
I really like Joe Iddons solution. Using .join and mod makes it faster and more readable.
In the end I decided that I never want to have a /in front. To get rid of the preceeding /in case len(s)%3is zero, I changed the line to
'/'.join([s[max(0,i):i+3] for i in range(len(s)%3-3*(len(s)%3 != 0), len(s), 3)])
Thank you for your great help!
Update 2:
If you are going to use os.path.join (like in my previous code) its even simpler since os.path.jointakes care of the format of the args itself:
ext_path = [s[0:len(s)%3]] + [s[i:i+3] for i in range(len(s)%3, len(s), 3)]
print(os.path.join('/home', *ext_path))
You can adapt the answer you linked, and use the beauty of mod to create a nice little one-liner:
>>> s = '1234567890'
>>> '/'.join([s[0:len(s)%3]] + [s[i:i+3] for i in range(len(s)%3, len(s), 3)])
'1/234/567/890'
and if you want this to auto-add the dot for the cases like your first example of:
s = '100243'
then you can just add a mini ternary use or as suggested by #MosesKoledoye:
>>> '/'.join(([s[0:len(s)%3] or '.']) + [s[i:i+3] for i in range(len(s)%3, len(s), 3)])
'./100/243'
This method will also be faster than reversing the string before hand or reversing a list.
Then if you got a solution for the direction left to right, why not simply reverse the input and output ?
str = '1234567890'
str[::-1]
Output:
'0987654321'
You can use the solution you found for left to right and then, you simply need to reverse it again.
You could use regex and modulo to split the strings into groups of three. This solution should get you started:
import re
s = [100243, 1234567890]
final_s = ['./'+'/'.join(re.findall('.{2}.', str(i))) if len(str(i))%3 == 0 else str(i)[:len(str(i))%3]+'/'+'/'.join(re.findall('.{2}.', str(i)[len(str(i))%3:])) for i in s]
Output:
['./100/243', '1/234/567/890']
Try this:
>>> line = '1234567890'
>>> n = 3
>>> rev_line = line[::-1]
>>> out = [rev_line[i:i+n][::-1] for i in range(0, len(line), n)]
>>> ['890', '567', '234', '1']
>>> "/".join(reversed(out))
>>> '1/234/567/890'

Sorting with two digits in string - Python

I am new to Python and I have a hard time solving this.
I am trying to sort a list to be able to human sort it 1) by the first number and 2) the second number. I would like to have something like this:
'1-1bird'
'1-1mouse'
'1-1nmouses'
'1-2mouse'
'1-2nmouses'
'1-3bird'
'10-1birds'
(...)
Those numbers can be from 1 to 99 ex: 99-99bird is possible.
This is the code I have after a couple of headaches. Being able to then sort by the following first letter would be a bonus.
Here is what I've tried:
#!/usr/bin/python
myList = list()
myList = ['1-10bird', '1-10mouse', '1-10nmouses', '1-10person', '1-10cat', '1-11bird', '1-11mouse', '1-11nmouses', '1-11person', '1-11cat', '1-12bird', '1-12mouse', '1-12nmouses', '1-12person', '1-13mouse', '1-13nmouses', '1-13person', '1-14bird', '1-14mouse', '1-14nmouses', '1-14person', '1-14cat', '1-15cat', '1-1bird', '1-1mouse', '1-1nmouses', '1-1person', '1-1cat', '1-2bird', '1-2mouse', '1-2nmouses', '1-2person', '1-2cat', '1-3bird', '1-3mouse', '1-3nmouses', '1-3person', '1-3cat', '2-14cat', '2-15cat', '2-16cat', '2-1bird', '2-1mouse', '2-1nmouses', '2-1person', '2-1cat', '2-2bird', '2-2mouse', '2-2nmouses', '2-2person']
def mysort(x,y):
x1=""
y1=""
for myletter in x :
if myletter.isdigit() or "-" in myletter:
x1=x1+myletter
x1 = x1.split("-")
for myletter in y :
if myletter.isdigit() or "-" in myletter:
y1=y1+myletter
y1 = y1.split("-")
if x1[0]>y1[0]:
return 1
elif x1[0]==y1[0]:
if x1[1]>y1[1]:
return 1
elif x1==y1:
return 0
else :
return -1
else :
return -1
myList.sort(mysort)
print myList
Thanks !
Martin
You have some good ideas with splitting on '-' and using isalpha() and isdigit(), but then we'll use those to create a function that takes in an item and returns a "clean" version of the item, which can be easily sorted. It will create a three-digit, zero-padded representation of the first number, then a similar thing with the second number, then the "word" portion (instead of just the first character). The result looks something like "001001bird" (that won't display - it'll just be used internally). The built-in function sorted() will use this callback function as a key, taking each element, passing it to the callback, and basing the sort order on the returned value. In the test, I use the * operator and the sep argument to print it without needing to construct a loop, but looping is perfectly fine as well.
def callback(item):
phrase = item.split('-')
first = phrase[0].rjust(3, '0')
second = ''.join(filter(str.isdigit, phrase[1])).rjust(3, '0')
word = ''.join(filter(str.isalpha, phrase[1]))
return first + second + word
Test:
>>> myList = ['1-10bird', '1-10mouse', '1-10nmouses', '1-10person', '1-10cat', '1-11bird', '1-11mouse', '1-11nmouses', '1-11person', '1-11cat', '1-12bird', '1-12mouse', '1-12nmouses', '1-12person', '1-13mouse', '1-13nmouses', '1-13person', '1-14bird', '1-14mouse', '1-14nmouses', '1-14person', '1-14cat', '1-15cat', '1-1bird', '1-1mouse', '1-1nmouses', '1-1person', '1-1cat', '1-2bird', '1-2mouse', '1-2nmouses', '1-2person', '1-2cat', '1-3bird', '1-3mouse', '1-3nmouses', '1-3person', '1-3cat', '2-14cat', '2-15cat', '2-16cat', '2-1bird', '2-1mouse', '2-1nmouses', '2-1person', '2-1cat', '2-2bird', '2-2mouse', '2-2nmouses', '2-2person']
>>> print(*sorted(myList, key=callback), sep='\n')
1-1bird
1-1cat
1-1mouse
1-1nmouses
1-1person
1-2bird
1-2cat
1-2mouse
1-2nmouses
1-2person
1-3bird
1-3cat
1-3mouse
1-3nmouses
1-3person
1-10bird
1-10cat
1-10mouse
1-10nmouses
1-10person
1-11bird
1-11cat
1-11mouse
1-11nmouses
1-11person
1-12bird
1-12mouse
1-12nmouses
1-12person
1-13mouse
1-13nmouses
1-13person
1-14bird
1-14cat
1-14mouse
1-14nmouses
1-14person
1-15cat
2-1bird
2-1cat
2-1mouse
2-1nmouses
2-1person
2-2bird
2-2mouse
2-2nmouses
2-2person
2-14cat
2-15cat
2-16cat
You need leading zeros. Strings are sorted alphabetically with the order different from the one for digits. It should be
'01-1bird'
'01-1mouse'
'01-1nmouses'
'01-2mouse'
'01-2nmouses'
'01-3bird'
'10-1birds'
As you you see 1 goes after 0.
The other answers here are very respectable, I'm sure, but for full credit you should ensure that your answer fits on a single line and uses as many list comprehensions as possible:
import itertools
[''.join(r) for r in sorted([[''.join(x) for _, x in
itertools.groupby(v, key=str.isdigit)]
for v in myList], key=lambda v: (int(v[0]), int(v[2]), v[3]))]
That should do nicely:
['1-1bird',
'1-1cat',
'1-1mouse',
'1-1nmouses',
'1-1person',
'1-2bird',
'1-2cat',
'1-2mouse',
...
'2-2person',
'2-14cat',
'2-15cat',
'2-16cat']

Aligning integers(Basic python)

Hey guys I need some help aligning my integers. I will show you what my code is, what my output is, and what I want my output to be. Thanks!
Code:
test_sign='#'
test_numbers=[100000,5000000,7000000]
test_calc_list=[]
test_sum=sum(test_numbers)
test_list=['Testcase1','Testcase2','Testcase3']
test_sign_list=[]
for x in test_numbers:
test_calc=round((x/float(test_sum)*10))
test_calc_list.append(test_calc)
for y in test_calc_list:
y=int(y)
signs=y*test_sign
test_sign_list.append(signs)
for z in range(len(test_list)):
print "%8s"%test_list[z]+":",test_sign_list[z],test_numbers[z]
Output:
Testcase1: 100000
Testcase2: #### 5000000
Testcase3: ###### 7000000
Desired output:
Testcase1: 100000
Testcase2: #### 5000000
Testcase3: ###### 7000000
This might be a good time to learn {}-formatting, instead of learning more in-depth about the (not-quite-deprecated, but discouraged) %-formatting.
Especially since the only %-formatting you're using seems to be incorrect. (There's no good reason to use %8s for a string you know is going to be 9 characters long…)
So:
print '{}: {:<6} {:>7}'.format(test_list[z], test_sign_list[z], test_numbers[z])
See String Formatting for details on all the options.
As a side note, I think your loop would be more readable this way:
for test, sign, number in zip(test_list, test_sign_list, test_numbers):
print '{}: {:<6} {:>7}'.format(test, sign, number)
Option one, specify length in format:
http://docs.python.org/2/library/string.html#format-specification-mini-language
"width is a decimal integer defining the minimum field width. If not specified, then the field width will be determined by the content."
Option two, pre-pad strings using ljust, rjust and center:
http://docs.python.org/2/library/string.html#string.ljust
Change
print "%8s"%test_list[z]+":",test_sign_list[z],test_numbers[z]
to
print "%8s: %-6s %7i" % (test_list[z], test_sign_list[z], test_numbers[z])
strings = ["abc", "sakjfslkdfnds", "7"]
maxlength = max(map(len, strings))
for index, string in enumerate(strings):
print("Testcase%d: %s" % (index, string.rjust(maxlength, ".")))
Leave out the "." argument if you just want spaces.

I need to change a zip code into a series of dots and dashes (a barcode), but I can't figure out how

Here's what I've got so far:
def encodeFive(zip):
zero = "||:::"
one = ":::||"
two = "::|:|"
three = "::||:"
four = ":|::|"
five = ":|:|:"
six = ":||::"
seven = "|:::|"
eight = "|::|:"
nine = "|:|::"
codeList = [zero,one,two,three,four,five,six,seven,eight,nine]
allCodes = zero+one+two+three+four+five+six+seven+eight+nine
code = ""
digits = str(zip)
for i in digits:
code = code + i
return code
With this I'll get the original zip code in a string, but none of the numbers are encoded into the barcode. I've figured out how to encode one number, but it wont work the same way with five numbers.
codeList = ["||:::", ":::||", "::|:|", "::||:", ":|::|",
":|:|:", ":||::", "|:::|", "|::|:", "|:|::" ]
barcode = "".join(codeList[int(digit)] for digit in str(zipcode))
Perhaps use a dictionary:
barcode = {'0':"||:::",
'1':":::||",
'2':"::|:|",
'3':"::||:",
'4':":|::|",
'5':":|:|:",
'6':":||::",
'7':"|:::|",
'8':"|::|:",
'9':"|:|::",
}
def encodeFive(zipcode):
return ''.join(barcode[n] for n in str(zipcode))
print(encodeFive(72353))
# |:::|::|:|::||::|:|:::||:
PS. It is better not to name a variable zip, since doing so overrides the builtin function zip. And similarly, it is better to avoid naming a variable code, since code is a module in the standard library.
You're just adding i (the character in digits) to the string where I think you want to be adding codeList[int(i)].
The code would probably be much simpler by just using a dict for lookups.
I find it easier to use split() to create lists of strings:
codes = "||::: :::|| ::|:| ::||: :|::| :|:|: :||:: |:::| |::|: |:|::".split()
def zipencode(numstr):
return ''.join(codes[int(x)] for x in str(numstr))
print zipencode("32345")
This is made in python.
number = ["||:::",
":::||",
"::|:|",
"::||:",
":|::|",
":|:|:",
":||::",
"|:::|",
"|::|:",
"|:|::"
]
def encode(num):
return ''.join(map(lambda x: number[int(x)], str(num)))
print encode(32345)
I don't know what language you are usingm so I made an example in C#:
int zip = 72353;
string[] codeList = {
"||:::", ":::||", "::|:|", "::||:", ":|::|",
":|:|:", ":||::", "|:::|", "|::|:", "|:|::"
};
string code = String.Empty;
while (zip > 0) {
code = codeList[zip % 10] + code;
zip /= 10;
}
return code;
Note: Instead of converting the zip code to a string, and the convert each character back to a number, I calculated the digits numerically.
Just for fun, here's a one-liner:
return String.Concat(zip.ToString().Select(c => "||::::::||::|:|::||::|::|:|:|::||::|:::||::|:|:|::".Substring(((c-'0') % 10) * 5, 5)).ToArray());
It appears you're trying to generate a "postnet" barcode. Note that the five-digit ZIP postnet barcodes were obsoleted by ZIP+4 postnet barcodes, which were obsoleted by ZIP+4+2 delivery point postnet barcodes, all of which are supposed to include a checksum digit and leading and ending framing bars. In any case, all of those forms are being obsoleted by the new "intelligent mail" 4-state barcodes, which require a lot of computational code to generate and no longer rely on straight digit-to-bars mappings. Search USPS.COM for more details.

Categories