def convert(time):
pos = ["s","m","h","d"]
time_dict = {"s": 1,"m": 60,"h": 3600,"d": 24*3600 }
unit = time[-1]
if unit not in pos:
return -1
try:
timeVal = int(time[:-1])
except:
return -2
return timeVal*time_dict[unit]
Currently, this is my code and I'm using it to translate Strings like 5d or 30m to seconds. And that's work, but if I try to combine them (like 5d 30m, it gives me the output -2. I don't really see what's wrong here.
Your problem is that you're only checking the last character, you need to parse the string to find each individual group and then work off of that
import re
def convert(time):
time_dict = {"s": 1,"m": 60,"h": 3600,"d": 24*3600 }
regex_groups = re.findall("(\d+)([smhd])", time)
return sum(int(x) * time_dict[y] for x,y in regex_groups)
I don't really see what's wrong here.
Lets say you provided 5d 30m as input, [:-1] does jettison last character which result in 5d 30. You then try to convert it to int which fails, as d is not allowed in integer representation.
You need first to tokenize elements then convert every piece to value in seconds then sum them together, simplified example with h and m only:
def to_seconds(token):
q = {"h":3600,"m":60}
return int(token[:-1])*q[token[-1]]
def convert(time):
return sum(to_seconds(i) for i in time.split())
print(convert("5h 30m"))
output
19800
Disclaimer: this solution assumes that elements are whitespaces sheared
Related
Im not a programer so go easy on me please ! I have a system of 4 linear equations and 4 unknowns, which I think I could use python to solve relatively easily. However my equations not of the form " 5x+2y+z-w=0 " instead I have algebraic constants c_i which I dont know the explicit numerical value of, for example " c_1 x + c_2 y + c_3 z+ c_4w=c_5 " would be one my four equations. So does a solver exist which gives answers for x,y,z,w in terms of the c_i ?
Numpy has a function for this exact problem: numpy.linalg.solve
To construct the matrix we first need to digest the string turning it into an array of coefficients and solutions.
Finding Numbers
First we need to write a function that takes a string like "c_1 3" and returns the number 3.0. Depending on the format you want in your input string you can either iterate over all chars in this array and stop when you find a non-digit character, or you can simply split on the space and parse the second string. Here are both solutions:
def find_number(sub_expr):
"""
Finds the number from the format
number*string or numberstring.
Example:
3x -> 3
4*x -> 4
"""
num_str = str()
for char in sub_expr:
if char.isdigit():
num_str += char
else:
break
return float(num_str)
or the simpler solution
def find_number(sub_expr):
"""
Returns the number from the format "string number"
"""
return float(sub_expr.split()[1])
Note: See edits
Get matrices
Now we can use that to split each expression into two parts: The solution and the equation by the "=". The equation is then split into sub_expressions by the "+" This way we would end turn the string "3x+4y = 3" into
sub_expressions = ["3x", "4y"]
solution_string = "3"
Each sub expression then needs to be fed into our find_numbers function. The End result can be appended to the coefficient and solution matrices:
def get_matrices(expressions):
"""
Returns coefficient_matrix and solutions from array of string-expressions.
"""
coefficient_matrix = list()
solutions = list()
last_len = -1
for expression in expressions:
# Note: In this solution all coefficients must be explicitely noted and must always be in the same order.
# Could be solved with dicts but is probably overengineered.
if not "=" in expression:
print(f"Invalid expression {expression}. Missing \"=\"")
return False
try:
c_string, s_string = expression.split("=")
c_strings = c_string.split("+")
solutions.append(float(s_string))
current_len = len(c_strings)
if last_len != -1 and current_len != last_len:
print(f"The expression {expression} has a mismatching number of coefficients")
return False
last_len = current_len
coefficients = list()
for c_string in c_strings:
coefficients.append(find_number(c_string))
coefficient_matrix.append(coefficients)
except Exception as e:
print(f"An unexpected Runtime Error occured at {coefficient}")
print(e)
exit()
return coefficient_matrix, solutions
Now let's write a simple main function to test this code:
# This is not the code you want to copy-paste
# Look further down.
from sys import argv as args
def main():
expressions = args[1:]
matrix, solutions = get_matrices(expressions)
for row in matrix:
print(row)
print("")
print(solutions)
if __name__ == "__main__":
main()
Let's run the program in the console!
user:$ python3 solve.py 2x+3y=4 3x+3y=2
[2.0, 3.0]
[3.0, 3.0]
[4.0, 2.0]
You can see that the program identified all our numbers correctly
AGAIN: use the find_number function appropriate for your format
Put The Pieces Together
These Matrices now just need to be pumped directly into the numpy function:
# This is the main you want
from sys import argv as args
from numpy.linalg import solve as solve_linalg
def main():
expressions = args[1:]
matrix, solutions = get_matrices(expressions)
coefficients = solve_linalg(matrix, solutions)
print(coefficients)
# This bit needs to be at the very bottom of your code to load all functions first.
# You could just paste the main-code here, but this is considered best-practice
if __name__ == '__main__':
main()
Now let's test that:
$ python3 solve.py x*2+y*4+z*0=20 x*1+y*1+z*-1=3 x*2+y*2+z*-3=3
[2. 4. 3.]
As you can see the program now solves the functions for us.
Out of curiosity: Math homework? This feels like math homework.
Edit: Had a typo "c_string" instead of "c_strings" worked out in all tests out of pure and utter luck.
Edit 2: Upon further inspection I would reccomend to split the sub-expressions by a "*":
def find_number(sub_expr):
"""
Returns the number from the format "string number"
"""
return float(sub_expr.split("*")[1])
This results in fairly readable input strings
Introduction to the problem
I have inputs in a .txt file and I want to 'extract' the values when a velocity is given.
Inputs have the form: velocity\t\val1\t\val2...\tvaln
[...]
16\t1\t0\n
1.0000\t9.3465\t8.9406\t35.9604\n
2.0000\t10.4654\t9.9456\t36.9107\n
3.0000\t11.1235\t10.9378\t37.1578\n
[...]
What have I done
I have written a piece of code to return values when a velocity is requested:
def values(input,velocity):
return re.findall("\n"+str(velocity)+".*",input)[-1][1:]
It works "backwards" because I want to ignore the first row from the inputs (16\t1\t0\n), this way if I call:
>>>values('inputs.txt',16)
>>>16.0000\t0.5646\t14.3658\t1.4782\n
But it has a big problem: if I call the function for 1, it returns the value for 19.0000
Since I thought all inputs would be in the same format I made a litte fix:
def values(input,velocity):
if velocity <= 5: #Because velocity goes to 50
velocity = str(velocity)+'.0'
return re.findall("\n"+velocity+".*",input)[-1][1:]
And it works pretty well, maybe is not the most beautiful (or efficient) way of do it but I'm a beginner.
The problem
But with this code I have a problem and it is that sometimes inputs have this form:
[...]
16\t1\t0\n
1\t9.3465\t8.9406\t35.9604\n
2\t10.4654\t9.9456\t36.9107\n
3\t11.1235\t10.9378\t37.1578\n
[...]
And, of course my solution doesn't work
So, is there any pattern that fit both kinds of inputs?
Thank you for your help.
P.S. I have a solution using the function split('\n') and indexes but I would like to solve it with re library:
def values(input,velocity):
return input.split('\n)[velocity+1] #+1 to avoid first row
You could use a positive look ahead to check that after your velocity there is either a period or a tab. That will stop you picking up further numbers without hardcoding there must be .0. This means that velocity 1 will be able to match 1 or 1.xxxxx
import re
from typing import List
def find_by_velocity(velocity: int, data: str) -> List[str]:
return re.findall(r"\n" + str(velocity) + r"(?=\.|\t).*", data)
data = """16\t1\t0\n1\t9.3465\t8.9406\t35.9604\n2\t10.4654\t9.9456\t36.9107\n3\t11.1235\t10.9378\t37.1578\n16\t1\t0\n1.0000\t9.3465\t8.9406\t35.9604\n2.0000\t10.4654\t9.9456\t36.9107\n3.0000\t11.1235\t10.9378\t37.1578\n"""
print(find_by_velocity(1, data))
OUTPUT
['\n1\t9.3465\t8.9406\t35.9604', '\n1.0000\t9.3465\t8.9406\t35.9604']
I'm trying to use Python to call an API and clean a bunch of strings that represent a movie budget.
So far, I have the following 6 variants of data that come up.
"$1.2 million"
"$1,433,333"
"US$ 2 million"
"US$1,644,736 (est.)
"$6-7 million"
"£3 million"
So far, I've only gotten 1 and 2 parsed without a problem with the following code below. What is the best way to handle all of the other cases or a general case that may not be listed below?
def clean_budget_string(input_string):
number_to_integer = {'million' : 1000000, 'thousand' : 1000}
budget_parts = input_string.split(' ')
#Currently, only indices 0 and 1 are necessary for computation
text_part = budget_parts[1]
if text_part in number_to_integer:
number = budget_parts[0].lstrip('$')
int_representation = number_to_integer[text_part]
return int(float(number) * int_representation)
else:
number = budget_parts[0]
idx_dollar = 0
for idx in xrange(len(number)):
if number[idx] == '$':
idx_dollar = idx
return int(number[idx_dollar+1:].replace(',', ''))
The way I would approach a parsing task like this -- and I'm happy to hear other opinions -- would be to break up your function into several parts, each of which identify a single piece of information in the input string.
For instance, I'd start by identifying what float number can be parsed from the string, ignoring currency and order of magnitude (a million, a thousand) for now :
f = float(''.join([c for c in input_str if c in '0123456789.']))
(you might want to add error handling for when you end up with a trailing dot, because of additions like 'est.')
Then, in a second step, you determine whether the float needs to be multiplied to adjust for the correct order of magnitude. One way of doing this would be with multiple if-statements :
if 'million' in input_str :
oom = 6
elif 'thousand' in input_str :
oom = 3
else :
oom = 1
# adjust number for order of magnitude
f = f*math.pow(10, oom)
Those checks could of course be improved to account for small differences in formatting by using regular expressions.
Finally, you separately determine the currency mentioned in your input string, again using one or more if-statements :
if '£' in input_str :
currency = 'GBP'
else :
currency = 'USD'
Now the one case that this doesn't yet handle is the dash one where lower and upper estimates are given. One way of making the function work with these inputs is to split the initial input string on the dash and use the first (or second) of the substrings as input for the initial float parsing. So we would replace our first line of code with something like this:
if '-' in input_str :
lower = input_str.split('-')[0]
f = float(''.join([c for c in lower if c in '0123456789.']))
else :
f = float(''.join([c for c in input_str if c in '0123456789.']))
using regex and string replace method, i added the return of the curency as well if needed.
Modify accordingly to handle more input or multiplier like billion etc.
import re
# take in string and return integer amount and currency
def clean_budget_string(s):
mult_dict = {'million':1000000,'thousand':1000}
tmp = re.search('(^\D*?)\s*((?:\d+\.?,?)+)(?:-\d+)?\s*((?:million|thousand)?)', s).groups()
currency = tmp[0]
mult = tmp[-1]
tmp_int = ''.join(tmp[1:-1]).replace(',', '') # join digits and multiplier, remove comma
tmp_int = int(float(tmp_int) * mult_dict.get(mult, 1))
return tmp_int, currency
>>? clean_budget_string("$1.2 million")
(1200000, '$')
>>? clean_budget_string("$1,433,333")
(1433333, '$')
>>? clean_budget_string("US$ 2 million")
(2000000, 'US$')
>>? clean_budget_string("US$1,644,736 (est.)")
(1644736, 'US$')
>>? clean_budget_string("$6-7 million")
(6000000, '$')
>>? clean_budget_string("£3 million")
(3000000, '£') # my script don't recognize the £ char, might need to set the encoding properly
I am new to Python and I have a hard time solving this.
I am trying to sort a list to be able to human sort it 1) by the first number and 2) the second number. I would like to have something like this:
'1-1bird'
'1-1mouse'
'1-1nmouses'
'1-2mouse'
'1-2nmouses'
'1-3bird'
'10-1birds'
(...)
Those numbers can be from 1 to 99 ex: 99-99bird is possible.
This is the code I have after a couple of headaches. Being able to then sort by the following first letter would be a bonus.
Here is what I've tried:
#!/usr/bin/python
myList = list()
myList = ['1-10bird', '1-10mouse', '1-10nmouses', '1-10person', '1-10cat', '1-11bird', '1-11mouse', '1-11nmouses', '1-11person', '1-11cat', '1-12bird', '1-12mouse', '1-12nmouses', '1-12person', '1-13mouse', '1-13nmouses', '1-13person', '1-14bird', '1-14mouse', '1-14nmouses', '1-14person', '1-14cat', '1-15cat', '1-1bird', '1-1mouse', '1-1nmouses', '1-1person', '1-1cat', '1-2bird', '1-2mouse', '1-2nmouses', '1-2person', '1-2cat', '1-3bird', '1-3mouse', '1-3nmouses', '1-3person', '1-3cat', '2-14cat', '2-15cat', '2-16cat', '2-1bird', '2-1mouse', '2-1nmouses', '2-1person', '2-1cat', '2-2bird', '2-2mouse', '2-2nmouses', '2-2person']
def mysort(x,y):
x1=""
y1=""
for myletter in x :
if myletter.isdigit() or "-" in myletter:
x1=x1+myletter
x1 = x1.split("-")
for myletter in y :
if myletter.isdigit() or "-" in myletter:
y1=y1+myletter
y1 = y1.split("-")
if x1[0]>y1[0]:
return 1
elif x1[0]==y1[0]:
if x1[1]>y1[1]:
return 1
elif x1==y1:
return 0
else :
return -1
else :
return -1
myList.sort(mysort)
print myList
Thanks !
Martin
You have some good ideas with splitting on '-' and using isalpha() and isdigit(), but then we'll use those to create a function that takes in an item and returns a "clean" version of the item, which can be easily sorted. It will create a three-digit, zero-padded representation of the first number, then a similar thing with the second number, then the "word" portion (instead of just the first character). The result looks something like "001001bird" (that won't display - it'll just be used internally). The built-in function sorted() will use this callback function as a key, taking each element, passing it to the callback, and basing the sort order on the returned value. In the test, I use the * operator and the sep argument to print it without needing to construct a loop, but looping is perfectly fine as well.
def callback(item):
phrase = item.split('-')
first = phrase[0].rjust(3, '0')
second = ''.join(filter(str.isdigit, phrase[1])).rjust(3, '0')
word = ''.join(filter(str.isalpha, phrase[1]))
return first + second + word
Test:
>>> myList = ['1-10bird', '1-10mouse', '1-10nmouses', '1-10person', '1-10cat', '1-11bird', '1-11mouse', '1-11nmouses', '1-11person', '1-11cat', '1-12bird', '1-12mouse', '1-12nmouses', '1-12person', '1-13mouse', '1-13nmouses', '1-13person', '1-14bird', '1-14mouse', '1-14nmouses', '1-14person', '1-14cat', '1-15cat', '1-1bird', '1-1mouse', '1-1nmouses', '1-1person', '1-1cat', '1-2bird', '1-2mouse', '1-2nmouses', '1-2person', '1-2cat', '1-3bird', '1-3mouse', '1-3nmouses', '1-3person', '1-3cat', '2-14cat', '2-15cat', '2-16cat', '2-1bird', '2-1mouse', '2-1nmouses', '2-1person', '2-1cat', '2-2bird', '2-2mouse', '2-2nmouses', '2-2person']
>>> print(*sorted(myList, key=callback), sep='\n')
1-1bird
1-1cat
1-1mouse
1-1nmouses
1-1person
1-2bird
1-2cat
1-2mouse
1-2nmouses
1-2person
1-3bird
1-3cat
1-3mouse
1-3nmouses
1-3person
1-10bird
1-10cat
1-10mouse
1-10nmouses
1-10person
1-11bird
1-11cat
1-11mouse
1-11nmouses
1-11person
1-12bird
1-12mouse
1-12nmouses
1-12person
1-13mouse
1-13nmouses
1-13person
1-14bird
1-14cat
1-14mouse
1-14nmouses
1-14person
1-15cat
2-1bird
2-1cat
2-1mouse
2-1nmouses
2-1person
2-2bird
2-2mouse
2-2nmouses
2-2person
2-14cat
2-15cat
2-16cat
You need leading zeros. Strings are sorted alphabetically with the order different from the one for digits. It should be
'01-1bird'
'01-1mouse'
'01-1nmouses'
'01-2mouse'
'01-2nmouses'
'01-3bird'
'10-1birds'
As you you see 1 goes after 0.
The other answers here are very respectable, I'm sure, but for full credit you should ensure that your answer fits on a single line and uses as many list comprehensions as possible:
import itertools
[''.join(r) for r in sorted([[''.join(x) for _, x in
itertools.groupby(v, key=str.isdigit)]
for v in myList], key=lambda v: (int(v[0]), int(v[2]), v[3]))]
That should do nicely:
['1-1bird',
'1-1cat',
'1-1mouse',
'1-1nmouses',
'1-1person',
'1-2bird',
'1-2cat',
'1-2mouse',
...
'2-2person',
'2-14cat',
'2-15cat',
'2-16cat']
Here's what I've got so far:
def encodeFive(zip):
zero = "||:::"
one = ":::||"
two = "::|:|"
three = "::||:"
four = ":|::|"
five = ":|:|:"
six = ":||::"
seven = "|:::|"
eight = "|::|:"
nine = "|:|::"
codeList = [zero,one,two,three,four,five,six,seven,eight,nine]
allCodes = zero+one+two+three+four+five+six+seven+eight+nine
code = ""
digits = str(zip)
for i in digits:
code = code + i
return code
With this I'll get the original zip code in a string, but none of the numbers are encoded into the barcode. I've figured out how to encode one number, but it wont work the same way with five numbers.
codeList = ["||:::", ":::||", "::|:|", "::||:", ":|::|",
":|:|:", ":||::", "|:::|", "|::|:", "|:|::" ]
barcode = "".join(codeList[int(digit)] for digit in str(zipcode))
Perhaps use a dictionary:
barcode = {'0':"||:::",
'1':":::||",
'2':"::|:|",
'3':"::||:",
'4':":|::|",
'5':":|:|:",
'6':":||::",
'7':"|:::|",
'8':"|::|:",
'9':"|:|::",
}
def encodeFive(zipcode):
return ''.join(barcode[n] for n in str(zipcode))
print(encodeFive(72353))
# |:::|::|:|::||::|:|:::||:
PS. It is better not to name a variable zip, since doing so overrides the builtin function zip. And similarly, it is better to avoid naming a variable code, since code is a module in the standard library.
You're just adding i (the character in digits) to the string where I think you want to be adding codeList[int(i)].
The code would probably be much simpler by just using a dict for lookups.
I find it easier to use split() to create lists of strings:
codes = "||::: :::|| ::|:| ::||: :|::| :|:|: :||:: |:::| |::|: |:|::".split()
def zipencode(numstr):
return ''.join(codes[int(x)] for x in str(numstr))
print zipencode("32345")
This is made in python.
number = ["||:::",
":::||",
"::|:|",
"::||:",
":|::|",
":|:|:",
":||::",
"|:::|",
"|::|:",
"|:|::"
]
def encode(num):
return ''.join(map(lambda x: number[int(x)], str(num)))
print encode(32345)
I don't know what language you are usingm so I made an example in C#:
int zip = 72353;
string[] codeList = {
"||:::", ":::||", "::|:|", "::||:", ":|::|",
":|:|:", ":||::", "|:::|", "|::|:", "|:|::"
};
string code = String.Empty;
while (zip > 0) {
code = codeList[zip % 10] + code;
zip /= 10;
}
return code;
Note: Instead of converting the zip code to a string, and the convert each character back to a number, I calculated the digits numerically.
Just for fun, here's a one-liner:
return String.Concat(zip.ToString().Select(c => "||::::::||::|:|::||::|::|:|:|::||::|:::||::|:|:|::".Substring(((c-'0') % 10) * 5, 5)).ToArray());
It appears you're trying to generate a "postnet" barcode. Note that the five-digit ZIP postnet barcodes were obsoleted by ZIP+4 postnet barcodes, which were obsoleted by ZIP+4+2 delivery point postnet barcodes, all of which are supposed to include a checksum digit and leading and ending framing bars. In any case, all of those forms are being obsoleted by the new "intelligent mail" 4-state barcodes, which require a lot of computational code to generate and no longer rely on straight digit-to-bars mappings. Search USPS.COM for more details.