I trying to split a string (although numbers currently a string in df column) but am struggling to find an answer anywhere. I think using expressions might be the way forward but haven't quite got my head around them.
example 1) 12.540%
example 2) 4.555.6%
I would like to take everything to the left of the first '.' and only one number going to the right of the same first '.'
I need to apply it to all different number lengths and the above statement is the only constant.
example 1 ) 12.5 and 40%
example 2) 4.5 and 55.6%
Thank you
The following function should do what you want:
def split_string(num):
s=num.split('.', 1)
s1=s[0]+'.'+s[1][0]
s2=s[1][1:]
return (s1, s2)
This is a straightforward problem in string manipulation. Any string tutorial will teach you the basic operations.
Find the location of the period.
Add 1.
Split the string at that point: grab one slice through that index; a second slice from there to the end.
For instance, one you find the location loc and adjust 1 or 2 spots to the right:
num, pct = str[:loc], str[loc:]
If you want regular expressions, catch the groups using this.
^(\d+\..)(.*)$
Use this with either re.search if you want.
b = re.search(r'^(\d+\..)(.*)$', string)
b.group(1)
b.group(2)
Ex-
val = '12.445.6'
b = re.search(r'^(\d+\..)(.*)$', val)
b.group(1)
Out[24]: '12.4'
b.group(2)
Out[25]: '45.6'
Related
I am reading an input 'S1,10'. First component is a string and second is an integer. They are separated by comma.
I have tried x = input().split(','). This creates a list ['S1','10']. How can I create a list ['S1', 10] where the second element is an integer?
I have solved this in a two step process.
bp = input().split(',')
bp[1] = int(bp[1])
Can it be done in a single step? How can we split with different datatypes?
I mean, if you really want to force it, it can be done in a single line
x = [s if i == 0 else int(s) for i,s in enumerate(input().split(','))]
But at the end of the day, code is for humans to understand. If I were you I would keep what you have.
Here is a solution using the zip() function. With this approach it is easy to vary which functions to apply to the parts of the split string.
x = [f(s) for f, s in zip([str, int], input().split(','))]
When I use the replace function I can input an additional 3rd argument which describes how many occurences of the particular character I might want to change.
For Example -
input_string = input()
first_char = input_string[0]
modified_string = input_string.replace(first_char, "$", input_string.count(first_char)-1)
print(modified_string)
The above code gives the following output:
Input: heyhhdh
Output: $ey$$dh
It replaced the h starting from the first occurrence but is there a way where I can specify where to start?
For instance in the problem I'm working on I need to leave the first character so is there a way to specify that in python
Edit:
The following line of code commented by Tarique performs my task
modified_string = first_char + input_string[1:].replace(first_char, "$", input_string.count(first_char)-1)
However is there a way to do this using only string functions like modifying the arguments in the replace function?
You could do what you already got, except without the pointless counting:
>>> first_char + input_string[1:].replace(first_char, '$')
'hey$$d$'
A single replace without anything else can't do it, but two can:
>>> input_string.replace(first_char, '$').replace('$', first_char, 1)
'hey$$d$'
That's only two linear-time operations instead of three, and for longer strings it's faster. For input_string = 'hey$$d$' * 10**6 the first way takes me 12.1 ms and the second way takes me 9.4 ms.
A third but silly and slow (30.9 ms) way, simulating backwards-replacing by reversing the string before and after:
>>> input_string[::-1].replace(first_char, '$', input_string.count(first_char) - 1)[::-1]
'hey$$d$'
Tarique's method is going to be the only way involving the replace method. You can specify the maximum number of characters to replace (see the bottom of this python documentation page), but that is the opposite of what you want. This is the same for Python 3, as seen here.
Im working on Advent of Code: Day 2, and Im having trouble working with lists. My code takes a string, for example 2x3x4, and splits it into a list. Then it checks for an 'x' in the list and removes them and feeds the value to a method that calculates the area needed. The problem is that before it removes the 'x's I need to find out if there are two numbers before the 'x' and combine them, to account for double digit numbers. I've looked into regular expressions but I don't think I've been using it right. Any ideas?
def CalcAreaBox(l, w, h):
totalArea = (2*(l*w)) + (2*(w*h))+ (2*(h*l))
extra = l * w
toOrder = totalArea + extra
print(toOrder)
def ProcessString(dimStr):
#seperate chars into a list
dimStrList = list(dimStr)
#How to deal with double digit nums?
#remove any x
for i in dimStrList:
if i == 'x':
dimStrList.remove(i)
#Feed the list to CalcAreaBox
CalcAreaBox(int(dimStrList[0]), int(dimStrList[1]), int(dimStrList[2]))
dimStr = "2x3x4"
ProcessString(dimStr)
You could use split on your string
#remove any x and put in list of ints
dims = [int(dim) for dim in dimStrList.split('x')]
#Feed the list to CalcAreaBox
CalcAreaBox(dims[0], dims[1], dims[2])
Of course you will want to consider handling the cases where there are not exactly two X's in the string
Your question is more likely to fit on Code Review and not Stack Overflow.
As your task is a little challenge, I would not tell you an exact solution, but give you a hint towards the split method of Python strings (see the documentation).
Additionally, you should check the style of your code against the recommendation in PEP8, e.g. Python usually has function/variable names in all lowercase letters, words separated by underscores (like calc_area_box).
I have a binary string say '01110000', and I want to return the number of leading zeros in front without writing a forloop. Does anyone have any idea on how to do that? Preferably a way that also returns 0 if the string immediately starts with a '1'
If you're really sure it's a "binary string":
input = '01110000'
zeroes = input.index('1')
Update: it breaks when there's nothing but "leading" zeroes
An alternate form that handles the all-zeroes case.
zeroes = (input+'1').index('1')
Here is another way:
In [36]: s = '01110000'
In [37]: len(s) - len(s.lstrip('0'))
Out[37]: 1
It differs from the other solutions in that it actually counts the leading zeroes instead of finding the first 1. This makes it a little bit more general, although for your specific problem that doesn't matter.
A simple one-liner:
x = '01110000'
leading_zeros = len(x.split('1', 1)[0])
This partitions the string into everything up to the first '1' and the rest after it, then counts the length of the prefix. The second argument to split is just an optimization and represents the number of splits to perform, meaning the function will stop after it found the first '1' instead of splitting it on all occurences. You could just use x.split('1')[0] if performance doesn't matter.
I'd use:
s = '00001010'
sum(1 for _ in itertools.takewhile('0'.__eq__, s))
Rather pythonic, works in the general case, for example on the empty string and non-binary strings, and can handle strings of any length (or even iterators).
If you know it's only 0 or 1:
x.find(1)
(will return -1 if all zeros; you may or may not want that behavior)
If you don't know which number would be next to zeros i.e. "1" in this case, and you just want to check if there are leading zeros, you can convert to int and back and compare the two.
"0012300" == str(int("0012300"))
How about re module?
a = re.search('(?!0)', data)
then a.start() is the position.
I'm using has_leading_zero = re.match(r'0\d+', str(data)) as a solution that accepts any number and treats 0 as a valid number without a leading zero
When we need to slice a string at a particular location, we need to know the index from where we want to.
For example, in the string:
>>> s = 'Your ID number is: 41233'
I want to slice the string starting from : and get the number.
Sure I can count at what index : is and then slice, but is that really a good approach?
Of course I can do a s.index(':'). But that would be an extra step, so I came up with something like:
>>> print s[(s.index(':')+2):]
41233
But somehow I don't like the looks of it.
So my question is, given a long string which you want to slice, how do you find the index from where to begin the slicing in the easiest and most readable way? If there is a trick to do it orally, I would love to know that.
Perhaps you could use split():
>>> s = 'Your ID number is: 41233'
>>> print s.split(":")[1].strip()
41233
text, sep, number = 'Your ID number is: 41233'.partition(':')
print number
works too. But it won't fail if the separator is not in the string.
That unpacking works for split too:
text, number = 'Your ID number is: 41233'.split(':',1)
Another approach is 'Your ID number is: 41233'.split(':')[1].strip().
So my question is, given a long string which you want to slice, how do you find the index from where to begin the slicing in the easiest and most readable way?
When "where to begin the slicing" is a specific symbol, you don't; instead you just as Python to split the string up with that symbol as a delimiter, or partition it into the bits before/within/after the symbol, as in the other answers. (split can split the string into several pieces if several delimiters are found; partition will always give three pieces even if the symbol is not there at all.)
If there is a trick to do it orally, I would love to know that.
I really don't think you mean "orally". :)
I wouldn't use slicing at all unless there's some other compelling reason you want to do so. Instead, this sounds like a perfect job for re the regular expression module in the standard library. Here's an example of using it to solve your problem:
import re
compile_obj = re.compile(r'Your ID number is:\s(?P<ID>\d+)')
s = 'Your ID number is: 41233'
match_obj = compile_obj.search(s)
if match_obj:
print match_obj.group('ID')
# 41233
Recently came across partition
string = "Your ID number is: 41233"
string = string.partition(':')
print string[2]