How to change the python string substring information - python

I have 1 bitsInfo string:
bitsInfo="0100001111110001"
and 1 array bitReplace which includes subarray:
bitReplace=[["1","5","00000"],["8","11","0000"]]
The first element of the subarray is startbit location and the second element is the endbit location.
The goal of the script is to replace the bitsInfo string (with the third element of subarray) base on startbit and endbit information.
The expected result should be
bitsFinal="0000001100000001"
I have tried this method:
for bits in bitReplace:
bitsFinal = bits[:int(bits[0])+bits[2]+ bits[int(bits[1]+1:]
This method doesn't really work. May I know what went wrong?

You are close but you are not using the original string anywhere. Try this:
bitsFinal = bitsInfo
for bits in bitReplace:
bitsFinal = bitsFinal[:int(bits[0])] + bits[2] + bitsFinal[int(bits[1])+1:]
the result is:
>>> bitsFinal
'0000001100000001'

for bits in bitReplace:
bitsFinal = bits[:int(bits[0])]+bits[2]+ bits[int(bits[1])+1:]
I think there are problems with parantheses.

Related

PySpark / Python Slicing and Indexing Issue

Can someone let me know how to pull out certain values from a Python output.
I would like the retrieve the value 'ocweeklyreports' from the the following output using either indexing or slicing:
'config': '{"hiveView":"ocweeklycur.ocweeklyreports"}
This should be relatively easy, however, I'm having problem defining the Slicing / Indexing configuation
The following will successfully give me 'ocweeklyreports'
myslice = config['hiveView'][12:30]
However, I need the indexing or slicing modified so that I will get any value after'ocweeklycur'
I'm not sure what output you're dealing with and how robust you're wanting it but if it's just a string you can do something similar to this (for a quick and dirty solution).
input = "Your input"
indexStart = input.index('.') + 1 # Get the index of the input at the . which is where you would like to start collecting it
finalResponse = input[indexStart:-2])
print(finalResponse) # Prints ocweeklyreports
Again, not the most elegant solution but hopefully it helps or at least offers a starting point. Another more robust solution would be to use regex but I'm not that skilled in regex at the moment.
You could almost all of it using regex.
See if this helps:
import re
def search_word(di):
st = di["config"]["hiveView"]
p = re.compile(r'^ocweeklycur.(?P<word>\w+)')
m = p.search(st)
return m.group('word')
if __name__=="__main__":
d = {'config': {"hiveView":"ocweeklycur.ocweeklyreports"}}
print(search_word(d))
The following worked best for me:
# Extract the value of the "hiveView" key
hive_view = config['hiveView']
# Split the string on the '.' character
parts = hive_view.split('.')
# The value you want is the second part of the split string
desired_value = parts[1]
print(desired_value) # Output: "ocweeklyreports"

Python: Find and increment a number in a string

I can't find a solution to this, so I'm asking here. I have a string that consists of several lines and in the string I want to increase exactly one number by one.
For example:
[CENTER]
[FONT=Courier New][COLOR=#00ffff][B][U][SIZE=4]{title}[/SIZE][/U][/B][/COLOR][/FONT]
[IMG]{cover}[/IMG]
[IMG]IMAGE[/IMG][/CENTER]
[QUOTE]
{description_de}
[/QUOTE]
[CENTER]
[IMG]IMAGE[/IMG]
[B]Duration: [/B]~5 min
[B]Genre: [/B]Action
[B]Subgenre: [/B]Mystery, Scifi
[B]Language: [/B]English
[B]Subtitles: [/B]German
[B]Episodes: [/B]01/5
[IMG]IMAGE[/IMG]
[spoiler]
[spoiler=720p]
[CODE=rich][color=Turquoise]
{mediaInfo1}
[/color][/code]
[/spoiler]
[spoiler=1080p]
[CODE=rich][color=Turquoise]
{mediaInfo2}
[/color][/code]
[/spoiler]
[/spoiler]
[hide]
[IMG]IMAGE[/IMG]
[/hide]
[/CENTER]
I'm getting this string from a request and I want to increment the episode by 1. So from 01/5 to 02/5.
What is the best way to make this possible?
I tried to solve this via regex but failed miserably.
Assuming the number you want to change is always after a given pattern, e.g. "Episodes: [/B]", you can use this code:
def increment_episode_num(request_string, episode_pattern="Episodes: [/B]"):
idx = req_str.find(episode_pattern) + len(episode_pattern)
episode_count = int(request_string[idx:idx+2])
return request_string[:idx]+f"{(episode_count+1):0>2}"+request_string[idx+2:]
For example, given your string:
req_str = """[B]Duration: [/B]~5 min
[B]Genre: [/B]Action
[B]Subgenre: [/B]Mystery, Scifi
[B]Language: [/B]English
[B]Subtitles: [/B]German
[B]Episodes: [/B]01/5
"""
res = increment_episode_num(req_str)
print(res)
which gives you the desired output:
[B]Duration: [/B]~5 min
[B]Genre: [/B]Action
[B]Subgenre: [/B]Mystery, Scifi
[B]Language: [/B]English
[B]Subtitles: [/B]German
[B]Episodes: [/B]02/5
As #Barmar suggested in Comments, and following the example from the documentation of re, also formatting to have the right amount of zeroes as padding:
pattern = r"(?<=Episodes: \[/B\])[\d]+?(?=/\d)"
def add_one(matchobj):
number = str(int(matchobj.group(0)) + 1)
return "{0:0>2}".format(number)
re.sub(pattern, add_one, request)
The pattern uses look-ahead and look-behind to capture only the number that corresponds to Episodes, and should work whether it's in the format 01/5 or 1/5, but always returns in the format 01/5. Of course, you can expand the function so it recognizes the format, or even so it can add different numbers instead of only 1.

How to use regex to dynamically find a value in a file in Python

I have a long string like this for example:
V:"production",PUBLIC_URL:"",WDS_SOCKET_HOST:void 0,WDS_SOCKET_PATH:void 0,WDS_SOCKET_PORT:void 0,FAST_REFRESH:!0,REACT_APP_CANDY_MACHINE_ID:"9mn5duMPUeNW5AJfbZWQgs5ivtiuYvQymqsCrZAenEdW",REACT_APP_SOLANA_NETWORK:"mainnet-beta
and I need to get the value of REACT_APP_CANDY_MACHINE_ID with regex, the value of it is always 44 characters long so that is a good thing I hope. Also the file/string im pulling it from is much much longer and the REACT_APP_CANDY_MACHINE_ID appears multiple times but it doesnt change
You don't need regex for that, just use index() to get the location of REACT_APP_CANDY_MACHINE_ID.
data = 'V:"production",PUBLIC_URL:"",WDS_SOCKET_HOST:void 0,WDS_SOCKET_PATH:void 0,WDS_SOCKET_PORT:void 0,FAST_REFRESH:!0,REACT_APP_CANDY_MACHINE_ID:"9mn5duMPUeNW5AJfbZWQgs5ivtiuYvQymqsCrZAenEdW",REACT_APP_SOLANA_NETWORK:"mainnet-beta'
key = "REACT_APP_CANDY_MACHINE_ID"
start = data.index(key) + len(key) + 2
print(data[start: start + 44])
# 9mn5duMPUeNW5AJfbZWQgs5ivtiuYvQymqsCrZAenEdW

Split string every nth character from the right?

I have different very large sets of files which I'd like to put in different subfolders. I already have an consecutive ID for every folder I want to use.
I want to split the ID from the right to always have 1000 folders in the deeper levels.
Example:
id: 100243 => resulting_path: './100/243'
id: 1234567890 => resulting path: '1/234/567/890'
I found Split string every nth character?, but all solutions are from left to right and I also did not want to import another module for one line of code.
My current (working) solution looks like this:
import os
base_path = '/home/made'
n=3 # take every 'n'th from the right
max_id = 12345678900
test_id = 24102442
# current algorithm
str_id = str(test_id).zfill(len(str(max_id)))
ext_path = list(reversed([str_id[max(i-n,0):i] for i in range(len(str_id),0,-n)]))
print(os.path.join(base_path, *ext_path))
Output is: /home/made/00/024/102/442
The current algorithm looks awkward and complicated for the simple thing I want to do.
I wonder if there is a better solution. If not it might help others, anyway.
Update:
I really like Joe Iddons solution. Using .join and mod makes it faster and more readable.
In the end I decided that I never want to have a /in front. To get rid of the preceeding /in case len(s)%3is zero, I changed the line to
'/'.join([s[max(0,i):i+3] for i in range(len(s)%3-3*(len(s)%3 != 0), len(s), 3)])
Thank you for your great help!
Update 2:
If you are going to use os.path.join (like in my previous code) its even simpler since os.path.jointakes care of the format of the args itself:
ext_path = [s[0:len(s)%3]] + [s[i:i+3] for i in range(len(s)%3, len(s), 3)]
print(os.path.join('/home', *ext_path))
You can adapt the answer you linked, and use the beauty of mod to create a nice little one-liner:
>>> s = '1234567890'
>>> '/'.join([s[0:len(s)%3]] + [s[i:i+3] for i in range(len(s)%3, len(s), 3)])
'1/234/567/890'
and if you want this to auto-add the dot for the cases like your first example of:
s = '100243'
then you can just add a mini ternary use or as suggested by #MosesKoledoye:
>>> '/'.join(([s[0:len(s)%3] or '.']) + [s[i:i+3] for i in range(len(s)%3, len(s), 3)])
'./100/243'
This method will also be faster than reversing the string before hand or reversing a list.
Then if you got a solution for the direction left to right, why not simply reverse the input and output ?
str = '1234567890'
str[::-1]
Output:
'0987654321'
You can use the solution you found for left to right and then, you simply need to reverse it again.
You could use regex and modulo to split the strings into groups of three. This solution should get you started:
import re
s = [100243, 1234567890]
final_s = ['./'+'/'.join(re.findall('.{2}.', str(i))) if len(str(i))%3 == 0 else str(i)[:len(str(i))%3]+'/'+'/'.join(re.findall('.{2}.', str(i)[len(str(i))%3:])) for i in s]
Output:
['./100/243', '1/234/567/890']
Try this:
>>> line = '1234567890'
>>> n = 3
>>> rev_line = line[::-1]
>>> out = [rev_line[i:i+n][::-1] for i in range(0, len(line), n)]
>>> ['890', '567', '234', '1']
>>> "/".join(reversed(out))
>>> '1/234/567/890'

How to add characters to a variable/integer name - Python

There might be a question like this but I can't find it.
I want to be to add the name of a variable/integer. e.g.
num = 5
chr(0x2075)
Now the 2nd line would return 5 in superscript but I want to put the word num into the Unicode instead so something like chr(0x207+num) would return 5 in superscript.
Any ideas? Thanks in advance
chr(0x2070 + num)
As given in the comment, if you want to get the character at U+207x, this is correct.
But this is not the proper way to find the superscript of a number, because U+2071 is ⁱ (superscript "i") while U+2072 and U+2073 are not yet assigned.
>>> chr(0x2070 + 1)
'ⁱ'
The real superscripts ¹ (U+00B9), ² (U+00B2), ³ (U+00B3) are out of place.
>>> chr(0xb9), chr(0xb2), chr(0xb3)
('¹', '²', '³')
Unfortunately, like most things Unicode, the only sane solution here is to hard code it:
def superscript_single_digit_number(x):
return u'⁰¹²³⁴⁵⁶⁷⁸⁹'[x]

Categories