Python - Split in one iteration - python

Why are functions often not available like this SPLIT in the final image?
How can I use split in this iteration?
soup = BeautifulSoup(teste, "html.parser")
Vagas = soup.find_all(title="Vaga disponível.")
temp=[]
for i in Vagas:
on_click = i.get('onclick')
temp.append(on_click)
achado2 = temp.split('\'')[1::6]
print(temp)

Here you defined temp as a list. It is actually already split into a list of elements, so .split() can not be applied on a list.
split() may be applied on a string to make from it a list of substrings.

Solved it by turning my list into a string and then using split on it again.
temp=[]
for i in Vagas:
on_click = i.get('onclick')
temp.append(on_click)
texto = str(temp)
achado2 = texto.split('\'')[1::6]

Related

Turn each word into a variable with python

I have a text which I need to delete the first two words and store the numbers into a variable.
I am trying to split the words and then create a loop to store each word in a variable.
My text is: "ABA BLLO 70000000 12-2022"
So I am trying to store the numbers, which can alternate depending on the data set and create a variable for each of them.
text = "ABA BLLO 70000000 12-2022"
a = text.strip().strip("")
for a in text:
print(a)
So I would have three variables:
number = 70000000
month = 12
year = 2022
You can use the split function to split the string on white-spaces and convert all the splitted strings into a list. Then you can slice the array to remove the first two elements and destructure the remaining array into variables.
text = "ABA BLLO 70000000 12-2022"
x,y = text.split()[2:]
print(x,y)
NOTE : This would work only if there's a fixed format for the input string.
If i get your point right, then try to check this code:
text = "ABA BLLO 70000000 12-2022"
counter = 0
word = []
number = []
for a in text.split(" "):
if counter <= 1:
word.append(a)
else:
number.append(a)
counter += 1
print(word)
print(number)
The output will be
['ABA', 'BLLO']
['70000000', '12-2022']
I don't know if I'm catching your drift but here's my answer:
new_text = text.split(" ")
for i in range(2, len(new_text)):
if i == 2:
number = new_text[i]
else:
month = new_text[i].split("-")[0]
year = new_text[i].split("-")[1]
print(f"Number: {number}\nMonth: {month}\nYear: {year}")
After using something like
tempSplit = text.split()
You're going to get a list class.
result = [s for s in tempSplit if s.isdigit()]
And with that you can get int objects but problem with this last fourth element is a Date object you have to use another function for that.
As #roganjosh suggested with the comment you should check other tutorials to find out about different functions. Like for this instance maybe you can try split function then learn how to get only numbers from a list.
To get dates
month = tempSplit[3].split("-")[0]
year = tempSplit[3].split("-")[1]

How can I replace an item in a list with a string that has a space in it?

I am trying to simply replace a list item with another item, except the new item has a space in it. When it replaces, it creates two list items when I only want one. How can I make it just one item in the list please?
Here is a minimal reproducible example:
import re
cont = "BECMG 2622/2700 32010KT CAVOK"
actual = "BECMG 2622"
sorted_fm_becmg = ['BECMG 262200', '272100']
line_to_print = 'BECMG 262200'
becmg = re.search(r'%s[/]\d\d\d\d' % re.escape(actual), cont).group()
new_becmg = "BECMG " + becmg[-4:] + "00" # i need to make this one list item when it replaces 'line_to_print'
sorted_fm_becmg = (' '.join(sorted_fm_becmg).replace(line_to_print, new_becmg)).split()
print(sorted_fm_becmg)
I need sorted_fm_becmg to look like this : ['BECMG 270000', '272100'].
I've tried making new_becmg a list item, I have tried removing the space in the string in new_becmg but I need the list item to have a space in it.
It is probably something simple but I can't get it. Thank you.
You can iterate through sorted_fm_becmg to replace each string individually instead:
sorted_fm_becmg = [b.replace(line_to_print, new_becmg) for b in sorted_fm_becmg]

Remove Prefixes From a String

What's a cute way to do this in python?
Say we have a list of strings:
clean_be
clean_be_al
clean_fish_po
clean_po
and we want the output to be:
be
be_al
fish_po
po
Another approach which will work for all scenarios:
import re
data = ['clean_be',
'clean_be_al',
'clean_fish_po',
'clean_po', 'clean_a', 'clean_clean', 'clean_clean_1']
for item in data:
item = re.sub('^clean_', '', item)
print (item)
Output:
be
be_al
fish_po
po
a
clean
clean_1
Here is a possible solution that works with any prefix:
prefix = 'clean_'
result = [s[len(prefix):] if s.startswith(prefix) else s for s in lst]
You've merely provided minimal information on what you're trying to achieve, but the desired output for the 4 given inputs can be created via the following function:
def func(string):
return "_".join(string.split("_")[1:])
you can do this:
strlist = ['clean_be','clean_be_al','clean_fish_po','clean_po']
def func(myList:list, start:str):
ret = []
for element in myList:
ret.append(element.lstrip(start))
return ret
print(func(strlist, 'clean_'))
I hope, it was useful, Nohab
There are many ways to do based on what you have provided.
Apart from the above answers, you can do in this way too:
string = 'clean_be_al'
string = string.replace('clean_','',1)
This would remove the first occurrence of clean_ in the string.
Also if the first word is guaranteed to be 'clean', then you can try in this way too:
string = 'clean_be_al'
print(string[6:])
You can use lstrip to remove a prefix and rstrip to remove a suffix
line = "clean_be"
print(line.lstrip("clean_"))
Drawback:
lstrip([chars])
The [chars] argument is not a prefix; rather, all combinations of its values are stripped.

Question about String operations in Python

I am new to Python, and was practising File Operations. I have written this program:
myfile = open('test3.txt', 'w+')
myfile.writelines(['Doctor', 'Subramanian', 'Swamy', 'Virat', 'Hindustan', 'Sangam'])
which outputs the following:
DoctorSubramanianSwamyViratHindustanSangam.
How do I add spaces in between items of the list in the final output such that the final output is Doctor Subramanian Swamy Virat Hindustan Sangam?
Based on what I understood from your question, you wish to add spaces between elements of the list in the final output. One possible solution is:
myfile = open('test3.txt', 'w+')
list = ['Doctor', 'Subramanian', 'Swamy', 'Virat', 'Hindustan', 'Sangam']
for l in list:
myfile.write(l+' ')
In particular, this line myfile.write(l+' ') will add a space after writing every element.
You could try stripping it using the same .strip() method?
value = "'"
list1 = []
for item in list2:
list1.append(item.strip("{0}".format(value)))
try it and let me know

Remove unwanted substring from a list of strings at specified indexes

New to python and I want to remove the prefix of two stings. Just leaving everything before the J and removing the .json.
I tried using [:1] but it removes the entire first string
name = ['190523-105238-J105150.json',
'190152-105568-J616293.json']
I want to output this
name = ['J105150',
'J616293']
You can use split() in a list-comprehension:
name = ['190523-105238-J105150.json',
'190152-105568-J616293.json']
print([x.rsplit('-', 1)[1].split('.')[0] for x in name])
# ['J105150', 'J616293']
You could use find() function and array splicing.
name = ['190523-105238-J105150.json' ,'190152-105568-J616293.json']
for i in range(len(name)):
start_of_json = name[i].find('.json')
start_of_name = name[i].find('J')
name[i] = name[i][start_of_name:start_of_json]
Doing [:1] will slice your current list to take only elements that are before index 1, so only element at index 0 will be present.
This is not what you want.
A regex can help you reach your goal.
import re
output = [re.search(r'-([\w+]).json', x).group(0) for x in your_list]
Firstly it is not a data frame, it is an array.
You could use something simple as below line for this, assuming you have static structure.
name = [x[x.index("J"):x.index(".")] for x in name]
Here are two possible approaches:
One is more verbose. The other does essentially the same thing but condenses it into a one-liner, if you will.
Approach 1:
In approach 1, we create an empty list to store the results temporarily.
From there we parse each item of name and .split() each item on the hyphens.
For each item, this will yield a list composed of three elements: ['190523', '105238', 'J105150.json'] for example.
We use the index [-1] to select just the last element and then .replace() the text .json with the empty string '' effectively removing the .json.
We then append the item to the new_names list.
Lastly, we overwrite the variable label name, so that it points at the new list we generated.
name = ['190523-105238-J105150.json', '190152-105568-J616293.json']
new_names = []
for item in name:
item = item.split('-')[-1]
new_names.append(item.replace('.json', ''))
name = new_names
Approach 2:
name = ['190523-105238-J105150.json', '190152-105568-J616293.json']
name = [item.split('-')[-1].replace('.json', '') for item in name]
Originally the list is name = ['190523-105238-J105150.json', '190152-105568-J616293.json'].
List comprehensions in python are extremely useful and powerful.
eq = [name[i][name[i].find("J"):name[i].rfind(".json")] for i in range(len(name))], a list comprehension is used to create a new list of values from the list name by finding the starting at the value J and going to before the .json. The result of find() is of type integer.
The complete code can be seen below.
def main():
name = ['190523-105238-J105150.json', '190152-105568-J616293.json']
eq = [name[i][name[i].find("J"):name[i].rfind(".json")] for i in range(len(name))]
print(eq)
if __name__ == "__main__":
main()
output: ['J105150', 'J616293']

Categories