I need some help with getting numbers after a certain text
For example, I have a list:
['Martin 9', ' Leo 2 10', ' Elisabeth 3']
Now I need to change the list into variables like this:
Martin = 9
Leo2 =10
Elisabeth = 3
I didn't tried many things, because I'm new to Python.
Thanks for reading my question and probably helping me
I suppose you get some list with using selenium so getting integer from list works without using re like :
list = ["Martin 9","Leo 10","Elisabeth 3"]
for a in list:
print ''.join(filter(str.isdigit, a))
Output :
9
10
3
Lets assume you loaded it from a file and got it line by line:
numbers = []
for line in lines:
line = line.split()
if line[0] == "Martin": numbers.append(int(line[1]))
I don't think you really want a bunch of variables, especially if the list grows large. Instead I think you want a dictionary where the names are the key and the number is the value.
I would start by splitting your string into a list as described in this question/answer.
s=
"Martin 9
Leo 10
Elisabeth 3"
listData = data.splitlines()
Then I would turn that list into a dictionary as described in this question/answer.
myDictionary = {}
for listItem in listData:
i = listItem.split(' ')
myDictionary[i[0]] = int(i[1])
Then you can access the number, in Leo's case 10, via:
myDictionary["Leo"]
I have not tested this syntax and I'm not used to python, so I'm sure a little debugging will be involved. Let me know if I need to make some corrections.
I hope that helps :)
s="""Leo 1 9
Leo 2 2"""
I shortened the list here
look for the word in the list:
if "Leo 1" in s:
then get the index of it:
i = s.index("Leo 1")
then add the number of character of the word
lenght=len("Leo 1")
And then add it: (+1 because of the space)
b = i + lenght + 1
and then get the number on the position b:
x = int(s[b]
so in total:
if "Leo 1" in s:
i = s.index("Leo 1")
lenght = len("Leo 1")
b = i + lenght + 1
x = int(s[b])
Related
I have a text which I need to delete the first two words and store the numbers into a variable.
I am trying to split the words and then create a loop to store each word in a variable.
My text is: "ABA BLLO 70000000 12-2022"
So I am trying to store the numbers, which can alternate depending on the data set and create a variable for each of them.
text = "ABA BLLO 70000000 12-2022"
a = text.strip().strip("")
for a in text:
print(a)
So I would have three variables:
number = 70000000
month = 12
year = 2022
You can use the split function to split the string on white-spaces and convert all the splitted strings into a list. Then you can slice the array to remove the first two elements and destructure the remaining array into variables.
text = "ABA BLLO 70000000 12-2022"
x,y = text.split()[2:]
print(x,y)
NOTE : This would work only if there's a fixed format for the input string.
If i get your point right, then try to check this code:
text = "ABA BLLO 70000000 12-2022"
counter = 0
word = []
number = []
for a in text.split(" "):
if counter <= 1:
word.append(a)
else:
number.append(a)
counter += 1
print(word)
print(number)
The output will be
['ABA', 'BLLO']
['70000000', '12-2022']
I don't know if I'm catching your drift but here's my answer:
new_text = text.split(" ")
for i in range(2, len(new_text)):
if i == 2:
number = new_text[i]
else:
month = new_text[i].split("-")[0]
year = new_text[i].split("-")[1]
print(f"Number: {number}\nMonth: {month}\nYear: {year}")
After using something like
tempSplit = text.split()
You're going to get a list class.
result = [s for s in tempSplit if s.isdigit()]
And with that you can get int objects but problem with this last fourth element is a Date object you have to use another function for that.
As #roganjosh suggested with the comment you should check other tutorials to find out about different functions. Like for this instance maybe you can try split function then learn how to get only numbers from a list.
To get dates
month = tempSplit[3].split("-")[0]
year = tempSplit[3].split("-")[1]
I am fairly new to python and am having difficulties with this (most likely simple) problem. I'm accepting a file with the format.
name_of_sports_team year_they_won_championship
e.g.,
1991 Minnesota
1992 Toronto
1993 Toronto
They are already separated into a nested list [year][name]. I am tasked to add up all the repetitions from the list and display them as such in a new file.
Toronto 2
Minnesota 1
My code is as follows-
def write_tab_seperated(n):
'''
N is the filename
'''
file = open(n, "w")
# names are always in the second position?
data[2] = names
countnames = ()
# counting the names
for x in names:
# make sure they are all the same
x = str(name).lower()
# add one if it shows.
if x in countnames:
countnames[x] += 1
else:
countnames[x] = 1
# finish writing the file
file.close
This is so wrong its funny, but I planned out where to go from here:
Take the file
separate into the names list
add 1 for each repetition
display in name(tab)number format
close the file.
Any help is appreciated and thank you in advance!
There's a built-in datatype that's perfect for your use case called collections.Counter.
I'm assuming from the sample I/O formatting that your data file columns are tab separated. In the question text it looks like 4-spaces — if that's the case, just change '\t' to ' ' or ' '*4 below.
with open('data.tsv') as f:
lines = (l.strip().split('\t') for l in f.readlines())
Once you've read the data in, it really is as simple as passing it to a Counter and specifying that it should create counts on the values in the second column.
from collections import Counter
c = Counter(x[1] for x in lines)
And printing them back out for reference:
for k, v in c.items():
print('{}\t{}'.format(k, v))
Output:
Minnesota 1
Toronto 2
From what I understand through your explanation, the following is my piece of code:
#input.txt is the input file with <year><tab><city> data
with open('input.txt','r') as f:
input_list =[x.strip().split('\t') for x in f]
output_dict = {}
for per_item in input_list:
if per_item[1] in output_dict:
output_dict[per_item[1]] += 1
else:
output_dict[per_item[1]] = 1
#output file has <city><tab><number of occurence>
file_output = open("output.txt","w")
for per_val in output_dict:
file_output.write(per_val + "\t" + str(output_dict[per_val]) + "\n")
Let me know if it helps.
One of the great things about python is the huge number of packages. For handling tabular data, I'd recommend using pandas and the csv format:
import pandas as pd
years = list(range(1990, 1994))
names = ['Toronto', 'Minnesota', 'Boston', 'Toronto']
dataframe = pd.DataFrame(data={'years': years, 'names': names})
dataframe.to_csv('path/to/file.csv')
That being said, I would still highly recommend to go through your code and learn how these things are done from scratch.
Assume I have A1 as the only cell in a workbook, and it's blank.
I want my code to add "1" "2" and "3" to it so it says "1 2 3"
As of now I have:
NUMBERS = [1, 2, 3, 4, 5]
ThisSheet.Cells(1,1).Value = NUMBERS
this just writes the first value to the cell. I tried
ThisSheet.Cells(1,1).Value = Numbers[0-2]
but that just puts the LAST value in there. Is there a way for me to just add all of the data in there? This information will always be in String format, and I need to use Win32Com.
update:
I did
stringVar = ', '.join(str(v) for v in LIST)
UPDATE:this .join works perfectly for the NUMBERS list. Now I tried attributing it to another list that looks like this
LIST=[Description Good\nBad, Description Valid\nInvalid]
If I print LIST[0] The outcome is
Description Good
Bad
Which is what I want. But if I use .join on this one, it prints
('Description Good\nBad, Description Valid\nInvalid')
so for this one I need it to print as though I did LIST[0] and LIST[1]
So if you want to put each number in a different cell, you would do something like:
it = 1
for num in NUMBERS:
ThisSheet.Cells(1,it).Value = num
it += 1
Or if you want the first 3 numbers in the same cell:
ThisSheet.Cells(1,it).Value = ' '.join([str(num) for num in NUMBERS[:3]])
Or all of the elements in NUMBERS:
ThisSheet.Cells(1,1).Value = ' '.join([str(num) for num in NUMBERS])
EDIT
Based on your question edit, for string types containing \n and assuming every time you find a newline character, you want to jump to the next row:
# Split the LIST[0] by the \n character
splitted_lst0 = LIST[0].split('\n')
# Iterate through the LIST[0] splitted by newlines
it = 1
for line in splitted_lst0:
ThisSheet.Cells(1,it).Value = line
it += 1
If you want to do this for the whole LIST and not only for LIST[0], first merge it with the join method and split it just after it:
joined_list = (''.join(LIST)).split('\n')
And then, iterate through it the same way as we did before.
I have a .txt File and I want to get the values in a list.
The format of the txt file should be:
value0,timestamp0
value1,timestamp1
...
...
...
In the end I want to get a list with
[[value0,timestamp0],[value1,timestamp1],.....]
I know it's easy to get these values by
direction = []
for line in open(filename):
direction,t = line.strip().split(',')
direction = float(direction)
t = long(t)
direction.append([direction,t])
return direction
But I have a big problem: When creating the data I forgot to insert a "\n" in each row.
Thats why I have this format:
value0, timestamp0value1,timestamp1value2,timestamp2value3.....
Every timestamp has exactly 13 characters.
Is there a way to get these data in a list as I want it? Would be very much work get the data again.
Thanks
Max
import re
input = "value0,0123456789012value1,0123456789012value2,0123456789012value3"
for (line, value, timestamp) in re.findall("(([^,]+),(.{13}))", input):
print value, timestamp
You will have to strip the last , but you can insert a comma after every 13 chars following a comma:
import re
s = "-0.1351197,1466615025472-0.25672746,1466615025501-0.3661744,1466615025531-0.46467665,1466615025561-0.5533287,1466615025591-0.63311553,1466615025621-0.7049236,1466615025652-0.7695509,1466615025681-1.7158673,1466615025711-1.6896278,1466615025741-1.65375,1466615025772-1.6092329,1466615025801"
print(re.sub("(?<=,)(.{13})",r"\1"+",", s))
Which will give you:
-0.1351197,1466615025472,-0.25672746,1466615025501,-0.3661744,1466615025531,-0.46467665,1466615025561,-0.5533287,1466615025591,-0.63311553,1466615025621,-0.7049236,1466615025652-0.7695509,1466615025681,-1.7158673,1466615025711,-1.6896278,1466615025741-1.65375,1466615025772,-1.6092329,1466615025801,
I coded a quickie using your example, and not using 13 but len("timestamp") so you can adapt
instr = "value,timestampvalue2,timestampvalue3,timestampvalue4,timestamp"
previous_i = 0
for i,c in enumerate(instr):
if c==",":
next_i = i+len("timestamp")+1
print(instr[previous_i:next_i])
previous_i = next_i
output is descrambled:
value,timestamp
value2,timestamp
value3,timestamp
value4,timestamp
I think you could do something like this:
direction = []
for line in open(filename):
list = line.split(',')
v = list[0]
for s in list[1:]:
t = s[:13]
direction.append([float(v), long(t)])
v = s[13:]
If you're using python 3.X, then the long function no longer exists -- use int.
I am in the process of understanding how to compare data from two text files and print the data that does not match into a new document or output.
The Program Goal:
Allow the user to compare the data in a file that contains many lines of data with a default file that has the correct values of the data.
Compare multiple lines of different data with the same parameters against a default list of the data with the same parameters
Example:
Lets say I have the following text document that has these parameters and data:
Lets call it Config.txt:
<231931844151>
Bird = 3
Cat = 4
Dog = 5
Bat = 10
Tiger = 11
Fish = 16
<92103884812>
Bird = 4
Cat = 40
Dog = 10
Bat = Null
Tiger = 19
Fish = 24
etc. etc.
Let's call this my Configuration data, now I need to make sure that the values these parameters inside my Config Data file are correct.
So I have a default data file that has the correct values for these parameters/variables. Lets call it Default.txt
<Correct Parameters>
Bird = 3
Cat = 40
Dog = 10
Bat = 10
Tiger = 19
Fish = 234
This text file is the default configuration or the correct configuration for the data.
Now I want to compare these two files and print out the data that is incorrect.
So, in theory, if I were to compare these two text document I should get an output of the following: Lets call this Output.txt
<231931844151>
Cat = 4
Dog = 5
Tiger = 11
Fish = 16
<92103884812>
Bird = 4
Bat = Null
Fish = 24
etc. etc.
Since these are the parameters that are incorrect or do not match. So in this case we see that for <231931844151> the parameters Cat, Dog, Tiger, and Fish did not match the default text file so those get printed. In the case of <92103884812> Bird, Bat, and Fish do not match the default parameters so those get printed.
So that's the gist of it for now.
Code:
Currently this is my approach I am trying to do however I'm not sure how I can compare a data file that has different sets of lines with the same parameters to a default data file.
configFile = open("Config.txt", "rb")
defaultFile = open("Default.txt", "rb")
with open(configFile) as f:
dataConfig = f.read().splitlines()
with open(defaultFile) as d:
dataDefault = d.read().splitlines()
def make_dict(data):
return dict((line.split(None, 1)[0], line) for line in data)
defdict = make_dict(dataDefault)
outdict = make_dict(dataConfig)
#Create a sorted list containing all the keys
allkeys = sorted(set(defdict) | set(outdict))
#print allkeys
difflines = []
for key in allkeys:
indef = key in defdict
inout = key in outdict
if indef and not inout:
difflines.append(defdict[key])
elif inout and not indef:
difflines.append(outdict[key])
else:
#key must be in both dicts
defval = defdict[key]
outval = outdict[key]
if outval != defval:
difflines.append(outval)
for line in difflines:
print line
Summary:
I want to compare two text documents that have data/parameters in them, One text document will have a series of data with the same parameters while the other will have just one series of data with the same parameters. I need to compare those parameters and print out the ones that do not match the default. How can I go about doing this in Python?
EDIT:
Okay so thanks to #Maria 's code I think I am almost there. Now I just need to figure out how to compare the dictionary to the list and print out the differences. Here's an example of what I am trying to do:
for i in range (len(setNames)):
print setNames[i]
for k in setData[i]:
if k in dataDefault:
print dataDefault
obvious the print line is just there to see if it worked or not but I'm not sure if this is the proper way about going through this.
Sample code for parsing the file into separate dictionaries. This works by finding the group separators (blank lines). setNames[i] is the name of the set of parameters in the dictionary at setData[i]. Alternatively you can create an object which has a string name member and a dictionary data member and keep a list of those. Doing the comparisons and outputting it how you want is up to you, this just regurgitates the input file to the command line in a slightly different format.
# The function you wrote
def make_dict(data):
return dict((line.split(None, 1)[0], line) for line in data)
# open the file and read the lines into a list of strings
with open("Config.txt" , "rb") as f:
dataConfig = f.read().splitlines()
# get rid of trailing '', as they cause problems and are unecessary
while (len(dataConfig) > 0) and (dataConfig[len(dataConfig) - 1] == ''):
dataConfig.pop()
# find the indexes of all the ''. They amount to one index past the end of each set of parameters
setEnds = []
index = 0
while '' in dataConfig[index:]:
setEnds.append(dataConfig[index:].index('') + index)
index = setEnds[len(setEnds) - 1] + 1
# separate out your input into separate dictionaries, and keep track of the name of each dictionary
setNames = []
setData = []
i = 0;
j = 0;
while j < len(setEnds):
setNames.append(dataConfig[i])
setData.append(make_dict(dataConfig[i+1:setEnds[j]]))
i = setEnds[j] + 1
j += 1
# handle the last index to the end of the list. Alternativel you could add len(dataConfig) to the end of setEnds and you wouldn't need this
if len(setEnds) > 0:
setNames.append(dataConfig[i])
setData.append(make_dict(dataConfig[i+1:]))
# regurgitate the input to prove it worked the way you wanted.
for i in range(len(setNames)):
print setNames[i]
for k in setData[i]:
print "\t" + k + ": " + setData[i][k];
print ""
Why not just use those dicts and loop through them to compare?
for keys in outdict:
if defdict.get(keys):
print outdict.get(keys)