My text file looks like so: comp_339,9.93/10
I want to extract just rating point as a float value. I couldn't solve this with 'if' statement.
Is there any simple solution for this?
d2 = {}
with open('ra.txt', 'r') as r:
for line in r:
s = line.strip().split(',')
d2[s[0]] = "".join(s[1:])
print(d2)
You can do it like this:
with open('ra.txt', 'r') as r:
for line in r:
s = line.strip().split(',')
rating, _ = s[1].split("/", 1)
print(rating)
First you split the line string into "comp399" and "9.93/10"
then you keep the latter and split it again with "/" and keep the first part and convert to float.
line = "comp_339,9.93/10"
s = float(line.split()[1].split('/')[0])
# output: 9.93
You may use a simple regular expression:
import re
line = "comp_339,9.93/10"
before, rating, highscore = re.split(r'[/,]+', line)
print(rating)
Which yields
9.93
You should try this code
ratings={}
with open('ra.txt' , 'r') as f:
for l in f.readlines():
if len(l):
c, r= ratings.split(',')
ratings[c.strip()]=float(r.strip().split('/')[0].strip())
Related
I'm working on a code that sends mails to the persons given in the text file.
This is the text file:
X,y#gmail.com
Z,v#gmail.com
This is my code:
with open("mail_list.txt","r",encoding ="utf-8") as file:
a = file.read()
b = a.split("\n")
d = []
for i in b:
c = i.split(",")
d.append(c)
for x in d:
for y in x:
print(x[0])
print(x[1])
The output should be:
X
y#gmail.com
Z
v#gmail.com
But instead it is:
X
y#gmail.com
X
y#gmail.com
Z
v#gmail.com
Z
v#gmail.com
Why is that?
How can I fix it?
You're iterating over the columns in every row, but not using the column value:
for x in d:
for y in x:
print(y)
Please have a look on this solution. I believe this is more elegant and efficient than the current one. Don't just rely on line break splitting. Instead get all the data in form of lines already split by \n(line break) and then use the content as per your requirements.
lines = []
with open('mail_list.txt') as f:
lines = f.readlines()
for line in lines:
info = line.split(',')
print(info[0])
print(info[1])
You need to only iterate on list d.
with open("mail_list.txt", "r", encoding ="utf-8") as file:
a = file.read()
b = a.split("\n")
d = []
for i in b:
c = i.split(",")
d.append(c)
for x in d:
print(x[0])
print(x[1])
To make it simpler, you can read your file line by line and process it at the same time. The strip() method removes any leading (spaces at the beginning) and trailing (spaces or EOL at the end) characters.
with open("mail_list.txt", "r", encoding ="utf-8") as file:
for line in file:
line_s = line.split(",")
print(line_s[0])
print(line_s[1].strip())
I am looking for words in my chosen text files.
def cash_sum(self):
with open(self.infile.name, "r") as myfile:
lines = myfile.readlines()
for line in lines:
if re.search("AMOUNT", line):
x = []
x.append(line[26:29])
print(x)
I have this output:
['100']
['100']
['100']
And I want to add them so I can have sum of all amounts from this file.
Any advices?
Initialize x outside the line-iterating loop.
Also, you don't need to readlines() separately...
def cash_sum(self):
x = []
with open(self.infile.name, "r") as myfile:
for line in myfile:
if re.search("AMOUNT", line):
x.append(line[26:29])
# or cast to int before appending:
# x.append(int(line[26:29]))
return x
This will give you the sum...
def cash_sum(self):
with open(self.infile.name, 'r') as myfile:
lines = myfile.readlines()
totalAmount = 0
for line in lines:
if re.search("AMOUNT", line):
totalAmount += float(line[26:29])
First thing I'd recommend is casting that value to a float. I say float instead of int just in case you have a decimal point. For example, float('100') == 100.0 . Once you have that, you should be able to then add them up as you would normal numbers.
You have redefined x inside your loop, so every time your loop iterates, it resets x to and empty list. Try the following:
def cash_sum(self):
with open(self.infile.name, "r") as myfile:
lines = myfile.readlines()
x = [] # That way x is defined outside the loop
for line in lines:
if re.search("AMOUNT", line):
x.append(line[26:29])
print(x)
# And if you want to add each item in the list...
num = float(0)
for number in x:
num = num + float(number)
return num
I have a text file which contains 800 words with a number in front of each. (Each word and its number is in a new line. It means the file has 800 lines) I have to find the numbers and then multiply them together. Because multiplying a lot of floats equals to zero, I have to use logarithm to prevent the underflow, but I don't know how.
this is the formula:
cNB=argmaxlogP(c )+log P(x | c )
this code doesn't print anything.
output = []
with open('c:/python34/probEjtema.txt', encoding="utf-8") as f:
w, h = map(int, f.readline().split())
tmp = []
for i, line in enumerate(f):
if i == h:
break
tmp.append(map(int, line.split()[:w]))
output.append(tmp)
print(output)
the file language is persian.
a snippet of the file:
فعالان 0.0019398642095053346
محترم 0.03200775945683802
اعتباري 0.002909796314258002
مجموع 0.0038797284190106693
حل 0.016488845780795344
مشابه 0.004849660523763337
مشاوران 0.027158098933074686
مواد 0.005819592628516004
معادل 0.002909796314258002
ولي 0.005819592628516004
ميزان 0.026188166828322017
دبير 0.0019398642095053346
دعوت 0.007759456838021339
اميد 0.002909796314258002
You can use regular expressions to find the first number in each line, e.g.
import re
output = []
with open('c:/python34/probEjtema.txt', encoding="utf-8") as f:
for line in f:
match = re.search(r'\d+.?\d*', line)
if match:
output.append(float(match.group()))
print(output)
re.search(r'\d+.?\d*', line) looks for the first number (integer or float with . in each line.
Here is a nice online regex tester: https://regex101.com/ (for debuging / testing).
/Edit: changed regex to \d+.?\d* to catch integers and float numbers.
If I understood you correctly, you could do something along the lines of:
result = 1
with open('c:/python34/probEjtema.txt', encoding="utf-8") as f:
for line in f:
word, number = line.split() # line.split("\t") if numbers are seperated by tab
result = result * float(number)
This will create an output list with all the numbers.And result will give the final multiplication result.
import math
output = []
result=1
eres=0
with open('c:/python34/probEjtema.txt', encoding="utf-8") as f:
for line in (f):
output.append(line.split()[1])
result *= float((line.split()[1]))
eres += math.log10(float((line.split()[1]))) #result in log base 10
print(output)
print(result)
print eres
How do I get the number of times a certain 2 characters are used in a text files, (e.g. ('hi'))
And how do I print the sum out as an int?
I tried doing this:
for line in open('test.txt'):
ly = line.split()
for i in ly:
a = i.count('ly')
print(sum(a))
But it failed, thanks in advance!
Your program fails because your variable a is an integer and you cannot apply the sum function to an integer.
Several examples have already been presented. Here is mine:
with open("test.txt") as fp:
a = fp.read().count('ly')
print(a)
you can simply count 'ly' on each line :
sum(line.count('ly') for line in open('test.txt'))
Different approach:
from collections import Counter
text = open('text.txt').read()
word_count = Counter(text.split())
print word_count['hi']
for line in open('test.txt'):
ly = line.split()
alist = [i.count('hi') for i in ly]
print sum(alist)
You can try something like this
for line in open('test.txt'):
ly = line.split()
for i in ly:
if 'word' in i:
a = a + 1
print (a)
So I have a text file consisting of one column, each column consist two numbers
190..255
337..2799
2801..3733
3734..5020
5234..5530
5683..6459
8238..9191
9306..9893
I would like to discard the very 1st and the very last number, in this case, 190 and 9893.
and basically moves the rest of the numbers one spot forward. like this
My desired output
255..337
2799..2801
3733..3734
5020..5234
5530..5683
6459..8238
9191..9306
I hope that makes sense I'm not sure how to approach this
lines = """190..255
337..2799
2801..3733"""
values = [int(v) for line in lines.split() for v in line.split('..')]
# values = [190, 255, 337, 2799, 2801, 3733]
pairs = zip(values[1:-1:2], values[2:-1:2])
# pairs = [(255, 337), (2799, 2801)]
out = '\n'.join('%d..%d' % pair for pair in pairs)
# out = "255..337\n2799..2801"
Try this:
with open(filename, 'r') as f:
lines = f.readlines()
numbers = []
for row in lines:
numbers.extend(row.split('..'))
numbers = numbers[1:len(numbers)-1]
newLines = ['..'.join(numbers[idx:idx+2]) for idx in xrange(0, len(numbers), 2]
with open(filename, 'w') as f:
for line in newLines:
f.write(line)
f.write('\n')
Try this:
Read all of them into one list, split each line into two numbers, so you have one list of all your numbers.
Remove the first and last item from your list
Write out your list, two items at a time, with dots in between them.
Here's an example:
a = """190..255
337..2799
2801..3733
3734..5020
5234..5530
5683..6459
8238..9191
9306..9893"""
a_list = a.replace('..','\n').split()
b_list = a_list[1:-1]
b = ''
for i in range(len(a_list)/2):
b += '..'.join(b_list[2*i:2*i+2]) + '\n'
temp = []
with open('temp.txt') as ofile:
for x in ofile:
temp.append(x.rstrip("\n"))
for x in range(0, len(temp) - 1):
print temp[x].split("..")[1] +".."+ temp[x+1].split("..")[0]
x += 1
Maybe this will help:
def makeColumns(listOfNumbers):
n = int()
while n < len(listOfNumbers):
print(listOfNumbers[n], '..', listOfNumbers[(n+1)])
n += 2
def trim(listOfNumbers):
listOfNumbers.pop(0)
listOfNumbers.pop((len(listOfNumbers) - 1))
listOfNumbers = [190, 255, 337, 2799, 2801, 3733, 3734, 5020, 5234, 5530, 5683, 6459, 8238, 9191, 9306, 9893]
makeColumns(listOfNumbers)
print()
trim(listOfNumbers)
makeColumns(listOfNumbers)
I think this might be useful too. I am reading data from a file name list.
data = open("list","r")
temp = []
value = []
print data
for line in data:
temp = line.split("..")
value.append(temp[0])
value.append(temp[1])
for i in range(1,(len(value)-1),2):
print value[i].strip()+".."+value[i+1]
print value
After reading the data I split and store it in the temporary list.After that, I copy data to the main list value which have all of the data.Then I iterate from the second element to second last element to get the output of interest. strip function is used in order to remove the '\n' character from the value.
You can later write these values to a file Instead of printing out.