Removing first character from string - python

I am working on a CodeEval challenge and have a solution to a problem which takes a list of numbers as an input and then outputs the sum of the digits of each line. Here is my code to make certain you understand what I mean:
import sys
test_cases = open(sys.argv[1], 'r')
for test in test_cases:
if test:
num = int(test)
total =0
while num != 0:
total += num % 10
num /= 10
print total
test_cases.close()
I am attempting to rewrite this where it takes the number as a string, slices each 0-index, and then adds those together (curious to see what the time and memory differences are - totally new to coding and trying to find multiple ways to do things as well)
However, I am stuck on getting this to execute and have the following:
import sys
test_cases = open(sys.argv[1], 'r')
for test in test_cases:
sums = 0
while test:
sums = sums + int(str(test)[0])
test = test[1:]
print sums
test_cases.close()
I am receiving a "ValueError: invalid literal for int() with base 10: ''"
The sample input is a text file which looks like this:
3011
6890
8778
1844
42
8849
3847
8985
5048
7350
8121
5421
7026
4246
4439
6993
4761
3658
6049
1177
Thanks for any help you can offer!

Your issue is the newlines (eg. /n or /r/n) at the end of each line.
Change this line:
for test in test_cases:
into this to split out the newlines:
for test in test_cases.read().splitlines():

try this code:
tot = 0
with open(sys.argv[1], 'r') as f:
for line in f:
try:
tot += int(line)
except ValueError:
print "Not a number"
print tot
using the context manager (with...) the file is automatically closed.
casting to int filter any empty or not valid value
you can substitute print with any other statement optimal for you (raise or pass depending on your goals)

Related

How to insert random spaces in txt file?

I have a file with lines of DNA in a file called 'DNASeq.txt'. I need a code to read each line and split each line at random places (inserting spaces) throughout the line. Each line needs to be split at different places.
EX: I have:
AAACCCHTHTHDAFHDSAFJANFAJDSNFADKFAFJ
And I need something like this:
AAA ADSF DFAFDDSAF ADF ADSF AFD AFAD
I have tried (!!!very new to python!!):
import random
for x in range(10):
print(random.randint(50,250))
but that prints me random numbers. Is there some way to get a random number generated as like a variable?
You can read a file line wise, write each line character-wise in a new file and insert spaces randomly:
Create demo file without spaces:
with open("t.txt","w") as f:
f.write("""ASDFSFDGHJEQWRJIJG
ASDFJSDGFIJ
SADFJSDFJJDSFJIDFJGIJSRGJSDJFIDJFG
SDFJGIKDSFGOROHPTLPASDMKFGDOKRAMGO""")
Read and rewrite demo file:
import random
max_no_space = 9 # if max sequence length without space
no_space = 0
with open("t.txt","r") as f, open("n.txt","w") as w:
for line in f:
for c in line:
w.write(c)
if random.randint(1,6) == 1 or no_space >= max_no_space:
w.write(" ")
no_space = 0
else:
no_space += 1
with open("n.txt") as k:
print(k.read())
Output:
ASDF SFD GHJEQWRJIJG
A SDFJ SDG FIJ
SADFJSD FJ JDSFJIDFJG I JSRGJSDJ FIDJFG
The pattern of spaces is random. You can influence it by settin max_no_spaces or remove the randomness to split after max_no_spaces all the time
Edit:
This way of writing 1 character at a time if you need to read 200+ en block is not very economic, you can do it with the same code like so:
with open("t.txt","w") as f:
f.write("""ASDFSFDGHJEQWRJIJSADFJSDFJJDSFJIDFJGIJSRGJSDJFIDJFGG
ASDFJSDGFIJSADFJSDFJJDSFJIDFJGIJSRGJSDJFIDJFGSADFJSDFJJDSFJIDFJGIJK
SADFJSDFJJDSFJIDFJGIJSRGJSDJFIDJFGSADFJSDFJJDSFJIDFJGIJSRGJSDJFIDJF
SDFJGIKDSFGOROHPTLPASDMKFGDOKRAMGSADFJSDFJJDSFJIDFJGIJSRGJSDJFIDJFG""")
import random
min_no_space = 10
max_no_space = 20 # if max sequence length without space
no_space = 0
with open("t.txt","r") as f, open("n.txt","w") as w:
for line in f:
for c in line:
w.write(c)
if no_space > min_no_space:
if random.randint(1,6) == 1 or no_space >= max_no_space:
w.write(" ")
no_space = 0
else:
no_space += 1
with open("n.txt") as k:
print(k.read())
Output:
ASDFSFDGHJEQ WRJIJSADFJSDF JJDSFJIDFJGIJ SRGJSDJFIDJFGG
ASDFJSDGFIJSA DFJSDFJJDSFJIDF JGIJSRGJSDJFIDJ FGSADFJSDFJJ DSFJIDFJGIJK
SADFJ SDFJJDSFJIDFJG IJSRGJSDJFIDJ FGSADFJSDFJJDS FJIDFJGIJSRG JSDJFIDJF
SDFJG IKDSFGOROHPTLPASDMKFGD OKRAMGSADFJSDF JJDSFJIDFJGI JSRGJSDJFIDJFG
If you want to split your DNA fixed amount of times (10 in my example) here's what you could try:
import random
DNA = 'AAACCCHTHTHDAFHDSAFJANFAJDSNFADKFAFJ'
splitted_DNA = ''
for split_idx in sorted(random.sample(range(len(DNA)), 10)):
splitted_DNA += DNA[len(splitted_DNA)-splitted_DNA.count(' ') :split_idx] + ' '
splitted_DNA += DNA[split_idx:]
print(splitted_DNA) # -> AAACCCHT HTH D AF HD SA F JANFAJDSNFA DK FAFJ
import random
with open('source', 'r') as in_file:
with open('dest', 'w') as out_file:
for line in in_file:
newLine = ''.join(map(lambda x:x+' '*random.randint(0,1), line)).strip() + '\n'
out_file.write(newLine)
Since you mentioned being new, I'll try to explain
I'm writing the new sequences to another file for precaution. It's
not safe to write to the file you are reading from.
The with constructor is so that you don't need to explicitly close
the file you opened.
Files can be read line by line using for loop.
''.join() converts a list to a string.
map() applies a function to every element of a list and returns the
results as a new list.
lambda is how you define a function without naming it. lambda x:
2*x doubles the number you feed it.
x + ' ' * 3 adds 3 spaces after x. random.randint(0, 1) returns
either 1 or 0. So I'm randomly selecting if I'll add a space after
each character or not. If the random.randint() returns 0, 0 spaces are added.
You can toss a coin after each character whether to add space there or not.
This function takes string as input and returns output with space inserted at random places.
def insert_random_spaces(str):
from random import randint
output_string = "".join([x+randint(0,1)*" " for x in str])
return output_string

Looking at a list of numbers and getting that number from another file>

I don't really know how to word the question, but I have this file with a number and a decimal next to it, like so(the file name is num.txt):
33 0.239
78 0.298
85 1.993
96 0.985
107 1.323
108 1.000
I have this string of numbers that I want to find the certain numbers from the file, take the decimal numbers, and append it to a list:
['78','85','108']
Here is my code so far:
chosen_number = ['78','85','108']
memory_list = []
for line in open(path/to/num.txt):
checker = line[0:2]
if not checker in chosen_number: continue
dec = line.split()[-1]
memory_list.append(float(dec))
The error they give to me is that it is not in a list and they only account for the 3 digit numbers. I don't really understand why this is happening and would like some tips to know how to fix it. Thanks.
As for the error, there is no actual error. The only problem is that they ignore the two digit numbers and only get the three digit numbers. I want them to get both the 2 and 3 digit numbers. For example, the script would pass 78 and 85, going to the line with '108'.
Your checker is undefined. The below code works.
N.B. I have used startswith because, the number might appear elsewhere in the line.
chosen_number = ['78','85','108']
memory_list = []
with open('path/to/num.txt') as f:
for line in f:
if any(line.startswith(i) for i in chosen_number):
memory_list.append(float(line.split()[1]))
print(memory_list)
Output:
[0.298, 1.993, 1.0]
The following would should work:
chosen_number = ['78','85','108']
memory_list = []
with open('num.txt') as f_input:
for line in f_input:
v1, v2 = line.split()
if v1 in chosen_number:
memory_list.append(float(v2))
print memory_list
Giving you:
[0.298, 1.993, 1.0]
Also, it is better to use a with statement when dealing with files so that the file is automatically closed afterwards.
Try to use this code:
chosen_number = ['78 ', '85 ', '108 ']
memory_list = []
for line in open("num.txt"):
for num in chosen_number:
if num in line:
dec = line.split()[-1]
memory_list.append(float(dec))
In chosen number, I declared numbers with a space after: '85 '. Otherwise when 0.985 is found, the if condition would be true, as they're used as string. I hope, I'm clear enough.

Iterating through characters in a python string and summing their values

So I have a very simple task. The Project Euler problem Names Scores gives us a file with a set of strings(which are names). Now you have to sort these names in the alphabetical order and then compute what is known as a name score for each of these names and sum them all up. The name score calculation is pretty simple. All you have to do is take a name and then sum up the values of the alphabets in the name and then multiply this sum with the position that the name has on the list. Obviously this seems a pretty simple question.
Being a python beginner, I wanted to try this out on python and being a beginner this was the code I wrote out. I did use list comprehensions as well along with a sum, but that gives me the same answer. Here is my code:
def name_score(s):
# print sum((ord(c)-96) for c in s)
s1 = 0;
for c in s:
s1 = s1 + (ord(c) - 96)
print s1
return s1
# print ord(c) - 96
myList = []
f = open('p022_names.txt')
for line in f:
myList.append(line.lower())
count = 0;
totalSum = 0;
for line in sorted(myList):
count = count + 1;
totalSum += (name_score(line) * count)
print totalSum
Now the file p022_names.txt contains only one line "colin". So the function name_score("colin") should return 53. Now try whatever I always end up getting the value -33. I am using PyDev on Eclipse. Now here is a curious anomaly. If I just used the list variable and populated it with the value myList = ["colin"] in the code, I get the correct answer. Honestly I don't know what is happening. Can anybody throw some light into what is happening here. There is a similar loop also in the program to calculate totalSum, but that doesn't seem to have an issue.
[EDIT] After the issue was pointed out, I am posting an updated revision of the code which works.
def name_score(s):
return sum((ord(c)-96) for c in s)
with open('p022_names.txt') as f:
myList = f.read().splitlines()
print sum((name_score(line.lower()) * (ind+1)) for ind,line in enumerate(sorted(myList)))
96 - 53 - 33 = 10
That happens because you have a newline character ("\n") in your file, thus your line is not "colin" but "colin\n".
To get rid of the newline character, multiple approaches could work. Here is an example:
Replace your line:
for line in f:
with:
for line in f.read().splitlines():
Could it be because you didn't close the file? As in f.close()?

Python: Reading a file and calculating sum and average

The question is to read the file line by line and calculate and display the sum and average of all of the valid numbers in the file.
The text file is
contains text
79.3
56.15
67
6
text again
57.86
6
37.863
text again
456.675
That's all I have so far.
numbers = open('fileofnumbers.txt', 'r')
line = file_contents.readline()
numbers.close()
try:
sum = line + line
line = file_contents.readline()
print "The sum of the numbers is", sum
except ValueError:
print line
Using with notation can make dealing with files a lot more intuitive.
For instance, changing the opening and closing to this:
summation = 0
# Within the with block you now have access to the source variable
with open('fileofnumbers.txt', 'r') as source:
for line in source: #iterate through all the lines of the file
try:
# Since files are read in as strings, you have to cast each line to a float
summation += float(line)
except ValueError:
pass
Might get you started
If you want to be a little more clever, there's a convenient python function called isdigit, which checks if a string is all integer values, which can let you do very clever things like this:
is_number = lambda number: all(number.split('.').isdigit())
answer = [float(line) for line in open('fileofnumbers.txt') if is_number(line)]
Which then makes sum and average trivial:
print sum(answer) # Sum
print sum(answer)/len(answer) #Average
Let's try list comprehension with try-except. This might be an overkill but surely a good tool to keep in your pocket, first you write a function that will silence the errors as such in http://code.activestate.com/recipes/576872-exception-handling-in-a-single-line/.
Then you can use list comprehension by passing in argv like you do in Unix:
intxt = """contains text
29.3423
23.1544913425
4
36.5
text again
79.5074638
3
76.451
text again
84.52"""
with open('in.txt','w') as fout:
fout.write(intxt)
def safecall(f, default=None, exception=Exception):
'''Returns modified f. When the modified f is called and throws an
exception, the default value is returned'''
def _safecall(*args,**argv):
try:
return f(*args,**argv)
except exception:
return default
return _safecall
with open('in.txt','r') as fin:
numbers = [safecall(float, 0, exception=ValueError)(i) for i in fin]
print "sum:", sum(numbers)
print "avg:", sum(numbers)/float(len(numbers))
[out]:
sum: 336.475255142
avg: 30.5886595584

Python: Calculating the averages of values in a text file

When I run my code below I get a: ValueError: invalid literal for int() with base 10: '0.977759164126' but i dont know why
file_open = open("A1_B1_1000.txt", "r")
file_write = open ("average.txt", "w")
line = file_open.readlines()
list_of_lines = []
length = len(list_of_lines[0])
total = 0
for i in line:
values = i.split('\t')
list_of_lines.append(values)
count = 0
for j in list_of_lines:
count +=1
for k in range(0,count):
print k
list_of_lines[k].remove('\n')
for o in range(0,count):
for p in range(0,length):
print list_of_lines[p][o]
number = int(list_of_lines[p][o])
total + number
average = total/count
print average
My text file looks like:
0.977759164126 0.977759164126 0.977759164126 0.977759164126 0.977759164126
0.981717034466 0.981717034466 0.981717034466 0.981717034466 0.98171703446
The data series is in rows and the values are tab delimited in the text file. All the rows in the file are the same length.
The aim of the script is to calculate the average of each column and write the output to a text file.
int() is used for integers (numbers like 7, 12, 7965, 0, -21233). you probably need float()
Python is limited on handling floating points. These all work fine here but for longer ones as well as arithmetic you are going to want to use the Decimal module.
import Decimal
result = Decimal.Decimal(1)/Decimal.Decimal(5)
print result
Link to the documentation
http://docs.python.org/2/library/decimal.html
Try typing in 1.1 into IDLE and see what your result is.

Categories