Calculator Language is what the problem is called and I am to code it in python. The coding part is done but I am having trouble while reading the input file.
So the input file looks like this :
A = B = 4
C = (D = 2)*_2
#
What i would like to do is to read each character, line by line ( each line is an expression and has to be calculated), characters as characters and integers as integers, since I push them into stacks. There are two stacks one for the characters and numbers and the other for the operators.
Anyway this is what I have done with the input so far :
#!/usr/bin/python
a = open("testinput1.txt","r+")
wordList = [line.strip() for line in a];
print wordList[1]
And what i get is :
C = (D = 2)*_2
Also the end of file is reached when the file reader hits #.
Any sort of help or suggestions are welcome.
wordList is list of lines, each element is line (stripped one, without \n')
You should split each line, to get its tokens.
Then for each token check if it is string or integer (using isdigit for example).
Now that your wordlist[0] contains your first statement, In python each and every string can be indexed directly without creating a seperate list for it.
for example: if wordlist[0] contains "c=a+b" , wordlist[0][0] will directly give you 'c'.
Related
I was hoping to get some guidance through my assignment, I'm completely new to programming and have no experience whatsoever.
Assignment: We are given a txt.file containing an amount of actual words and my assignment is to write a program that asks the user for a set between 4 - 9 random letters to input. Make the program run these random letters through the txt.file and find all the words that it can create with the words the user put in. It then has to write out all the words it can find containing all the letters the user typed in, print it out in a line in alphabetical order. Afterwards it has to print out how many words contain all the 9 letters the user has put in as a list.
If my explanation is messy please tell me and I will try to explain better for you. Im not looking for anyone to solve the assignment for me, I just want some proper guidance on what I can do to proceed to the next step because right now it just feels like a dead end.
This is my code so far:
import os
chars = []
nine = input("Nians bokstäver: ")
lenght = len(nine)
#print(lenght) #Denna ska tas bort senare
while (lenght < 4) or (lenght > 9):
lenght = len(nine)
print("Fel antal, försök igen!")
nine = input("Nians bokstäver: ")
if (lenght >= 4) and (lenght <= 9):
lenght = len(nine)
chars.append(nine)
file = open("/Users/************/Desktop/Programmering 1/Inlämningsuppgift/ny_svenskaord.txt", "r")
while file:
line = file.readline()
print(line)
if line == "":
break
Thank you,
Artin
It sounds like you are building an Anagram solver. There are a few ways you can go about this. Due to it being a homework assignment I won't give any actual answers but some routes to look into.
The pythonic way to do it is using the builtin method string.rfind() where string is one word from your target file. Python Docs
The most robust way would be to use Regex to find the characters and sequence, but involves learning an entirely new language.Python Docs.
The easiest way for a programming beginner is also the ugliest way but it works with string slices and loops to iterate over each letter in the input, grab a single line from from the file using file.readline() then splitting it into its characters and comparing the two. Might looks something like
...
def Anagram(input, filename):
with open(filename, 'r') as target: # opens the file at the file path given in file name
word = target.readline() # reads one line of the file
while word: # While word is defined (has a value), loop!
hit_list = [0] # unfortunate naming, list of words the input compares too and a counter at index[0]
for i in len(input): # for letter in input
hits = 0 # If a letter is the same as the string, add one
'''
Comparison code goes here
'''
if hits == len(word): # if you have the same amount of hits as letters in the word its an anagram.
hit_list[1] = hit_list[1] + 1 # adds one to the counter
hit_list.append(word) # adds the word at the end of the list
word = target.readline() # reads a new line from the file
return hit_list # returns the list you made
The above example has huge holes and problems but would work in a very narrow set of circumstances. Some of the issues include the fact that its never checked that the input is actually a letter, that the file contains letters, that the file uses one word per line, that there are no line end character, ect ect. Things to think about when writing or reading other peoples code.
Have fun with the project Artin!
Hello I am quite new to python and started taking classes for biologists but I have a problem with an assignment in python and just can't figure it out. From a .txt file i should find 2 restriction enzymes (basically just letters), "gatc" with an g or a in front and c or t in the back so: "[ga]gatc[ct]". This is 2 times in the text file and i should find out the length between them(xxxx[ga]gatc[ct] xxxxxxx [ga]gatc[ct]xxxx) -->how many x are between them . I tried to put it in groups but i make something wrong.
xxxx is an unknown number of letters that is made up of "g" "a" "t" "c" : like ctactatctcatcttaaccttaa for example
My current code is:
import regex
file = "enzyme.txt"
f=open(file, "r")
content = f.read()
print(content)
pattern = regex.compile("[ga]gatc[ct]")
for line in open("enzyme.txt"):
for match in regex.finditer (pattern, line):
print(match.group())
print(line)
for lines in f:
m=regex.search("[ga]gatc[ct] {*} [ga]gatc[ct]", lines)
if m:
print(len(str(m.start(1)) + str(m.end(2))))
it shows me the correct sequence and prints the line in which it is but i don't know how to find the length in between them. the second part of the code doesn't do anything but also shows no error message.
In my perspective this will be a naive solution.
pattern = "[ga]gatc[ct]"
with open("enzyme.txt") as file:
for line in file:
parsed = line.split(pattern)[1]
print(len(parsed))
str.split will divide the line into pieces according to given pattern
IN your case the pattern will be [ga]gatc[ct]
Now you need to access the index 1 for the xxxxxxxx because index 0 will be ''. An empty string.
Now you wanted the length of text in between the pattern so, print(len(parsed))
I want to read a text file and copy text that is in between '~~~~~~~~~~~~~' into an array. However, I'm new in Python and this is as far as I got:
with open("textfile.txt", "r",encoding='utf8') as f:
searchlines = f.readlines()
a=[0]
b=0
for i,line in enumerate(searchlines):
if '~~~~~~~~~~~~~' in line:
b=b+1
if '~~~~~~~~~~~~~' not in line:
if 's1mb4d' in line:
break
a.insert(b,line)
This is what I envisioned:
First I read all the lines of the text file,
then I declare 'a' as an array in which text should be added,
then I declare 'b' because I need it as an index. The number of lines in between the '~~~~~~~~~~~~~' is not even, that's why I use 'b' so I can put lines of text into one array index until a new '~~~~~~~~~~~~~' was found.
I check for '~~~~~~~~~~~~~', if found I increase 'b' so I can start adding lines of text into a new array index.
The text file ends with 's1mb4d', so once its found, the program ends.
And if '~~~~~~~~~~~~~' is not found in the line, I add text to the array.
But things didn't go well. Only 1 line of the entire text between those '~~~~~~~~~~~~~' is being copied to the each array index.
Here is an example of the text file:
~~~~~~~~~~~~~
Text123asdasd
asdasdjfjfjf
~~~~~~~~~~~~~
123abc
321bca
gjjgfkk
~~~~~~~~~~~~~
You could use regex expression, give a try to this:
import re
input_text = ['Text123asdasd asdasdjfjfjf','~~~~~~~~~~~~~','123abc 321bca gjjgfkk','~~~~~~~~~~~~~']
a = []
for line in input_text:
my_text = re.findall(r'[^\~]+', line)
if len(my_text) != 0:
a.append(my_text)
What it does is it reads line by line looks for all characters but '~' if line consists only of '~' it ignores it, every line with text is appended to your a list afterwards.
And just because we can, oneliner (excluding import and source ofc):
import re
lines = ['Text123asdasd asdasdjfjfjf','~~~~~~~~~~~~~','123abc 321bca gjjgfkk','~~~~~~~~~~~~~']
a = [re.findall(r'[^\~]+', line) for line in lines if len(re.findall(r'[^\~]+', line)) != 0]
In python the solution to a large part of problems is often to find the right function from the standard library that does the job. Here you should try using split instead, it should be way easier.
If I understand correctly your goal, you can do it like that :
joined_lines = ''.join(searchlines)
result = joined_lines.split('~~~~~~~~~~')
The first line joins your list of lines into a sinle string, and then the second one cut that big string every times it encounters the '~~' sequence.
I tried to clean it up to the best of my knowledge, try this and let me know if it works. We can work together on this!:)
with open("textfile.txt", "r",encoding='utf8') as f:
searchlines = f.readlines()
a = []
currentline = ''
for i,line in enumerate(searchlines):
currentline += line
if '~~~~~~~~~~~~~' in line:
a.append(currentline)
elif 's1mb4d' in line:
break
Some notes:
You can use elif for your break function
Append will automatically add the next iteration to the end of the array
currentline will continue to add text on each line as long as it doesn't have 's1mb4d' or the ~~~ which I think is what you want
s = ['']
with open('path\\to\\sample.txt') as f:
for l in f:
a = l.strip().split("\n")
s += a
a = []
for line in s:
my_text = re.findall(r'[^\~]+', line)
if len(my_text) != 0:
a.append(my_text)
print a
>>> [['Text123asdasd asdasdjfjfjf'], ['123abc 321bca gjjgfkk']]
If you're willing to impose/accept the constraint that the separator should be exactly 13 ~ characters (actually '\n%s\n' % ( '~' * 13) to be specific) ...
then you could accomplish this for relatively normal sized files using just
#!/usr/bin/python
## (Should be #!/usr/bin/env python; but StackOverflow's syntax highlighter?)
separator = '\n%s\n' % ('~' * 13)
with open('somefile.txt') as f:
results = f.read().split(separator)
# Use your results, a list of the strings separated by these separators.
Note that '~' * 13 is a way, in Python, of constructing a string by repeating some smaller string thirteen times. 'xx%sxx' % 'YY' is a way to "interpolate" one string into another. Of course you could just paste the thirteen ~ characters into your source code ... but I would consider constructing the string as shown to make it clear that the length is part of the string's specification --- that this is part of your file format requirements ... and that any other number of ~ characters won't be sufficient.
If you really want any line of any number of ~ characters to serve as a separator than you'll want to use the .split() method from the regular expressions module rather than the .split() method provided by the built-in string objects.
Note that this snippet of code will return all of the text between your separator lines, including any newlines they include. There are other snippets of code which can filter those out. For example given our previous results:
# ... refine results by filtering out newlines (replacing them with spaces)
results = [' '.join(each.split('\n')) for each in results]
(You could also use the .replace() string method; but I prefer the join/split combination). In this case we're using a list comprehension (a feature of Python) to iterate over each item in our results, which we're arbitrarily naming each), performing our transformation on it, and the resulting list is being boun back to the name results; I highly recommend learning and getting comfortable with list comprehension if you're going to learn Python. They're commonly used and can be a bit exotic compared to the syntax of many other programming and scripting languages).
This should work on MS Windows as well as Unix (and Unix-like) systems because of how Python handles "universal newlines." To use these examples under Python 3 you might have to work a little on the encodings and string types. (I didn't need to for my Python3.6 installed under MacOS X using Homebrew ... but just be forewarned).
I've been stuck on this Python homework problem for awhile now: "Write a complete python program that reads 20 real numbers from a file inner.txt and outputs them in sorted order to a file outter.txt."
Alright, so what I do is:
f=open('inner.txt','r')
n=f.readlines()
n.replace('\n',' ')
n.sort()
x=open('outter.txt','w')
x.write(print(n))
So my thought process is: Open the text file, n is the list of read lines in it, I replace all the newline prompts in it so it can be properly sorted, then I open the text file I want to write to and print the list to it. First problem is it won't let me replace the new line functions, and the second problem is I can't write a list to a file.
I just tried this:
>>> x= "34\n"
>>> print(int(x))
34
So, you shouldn't have to filter out the "\n" like that, but can just put it into int() to convert it into an integer. This is assuming you have one number per line and they're all integers.
You then need to store each value into a list. A list has a .sort() method you can use to then sort the list.
EDIT:
forgot to mention, as other have already said, you need to iterate over the values in n as it's a list, not a single item.
Here's a step by step solution that fixes the issues you have :)
Opening the file, nothing wrong here.
f=open('inner.txt','r')
Don't forget to close the file:
f.close()
n is now a list of each line:
n=f.readlines()
There are no list.replace methods, so I suggest changing the above line to n = f.read(). Then, this will work (don't forget to reassign n, as strings are immutable):
n = n.replace('\n','')
You still only have a string full of numbers. However, instead of replacing the newline character, I suggest splitting the string using the newline as a delimiter:
n = n.split('\n')
Then, convert these strings to integers:
`n = [int(x) for x in n]`
Now, these two will work:
n.sort()
x=open('outter.txt','w')
You want to write the numbers themselves, so use this:
x.write('\n'.join(str(i) for i in n))
Finally, close the file:
x.close()
Using a context manager (the with statement) is good practice as well, when handling files:
with open('inner.txt', 'r') as f:
# do stuff with f
# automatically closed at the end
I guess real means float. So you have to convert your results to float to sort properly.
raw_lines = f.readlines()
floats = map(float, raw_lines)
Then you have to sort it. To write result back, you have to convert to string and join with line endings:
sortеd_as_string = map(str, sorted_floats)
result = '\n'.join(sortеd_as_string)
Finally you have have to write result to destination.
Ok let's look it step by step what you want to do.
First: Read some integers out of a textfile.
Pythonic Version:
fileNumbers = [int(line) for line in open(r'inner.txt', 'r').readlines()]
Easy to get version:
fileNumbers = list()
with open(r'inner.txt', 'r') as fh:
for singleLine in fh.readlines():
fileNumbers.append(int(singleLine))
What it does:
Open the file
Read each line, convert it to int (because readlines return string values) and append it to the list fileNumbers
Second: Sort the list
fileNumbers.sort()
What it does:
The sort function sorts the list by it's value e.g. [5,3,2,4,1] -> [1,2,3,4,5]
Third: Write it to a new textfile
with open(r'outter.txt', 'a') as fh:
[fh.write('{0}\n'.format(str(entry))) for entry in fileNumbers]
I'm somewhat new to python. I'm trying to sort through a list of strings and integers. The lists contains some symbols that need to be filtered out (i.e. ro!ad should end up road). Also, they are all on one line separated by a space. So I need to use 2 arguments; one for the input file and then the output file. It should be sorted with numbers first and then the words without the special characters each on a different line. I've been looking at loads of list functions but am having some trouble putting this together as I've never had to do anything like this. Any takers?
So far I have the basic stuff
#!/usr/bin/python
import sys
try:
infilename = sys.argv[1] #outfilename = sys.argv[2]
except:
print "Usage: ",sys.argv[0], "infile outfile"; sys.exit(1)
ifile = open(infilename, 'r')
#ofile = open(outfilename, 'w')
data = ifile.readlines()
r = sorted(data, key=lambda item: (int(item.partition(' ')[0])
if item[0].isdigit() else float('inf'), item))
ifile.close()
print '\n'.join(r)
#ofile.writelines(r)
#ofile.close()
The output shows exactly what was in the file but exactly as the file is written and not sorted at all. The goal is to take a file (arg1.txt) and sort it and make a new file (arg2.txt) which will be cmd line variables. I used print in this case to speed up the editing but need to have it write to a file. That's why the output file areas are commented but feel free to tell me I'm stupid if I screwed that up, too! Thanks for any help!
When you have an issue like this, it's usually a good idea to check your data at various points throughout the program to make sure it looks the way you want it to. The issue here seems to be in the way you're reading in the file.
data = ifile.readlines()
is going to read in the entire file as a list of lines. But since all the entries you want to sort are on one line, this list will only have one entry. When you try to sort the list, you're passing a list of length 1, which is going to just return the same list regardless of what your key function is. Try changing the line to
data = ifile.readlines()[0].split()
You may not even need the key function any more since numbers are placed before letters by default. I don't see anything in your code to remove special characters though.
since they are on the same line you dont really need readlines
with open('some.txt') as f:
data = f.read() #now data = "item 1 item2 etc..."
you can use re to filter out unwanted characters
import re
data = "ro!ad"
fixed_data = re.sub("[!?#$]","",data)
partition maybe overkill
data = "hello 23frank sam wilbur"
my_list = data.split() # ["hello","23frank","sam","wilbur"]
print sorted(my_list)
however you will need to do more to force numbers to sort maybe something like
numbers = [x for x in my_list if x[0].isdigit()]
strings = [x for x in my_list if not x[0].isdigit()]
sorted_list = sorted(numbers,key=lambda x:int(re.sub("[^0-9]","",x))) + sorted(strings(
Also, they are all on one line separated by a space.
So your file contains a single line?
data = ifile.readlines()
This makes data into a list of the lines in your file. All 1 of them.
r = sorted(...)
This makes r the sorted version of that list.
To get the words from the line, you can .read() the entire file as a single string, and .split() it (by default, it splits on whitespace).