EDIT I will try to clarify this question. I want to make two csv files. One with the text "Greetings", the other with the text "Greetings earth". The problem is I can't find a way to ask python to write to multiple files with one write command. I am trying to find a way to make things more efficient.
This question was identified as a possible duplicate of this. write multiple files at a time but there are a lot more parts to that question that I don't understand. I am trying to isolate this problem in as simple a question as I can.
hello = open("hello.csv","w")
world = open("world.csv","w")
everything = ['hello','world']
half = ['world']
everything.write("Greetings")
half.write("Earth")
hello.close()
world.close()
It is not entirely clear what any why you try to achieve.
If you need a function which manages 'all your file creation needs' you should probably approach this by creating the setup (file names -> contents) of your files then just write them. Alternatively you can label the files and generate their contents based on the preset 'flags'.
approach 1 is something like this:
file_dict = {'hello.csv': 'Greetings', 'world.csv': 'Greetings earth'}
for f in file_dict:
with open(f) as working:
working.write(file_dict[f])
approach 2 is something like this:
files = {'common': 'hello', 'custom': 'world'}
common_text = 'Greetings'
custom_text = ' earth'
for f in files.keys():
with open(files[f]+'.csv', 'w') as working_file:
text = common_text
if f is 'custom':
text += custom_text
working_file.write(text)
If you are happy with your implementation, you can migrate the 'writing' part to a separate function (something like this):
def write_my_stuffs():
for f in file_dict:
with open(f) as working:
working.write(file_dict[f])
file_dict = {'animal.csv': 'I like dogs',
'candy.csv': 'I like chocolate cake'}
write_my_stuffs()
csv","w")
world = open("world.csv","w")
everything = [hello,world]
half = [world]
for x in everything:
x.write("Greetings")
for x in half:
x.write(" Earth")
hello.close()
world.close()
Related
I'm trying to loop through some unstructured text data in python. End goal is to structure it in a dataframe. For now I'm just trying to get the relevant data in an array and understand the line, readline() functionality in python.
This is what the text looks like:
Title: title of an article
Full text: unfortunately the full text of each article,
is on numerous lines. Each article has a differing number
of lines. In this example, there are three..
Subject: Python
Title: title of another article
Full text: again unfortunately the full text of each article,
is on numerous lines.
Subject: Python
This same format is repeated for lots of text articles in the same file. So far I've figured out how to pull out lines that include certain text. For example, I can loop through it and put all of the article titles in a list like this:
a = "Title:"
titleList = []
sample = 'sample.txt'
with open(sample,encoding="utf8") as unstr:
for line in unstr:
if a in line:
titleList.append(line)
Now I want to do the below:
a = "Title:"
b = "Full text:"
d = "Subject:"
list = []
sample = 'sample.txt'
with open(sample,encoding="utf8") as unstr:
for line in unstr:
if a in line:
list.append(line)
if b in line:
1. Concatenate this line with each line after it, until i reach the line that includes "Subject:". Ignore the "Subject:" line, stop the "Full text:" subloop, add the concatenated full text to the list array.<br>
2. Continue the for loop within which all of this sits
As a Python beginner, I'm spinning my wheels searching google on this topic. Any pointers would be much appreciated.
If you want to stick with your for-loop, you're probably going to need something like this:
titles = []
texts = []
subjects = []
with open('sample.txt', encoding="utf8") as f:
inside_fulltext = False
for line in f:
if line.startswith("Title:"):
inside_fulltext = False
titles.append(line)
elif line.startswith("Full text:"):
inside_fulltext = True
full_text = line
elif line.startswith("Subject:"):
inside_fulltext = False
texts.append(full_text)
subjects.append(line)
elif inside_fulltext:
full_text += line
else:
# Possibly throw a format error here?
pass
(A couple of things: Python is weird about names, and when you write list = [], you're actually overwriting the label for the list class, which can cause you problems later. You should really treat list, set, and so on like keywords - even thought Python technically doesn't - just to save yourself the headache. Also, the startswith method is a little more precise here, given your description of the data.)
Alternatively, you could wrap the file object in an iterator (i = iter(f), and then next(i)), but that's going to cause some headaches with catching StopIteration exceptions - but it would let you use a more classic while-loop for the whole thing. For myself, I would stick with the state-machine approach above, and just make it sufficiently robust to deal with all your reasonably expected edge-cases.
As your goal is to construct a DataFrame, here is a re+numpy+pandas solution:
import re
import pandas as pd
import numpy as np
# read all file
with open('sample.txt', encoding="utf8") as f:
text = f.read()
keys = ['Subject', 'Title', 'Full text']
regex = '(?:^|\n)(%s): ' % '|'.join(keys)
# split text on keys
chunks = re.split(regex, text)[1:]
# reshape flat list of records to group key/value and infos on the same article
df = pd.DataFrame([dict(e) for e in np.array(chunks).reshape(-1, len(keys), 2)])
Output:
Title Full text Subject
0 title of an article unfortunately the full text of each article,\nis on numerous lines. Each article has a differing number \nof lines. In this example, there are three.. Python
1 title of another article again unfortunately the full text of each article,\nis on numerous lines. Python
I might have explained it weirdly in the title, but here is the issue I have. I am making a small sentence-generating program and to choose the sentence, it chooses a random sentence structure.
I want to have a file with the different structure codes on separate lines, like this:
random.choice(blankThat)+" "+sentenceSubject+" "+random.choice(description)+"."
random.choice(questionBegin)+" "+sentenceSubject+" "+random.choice(pastDescription)+"?"
random.choice(pastBegin)+" "+sentenceSubject+" "+random.choice(pastDescription)+"."
random.choice(subjectBegin)+" "+sentenceSubject+"."
random.choice(subjectQuestion)+" "+sentenceSubject+"?"
sentenceSubject+" "+random.choice(description)+"."
sentenceSubject+" "+random.choice(pastDescription)+"."
random.choice(subjectBeginExclaim)+" "+sentenceSubject+"!"
random.choice(songList)+" "+random.choice(["is","was"])+" "+random.choice(adverbs)+" "+random.choice(adjectives)+"."
sentenceSubject+" "+"is"+" "+random.choice(describers)+"."
How would I then randomly choose to execute one of the above lines of code? I tried using this simple code to randomly choose one...
templateFile = open("structures.txt","a+")
templates = templateFile.readlines()
templates = [y.strip() for y in templates]
finalSentence = random.choice(templates)
But when I print(finalSentence), it just spits out one of the lines instead of executing it:
random.choice(pastBegin)+" "+sentenceSubject+" "+random.choice(pastDescription)+"."
How can I just randomly choose and execute one of the lines? I'd prefer it if I can read in the structures from a file, as I will regularly be adding new sentence structures.
Here's a sketch of what you can do. It looks like each line of your file uses three types of expressions:
string literals like " is ",
references to constant strings, like sentenceSubject, and
random choices from string collections, like random.choice(blankThat).
Create a mini-language that can recognize these expressions. E.g.:
?blankThat " " !sentenceSubject " " ?description "."
Create a dictionary of all constant strings, e.g.:
strings = {"sentenceSubject" : "Hello, world", ...}
Create a dictionary of all string collections, e.g.:
collections = {"blankThat" : ["foo", "bar", ...],
"description" : ["dog", "cat", ...], ...}
Write a mini-parser that takes a string written in your mini-language, breaks it into expressions, determines the type of each expression by the first character of the token, and converts it to the proper string:
?X -> random lookup, find X in collections, call random.choice(collections[X])
!X -> constant string, find strings[X]
"X" -> string literal, just use X
Finally, combine all translated pieces. Hope it helps.
This is not a direct answer to your question, but I think it could prove helpful. Consider different ways of constructing and generating this information. Here is a simple and imperfect example, but I think it could be a good place to start:
import random
subject = "Jeremy"
descriptions = ["cool", "tall", "strong"]
hobbies = ["running", "coding"]
def sentence_maker3000():
sentence_vals = {"subject": subject, "descriptions": random.choice(descriptions), "hobbies": random.choice(hobbies)}
valid_sentences = ["{subject} is {descriptions}", "{subject} likes {hobbies}"]
sentence = random.choice(valid_sentences).format(**sentence_vals)
return sentence
print(sentence_maker3000()) # Might print "Jeremy is cool" or "Jeremy likes coding"
You can construct all your valid sentences using Python's formatting brackets. Very easy to read and much shorter to write.
You can write these valid sentences in a separate text file like so:
{subject} is {descriptions}
{subject} likes {hobbies}
and then replace the valid_sentences assignment with:
with open('input.txt', 'r') as f:
valid_sentences = f.read().splitlines()
Use
print(eval(finalSentence))
instead of print(finalSentence). What eval(str) does is it takes a string and runs it like it would have been normal code.
Similar questions have been asked but none quite like this.
I need to save 2 pieces of information in a text file, the username and their associated health integer. Now I need to be able to look into the file and see the user and then see what value is connected with it. Writing it the first time I plan to use open('text.txt', 'a') to append the new user and integers to the end of the txt file.
my main problem is this, How do I figure out which value is connected to a user string? If they're on the same line can I do something like read the only the number in that line?
What are your guys' suggestions? If none of this works, I guess I'll need to move over to json.
This may be what you're looking for. I'd suggest reading one line at a time to parse through the text file.
Another method would be to read the entire txt and separate strings using something like text_data.split("\n"), which should work if the data is separated by line (denoted by '\n').
You're probably looking for configparser which is designed for just that!
Construct a new configuration
>>> import configparser
>>> config = configparser.ConfigParser()
>>> config.sections()
[]
>>> config['Players'] = {
... "ti7": 999,
... "example": 50
... }
>>> with open('example.cfg', 'w') as fh:
... config.write(fh) # write directly to file handler
...
Now read it back
>>> import configparser
>>> config = configparser.ConfigParser()
>>> config.read("example.cfg")
['example.cfg']
>>> print(dict(config["Players"]))
{'ti7': '999', 'example': '50'}
Inspecting the written file
% cat example.cfg
[Players]
ti7 = 999
example = 50
If you already have a text config written in the form key value in each line, you can probably parse your config file as follows:
user_healths = {} # start empty dictionary
with open("text.txt", 'r') as fh: # open file for reading
for line in fh.read().strip().split('\n'): # list lines, ignore last empty
user, health = line.split(maxsplit=1) # "a b c" -> ["a", "b c"]
user_healths[user] = int(health) # ValueError if not number
Note that this will make the user's health the last value listed in text.txt if it appears multiple times, which may be what you want if you always append to the file
% cat text.txt
user1 100
user2 150
user1 200
Parsing text.txt above:
>>> print(user_healths)
{'user1': 200, 'user2': 150}
I've just started to write little programs in Python, so my experience level is very low. At the moment I'm trying to read a file into a data structure in Python3, but I have no idea how to do it fast & easy to understand.
First, I have to explain the content of the file. There are headings and the lines following is additional information belonging to the heading.
Booklist.txt
Programming----------------
Python Cookbook
Python in a nutshell
Recipes--------------------
Slow Cooking
Clean Eating
Low Carb
Sports---------------------
Mastering Mountain Bike Skills
My idea is to have a structure like this:
{'Programming': ['Python Cookbook', 'Python in a nutshell'],
'Recipes': ['Slow Cooking', 'Clean Eating', 'Low Carb'], ... }
So far, I did something that seems to work:
f = open('Booklist.txt')
myDict = dict()
for ind, line in enumerate(f):
match = re.search(r"(^[\w ]+)([-]+)$", line)
if match is not None:
category = match.group(1)
myDict[category] = []
else:
myDict[category].append(line)
f.close()
But what could I do with the index? Can I use it to sort the keys in any way? Dictionaries are unsorted, right?
It may be overkill, but you can use a python PEG parser like parsimonious to parse the booklist.txt. It will take you some time to learn the PEG syntax, but it is much easier to write robust code with an established library than doing everything yourself.
Basic usage:
from parsimonious.grammar import Grammar
grammar = Grammar(
"""
body = ( category '\n' name+ '\n' ) +
category = name '-'+
name = ~"[a-zA-Z]*"i
""")
with open('booklist.txt','r') as f:
ast = grammar.parse(f.read())
print( ast )
After SO update the question
Yes, dict is unsorted. If you want to keep the origin order, use OrderedDict. Also if match is not None: can be simplified to if match:
I have list similar to this:
m=[['qw','wew','23','C:/xyz/s.wav'],['qw','wew','23','C:/xyz/s2.wav'],['qw','wew','23','C:/xyz/s1.wav']]
Now I want to these files
win=wave.open(m[0][3],'rb')
It is giving error how can I use this in this way...
I want to take the files name from the list
Please suggest???
do this:
m = [['qw','wew','23','C:/xyz/s.wav'],['qw','wew','23','C:/xyz/s2.wav'],['qw','wew','23','C:/xyz/s1.wav']]
fname = m[0][3]
print 'fname is', repr(fname)
win = wave.open(fname, 'rb')
and show us (using copy/paste into an edit of your question) everything that is printed, especially
(1) the result of print 'fname is', repr(fname)
(2) the ERROR MESSAGE