Python3: Use Dateutil to make multiple vars from a long string - python

I have a program where I would like to randomly pull a line from a song, and string them together with other lines from other songs. Looking into I saw that the dateutil library might be able to help me parse multiple variables from a string, but it doesn't do quite what I want.
I have multiple strings like this, only much longer.
"This is the day\nOf the expanding man\nThat shape is my shade\nThere where I used to stand\nIt seems like only yesterday\nI gazed through the glass\n..."
I want to randomly pull one line from this string (To the page break) and save it as a variable but iterate this over multiple strings, any help would be much appreciated.

assuming you want to pull one line at random from the string you can use choice from the random module.
random. choice ( seq ): Return a random element from the non-empty
sequence seq. If seq is empty, raises IndexError.
from random import choice
data = "This is the day\nOf the expanding man\nThat shape is my shade\nThere where I used to stand\nIt seems like only yesterday\nI gazed through the glass\n..."
my_lines = data.splitlines()
my_line = choice(my_lines)
print(my_line)
OUTPUT
That shape is my shade

Related

Adding strings together adds brackets and quotation marks

I'm new to programming and trying to learn it by doing small projects. Currently I'm working on a random string generator and I have it 99% done, but I cant get the output to be the way I want it to be.
First, here is the code:
import random
def pwgenerator():
print("This is a randomm password generator.")
print("Enter the lenght of your password and press enter to generate a password.")
lenght = int(input())
template = "abcdefghijklmnopqrstuvwxyz01234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ!" # this is the sample used for choosing random characters
generator = ""
for x in range(lenght): # join fcuntion goes trough str chronologically and I want it fully randomized, so I made this loop
add_on = str(random.sample(template, 1))
generator = generator + add_on
#print(add_on) added this and next one to test if these are already like list or still strings.
#print(generator)
print(generator) # I wanted this to work, but...
for x in range(lenght): #...created this, because I thought that I created list with "generator" and tried to print out a normal string with this
print(generator[x], end="")
pwgenerator()
The original code was supposed to be this:
import random
def pwgenerator():
print("This is a randomm password generator.")
print("Enter the lenght of your password and press enter to generate a password.")
lenght = int(input())
template = "abcdefghijklmnopqrstuvwxyz01234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ!"
generator = ""
for x in range(lenght):
generator = generator + str(random.sample(template, 1))
print(generator)
pwgenerator()
The problem is that with this original code and an for example an input of 10 I get this result:
['e']['3']['i']['E']['L']['I']['3']['r']['l']['2']
what I would want as an output here would be "e3iELI3rl2"
As you can see in the first code i tried a few things, because it looked to me like i was somehow creating a List with lists as items, that each have 1 entry. So i though I would just print out each item, but the result was (for a user input/lenght of 10):
['e']['3']
So it just printed out each character in that list as a string (inlcuding the brackets and quotation marks) , which I interpret as whatever I created not being a list. but actually still a string
Doing some research - and assuming I still created a string - i found this from W3Schools. If I understand it correctly though Im, doing everything right trying to add strings together.
Can you please tell me whats going on here, specifically why I get the output i get that looks like a list of lists?
And if you can spare some more time Id also like to hear for a better way to do this, but I mainly want to understand whats going on, rather than be given a solution. Id like to find a solution myself. :D
Cheers
PS:
Just in case you are wondering: Im trying to learn by doing and currently follow the suggested mini projects from HERE. But in this case I read on W3Schools, that the "join" method results in chronological results so I added the additional complication of making it really random.
Okay, so the problem is that random.choice returns list of strings instead of a string as you may see below:
template = "abcdefghijklmnopqrstuvwxyz01234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ!"
random.sample(template, 1)
Out[5]: ['H']
What actually happened there it was adding strings containing result of list casting (e.g ['H'] was converted to "['H']" and then printed on the screen as ['H']).
After modyfying the function to random.choice it worked fine:
random.choice(template)
Out[6]: 'j'
Switch this random.sample to random.choice in your function and it shall be as you expected.
The random.sample() function returns a list chosen from the given string.
That's why you're getting a bunch lists stacked together.

python parsing file into data structure

So I started looking into it, and I haven't found a good way to parse a file following the format I will show you below. I have taken a data structures course, but it doesn't really help me with what I want to do. Any help will be greatly appreciated!
Goal: Create a tool that can read, create, and manipulate a custom file type
File Format: I'm sure there is a name for this type of format, but I couldn't find it. Anyways, the format is subject to some change since the variable names can be added, removed, or changed. Also, after each variable name the data could be one of several different types. Right now the files do not use sub groups, but I want to be prepared in case they decide to change that. The only things I can think of that will remain constant are the GROUP = groupName, END_GROUP = groupName, and the varName = data.
GROUP = myGroup
name1 = String, datenum, number, list, array
name2 = String, datenum, number, list, array
// . . .
name# = String, datenum, number, list, array
GROUP = mySubGroup
name1 = String, datenum, number, list, array
END_GROUP = mySubGroup
// More names could go here
END_GROUP = myGroup
GROUP = myGroup2
// etc.
END_GROUP = myGroup2
Strings and dates are enclosed in " (ie "myString")
Numbers are written as a raw ascii encoded number. They also use the E format if they are large or small (ie 5.023E-6)
Lists are comma separated and enclosed in parentheses (ie (1,2,3,4) )
Additional Info:
I want to be able to easily read a file and manipulate it as needed. For example, if I read the file and I want to change an attribute of a specific variable within a group I should be able to do something along the lines of dataStructure.groupName.varName = newData.
It should be easy to create my own file (using a default template that I will make myself or a custom template that has been passed in).
I want it to treat numbers as numbers and not strings. I should be able to add, subtract, multiply, etc. values within the data structure that are numbers
The big kicker, I'd like to have this written in vanilla python since our systems have only the most basic modules. It is a huge pain for someone to download another module since they have to create their own virtual environment and import the module to it. This tool should be as system independent as possible
Initial Attempt: I was thinking of using a dictionary to organize the data in levels. I do, however, like the idea of using dot structures (like what one would see using MATLAB structures). I wrote a function that will read all the lines of the file and remove the newline characters from each line. From there I want to check for every GROUP = I can find. I would start adding data to that group until I hit an END_GROUP line. Using regular expressions I should be able to parse out the line to determine whether it is a date, number, string, etc.
I am asking this question because I hope to have some insight on things I may be missing. I'd like for this tool to be used long after I've left the dev team which is why I'm trying to do my best to make it as intuitive and easy to use as possible. Thank you all for your help, I really appreciate it! Let me know if you need any more information to help you help me.
EDIT: To clarify what help I need, here are my two main questions I am hoping to answer:
How should I build a data structure to hold grouped data?
Is there an accepted algorithm for parsing data like this?

Trying to fill an array with data opened from files

the following is code I have written that tries to open individual files, which are long strips of data and read them into an array. Essentially I have files that run over 15 times (24 hours to 360 hours), and each file has an iteration of 50, hence the two loops. I then try to open the files into an array. When I try to print a specific element in the array, I get the error "'file' object has no attribute 'getitem'". Any ideas what the problem is? Thanks.
#!/usr/bin/python
############################################
#
import csv
import sys
import numpy as np
import scipy as sp
#
#############################################
level = input("Enter a level: ");
LEVEL = str(level);
MODEL = raw_input("Enter a model: ");
NX = 360;
NY = 181;
date = 201409060000;
DATE = str(date);
#############################################
FileList = [];
data = [];
for j in range(1,51,1):
J = str(j);
for i in range(24,384,24):
I = str(i);
fileName = '/Users/alexg/ECMWF_DATA/DAT_FILES/'+MODEL+'_'+LEVEL+'_v_'+J+'_FT0'+I+'_'+DATE+'.dat';
FileList.append(fileName);
fo = open(fileName,"rb");
data.append(fo);
fo.close();
print data[1][1];
print FileList;
EDITED TO ADD:
Below, find the CORRECT array that the python script should be producing (sorry it wont let me post this inline yet):
http://i.stack.imgur.com/ItSxd.png
The problem I now run into, is that the first three values in the first row of the output matrix are:
-7.090874
-7.004936
-6.920952
These values are actually the first three values of the 11th row in the array below, which is the how it should look (performed in MATLAB). The next three values the python script outputs (as what it believes to be the second row) are:
-5.255577
-5.159874
-5.064171
These values should be found in the 22nd row. In other words, python is placing the 11th row of values in the first position, the 22nd in the second and so on. I don't have a clue as to why, or where in the code I'm specifying it do this.
You're appending the file objects themselves to data, not their contents:
fo = open(fileName,"rb");
data.append(fo);
So, when you try to print data[1][1], data[1] is a file object (a closed file object, to boot, but it would be just as broken if still open), so data[1][1] tries to treat that file object as if it were a sequence, and file objects aren't sequences.
It's not clear what format your data are in, or how you want to split it up.
If "long strips of data" just means "a bunch of lines", then you probably wanted this:
data.append(list(fo))
A file object is an iterable of lines, it's just not a sequence. You can copy any iterable into a sequence with the list function. So now, data[1][1] will be the second line in the second file.
(The difference between "iterable" and "sequence" probably isn't obvious to a newcomer to Python. The tutorial section on Iterators explains it briefly, the Glossary gives some more information, and the ABCs in the collections module define exactly what you can do with each kind of thing. But briefly: An iterable is anything you can loop over. Some iterables are sequences, like list, which means they're indexable collections that you can access like spam[0]. Others are not, like file, which just reads one line at a time into memory as you loop over it.)
If, on the other hand, you actually imported csv for a reason, you more likely wanted something like this:
reader = csv.reader(fo)
data.append(list(reader))
Now, data[1][1] will be a list of the columns from the second row of the second file.
Or maybe you just wanted to treat it as a sequence of characters:
data.append(fo.read())
Now, data[1][1] will be the second character of the second file.
There are plenty of other things you could just as easily mean, and easy ways to write each one of them… but until you know which one you want, you can't write it.

Python-How can i change part of a row in a CSV file?

I have a CSV file with some words in, followed by a number and need a way to append the number; either adding 1 to it, or setting it back to 1.
Say for instance I have these words:
variant,1
sixty,2
game,3
library,1
If the user inputs the number sixty, how could I use that to add one onto the number, and how would I reset it back to 1?
I've been all over Google+Stackoverflow trying to find an answer, but I expect me not being able to find an answer was due more to my inexperience than anything.
Thanks.
This is a quick throw together using fileinput. Since I am unaware of the conditions for why you would decrease or reset your value, I added it in as an keyword arg you can pass at will. Such as
updateFileName(filename, "sixty", reset=True)
updateFileName(filename, "sixty", decrease=True)
updateFileName(filename, "sixty")
The results of each should be self-explanatory. Good luck! I wrapped it in a Try as I had no clue how your structure was, which will cause it to fail ultimately either way. If you have spaces you will need to .strip() the key and value.
import fileinput
def updateFileName(filename, input_value, decrease=False, reset=False):
try:
for line in fileinput.input(filename, inplace=True):
key, value = line.split(",")
if key == input_value:
if decrease:
sys.stdout.write("%s,%s"(key, int(value) - 1)
elif reset:
sys.stdout.write("%s,%s"(key, 1)
else:
sys.stdout.write("%s,%s"(key, int(value) + 1)
continue
sys.stdout.write(line)
finally:
fileinput.close()
Without knowing when you want to switch a number to 1 and when you want to add 1, I can't give a full answer, but I can set you on the right track.
First you want to import the csv file, so you can change it around.
The csv module is very helpful in this. Read about it here: http://docs.python.org/2/library/csv.html
You will likely want to read it into a dictionary structure, because this will link each word to it's corresponding number. Something like this make dictionary from csv file columns
Then you'll want to use raw_input (or input if you are using Python 3.0)
to get the word you are looking for and use that as the key to find and change the number you want to change. http://anh.cs.luc.edu/python/hands-on/handsonHtml/handson.html#x1-490001.12
or http://www.sthurlow.com/python/lesson06/
will show you how to get one part of a dictionary by giving it the other part and how to save info back into a dictionary.
Sorry it's not a direct answer. Maybe someone else will write one up, but it should get you started,
This is very probably not the best way to do it but you can load your csv file in an array object using numpy with the help of loadtxt().
The following code is going to give you a 2 dimension array with names in the first column and your numbers in the second one.
import numpy as np
a = np.loadtxt('YourFile',delimiter=',')
Perform your changes on the numbers the way you want and use the numpy savetxt() to save your file.
If your file is very heay, this solution is going to be a pain as loading a huge array takes a lot of memory. So consider it just as a workaround. The dictionary solution is actually better (I think).

Generate N amount of outputs depending on input. Python

import os,random
def randomizer(input,sample_size,output='NM_controls.xls'):
#Input file with query
query=open(input,'r').read().split('\n')
dir,file=os.path.split(input)
temp = os.path.join(dir,output)
out_file=open(temp,'w')
name_list = 833 #### got this from input
output_amount = (name_list)/(sample_size) #### = 3.6 but I want the floor value so its fine
i'm writing a function right now and the number of outputs depends on the input. so in my function , it takes in a file and then scans it and partitions the names and other data.
the next part mutually-exclusive and randomly samples from this lists. i want it to generate a certain amount of files but that depends on how many names are in the list.
is there anyway to use the 'os' module to create a certain number of lists that i did not specify in variables ?
in this case it would be 3 outputs ['output_1.txt','output_2.txt','output_3.txt']
**sorry for the confusion! so i'm using the OS module because I want to create non-existent files in the same directory that the input file was located. that is the only way i know how to do it
If you want to "create a certain number of lists" (where that number isn't known until runtime), that's easy:
my_lists = []
for _ in range(certain_number):
my_lists.append(make_another_list())
Or, even more simply:
my_lists = [make_another_list() for _ in range(certain_number)]
Or, if you have a dynamic number of input lists, and want to create one output list for each, you don't even need to calculate the number:
my_lists = [transform_a_list(input_list) for input_list in input_lists]
And obviously, there is nothing list-specific here. As you can tell, my other than the word list appearing in the names of the make_another_list and transform_a_list functions, I'm not making any use of the fact that the things you're creating a certain number of are lists. You can create a certain number of strings, numbers, arrays, threads, files, databases, etc. the same way.
If you want to write to output_amount files (as indicated by the out_file variable), then the following should work.
for n in range(output_amount):
outfile = open('%s_%d.txt'%(temp, n), 'w')
# Now write the data to output file n
If you want to create a list with n items, see abarnert's answer.

Categories