Create complex file name from user input - python

I’m trying to adapt a script that currently contains the following segment:
# Initialize the output files
working_dir = os.getcwd()
output_path = "{}/{}".format(working_dir, "output_Prelim")
if not os.path.exists(output_path):
os.mkdir(output_path)
data_file = "{0}/RWA_2010_BUFFER_by_1.csv".format(output_path)
error_file = "{0}/failed_queries.txt".format(output_path)
In the statement that begins “data_file,” the parts of the file name “RWA” and “2010” refer to the country and year in which a particular survey was conducted.
I’m trying to adapt that segment so that the file name preserves the same general format, but allows the user to enter a different country code and year.
I can generate a string called “file_name” that looks right, using the following code:
print('Enter the country code')
cCode =input()
print('The country code is '+cCode)
print('Enter the survey year')
srvyYear =input()
print('The survey year is '+srvyYear)
file_name = r'"{0}/'+cCode+'_'+srvyYear+'_'+'BUFFER300_by_1.csv"'\
When I print “file_name,” I get
"{0}/BDI_2009_BUFFER300_by_1.csv"
That looks right, but am not sure what to do with it - in particular, how to get it understood as a file name rather than as a string. When I try to concatenate that string with the remainder of the statement that begins “data_file,” I get a syntax error.
Obviously I need to do a tutorial, but am not sure what to look for.
Many thanks, and apologies for the newbie question.

Not sure what your problem is exactly, but if you want to replace the {0} part with something else (e.g. the value in data_file), you can just do file_name.format(data_file).

Why not use join() method for string and os.path.join() for paht? e.g.
file_name = os.path.join(out_path, '_'.join([cCode, srvyYear, 'BUFFER300_by_1.csv']))
You can see its doc here, I believe that is what you needed, you'd better not concatenate string by yourself to construct path or filename. By the way, os.path.join() can construct filename without platform dependence, it will be a smart choice (especially for Windows).

Related

How would I be able to remove this part of the variable?

So I am making a code like a guessing game. The data for the guessing game is in the CSV file so I decided to use pandas. I have tried to use pandas to import my csv file, pick a random row and put the data into variables so I can use it in the rest of the code but, I can't figure out how to format the data in the variable correctly.
I've tried to split the string with split() but I am quite lost.
ar = pandas.read_csv('names.csv')
ar.columns = ["Song Name","Artist","Intials"]
randomsong = ar.sample(1)
songartist = randomsong["Artist"]
songname = (randomsong["Song Name"])
songintials = randomsong["Intials"]
print(songname)
My CSV file looks like this.
Song Name,Artist,Intials
Someone you loved,Lewis Capaldi,SYL
Bad Guy,Billie Eilish,BG
Ransom,Lil Tecca,R
Wow,Post Malone, W
I expect the output to be the name of the song from the csv file. For Example
Bad Guy
Instead the output is
1 Bad Guy
Name: Song Name, dtype:object
If anyone knows the solution please let me know. Thanks
You're getting a series object as output. You can try
randomsong["Song Name"].to_string()
Use df['column].values to get values of the column.
In your case, songartist = randomsong["Artist"].values[0] because you want only the first element of the returned list.

Extract information in a line of text with a format from user input

I am trying to make a program which takes in input song files and a format to write metatags in file. Here is a few examples of the call:
./parser '%n_-_%t.mp3' 01_-_Respect.mp3 gives me track=01; title=Respect
./parser '%b._%n.%t.mp3' The_Queen_of_Soul._01.Respect.mp3 gives me album=The_Queen_of_Sould; track=01; title=Respect
./parser '%a-%b._%n.%t.mp3' Aretha_Franklin-The_Queen_of_Soul._01.Respect.mp3 gives me artist=Aretha_Franklin; track=01; title=Respect
./parser '%a_-_%b_-_%n_-_%t.mp3' Aretha_Franklin_-_The_Queen_of_Soul_-_01_-_Respect.mp3 gives me artist=Aretha_Franklin; track=01; title=Respect
For a call on the file 01_-_Respect.mp3, I'd like to have a variable containing 01, and the other Respect.
Here %n and %t represents respectively the number and the title of the songs. The problem is that I don't know how to extract this information in bash (or eventually in python).
My biggest problem is that I don't know the format in advance!
Note: There is more information than this, for example %b for the album, %a for the artist etc.
Well, you can use the string method split to split the string by _-_.
and for taking the input from the command line, you can use sys.argv to get that.
here's an example:
import sys
number,title = sys.argv[1].split("_-_")
Update:
Surely you can pass the pattern as a first argument and the file as the second argument like that:
import sys
pattern = sys.argv[1]
number,title = sys.argv[2].split(pattern)
Now if you need more complex and dynamic processing, then Regex is your winning card!
And in order to write a good regex, you got to understand your data and your problem or you'll end up writing a glitchy regex
You can elaborate on this. It is a very simple example, though.
import re
p = re.compile('([0-1][0-1])_\-_(.*)\.mp3')
title = '01_-_Respect.mp3'
p.findall(title)
Output
[('01', 'Respect')]
I use this page to play with regex.
Update
Since the format is given, go with string slicing. Ok, pretty limited to the specific case..
number = title[:title.find('_')]
>>> number
'01'
>>> track = title[len(number) + 3:len(title)-4]
>>> track
'Respect'
Try This code:
(considering argument is given in runtime)
tmp=$1
num=echo ${tmp%%_*}
title=echo ${tmp##*_}|cut -d. -f1
Variables num and title will store the parts from the argument

Python - Searching a dictionary for strings

Basically, I have a troubleshooting program, which, I want the user to enter their input. Then, I take this input and split the words into separate strings. After that, I want to create a dictionary from the contents of a .CSV file, with the key as recognisable keywords and the second column as solutions. Finally, I want to check if any of the strings from the split users input are in the dictionary key, print the solution.
However, the problem I am facing is that I can do what I have stated above, however, it loops through and if my input was 'My phone is wet', and 'wet' was a recognisable keyword, it would go through and say 'Not recognised', 'Not recognised', 'Not recognised', then finally it would print the solution. It says not recognised so many times because the strings 'My', 'phone' and 'is' are not recognised.
So how do I test if a users split input is in my dictionary without it outputting 'Not recognised' etc..
Sorry if this was unclear, I'm quite confused by the whole matter.
Code:
import csv, easygui as eg
KeywordsCSV = dict(csv.reader(open('Keywords and Solutions.csv')))
Problem = eg.enterbox('Please enter your problem: ', 'Troubleshooting').lower().split()
for Problems, Solutions in (KeywordsCSV.items()):
pass
Note, I have the pass there, because this is the part I need help on.
My CSV file consists of:
problemKeyword | solution
For example;
wet Put the phone in a bowl of rice.
Your code reads like some ugly code golf. Let's clean it up before we look at how to solve the problem
import easygui as eg
import csv
# # KeywordsCSV = dict(csv.reader(open('Keywords and Solutions.csv')))
# why are you nesting THREE function calls? That's awful. Don't do that.
# KeywordsCSV should be named something different, too. `problems` is probably fine.
with open("Keywords and Solutions.csv") as f:
reader = csv.reader(f)
problems = dict(reader)
problem = eg.enterbox('Please enter your problem: ', 'Troubleshooting').lower().split()
# this one's not bad, but I lowercased your `Problem` because capital-case
# words are idiomatically class names. Chaining this many functions together isn't
# ideal, but for this one-shot case it's not awful.
Let's break a second here and notice that I changed something on literally every line of your code. Take time to familiarize yourself with PEP8 when you can! It will drastically improve any code you write in Python.
Anyway, once you've got a problems dict, and a problem that should be a KEY in that dict, you can do:
if problem in problems:
solution = problems[problem]
or even using the default return of dict.get:
solution = problems.get(problem)
# if KeyError: solution is None
If you wanted to loop this, you could do something like:
while True:
problem = eg.enterbox(...) # as above
solution = problems.get(problem)
if solution is None:
# invalid problem, warn the user
else:
# display the solution? Do whatever it is you're doing with it and...
break
Just have a boolean and an if after the loop that only runs if none of the words in the sentence were recognized.
I think you might be able to use something like:
for word in Problem:
if KeywordsCSV.has_key(word):
KeywordsCSV.get(word)
or the list comprehension:
[KeywordsCSV.get(word) for word in Problem if KeywordsCSV.has_key(word)]

match hex string with list indice

I'm building a de-identify tool. It replaces all names by other names.
We got a report that <name>Peter</name> met <name>Jane</name> yesterday. <name>Peter</name> is suspicious.
outpout :
We got a report that <name>Billy</name> met <name>Elsa</name> yesterday. <name>Billy</name> is suspicious.
It can be done on multiple documents, and one name is always replaced by the same counterpart, so you can still understand who the text is talking about. BUT, all documents have an ID, referring to the person this file is about (I'm working with files in a public service) and only documents with the same people ID will be de-identified the same way, with the same names. (the goal is to watch evolution and people's history) This is a security measure, such as when I hand over the tool to a third party, I don't hand over the key to my own documents with it.
So the same input, with a different ID, produces :
We got a report that <name>Henry</name> met <name>Alicia</name> yesterday. <name>Henry</name> is suspicious.
Right now, I'm hashing each name with the document ID as a salt, I convert the hash to an integer, then subtract the length of the name list until I can request a name with that integer as an indice. But I feel like there should be a quicker/more straightforward approach ?
It's really more of an algorithmic question, but if it's of any relevance I'm working with python 2.7 Please request more explanation if needed. Thank you !
I hope it's clearer this way ô_o Sorry when you are neck-deep in your code you forget others need a bigger picture to understand how you got there.
As #LutzHorn pointed out, you could just use a dict to map real names to false ones.
You could also just do something like:
existing_names = []
for nameocurrence in original_text:
if not nameoccurence.name in existing_names:
nameoccurence.id = len(existing_names)
existing_names.append(nameoccurence.name)
else:
nameoccurence.id = existing_names.index(nameoccurence.name)
for idx, _ in enumerate(existing_names):
existing_names[idx] = gimme_random_name()
Try using a dictionary of names.
import re
names = {"Peter": "Billy", "Jane": "Elsa"}
for name in re.findall("<name>([a-zA-Z]+)</name>", s):
s = re.sub("<name>" + name + "</name>", "<name>"+ names[name] + "</name>", s)
print(s)
Output:
'We got a report that <name>Billy</name> met <name>Elsa</name> yesterday. <name>Billy</name> is suspicious.'

Writing and Editing Files (Python)

First of all i would like to apologize since i am a beginner to Python. Anyway I have a Python Program where I can create text files with the general form:
Recipe Name:
Item
Weight
Number of people recipe serves
And what I'm trying to do is to allow the program to be able to retrieve the recipe and have the ingredients recalculated for a different number of people. The program should output the the recipe name, the new number of people and the revised quantities for the new number of people. I am able to retrieve the recipe and output the recipe however i am not sure how to have the ingredients recaculated for a different number of people. This is part of my code:
def modify_recipe():
Choice_Exist = input("\nOkaym it looks like you want to modify a recipe. Please enter the name of this recipe ")
Exist_Recipe = open(Choice_Exist, "r+")
time.sleep(2)
ServRequire = int(input("Please enter how many servings you would like "))
I would recommend splitting your effort into multiple steps, and working on each step (doing research, trying to write the code, asking specific questions) in succession.
1) Look up python's file I/O. 1.a) Try to recreate the examples you find to make sure you understand what each piece of the code does. 1.b) Write your own script that accomplishes just this piece of your desired program, i.e. opens an exist recipe text file or creates a new one.
2) Really use you're own functions in Python particularly with passing your own arguments. What you're trying to make is a perfect example of good "modular programming", were you would right a function that reads an input file, another that writes an output file, another that prompts users for they number they'd like to multiple, and so on.
3) Add a try/except block for user input. If a user enters a non-numeric value, this will allow you to catch that and prompt the user again for a corrected value. Something like:
while True:
servings = raw_input('Please enter the number of servings desired: ')
try:
svgs = int(servings)
break
except ValueError:
print('Please check to make sure you entered a numeric value, with no'
+' letters or words, and a whole integer (no decimals or fractions).')
Or if you want to allow decimals, you could use float() instead of int().
4) [Semi-Advanced] Basic regular expressions (aka "regex") will be very helpful in building out what you're making. It sounds like your input files will have a strict, predictable format, so regex probably isn't necessary. But if you're looking to accept non-standard recipe input files, regex would be a great tool. While it can be a bit hard or confusing skill to learn, but there are a lot of good tutorials and guides. A few I bookmarked in the past are Python Course, Google Developers, and Dive Into Python. And a fantastic tool I strongly recommend while learning to build your own regular expression patterns is RegExr (or one of many similar, like PythonRegex), which show you what parts of your pattern are working or not working and why.
Here's an outline to help get you started:
def read_recipe_default(filename):
# open the file that contains the default ingredients
def parse_recipe(recipe):
# Use your regex to search for amounts here. Some useful basics include
# '\d' for numbers, and looking for keywords like 'cups', 'tbsp', etc.
def get_multiplier():
# prompt user for their multiplier here
def main():
# Call these methods as needed here. This will be the first part
# of your program that runs.
filename = ...
file_contents = read_recipe_file(filename)
# ...
# This last piece is what tells Python to start with the main() function above.
if __name__ == '__main__':
main()
Starting out can be tough, but it's very worth it in the end! Good luck!
I had to edit it a couple times because I use Python 2.7.5, but this should work:
import time
def modify_recipe():
Choice_Exist = input("\nOkay it looks like you want to modify a recipe. Please enter the name of this recipe: ")
with open(Choice_Exist + ".txt", "r+") as f:
content = f.readlines()
data_list = [word.replace("\n","") for word in content]
time.sleep(2)
ServRequire = int(input("Please enter how many servings you would like: "))
print data_list[0]
print data_list[1]
print int(data_list[2])*ServRequire #provided the Weight is in line 3 of txt file
print ServRequire
modify_recipe()

Categories