Python insert array into string - python

I'm trying to create new files based on my store_array list. If the name doesn't exist yet in directory then create a new one, then another, then another. I have 300 files I need to create.
store_array = ["1234567", "987654", "1919103039"]
if store_number == "1":
continue
print(store_number, file=open(r'C:\Users\hank\Desktop\project\json_' + [store_number] + '".json', 'w'))
TypeError: must be str, not list
I can get output with a simple print(store_number) but I need to concat the text from the array into my filename.
Thanks for the help!

If the error is in the lack of an existing file, the following example may demonstrate a solution. It writes a blank json file for each str(number) in a list.
import json, os
placeholder_data = {}
store_array = ["1","2","3"]
for store_number in store_array:
filename = os.path.join('C:\users\csind\documents\pscripts','test{}.json'.format(store_number))
with open(filename,'w') as file:
json.dump(placeholder_data,file)
print(store_number, filename)

Related

Is there a function to save multiple csv files with different names in python?

I am trying to write a function in python that essentially splits the original CSV file into multiple csv files and saves their names. The code that I have written so far looks like this:
def split_data(a, b, name_of_the_file):
part = df.loc[a:b-1]
filename = str(name_of_the_file)
part.to_csv(r'C:\path\to\file\filename.csv', index=False)
The main intention of the code is to name each file a different name, which is the input (name_of_the_file). The code seems to work, but only saves the file as filename.csv.
Your function is saving the file(s) with the name filename.csv because you only specify that name in the following line:
part.to_csv(r'C:\path\to\file\filename.csv', index=False)
To change the name you need to change the string to take the filename variable:
part.to_csv(f'C:\path\to\file\{filename}.csv', index=False)
Notice how the string has a f in the beginning of it -- this is called an f-string and it allows you to add Python variables directly into the string by using curly brackets ({filename}).
Welcome to Stackoverflow.
I think what you need to do is to user string interpolation.
https://www.programiz.com/python-programming/string-interpolation
Example from the linked page:
name = 'World'
program = 'Python'
print(f'Hello {name}! This is {program}')
In your case something like
def split_data(a, b, name_of_the_file):
part = df.loc[a:b-1]
filename = str(name_of_the_file)
part.to_csv(r'C:\path\to\file\{filename}.csv', index=False)
I have added the curly braces to your code as in my initial example.
I hope that helps.
Another approach to saving multiple csv files
# create a list of dataframes (example below assumes a dictionary of dataframes)
lst_of_dfs = [x for x in dfs]
# path to save files
path = 'c:/location_to_save_files/'
# create a list of filenames you would like to use
fnames = ['A', 'B', 'C', 'D', 'E']
# output the data
# this outputs the files using the index, i, + fnames
for i, j in enumerate(fnames, 0):
lst_of_dfs[i].to_csv(path'+str(i)+str(j)+'.csv')

How to increment output filename in Python

I have a script that works, but when I run it a second time it doesn't because it keeps saving the output filename the same. I'm very new to Python and programming in general, so dumb you answers down...and then dumb them down some more. :)
arcpy.gp.Spline_sa("Observation_RegionalClip_Clip", "observatio", "C:/Users/moshell/Documents/ArcGIS/Default.gdb/Spline_shp16", "514.404", "REGULARIZED", "0.1", "12")
Where Spline_shp16 is the output filename, I would like it to save as Spline_shp17 the next time I run the script, and then Spline_shp18 the time after that, etc.
If you want to use numbers in the file names, you can check what files with similar names already exist in that directory, take the largest one, and increment it by one. Then pass this new number as a variable in the string for the filename.
For example:
import glob
import re
# get the numeric suffixes of the appropriate files
file_suffixes = []
for file in glob.glob("./Spline_shp*"):
regex_match = re.match(".*Spline_shp(\d+)", file)
if regex_match:
file_suffix = regex_match.groups()[0]
file_suffix_int = int(file_suffix)
file_suffixes.append(file_suffix_int)
new_suffix = max(file_suffixes) + 1 # get max and increment by one
new_file = f"C:/Users/moshell/Documents/ArcGIS/Default.gdb/Spline_shp{new_suffix}" # format new file name
arcpy.gp.Spline_sa(
"Observation_RegionalClip_Clip",
"observatio",
new_file,
"514.404",
"REGULARIZED",
"0.1",
"12",
)
Alternatively, if you are just interested in creating unique filenames so that nothing gets overwritten, you can append a timestamp to the end of the filename. So you would have files with names like "Spline_shp-1551375142," for example:
import time
timestamp = str(time.time())
filename = "C:/Users/moshell/Documents/ArcGIS/Default.gdb/Spline_shp-" + timestamp
arcpy.gp.Spline_sa(
"Observation_RegionalClip_Clip",
"observatio",
filename,
"514.404",
"REGULARIZED",
"0.1",
"12",
)

python openpyxl.load_workbook(r"mypath")

i want to use this piece of code openpyxl.load_workbook(r"mypath") but the only difference is that mypath is a varialbe path i change everytime depending on a loop of different folders.
PathsList = []
for folderName, subFolders, fileNames in os.walk
fileNamesList.append(os.path.basename(fileName))
PathsList.append(os.path.abspath(fileName))
or i in range(len(fileNamesList)):
j = 1
while j < len(fileNamesList):
if(first3isdigit(fileNamesList[i])) == (first3isdigit(fileNamesList[j])):
if(in_fileName_DOORS in str(fileNamesList[i]) and in_fileName_TAF in str(fileNamesList[j])):
mypath = PathsList[i]
File = openpyxl.load_workbook(r'mypath ')
wsFile = File.active
mypath is not readable as a vairable , is there's any solution!
Edit 1:i thought also about
File = openpyxl.load_workbook(exec(r'%s' % (mypath))
but couldn't since exec can't be inside brackets
This code
File = openpyxl.load_workbook(r'mypath ')
Tries to pass the raw string 'mypath ' as an argument to the load_workbook method.
If you want to pass the contents of the mypath variable to the method, you should remove the apostrophe and the r tag.
File = openpyxl.load_workbook(mypath)
This is basic python synthax. You can read more about it in the documentation.
Please let me know if this is what you needed.
Edit:
If the slashes are a concern you can do the following:
File = openpyxl.load_workbook(mypath.replace('\\','/')

Can't get unique word/phrase counter to work - Python

I'm having trouble getting anything to write in my outut file (word_count.txt).
I expect the script to review all 500 phrases in my phrases.txt document, and output a list of all the words and how many times they appear.
from re import findall,sub
from os import listdir
from collections import Counter
# path to folder containg all the files
str_dir_folder = '../data'
# name and location of output file
str_output_file = '../data/word_count.txt'
# the list where all the words will be placed
list_file_data = '../data/phrases.txt'
# loop through all the files in the directory
for str_each_file in listdir(str_dir_folder):
if str_each_file.endswith('data'):
# open file and read
with open(str_dir_folder+str_each_file,'r') as file_r_data:
str_file_data = file_r_data.read()
# add data to list
list_file_data.append(str_file_data)
# clean all the data so that we don't have all the nasty bits in it
str_full_data = ' '.join(list_file_data)
str_clean1 = sub('t','',str_full_data)
str_clean_data = sub('n',' ',str_clean1)
# find all the words and put them into a list
list_all_words = findall('w+',str_clean_data)
# dictionary with all the times a word has been used
dict_word_count = Counter(list_all_words)
# put data in a list, ready for output file
list_output_data = []
for str_each_item in dict_word_count:
str_word = str_each_item
int_freq = dict_word_count[str_each_item]
str_out_line = '"%s",%d' % (str_word,int_freq)
# populates output list
list_output_data.append(str_out_line)
# create output file, write data, close it
file_w_output = open(str_output_file,'w')
file_w_output.write('n'.join(list_output_data))
file_w_output.close()
Any help would be great (especially if I'm able to actually output 'single' words within the output list.
thanks very much.
Would be helpful if we got more information such as what you've tried and what sorts of error messages you received. As kaveh commented above, this code has some major indentation issues. Once I got around those, there were a number of other logic errors to work through. I've made some assumptions:
list_file_data is assigned to '../data/phrases.txt' but there is then a
loop through all file in a directory. Since you don't have any handling for
multiple files elsewhere, I've removed that logic and referenced the
file listed in list_file_data (and added a small bit of error
handling). If you do want to walk through a directory, I'd suggest
using os.walk() (http://www.tutorialspoint.com/python/os_walk.htm)
You named your file 'pharses.txt' but then check for if the files
that endswith 'data'. I've removed this logic.
You've placed the data set into a list when findall works just fine with strings and ignores special characters that you've manually removed. Test here:
https://regex101.com/ to make sure.
Changed 'w+' to '\w+' - check out the above link
Converting to a list outside of the output loop isn't necessary - your dict_word_count is a Counter object which has an 'iteritems' method to roll through each key and value. Also changed the variable name to 'counter_word_count' to be slightly more accurate.
Instead of manually generating csv's, I've imported csv and utilized the writerow method (and quoting options)
Code below, hope this helps:
import csv
import os
from collections import Counter
from re import findall,sub
# name and location of output file
str_output_file = '../data/word_count.txt'
# the list where all the words will be placed
list_file_data = '../data/phrases.txt'
if not os.path.exists(list_file_data):
raise OSError('File {} does not exist.'.format(list_file_data))
with open(list_file_data, 'r') as file_r_data:
str_file_data = file_r_data.read()
# find all the words and put them into a list
list_all_words = findall('\w+',str_file_data)
# dictionary with all the times a word has been used
counter_word_count = Counter(list_all_words)
with open(str_output_file, 'w') as output_file:
fieldnames = ['word', 'freq']
writer = csv.writer(output_file, quoting=csv.QUOTE_ALL)
writer.writerow(fieldnames)
for key, value in counter_word_count.iteritems():
output_row = [key, value]
writer.writerow(output_row)
Something like this?
from collections import Counter
from glob import glob
def extract_words_from_line(s):
# make this as complicated as you want for extracting words from a line
return s.strip().split()
tally = sum(
(Counter(extract_words_from_line(line))
for infile in glob('../data/*.data')
for line in open(infile)),
Counter())
for k in sorted(tally, key=tally.get, reverse=True):
print k, tally[k]

Need some help in deleting the data from list and again append it in same list

I have developed a Django app where user can upload multiple files. I can upload all the multiple files and its paths in the form of a list separated by comma(,) in MySql database.For example I have uploaded three files
Logging a Defect.docx,
2.Mocks (1).pptx and
3.Mocksv2.pptx
and it gets stored in database as following( Converting the individual file path into list and joining all the paths results in following form) :
FileStore/client/Logging a Defect.docx,FileStore/client/Mocks (1).pptx,FileStore/client/Mocksv2.pptx,
Now I need help while deleting particular file. For example when I'm deleting Logging a Defect.docx then I should be deleting first element of list alone and retain the other two paths. I'll be sending only name of document.
I'm retrieving the path as list and then I have to check if the name of doc being passed is there in each element of the list and if it matches then I should delete that element keeping the other elements intact. How to approach this ? It sounds like more of python question than Django question.
Use list-expression to filter the splitted text, and rebuild the string using join function
>>> db_path = 'FileStore/client/Logging a Defect.docx,FileStore/client/Mocks (1).pptx,FileStore/client/Mocksv2.pptx'
>>> file_to_delete = 'Logging a Defect.docx'
>>> file_separator = ","
>>> new_db_path = [
... path.strip()
... for path in db_path.split(file_separator)
... if path.strip() and file_to_delete not in path
... ]
>>> string_to_save = file_separator.join(new_db_path)
>>> string_to_save
'FileStore/client/Mocks (1).pptx,FileStore/client/Mocksv2.pptx'
You can read the text in your database and then use remove method of the list in python and then write back the new value into databse:
text = "FileStore/client/Logging a Defect.docx,FileStore/client/Mocks (1).pptx,FileStore/client/Mocksv2.pptx,"
splitted = text.split(',')
#filename is the one you want to delete
entry = "FileStore/client/{filename}".format(filename="Mocks (1).pptx")
if entry in splitted:
splitted.remove(entry)
newtext = ""
for s in splitted:
newtext += s
newtext += ','
now write back newtext to database
Not boasting or anything but I came up with my own logic for my question. It looks far less complicated but it works fine.
db_path = 'FileStore/client/Logging a Defect.docx,FileStore/client/Mocks (1).pptx,FileStore/client/Mocksv2.pptx'
path_list = db_path.split(",")
doc = 'Logging a Defect.docx'
for i in path_list :
if doc in i:
y.remove("FileStore/"+client+"/"+doc)
new_path = ",".join(y)
print new_path

Categories