Python - Write each item from a dict's key to a new line - python

I think this should be fairly simple but I can't find the write solution for my situation. I have dict where each key has a list. For each key in the dict, I want to open a csv file and write each item in each list to a new line. As the program loops through the dict, a new csv file is created for each key, lines for each item in the list for that key are written to a csv, and then the csv closes so the next csv can be created for the next key:
for k, v in Dict.items():
name = "coQ" + str(k) + ".csv"
cPath = r"C:\Path"
coQ = os.path.join(cPath, name)
company_file = open(coQ, 'w')
for i in v:
company_file.write(str(i))
company_file.close()
This writes the list to a csv but all the list items are on the same line in the csv outputs. I've tried opening with append 'a' but I get the same result and I don't think newline will work as it's not a new line but an item in a list that needs to be written.

Try:
for k, v in Dict.items():
name = "coQ" + str(k) + ".csv"
cPath = r"C:\Path"
coQ = os.path.join(cPath, name)
company_file = open(coQ, 'w')
for i in v:
company_file.write("\n")
company_file.write(str(i))
company_file.close()
Or as patrick said:
for k, v in Dict.items():
name = "coQ" + str(k) + ".csv"
cPath = r"C:\Path"
coQ = os.path.join(cPath, name)
company_file = open(coQ, 'w')
for i in v:
company_file.write(str(i)+'\n')
company_file.close()

Related

Extracting strings using a list of words

I am trying to solve the problem (for example)
I have a name file
VOG00001
VOG00002
VOG00004
and database file
VOG00001!962!834!Xu!sp|O31936|YOPB_BACSU_Putative_antitoxin_YopB
VOG00002!206!17!Xh!sp|Q5UPJ9|YL122_MIMIV_Putative_ankyrin_repeat_protein_L12
VOG00003!1284!960!Xr!sp|O22001|VXIS_BPMD2_Excisionase
VOG00004!353!304!Xu!sp|P03795|Y28_BPT7_Protein_2.8
VOG00005!253!60!Xu!REFSEQ_hypothetical_protein
I need to extract rows from the database that match the words in the file names
results:
VOG00001!962!834!Xu!sp|O31936|YOPB_BACSU_Putative_antitoxin_YopB
VOG00002!206!17!Xh!sp|Q5UPJ9|YL122_MIMIV_Putative_ankyrin_repeat_protein_L12
VOG00004!353!304!Xu!sp|P03795|Y28_BPT7_Protein_2.8
def log(x, y):
output = open('output.txt', 'a')
output.write(x + y)
output.close
def main():
i = 0
nfile = 'input/' + input('Enter file with names: ')
dfile = 'input/' + input('Enter file with data: ')
names = list(open(nfile, 'r'))
data = list(open(dfile, 'r'))
while i != len(data):
line = data[i]
if 'VOG' in line:
line1 = line.replace("!*" , "")
if line1 in names:
log(line, data[i + 1])
i += 1
return(0)
main()
I want to trim the unnecessary and compare with the list of names
line1 = line.replace("!*" , "")
The best way to do this is to create a dictionary based on the database file. The database file uses '!' to delimit the key so we just split the strings to find the key and associated the entire line with that key (the dictionary value). Then iterate over the "names" file and do lookups in the dictionary.
names = input('Enter path to names file: ')
db = input('Enter path to database file: ')
with open(names) as n, open(db) as d:
dict_ = {}
for line in map(str.strip, d):
key, *_ = line.split('!')
dict_[key] = line
for name in map(str.strip, n):
if (v := dict_.get(name)):
print(v)

what does this mean: started your execution timer at the wrong place?

This is the code for a tool that will convert this JSON-based log to columnar format, i.e., we want to create one file per column. Take Reference to this question from this link: converting json based log into column format, i.e., one file per column
So when I submitted this, I was told that my execution timer was starting at the wrong place so before I improve this code I was asked to fix it. can someone tell me what that is?
Any help will be appreciated. Also if you want you can look on the GitHub repo for it: https://github.com/sparsh459/File-content-convertor-tool
import json
import os
# --- functions ---
def flatten_dict(data, prefix=""):
result = {}
for key, value in data.items():
if prefix:
key = prefix + "." + key
# -------checks if the key has nested dictionary as value or not ---------
if isinstance(value, dict):
result.update( flatten_dict(value, key) )
else:
# -------edge case for null value--------
if value is None:
result[key] = "\n"
# -------edge case for empty value--------
elif value is "":
result[key] = "\n"
else:
result[key] = value
return result
# --- main ---
path = input("Enter the path for the log file: ")
# --------Checking if path exists to the selected log-------------
assert os.path.exists(path), "I did not find the file at, "+str(path)
file_obj = open(path) # emulate file in memory
for line in file_obj:
# --------handled the undefined string edge case ---------
data = json.loads(line.replace(': undefined', ': null'))
# ----converting nested dictionay to non-nested ----------
data = flatten_dict(data)
for key, value in data.items():
with open(key + '.column', "a") as f:
f.write(str(value) + "\n")

Python and Comparing File Changes

d = feedparser.parse('somerssfeed/rss.xml')
message = {}
smessage = {}
for post in d.entries:
message[post.link] = post.title
fwrite = open("db.txt", "a")
for k, v in message.items():
if k in open("db.txt", "r"):
print("already exists")
else:
fwrite.write("\n" + "{0}".format(k) + "\n")
smessage[k] = v
What i want to achieve is parsing RSS feeds and write their links in to a text file. But the problem is when i run the script next time it should't return old rss items so i compare them via text file except it's failing. On the first run it writes all links, second run it should return empty because all of links are the same but it writes again the same links
EDIT:
after a whole day of trial and error this worked:
for k, v in message.items():
if k in open('db.txt').read():
print('already exists')
else:
smessage[k] = v
fwrite = open("db.txt", "a")
fwrite.write('\n{0}\n'.format(k))
fwrite.close()
You aren't using the correct syntax to open the file. Use this :
g = open("db.txt","r")
lines = xml_file.readlines()
if k in lines:
print ("already exists");

how to read a specific line which starts in "#" from file in python

how can i read a specific line which starts in "#" from file in python and
set that line as a key in a dictionary (without the "#") and set all the lines after that line until the next "#" as a value is the dictionary
please help me
here is the file :
from collections import defaultdict
key = 'NOKEY'
d = defaultdict(list)
with open('thefile.txt', 'r') as f:
for line in f:
if line.startswith('#'):
key = line.replace('#', '')
continue
d[key].append(line)
Your dictionary will have a list of lines under each key. All lines that come before the first line starting with '#' would be stored under the key 'NOKEY'.
You could make use of Python's groupby function as follows:
from itertools import groupby
d = {}
key = ''
with open('input.txt', 'r') as f_input:
for k, g in groupby(f_input, key=lambda x: x[0] == '#'):
if k:
key = next(g).strip(' #\n')
else:
d[key] = ''.join(g)
print d
This would give you the following kind of output:
{'The Piper at the gates of dawn': '*Lucifer sam....\nsksdlkdfslkj\ndkdkfjoiupoeri\nlkdsjforinewonre\n', 'A Saucerful of Secrets': '*Let there be\nPeople heard him say'}
Tested using Python 2.7.9
A pretty simple version
filename = 'test'
results = {}
with open(filename, 'r') as f:
while (1):
text = f.readline()
if (text == ''):
break
elif (text[0] == "#"):
key = text
results[key] = ''
else:
results[key] += text
From (ignoring additional blank lines, a bi-product of the Answer formatting):
#The Piper at the gates of dawn
*Lucifer sam....
sksdlkdfslkj
dkdkfjoiupoeri
lkdsjforinewonre
# A Saucerful of Secrets
*Let there be
People heard him say
Produces:
{'#The Piper at the gates of dawn\n': '*Lucifer sam....\nsksdlkdfslkj\ndkdkfjoiupoeri\nlkdsjforinewonre\n', '# A Saucerful of Secrets \n': '*Let there be\nPeople heard him say\n'}

Manipulating Python dictionaries to remove empty values

I'm trying to remove a key/value pair if the key contains 'empty' values.
I have tried the following dictionary comprehension and tried doing it in long form, but it doesn't seem to actually do anything and I get no errors.
def get_Otherfiles():
regs = ["(.*)((U|u)ser(.*))(\s=\s\W\w+\W)", "(.*)((U|u)ser(.*))(\s=\s\w+)", "(.*)((P|p)ass(.*))\s=\s(\W(.*)\W)", "(.*)((P|p)ass(.*))(\s=\s\W\w+\W)"]
combined = "(" + ")|(".join(regs) + ")"
cred_results = []
creds = []
un_matched = []
filesfound = []
d = {}
for root, dirs, files in os.walk(dir):
for filename in files:
if filename.endswith(('.bat', '.vbs', '.ps', '.txt')):
readfile = open(os.path.join(root, filename), "r")
d.setdefault(filename, [])
for line in readfile:
m = re.match(combined, line)
if m:
d[filename].append(m.group(0).rstrip())
else:
pass
result = d.copy()
result.update((k, v) for k, v in d.iteritems() if v is not None)
print result
Current output:
{'debug.txt': [], 'logonscript1.vbs': ['strUser = "guytom"', 'strPassword = "P#ssw0rd1"'], 'logonscript2.bat': ['strUsername = "guytom2"', 'strPass = "SECRETPASSWORD"']}
As you can see I have entries with empty values. I'd like to remove these before printing the data.
In this part of your code:
d.setdefault(filename, [])
for line in readfile:
m = re.match(combined, line)
if m:
d[filename].append(m.group(0).rstrip())
else:
pass
You always add filename as a key to the dictionary, even if you don't subsequently add anything to the resulting list. Try
for line in read file:
m = re.match(combined, line)
if m:
d.setdefault(filename, []).append(m.group(0).rstrip())
which will only initialize d[filename] to an empty list if it is actually necessary to have something on which to call append.
result = dict((k, v) for k, v in d.iteritems() if v is not None)
update wont remove entries ... it will only add or change
a = {"1":2}
a.update({"2":7})
print a # contains both "1" and "2" keys
Looking at the first matching group in your regex, (.*), if the regex matches but there are no characters to match, group(0) is "", not None. So, you can filter there.
result.update((k, v) for k, v in d.iteritems() if not v)
But you can also have your regex do that part for you. Change that first group to (.+) and you won't have empty values to filter out.
EDIT
Instead of removing empty values at the end, you can avoid adding them to the dict altogether.
def get_Otherfiles():
# fixes: make it a raw string so that \s works right and
# tighten up filtering, ... (U|u) should probably be [Uu] ...
regs = ["(.+)\s*((U|u)ser(.*))(\s=\s\W\w+\W)", "(.*)((U|u)ser(.*))(\s=\s\w+)", "(.*)((P|p)ass(.*))\s=\s(\W(.*)\W)", "(.*)((P|p)ass(.*))(\s=\s\W\w+\W)"]
combined = "(" + ")|(".join(regs) + ")"
cred_results = []
creds = []
un_matched = []
filesfound = []
d = {}
for root, dirs, files in os.walk(dir):
for filename in files:
if filename.endswith(('.bat', '.vbs', '.ps', '.txt')):
readfile = open(os.path.join(root, filename), "r")
# assuming you want to aggregate matching file names...
content_list = d.get(filename, [])
content_orig_len = len(content_list)
for line in readfile:
m = re.match(combined, line)
if m:
content_list.append(m.group(0))
if len(content_list) > content_orig_len:
d[filename] = content_list

Categories