Add input to an existing row in csv in python 3.6 - python

I am working in python 3.6 on the following structure:
import csv
aircraft = input("Please insert the aircraft type : ")
characteristics = input("Please insert the respective aircraft characteristics: ")
with open("aircraft_list.csv","a",newline="") as output:
if aircraft not in open("aircraft_list.csv").read():
wr = csv.writer(output)
wr.writerow([aircraft + "," + characteristics])
# with squared brackets since otherwise each letter written as a separate string to the csv, separated by comma
else:
for row in enumerate(output):
data = row.split(",")
if data[0] == aircraft:
wr = csv.writer(output)
wr.writerow([characteristics],1)
I want to write the inputs to a csv in the following format:
B737,Boeing,1970, etc
A320,Airbus,EU, etc
As long as the aircraft e.g. B737 entry does yet not exist, it is easy to write it to a csv. However, as soon as the B737 property already exists in the csv, I want to add the characteristics (not the aircraft) to the entry already made for the e.g. B737. The order of the characteristics does not matter.
I want the additional input characteristics to be added to the correct row in my csv. How would I do that?
Since I’m new to coding I tried the basics and combined it with code which I found on Stackoverflow but unfortunately I cannot get it working.
Your help would be great, thank you!

Related

Python parsing large CSV file for usernames [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
I have a very large csv file (+50k lines).
This file contains IRC logs and here's the data format:
1st column: Message type (1 for message, 2 for system)
2nd column: Timestamps (numbers of seconds since a precise date)
3rd column: Username of the one writing the message
4th column: Message
Here's an example of the data:
1,1382445487956,"bob","i don't know how to do such a task"
1,1382025765196,"alice","bro ask stackoverflow"
1,1382454875476,"_XxCoder_killerxX_","I'm pretty sure it can be done with python, bob"
2,1380631520410,"helloman","helloman_ join the chan."
For example, _XxCoder_killerxX_ mentioned bob.
So, knowing all of this, I want to know which pair of usernames mentioned each others the most.
I want messages to be count, so I only need to work on lines starting with the number "1" (as there is a bunch of lines starting with "2" and other irrelevant numbers)
I know it can be done with the csv Python module, but I've never worked with such larges files so I really don't know how to start all of this.
You should perform two passes of the CSV: one to capture all sender usernames, the second to find sender usernames mentioned in messages.
import csv
users = set()
with open("test.csv", "r") as file:
reader = csv.reader(file)
for line in reader:
users.add(line[2])
mentions = {}
with open("test.csv", "r") as file:
reader = csv.reader(file)
for line in reader:
sender, message = line[2], line[3]
for recipient in users:
if recipient == sender:
continue # can't mention yourself
if recipient in message:
key = (sender, recipient)
mentions[key] = mentions.get(key, 0) + 1
for mention, times in mentions.items():
print(f"{mention[0]} mentioned {mention[1]} {times} time(s)")
totals = {}
for mention, times in mentions.items():
key = tuple(sorted(mention))
totals[key] = totals.get(key, 0) + times
for names, times in totals.items():
print(f"{names[0]} and {names[1]} mentioned each other {times} time(s)")
This example is naive, as it's performing simple substring matches. So, if there's someone named "foo" and someone mentions "food" in a message, it will indicate a match.
Here is a solution using pandas and sets. The use of pandas significantly simplifies the import and manipulation of csv data, and the use of sets allows one to count {'alice', 'bob'} and {'bob', 'alice'} as two occurrences of the same combination.
df = pd.read_csv('sample.csv', header=None)
df.columns = ['id','timestamp','username','message']
lst = []
for name in df.username:
for i,m in enumerate(df.message):
if name in m:
author = df.iloc[i,2]
lst.append({author, name})
most_freq = max(lst, key=lst.count)
print(most_freq)
#{'bob', '_XxCoder_killerxX_'}

Python - Dictionary - If loop variable not changing

Project is about to convert short forms into long description and read from csv file
Example: user enters LOL and then it should response 'Laugh of Laughter'
Expectation: Till the time time user enter wrong keyword computer keep on asking to enter short form and system answers it's long description from CSV file
I considered each row of CSV file as dictionary and broke down into keys and values
logic used: - Used while so, that it keeps on asking until short column didn't finds space, empty cell. But issue is after showing successful first attempt comparison in IF loop is not happening because readitems['short' ] is not getting updated on each cycle
AlisonList.csv Values are:
short,long
lol,laugh of laughter
u, you
wid, with
import csv
from lib2to3.fixer_util import Newline
from pip._vendor.distlib.util import CSVReader
from _overlapped import NULL
READ = "r"
WRITE = 'w'
APPEND = 'a'
# Reading the CSV file and converted into Dictionary
with open ("AlisonList.csv", READ) as csv_file:
readlist = csv.DictReader(csv_file)
# Reading the short description and showing results
for readitems in readlist:
readitems ['short'] == ' '
while readitems['short'] !='' :
# Taking input of short description
smsLang = str(input("Enter SMS Language : "))
if smsLang == readitems['short']:
print(readitems['short'], ("---Means---"), readitems['long'])
else:
break
Try this:
import csv
READ = "r"
WRITE = 'w'
APPEND = 'a'
# Reading the CSV file and converted into Dictionary
with open ("AlisonList.csv", READ) as csv_file:
readlist = csv.DictReader(csv_file)
word_lookup = { x['short'].strip() : x['long'].strip() for x in readlist }
while True:
# Taking input of short description
smsLang = str(input("Enter SMS Language : ")).lower()
normalWord = word_lookup.get(smsLang.lower())
if normalWord is not None:
print(f"{smsLang} ---Means--- {normalWord}")
else:
print(f"Sorry, '{smsLang}' is not in my dictionary.")
Sample output:
Enter SMS Language : lol
lol ---Means--- laugh of laughter
Enter SMS Language : u
u ---Means--- you
Enter SMS Language : wid
wid ---Means--- with
Enter SMS Language : something that won't be in the dictionary
Sorry, 'something that won't be in the dictionary' is not in my dictionary.
Basically, we compile a dictionary from the csv file, using the short words as the keys, and the long words as the items. This allows us in the loop to then just call word_lookup.get(smsLang) to find the longer version. If such a key does not exist, we get a result of None, so a simple if statement can handle the case where there is no longer version.
Hope this helps.

Python: is there a maximum of values the write() functions could process?

I´m new in python so I would be thankful for every help...
My problem is the following:
I wrote a program in python analysing gene sequences of a huge database (more than 600 genes). With the help of the write() function the program should insert the results in a text file - one result per gene. Opening my output file, there are only the first genes followed by "..." followed by the last gene.
Is there a maximum this function could process? How do I make python write all results?
relevant part of code:
fasta_df3 = pd.read_table(fasta_out3, delim_whitespace=True, names=
('qseqid','sseqid', 'evalue', 'pident'))
fasta_df3_sorted = fasta_df3.sort_values(by='qseqid', ascending = True)
fasta_df3_grouped = fasta_df3_sorted.groupby('qseqid')
for qseqid, fasta_df3_sorted in fasta_df3_grouped:
subj3_pident_max = str(fasta_df3_grouped['pident'].max())
subj3_pident_min = str(fasta_df3_grouped['pident'].min())
current_gene = str(qseqid)
with open(dir_output+outputall_file+".txt","a") as gene_list:
gene_list.write("\n"+"subj3: {} \t {} \t {}".format(current_gene,
subj3_pident_max, subj3_pident_min))
gene_list.close()

How do you get rid of the gaps in the CSV file? [duplicate]

This question already has answers here:
CSV file written with Python has blank lines between each row
(11 answers)
Closed 6 years ago.
when I run my programme, in the blank CSV file, there seems to be an one line gap between each product. How do you get rid of that gap? (shown in the first picture)
This is what happens when I fully run my code
This is my original CSV file that contains all the product information. Row A is the GTIN-8 code, row B is the product name, C is the current stock level, D is the re-order stock level, and E is the target stock level (Just to clarify)
This is my code:
import csv
redo='yes'
receipt=open('receipt.txt', 'wt')
stock=open('Stock_.csv', 'rt')
stock_read=csv.reader(stock)
blank_csv=open('Blank_csv_.csv', 'wt')
blank_csv_write=csv.writer(blank_csv)
clothes=(input('\nPlease enter the GTIN-8 code of what you want to purchase: '))
quantity=int(input('\nPlease enter the amount of this product you want to buy: '))
for row in stock_read:
GTIN=row[0]
product=row[1]
current=row[2]
re_order=row[3]
target=row[4]
if clothes==GTIN:
current=int(current)-quantity
blank_csv_write.writerows([[GTIN,product,current,re_order,target]])
stock.close()
blank_csv.close()
reorder_receipt=open('receipt.txt', 'wt')
blank_csv2=open('Blank_csv_.csv', 'rt')
blank_csv_read2=csv.reader(blank_csv2)
stock_check=input('Press \"ENTER\" if you want to check the current stock leavels: ')
if stock_check=='':
for row in blank_csv_read2:
for field in row:
GTIN=row[0]
product=row[1]
current=int(row[2])
re_order=int(row[3])
target=int(row[4])
if current<=re_order:
re_stock=target-current
reorder_receipt.write(GTIN+' '+product+' '+str(current)+' '+str(re_order)+' '+str(target)+' '+str(re_stock)+'\n')
blank_csv2.close()
reorder_receipt.close()
Thank you in advance!!!!
It seems this is a duplicate of THIS issue. I believe the answer there will help you solve your problem. HINT: newline=''
I didn't dig into your code in much detail, but on first glance it looks like if you want to remove the extra line between each row of data, remove the newline character '\n' from the end of the reorder_receipt.write() function call.
In other words, modify:
reorder_receipt.write(GTIN+' '+product+ ... +str(re_stock)+'\n')
to be:
reorder_receipt.write(GTIN+' '+product+ ... +str(re_stock))
I guess the problem is with the target=row[4].
May be "target" string is getting an extra \n. So stripping it might solve the problem.
target.strip()

Python 2 - parse tab-delimited text file data into a list

Please help me with what I think is my fundamental misunderstanding of lists and csv.
I have a text file of tab-delimited data (this is why I am using csv). It has a header row with 4 headings, and then 40 rows of data. I am trying to create a programme that will search the content of the text file and when a match is found will then print that row of data.
My first step is to create a list from the text file.
import csv
list=[] #create a new empty list
with open('data.txt','rb') as f:
next(f) #skip heading row in text file (I cannot get csv.Dictreader instead to work as an alternative to this step)
data = csv.reader(f,delimiter='\t') #read text file with csv
for row in data:
list.append(row) #add the data from the text file to the list
When I run this programme as it is, I can type print list and it prints the contents of the text file, each row enclosed in a []. When I type print row it prints the LAST row entry in the text file. When I type print row[0] it prints the first column of this last row, and so on for row[1], row[2] and row[3]. When I type print len(list) it returns '40' which is the number of rows excluding the header.
I cannot print any of the other rows from the text file. Have I done something wrong in creating my list? How can I access other rows and check that I have created my list correctly?
I am having problems with what I think are the next steps, and I want to make sure that I have got this first step correct! I have read all the documentation I can find and all vaguely-related stack overflow queries and I just do not seem to understand this. I would really appreciate some help!
Edit: I have been asked to explain what I am trying to use this for.
I have a text file (data.txt). It has rows of tab-delimited data under four columns.
I want to make a search function so that:
The user inputs which column to search by
The user inputs a search term
The programme searches the list to find a match
The programme then prints the whole row containing the matching data.
E.g.
Name Age Address Job
Marks 49 Manchester Teacher
Smith 52 Somerset Banker
Williams 83 Kent Student
To do this I think I need to make the text file into a list that has been parsed with csv (because the data is tab-delimited). Then I think I should use name = row[0] age = row[1] and so on to complete my search function.
I am having trouble with understanding how the list function works in terms of row[0] etc.
Why are you using csv anyway? You're only splitting the lines.
I made myself a test file like that:
header1 header2 header3 header4
row10 row11 row12 row13
row20 row21 row22 row23
row30 row32 row32 row33
row40 row42 row42 row43
row50 row52 row52 row53
row60 row62 row62 row63
And some simple lines to access each element:
with open('data.txt','r') as f:
lines = f.readlines()[1:]
for line in lines:
elements = line.strip().split("\t")
print elements, len(elements)
The result output is:
['row10', 'row11', 'row12', 'row13'] 4
['row20', 'row21', 'row22', 'row23'] 4
['row30', 'row32', 'row32', 'row33'] 4
['row40', 'row42', 'row42', 'row43'] 4
['row50', 'row52', 'row52', 'row53'] 4
['row60', 'row62', 'row62', 'row63'] 4
That way you can add each entry of elements to a new array (like your list) and continue working with that.
There could be simpler ways to do this, however I am comfortable with Pandas so using it here. This program is just a sketch. You will need to modify to optimize it. for eg. if you need to search through columns and records, you will need to modify the func with some kind of 'regex' (re package) logic. Let me know if you need more help.
I create a .txt file
name state game
john CA soccer
peter CA soccer
kate CA basketball
ed CA football
import pandas as pd
df=pd.read_csv("C:/Amrita/test.txt", header=None, delim_whitespace=True,names=['name','state','game'])
def myfunc(data):
prompt1 = "Enter column name: \n"
prompt2 = "Enter search term: \n"
user_input1 = raw_input(prompt1)
user_input2 = raw_input(prompt2)
print df[(df[user_input1] == user_input2)]
myfunc(df)
Enter column name:
game
Enter search term:
soccer
name state game
john CA soccer
peter CA soccer

Categories