I am currently trying to write a program which runs through a CSV file of academic papers. The CSV is tab deliminated and is in four columns (Author, Date, Title, Journal)
The idea is to ask the user whether he wants to search the group of papers via Author, Paper Title or Journal Title (or press Q to quit), and display the results of the query back to the user in this order: Author/s. Year. Title. Journal.
My code runs, but it only retrieves data from the 'search option' I selected. I.E, if I choose to search by Author, it will pull back and display the Authors whose names match the query, but it doesn't display any of the other information (The year, title or journal). This is the same with the other search options (i.e if I select Journal, it will pull back any relevant journals but will not give me the Author, Date or Title of said journal)
Any help here is greatly appreciated! Below is my code.
import csv
def AuthorSearch():
authorSearch = input("Please type Author name. \n")
for item in Author:
if item.find(authorSearch) != -1:
print (item)
def TitleSearch():
titleSearch = input("Please type in Title, \n")
for item in Title:
if item.find(titleSearch) != -1:
print (item)
def JournalSearch():
journalSearch = input("Please type in a Journal, \n")
for item in Journal:
if item.find(journalSearch) != -1:
print (item)
data = csv.reader (open('List.txt', 'rt'), delimiter='\t')
Author, Year, Title, Journal = [], [], [], []
for row in data:
Author.append(row[0])
Year.append(row[1])
Title.append(row[2])
Journal.append(row[3])
print ("Please type in capitals.")
searchOption = input("Press A to search for Author, T to search titles or J to search Journals or press Q to quit. \n" )
if searchOption == 'A':
AuthorSearch()
elif searchOption == 'T':
TitleSearch()
elif searchOption == 'J':
JournalSearch()
elif searchOption == 'Q':
exit()
Thank you very much to anybody who helps, it's really appreciated!
I have googled and read the CSV reference page, but I can't seem to get my head around it. Aagin, all help is appreciated!
The list Author doesn't contain anything but the authors. When you do for item in Author, you are only looking through the authors. When you then print the found item, it is of course only the author. You have the same problem with each field. You have four separate lists of fields that are not linked in any way.
I would suggest you take a look at the pandas library, which has nice facilities for reading CSV files into a tabular data structure. It also does a lot more than that, but it should easily handle what you want to do here.
Your issue is that you put all your info into separate arrays...but you did that because you know their column numbers...so just keep it as is and call by the number!
everything = []
for row in data:
everything.append[row]
Here is an example for your title search function:
def TitleSearch():
titleSearch = input("Please type in Title, \n")
for row in everything:
title = row[2]
if title.find(titleSearch) != -1:
print row
so now you take the entire row, and just run your find() on the 3rd column (the one you said was title) and if it's the same as your titleSearch it will print the entire row with all of the information, problem solved!
Related
I am fairly new to file handling and csv files in Python.
Code
import csv
file1=open("Employee.csv","w",newline='\r\n')
empwriter=csv.writer(file1)
empwriter.writerow(["EmpID.","Name","MobNo."])
n=int(input("How many employee records do you want to add?:"))
for i in range(n):
print("\nEmployee Record",(i+1))
empid=input("Enter empID:")
name=input("Enter Name:")
mob=input("Enter Mobile No.:")
emprec=[empid,name,mob]
empwriter.writerow(emprec)
file1.close()
file2=open("Employee.csv","r+",newline='\r\n')
empreader=csv.reader(file2)
newwriter=csv.writer(file2)
check=input("\nEnter the empID to check:\n")
opt=int(input('''Which of the following fields do you want to update?:
1.Update the Name
2.Update the Mobile No.
3Update the whole record\n\n'''))
if opt==1:
for x in empreader:
if x[0]==check:
newname=input("Enter new name:")
x[1]=newname
print("Record Updated Successfully!\n Updated record:\n",x)
newwriter.writerow(x)
print(x)
elif opt==2:
for x in empreader:
if x[0]==check:
newmob=input("Enter new Mobile No.:")
x[2]=newmob
print("Record Updated Successfully!\n Updated record:\n",x)
newwriter.writerow(x)
print(x)
elif opt==3:
for x in empreader:
if x[0]==check:
newname=input("Enter new name:")
newmob=input("Enter new Mobile No.:")
x[1]=newname
x[2]=newmob
print("Record Updated Successfully!\n Updated record:\n",x)
newwriter.writerow(x)
print(x)
file2.close()
I have in this code tried to
enter records (empID, Name, MobNo.) by user and make a csv file.
using empID find the desired record.
update that record.
display all records.
When I execute the code, the print statement inside the for loop gets executed before the if statement and gives me this output.
Output
How many employee records do you want to add?:3
Employee Record 1
Enter empID:001
Enter Name:John
Enter Mobile No.:1234567890
Employee Record 2
Enter empID:002
Enter Name:Jane
Enter Mobile No.:2345678901
Employee Record 3
Enter empID:003
Enter Name:Judy
Enter Mobile No.:4567890123
Enter the empID to check:
002
Which of the following fields do you want to update?:
1.Update the Name
2.Update the Mobile No.
3.Update the whole record
2
['EmpID.', 'Name', 'MobNo.']
['001', 'John', '1234567890']
Enter new Mobile No.:1111222233
Record Updated Successfully!
Updated record:
['002', 'Jane', '1111222233']
['002', 'Jane', '1111222233']
As you can see in the last few lines of the output. Print statement got executed before the if statement (or something similar).I actually want the output to display in the below given way...
Desired output
2
Enter new Mobile No.:1111222233
Record Updated Successfully!
Updated record:
['002', 'Jane', '1111222233']
['EmpID.', 'Name', 'MobNo.']
['001', 'John', '1234567890']
['002', 'Jane', '1111222233']
['003', 'Judy', '4567890123']
I can see what's happening, and why it's not what you expect.
Take a look at this block:
elif opt==2:
for x in empreader:
if x[0]==check:
newmob=input("Enter new Mobile No.:")
x[2]=newmob
print("Record Updated Successfully!\n Updated record:\n",x)
newwriter.writerow(x)
print(x)
What's actually happening is that, for every row x in empreader:
the ID col, x[0], is being evaluated against the ID you entered, check.
If it matches:
you interrupt the loop, and step in to modify x
write out the new value, newwriter.writerow(x)
then you step out and...
print your update, print(x)
Unless, x[0]==check is False, in which case x is just printed.
Think about (and test) what happens when you provide a non-existet ID for check... it'll never get into your "modify interface" and instead will just print every (unmodified) row in empreader.
What happens then, do you display an error message letting the user know they picked a bad ID?
This is a pretty big change, but I think it keeps the logic and your intent more clear... you're not making decisions in loops, which is very difficult. Take input and make decisions in a "flat" sequence:
import csv
import sys
# Get the employee ID
check=input("\nEnter the empID to check:\n")
# and check the ID in the employee records
emp_row = None
with open("Employee.csv","r",newline='\r\n') as f:
empreader=csv.reader(f)
for x in empreader:
if x[0] == check: # the employee ID matches
emp_row = x[:]
if emp_row is None:
print(f'Could not find employee ID {check}')
sys.exit(0) # or exit(1) to show it's an "error"
# Now you know you have a valid ID, because emp_row has data
opt=int(input('''Which of the following fields do you want to update?:
1.Update the Name
2.Update the Mobile No.
3.Update the whole record\n\n'''))
if opt == 1:
newname = input("Enter new name:")
emp_row[1] = newname
# else if opt == 2:
# ...
# else if opt ==3:
# ...
# Now how do you overwrite the old row in Employee.csv with emp_row?
# Don't try to "modify" the original file as you're reading it, instead...
# Read from the old file, copying all the old rows, except for emp_row, to a new list
new_rows = []
with open('Employee.csv', 'r', newline='\r\n') as f_in:
reader = csv.reader(f_in)
header_row = next(reader)
new_rows.append(header_row)
for row in reader:
if row[0] == check:
new_rows.append(emp_row)
else:
new_rows.append(row)
# Then write your new_rows out
with open('Employee.csv', 'w', newline='\r\n') as f_out:
writer = csv.writer(f_out)
writer.writerows(new_rows)
I started with Employee.csv looking like this:
EmpID.,Name,MobNo.
001,John,1234567890
002,Jane,2345678901
003,Judy,4567890123
I ran that, and it looked like this:
% python3 main.py
Enter the empID to check:
002
Which of the following fields do you want to update?:
1.Update the Name
2.Update the Mobile No.
3.Update the whole record
1
Enter new name:Alice
and Employee.csv now looks like:
EmpID.,Name,MobNo.
001,John,1234567890
002,Alice,2345678901
003,Judy,4567890123
(I'm on a Mac, so \r\n adding lines)
Background: I am creating a tweet scraper, using snscrape, to scrape tweets from sitting government representatives in the House and Senate. The tweets that I'm scraping I am scanning for keywords related to "cybersecurity" and "privacy". I'm using a dictionary of words to scan for. Usually, I would have many more members in the username list but I am just trying to test with a low number to get it working first.
The problem: I have set up nested for loops to run through each username to check and the dictionary of words to scan for. The output is only showing the last person in my username list. I can't find out why. It's like every time the for loop restarts it erases the last person it just checked.
The code:
import os
import pandas as pd
tweet_count = 500
username = ["SenShelby", "Ttuberville", "SenDanSullivan"]
text_query = ["cybersecurity", "cyber security", "internet privacy", "online privacy", "computer security", "health privacy", "privacy", "security breach", "firewall", "data"]
since_date = "2016-01-01"
until_date = "2021-10-14"
for person in username:
for word in text_query:
os.system("snscrape --jsonl --progress --max-results {} --since {} twitter-search '{} from:{} until:{}'> user-tweets.json".format(tweet_count, since_date, word, person, until_date))
tweets_framework = pd.read_json('user-tweets.json', lines=True)
tweets_framework.to_csv('user-tweets.csv', sep=',', index=False)
Any help would be greatly appreciated!
first you should have a unique name for each user's JSON.
second, you need to run the json to csv for each user (if this is what you try to do)
for person in username:
for word in text_query:
filename = '{}-{}-tweets'.format(person, word)
os.system("snscrape --jsonl --progress --max-results {} --since {} twitter-search '{} from:{} until:{}'> {}.json".format(tweet_count, since_date, word, person, until_date, filename))
tweets_framework = pd.read_json('{}.json'.format(filename), lines=True)
tweets_framework.to_csv('{}.csv'.format(filename), sep=',', index=False)
Attempting to add authors and their book titles to a list inside of a dictionary so that each author can support multiple book titles. In the code, I have 3 authors already and each has 1 book title, but they need to be able to support at least 1 more book title.
I have the values (book titles) of the keys (authors) nested inside of a list inside the dictionary already, but I don't know how to append more values to the existing keys that are inside of the existing list.
readings = {'George Orwell': ['1984'], 'Harper Lee': ['To Kill a Mockingbird'], 'Paul Tremblay': ['The Cabin at the End of the World']} # list inside of dict.
I need to use the following code to append the new book titles to the list
def add(readings): # appending to list will go here
author = input('\nEnter an author: ')
if author in readings: # check if input already inside dict.
bookTitle = readings[author]
print(f'{bookTitle} is already added for this author.\n')
else:
bookTitle = input('Enter book title: ')
bookTitle = bookTitle.title()
readings[author] = bookTitle
print(f'{bookTitle} was added.\n')
I expect that you are not able to add the same book title twice and not be able to add the same author twice either. I am expected to be able to input book titles for an existing author (or new author not already existing) while the program is running, then be able to view all of the authors and their book title(s) via a 'command menu' (not shown).
You're workflow is a little off. After checking for the author, then check for the book in the list of books by that author. You can add a title to the list of books using .append. Try this:
def add(readings): # appending to list will go here
author = input('\nEnter an author: ')
if author in readings: # check if input already inside dict.
books = readings[author]
print(f'Found {len(books)} books by {author}:')
for b in books:
print(f' - {b}')
else:
readings[author] = []
bookTitle = input('Enter book title: ')
bookTitle = bookTitle.title()
if bookTitle in readings[author]:
print(f'{bookTitle} is already added for this author.')
else:
readings[author].append(bookTitle)
print(f'Add "{bookTitle}"')
So you are trying to add multiple books to an author, is that correct? Since the values in your dictionary are already stored as list, you can try doing -
readings[author].append(bookTitle)
instead of
readings[author] = bookTitle
So I have a CSV file that looks something like this:
Username,Password,Name,DOB,Fav Artist,Fav Genre
Den1994,Denis1994,Denis,01/02/1994,Eminem,Pop
Joh1997,John1997,John,03/04/1997,Daft Punk,House
What I need to be able to do is let the user edit and change their Fav Artist and Fav Genre so that their new values are saved to the file in place of the old ones. I'm not the very advanced when it comes to CSV so I'm not sure where to begin with it, therefore any help and pointers will be greatly appreciated.
Thanks guys.
EDIT:
Adding the code I have so far so it doesn't seem like I'm just trying to get some easy way out of this, generally not sure what to do after this bit:
def editProfile():
username = globalUsername
file = open("users.csv", "r")
for line in file:
field = line.split(",")
storedUsername = field[0]
favArtist = field[4]
favGenre = field[5]
if username == storedUsername:
print("Your current favourite artist is:", favArtist,"\n" +
"Your current favourite genre is:",favGenre,"\n")
wantNewFavArtist = input("If you want to change your favourite artist type in Y, if not N: ")
wantNewFavGenre = input("If you want to change your favourite genre type in Y, if not N: ")
if wantNewFavArtist == "Y":
newFavArtist = input("Type in your new favourite artist: ")
if wantNewFavGenre == "Y":
newFavGenre = input("Type in your new favourite genre: ")
This is how it would look like using pandas
import pandas as pd
from io import StringIO
# Things you'll get from a user
globalUsername = "Den1994"
field = 'Fav Artist'
new_value = 'Linkin Park'
# Things you'll probably get from a data file
data = """
Username,Password,Name,DOB,Fav Artist,Fav Genre
Den1994,Denis1994,Denis,01/02/1994,Eminem,Pop
Joh1997,John1997,John,03/04/1997,Daft Punk,House
"""
# Load your data (e.g. from a CSV file)
df = pd.read_csv(StringIO(data)).set_index('Username')
print(df)
# Now change something
df.loc[globalUsername][field] = new_value
print(df)
Here df.loc[] allows you to access a row by the index. In this case Username is set as index. Then, [field] selects the column in that row.
Also, consider this:
df.loc[globalUsername][['Fav Artist', 'Fav Genre']] = 'Linkin Park', 'Nu Metal'
In case you have a my-data.csv file you can load it with:
df = pd.read_csv('my-data.csv')
The code above will return
Password Name DOB Fav Artist Fav Genre
Username
Den1994 Denis1994 Denis 01/02/1994 Eminem Pop
Joh1997 John1997 John 03/04/1997 Daft Punk House
and
Password Name DOB Fav Artist Fav Genre
Username
Den1994 Denis1994 Denis 01/02/1994 Linkin Park Pop
Joh1997 John1997 John 03/04/1997 Daft Punk House
Try this
import pandas as pd
data = pd.read_csv("old_file.csv")
data.loc[data.Username=='Den1994',['Fav Artist','Fav Genre']] = ['Beyonce','Hard rock']
data.to_csv('new_file.csv',index=False)
python has a built-in module dealing with csv, there are examples in the docs that will guide you right.
One way to do is to use the csv module to get the file you have into a list of lists, then you can edit the individual lists (rows) and just rewrite to disk what you have in memory.
Good luck.
PS: in the code that you have posted there is no assignment to the "csv in memory" based on the user-input
a minimal example without the file handling could be:
fake = 'abcdefghijkl'
csv = [list(fake[i:i+3]) for i in range(0, len(fake), 3)]
print(csv)
for row in csv:
if row[0] == 'd':
row[0] = 'changed'
print(csv)
the file handling is easy to get from the docs, and pandas dependance is avoided if that is on the wishlist
I am trying to create a button that adds a row to a table (QtTableWidget) and uses a dialog box to ask for the name, and I have hit a big problem (seemingly a flaw within PyQt).
By adding a row using the insertRow() function the row header automatically has a value of none, which then means you cannot use the verticalHeaderItem(rowPosition).setText(...) on the row Header as it cannot set the text of an item with value none.
The relevant code is here:
def RenameRow(self, i, name):
self.tab1table.verticalHeaderItem(i).setText(name)
def DatabaseAddRow(self):
text, ok = QInputDialog.getText(self, "Row Entry", 'Please Enter A Row Name:', QLineEdit.Normal, 'e.g. ECN 776')
if ok and text != '':
rowPosition = self.tab1table.rowCount()
self.tab1table.insertRow(rowPosition)
self.RenameRow(rowPosition, text)
Any Ideas how to get around this or maybe methods I do not know about?
So I managed to solve this myself just after asking this after wasting half a day on this problem, such is life. The solution to the problem is to assign an empty item to the header and then rename it, the implementation is here:
def RenameRow(self, i, name, table):
item = QTableWidgetItem()
table.setVerticalHeaderItem(i, item)
item = table.verticalHeaderItem(i)
item.setText(QCoreApplication.translate("MainWindow", name))