Read lines from .txt file into sql query - python

I want to run a .txt file line for line through an SQL query. The .txt file consists of songtitles that may or may not exist in the database. If there is more than one option that fits the songtitle in the database a selection menu should appear. If there is only one option no further action is needed. If the line in the .txt file is not in the database a print statment shoud appear saying the song is not found.
To test this I made a .txt file with each of the three options described above:
Your (this gives 7 hits)
Bohemian (this gives 1 hit)
Thriller (this gives 0 hits)
I created the .txt file in another .py file, like this:
with open('MijnMuziek.txt', 'w') as f:
f.writelines("""
your
bohemian
thriller""")
f.close()
But if I run the code below in a separate .py file it only prints 'Choose from the following options: ' and than gives an error message saying index is out of range.
import sqlite3
music_database = sqlite3.connect("C:\\Users\marlo\Downloads\chinook_LOI.db")
cursor = music_database.cursor()
def read_file(filename):
with open(filename) as f:
for track in f:
cursor.execute(f"""SELECT DISTINCT t.TrackId, t.Name, art.Name
FROM tracks t
JOIN albums alb ON t.AlbumId = alb.AlbumId
JOIN artists art ON alb.ArtistId = art.ArtistId
WHERE t.Name LIKE '{track}%'""")
def selection_menu():
for position, song in enumerate(tracks_available):
print(str(position + 1), *song[1:3], sep='\t')
choice = int(input('Choose from the following options: '))
print('You chose:', *tracks_available[choice - 1], sep='\t')
read_file('MijnMuziek.txt')
tracks_available = cursor.fetchall()
selection_menu()
music_database.close()
When I put only one option in the .txt file (f.writelines('your')) the code does work and I get a selection menu.But with more than one line in the .txt file it does not work.
How do I solve this?

I don't have your database to test this, but this is
a way to do it.
It makes sense to open & close the database in the read function.
It also is a good idea to avoid global variable use and instead pass
them into functions.
I included protection against blank lines in your text file.
I didn't fix the SQL injection for you because I'd need to google
how it works with the LIKE % you use...
import sqlite3
DATABASE_FILE = r"C:\\Users\marlo\Downloads\chinook_LOI.db"
def read_tracks_from_file(filename, database_file):
music_database = sqlite3.connect(database_file)
cursor = music_database.cursor()
tracks_available = []
with open(filename) as f:
for track in f:
if track:
cursor.execute(f"""SELECT DISTINCT t.TrackId, t.Name, art.Name
FROM tracks t
JOIN albums alb ON t.AlbumId = alb.AlbumId
JOIN artists art ON alb.ArtistId = art.ArtistId
WHERE t.Name LIKE '{track}%'""")
for track in cursor.fetchall():
tracks_available.append(track)
music_database.close()
return tracks_available
def selection_menu(track_selection):
for position, song in enumerate(track_selection, start=1):
print(str(position), *song[1:3], sep='\t')
choice = int(input('Choose from the following options: '))
print('You chose:', *track_selection[choice - 1], sep='\t')
tracks_available = read_tracks_from_file(filename='MijnMuziek.txt',
database_file=DATABASE_FILE)
selection_menu(track_selection=tracks_available)

Related

Python chunks write to excel

I am new to python and I m learning by doing.
At this moment, my code is running quite slow and it seems to take longer and longer by each time I run it.
The idea is to download an employee list as CSV, then to check the location of each Employee ID by running it trough a specific page then writing it to an excel file.
We have around 600 associates on site each day and I need to find their location and to keep refreshing it each 2-4 minutes.
EDIT:
For everyone to have a better understanding, I have a CSV file ( TOT.CSV ) that contains Employee ID's, Names and other information of the associates that I have on site.
In order to get their location, I need to run each employee ID from that CSV file trough https://guided-coaching-dub.corp.amazon.com/api/employee-location-svc/GetLastSeenLocationOfEmployee?employeeId= 1 by 1 and at the same time to write it in another CSV file ( Location.csv ). Right now, it does in about 10 minutes and I want to understand if the way I did it is the best possible way, or if there is something else that I could try.
My code looks like this:
# GET EMPLOYEE ID FROM THE CSV
data = read_csv("Z:\\_Tracker\\Dump\\attendance\\TOT.csv")
# converting column data to list
TOT_employeeID = data['Employee ID'].tolist()
# Clean the Location Sheet
with open("Z:\\_Tracker\\Dump\\attendance\\Location.csv", "w") as f:
pass
print("Previous Location data cleared ... ")
# go through EACH employee ID to find out location
for x in TOT_employeeID:
driver.get(
"https://guided-coaching-dub.corp.amazon.com/api/employee-location-svc/GetLastSeenLocationOfEmployee?employeeId=" + x)
print("Getting Location data for EmployeeID: " + x)
locData = driver.find_element(By.TAG_NAME, 'body').text
aaData = str(locData)
realLoc = aaData.split('"')
# write to excel
with open("Z:\\_Tracker\\Dump\\attendance\\Location.csv",
"a") as f:
writer = csv.writer(f)
writer.writerow(realLoc)
time.sleep(5)
print("Employee Location data downloaded...")
Is there a way I can do this faster?
Thank you in advance!
Regards,
Alex
Something like this.
import concurrent.futures
def process_data(data: pd.DataFrame) -> None:
associates = data['Employee ID'].unique()
with concurrent.futures.ProcessPoolExecutor() as executer:
executer.map(get_location, associates)
def get_location(associate: str) -> None:
driver.get(
"https://guided-coaching-dub.corp.amazon.com/api/employee-location-svc/GetLastSeenLocationOfEmployee?"
f"employeeId={associate}")
print(f"Getting Location data for EmployeeID: {associate}")
realLoc = str(driver.find_element(By.TAG_NAME, 'body').text).split('"')
with open("Z:\\_Tracker\\Dump\\attendance\\Location.csv", "a") as f:
writer = csv.writer(f)
writer.writerow(realLoc)
if __name__ == "__main__":
data = read_csv("Z:\\_Tracker\\Dump\\attendance\\TOT.csv")
process_data(data)
You could try separating the step of reading the information and writing the information to your CSV file, like below:
# Get Employee Location Information
# Create list for employee information, to be used below
employee_Locations = []
for x in TOT_employeeID:
driver.get("https://guided-coaching-dub.corp.amazon.com/api/employee-location-svc/GetLastSeenLocationOfEmployee?employeeId=" + x)
print("Getting Location data for EmployeeID: " + x)
locData = driver.find_element(By.TAG_NAME, 'body').text
aaData = str(locData)
realLoc = [aaData.split('"')]
employee_Locations.extend(realLoc)
# Write to excel - Try this as a separate step
with open("Z:\\_Tracker\\Dump\\attendance\\Location.csv","a") as f:
writer = csv.writer(f, delimiter='\n')
writer.writerow(employee_Locations)
print("Employee Location data downloaded...")
You may see some performance gains by collecting all your information first, then writing to your CSV file

How to create a for loop from a input dependent function in Python?

I am finally getting the hang of Python and have started using it on a daily basis at work. However, the learning curve is still steep and I have hit a roadblock in trying something new with a code I found here for scraping members from telegram channels.
Currently in lines 38-44 we can select a group from the list and it will scrape the user data into members.csv .
EDIT: Resolved the CSV naming issue:
print('Saving In file...')
print(target_group.title)
filename = target_group.title
with open(("{}.csv".format(filename)),"w",encoding='UTF-8') as f:
Instead of relying on input, I would like to create a for loop which would iterate through every group in the list.
print('Choose a group to scrape members from:')
i=0
for g in groups:
print(str(i) + '- ' + g.title)
i+=1
g_index = input("Enter a Number: ")
target_group=groups[int(g_index)]
The problem is that I am not sure exactly how to replace this part of the code with a for loop.
Although, just changing it into a for loop would make it merely overwrite the same members.csv file with each iteration, I plan on changing that so that it outputs into unique files.
So circling back to my question. How do I make this single program iteration loop through all of the groups, or just select all of them.
Thanks for the help !
Couldn't test this, but something like this maybe? This creates a new .csv file for each group.
for chat in chats:
try:
if chat.megagroup == True:
groups.append(chat)
except:
continue
for current_group in groups:
print(f"Fetching members for group \"{current_group.title}\"...")
all_participants = client.get_participants(current_group, aggressive=True)
current_file_name = f"members_{current_group.title}.csv"
print(f"Saving in file \"{current_file_name}\"...")
with open(current_file_name, "w+", encoding="UTF-8") as file:
writer = csv.writer(file, delimiter=",", lineterminator="\n")
writer.writerow(["username", "user id", "access hash", "name", "group", "group id"])
for user in all_participants:
username = user.username if user.username else ""
first_name = user.first_name.strip() if user.first_name else ""
last_name = user.last_name.strip() if user.last_name else ""
name = f"{first_name} {last_name}"
row = [username, user.id, user.access_hash, name, current_group.title, current_group.id]
writer.writerow(row)
print(f"Finished writing to file \"{current_file_name}\".")
print("Members scraped successfully.")
Ended up figuring out the issue:
On naming the CSV file: Used the title attribute to name the file and replacement within the string.
g_index = chat_num
target_group=groups[int(g_index)]
filename = target_group.title
print('Fetching Members from {} ...'.format(filename))
all_participants = []
all_participants = client.get_participants(target_group, aggressive=True)
print('Saving In file...')
with open(("{}.csv".format(filename)),"w",encoding='UTF-8') as f:
On creating a for loop for the sequence: The original code (posted in the question) did not include a for loop. My version of a workaround was to create a function from everything and then iterate through a an indexed list that was equal to the amount of instances detected. In the end looking like this:
chat_list_index = list(range(len(chats)))
for x in chat_list_index:
try:
get(x)
except:
print("No more groups.", end = " ")
pass
pass
print("Done")
Overall, this might not be the best solution to accomplish what I sought out to, however its good enough for me now, and I have learned a lot. Maybe someone in the future finds this beneficial. Full code available here: (https://github.com/ivanstruk/telegram-member-scraper/).
Cheers !

Error whilst trying to delete string from a 'txt' file - Contacts list program

I'm creating a Contact list/book program which can create new contacts for you. Save them in a 'txt' file. List all contacts, and delete existing contacts. Well sort of. In my delete function there is an error which happens and I can't quite tell why?. There isn't a error prompted on the shell when running. It's meant to ask the user which contact they want to delete, find what the user said in the 'txt' file. Then delete it. It can find it easily, however it just doesn't delete the string at all.
I have tried other methods including if/else statements, other online code (copied) - nothing works.
import os, time, random, sys, pyautogui
#function for creating a new contact.
def new_contact():
name = str(input("Clients name?\n:"))
name = name + " -"
info = str(input("Info about the client?\n:"))
#starts formatting clients name and info for injection into file.
total = "\n\n"
total = total + name
total = total + " "
total = total + info
total = total + "\n"
#Injects info into file.
with open("DATA.txt", "a") as file:
file.write(str(total))
file.close
main()
#function for listing ALL contacts made.
def list():
file = open("DATA.txt", "r")
read = file.read()
file.close
#detects whether there are any contacts at all. If there are none the only str in the file is "Clients:"
if read == "Clients:":
op = str(input("You havn't made any contacts yet..\nDo you wish to make one?\n:"))
if op == "y":
new_contact()
else:
main()
else:
print (read)
os.system('pause')
main()
#Function for deleting contact
def delete_contact():
file = open("DATA.txt", "r")
read = file.read()
file.close
#detects whether there are any contacts at all. If there are none the only str in the file is "Clients:"
if read == "Clients:":
op = str(input("You havn't made any contacts yet..\nDo you wish to make one?\n:"))
if op == "y":
new_contact()
else:
main()
else:
#tries to delete whatever was inputted by the user.
file = open("DATA.txt", "r")
read = file.read()
file.close
print (read, "\n")
op = input("copy the Clinets name and information you wish to delete\n:")
with open("DATA.txt") as f:
reptext=f.read().replace((op), '')
with open("FileName", "w") as f:
f.write(reptext)
main()
#Main Menu Basically.
def main():
list_contacts = str(input("List contacts? - L\n\n\nDo you want to make a new contact - N\n\n\nDo you want to delete a contact? - D\n:"))
if list_contacts in ("L", "l"):
list()
elif list_contacts in ("N", "n"):
new_contact()
elif list_contacts in ("D", "d"):
delete_contact()
else:
main()
main()
It is expected to delete everything the user inputs from the txt file. No errors show up on shell/console, it's as if the program thinks it's done it, but it hasn't. The content in the txt file contains:
Clients:
Erich - Developer
Bob - Test subject
In your delete function, instead of opening DATA.txt, you open "FileName"
When using “with”, a file handle doesn't need to be closed. Also, file.close() is a function, you didnt call the function, just its address.
In addition, in the delete function, you opened “fileName” instead of “DATA.txt”

How do I create new JSON data after every script run

I have JSON data stored in the variable data.
I want to make it write to a text file after every time it runs so I know which data json that is new instead of re-writting the same Json.
Currently, I am trying this:
Saving = firstname + ' ' + lastname+ ' - ' + email
with open('data.json', 'a') as f:
json.dump(Saving, f)
f.write("\n")
which just adds up to the json file and the beginning of the script where the first code starts, I clean it with
Infotext = "First name : Last name : Email"
with open('data.json', 'w') as f:
json.dump(Infotext, f)
f.write("\n")
How can I make instead of re-write the same Json, instead create new file with Infotext information and then add up with Saving?
Output in Json:
"First name : Last name : Email"
Hello World - helloworld#test.com
Hello2 World - helloworld2#test.com
Hello3 World - helloworld3#test.com
Hello4 World - helloworld4#test.com
Thats the outprint I wish to be. So basically it needs to start with
"First name : Last name : Email"
And then the Names, Lastname Email will add up below that until there is no names anymore.
So basically easy to say now - What I want is that instead of clearing and add to the same json file which is data.json, I want it to create to a newfile called data1.json - then if I rerun the program again tommorow etc - it gonna be data2.json and so on.
Just use a datetime in the file name, to create a unique file each time the code is run. In this case, granularity goes down to per-second so, if the code is run more than once per second, you will overwrite the existing contents of a file. In that case, step down to file names with microseconds in their name.
import datetime as dt
import json
time_script_run = dt.datetime.now().strftime('%Y_%m_%d_%H_%M_%S')
with open('{}_data.json'.format(time_script_run), 'w') as outfile:
json.dump(Infotext, outfile)
This has multiple drawbacks:
You'll have an ever-growing number of files
Even if you load the file with the latest datetime in its name (and finding that file grows in run time), you can only see data as it was in the single time before the last run; the full history is very difficult to look up.
I think you're better using a light-weight database such as sqlite3:
import sqlite3
import random
import time
import datetime as dt
# Create DB
with sqlite3.connect('some_database.db') as conn:
c = conn.cursor()
# Just for this example, we'll clear the whole table to make it repeatable
try:
c.execute("DROP TABLE user_emails")
except sqlite3.OperationalError: # First time you run this code
pass
c.execute("""CREATE TABLE IF NOT EXISTS user_emails(
datetime TEXT,
first_name TEXT,
last_name TEXT,
email TEXT)
""")
# Now let's create some fake user behaviour
for x in range(5):
now = dt.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
c.execute("INSERT INTO user_emails VALUES (?, ?, ?, ?)",
(now, 'John', 'Smith', random.randint(0, 1000)))
time.sleep(1) # so we get new timestamps
# Later on, doing some work
with sqlite3.connect('some_database.db') as conn:
c = conn.cursor()
# Get whole user history
c.execute("""SELECT * FROM user_emails
WHERE first_name = ? AND last_name = ?
""", ('John', 'Smith'))
print("All data")
for row in c.fetchall():
print(row)
print('...............................................................')
# Or, let's get the last email address
print("Latest data")
c.execute("""
SELECT * FROM user_emails
WHERE first_name = ? AND last_name = ?
ORDER BY datetime DESC
LIMIT 1;
""", ('John', 'Smith'))
print(c.fetchall())
Note: the data retrieval runs really quickly in this code, it only takes ~5 secs to run because I use time.sleep(1) in generating the fake user data.
The JSON file should contain a list of strings. You should read the current contents of the file into a variable, append to the variable, then rewrite the file.
with open("data.json", "r") as f:
data = json.load(f)
data.append(firstname + ' ' + lastname+ ' - ' + email)
with open("data.json", "w") as f:
json.dump(data, f)
I think what you could do is to use seek() for files and write in the related position of the json file . for example you need to update firstname , you seek for the : after firstname , and update the text there.
There are examples here :
https://www.tutorialspoint.com/python/file_seek.htm

searching a csv and printing a line

Trying to create a train booking system.
Having trouble searching my csv and printing that certain line.
The user already has there id number,and the csv is is set out like
This is what I have so far:
You are matching the entire line against the ID. You need to split out the first field and check that:
def buySeat():
id = raw_input("please enter your ID")
for line in open("customers.csv"):
if line.split(',')[0] == id:
print line
else:
print "sorry cant find you"
Try using the built-in CSV module. It will make things easier to manage as your requirements change.
import csv
id = raw_input("please enter your ID")
ID_INDEX = 0
with open('customers.csv', 'rb') as csvfile:
csvReader = csv.reader(csvfile)
for row in csvReader:
# Ignore the column names on the first line.
if row[ID_INDEX] != 'counter':
if row[ID_INDEX] == id:
print ' '.join(row)
else:
print 'sorry cant find you'

Categories