I'm trying to run in Pycharm this programm :
# Chap02-03/twitter_hashtag_frequency.py
import sys
from collections import Counter
import json
def get_hashtags(tweet):
entities = tweet.get('entities', {})
hashtags = entities.get('hashtags', [])
return [tag['text'].lower() for tag in hashtags]
if __name__ == '__main__':
fname = sys.argv[1]
with open(fname, 'r') as f:
hashtags = Counter()
for line in f:
tweet = json.loads(line)
hashtags_in_tweet = get_hashtags(tweet)
hashtags.update(hashtags_in_tweet)
for tag, count in hashtags.most_common(20):
print("{}: {}".format(tag, count))
I want to run the programm twitter_hashtag_frequency.py in Pycharm using a json file stream_.jsonl as a parameter, this file is in the same directory as the programm. Can you show me how can I edit this code ? I tried several time, I did'nt succeed, I got this error :
fname = sys.argv[1]
IndexError: list index out of range
Thank you for your help.
If you run the file by pressing the green play button (next to Edit Configurations), you'll need to specify the argument in the configurations menu in Parameters. Enter stream_.jsonl in the text box.
Also double check that the working directory is set to the one containing both these files
Related
I want my code to save information..
for example in here we take a input of what is my name :
your_name = str(input("What is your name : "))
then if i stop the code and run it again i want it to still know my name
but the problem is that when you stop the code everything gets deleted and the program doesn't know what is your name since you stopped the code..
That's how programs work. If you want to persist anything, you can store the information to file and load the information from file the every time the program runs.
For example,
import os
filepath = 'saved_data.txt'
try:
# try loading from file
with open(filepath, 'r') as f:
your_name = f.read()
except FileNotFoundError:
# if failed, ask user
your_name = str(input("What is your name : "))
# store result
with open(filepath, 'w') as f:
f.write(your_name)
With more complex data, you will want to use the pickle package.
import os
import pickle
filepath = 'saved_data.pkl'
try:
# try loading from file
with open(filepath, 'r') as f:
your_name = pickle.load(f)
except FileNotFoundError:
# if failed, ask user
your_name = str(input("What is your name : "))
# store result
with open(filepath, 'w') as f:
pickle.dump(your_name)
To be able to dump and load all the data from your session, you can use the dill package:
import dill
# load all the variables from last session
dill.load_session('data.pkl')
# ... do stuff
# dump the current session to file to be used next time.
dill.dump_session('data.pkl')
I've just had a look around and found the sqlitedict package that seems to make this sort of thing easy.
from sqlitedict import SqliteDict
def main() -> None:
prefs = SqliteDict("prefs.sqlite")
# try getting the user's name
if name := prefs.get("name"):
# got a name, see if they want to change it
if newname := input(f"Enter your name [{name}]: ").strip():
name = newname
else:
# keep going until we get something
while not (name := input("Enter your name: ").strip()):
pass
print(name)
# save it for next time
prefs["name"] = name
prefs.commit()
if __name__ == "__main__":
main()
Note that I'm using Python 3.8's new "Walrus operator", :=, to make the code more concise.
You'll need to rely on a database (like SQLite3, MySQL, etc) if you want to save the state of your program.
You could also writing to the same file you're running and append a variable to the top of the file as well--but that would cause security concerns if this were a real program since this is equivalent to eval():
saved_name = None
if not saved_name:
saved_name = str(input("What is your name : "))
with open("test.py", "r") as this_file:
lines = this_file.readlines()
lines[0] = f'saved_name = "{saved_name}"\n'
with open("test.py", "w") as updated_file:
for line in lines:
updated_file.write(line)
print(f"Hello {saved_name}")
I am trying to open a file with this Try/Except block but it is going straight to Except and not opening the file.
I've tried opening multiple different files but they are going directly to not being able to open.
import string
fname = input('Enter a file name: ')
try:
fhand = open(fname)
except:
print('File cannot be opened:', fname)
exit()
counts = dict()
L_N=0
for line in fhand:
line= line.rstrip()
line = line.translate(line.maketrans(' ', ' ',string.punctuation))
line = line.lower()
words = line.split()
L_N+=1
for word in words:
if word not in counts:
counts[word]= [L_N]
else:
if L_N not in counts[word]:
counts[word].append(L_N)
for h in range(len(counts)):
print(counts)
out_file = open('word_index.txt', 'w')
out_file.write('Text file being analyzed is: '+str(fname)+ '\n\n')
out.file_close()
I would like the output to read a specific file and count the created dictionary
make sure you are inputting quotes for your filename ("myfile.txt") if using python 2.7. if python3, quotes are not required.
make sure your input is using absolute path to the file, or make sure the file exists in the same place you are running the python program.
for example,
if your program and current working directory is in ~/code/
and you enter: 'myfile.txt', 'myfile.txt' must exist in ~/code/
however, its best you provide the absolute path to your input file such as
/home/user/myfile.txt
then your script will work 100% of the time, no matter what directory you call your script from.
I'm trying to make a custom logwatcher of a log folder using python. The objective is simple, finding a regex in the logs and write a line in a text if find it.
The problem is that the script must be running constantly against a folder in where could be multiple log files of unknown names, not a single one, and it should detect the creation of new log files inside the folder on the fly.
I made some kind of tail -f (copying part of the code) in python which is constantly reading a specific log file and write a line in a txt file if regex is found in it, but I don't know how could I do it with a folder instead a single log file, and how can the script detect the creation of new log files inside the folder to read them on the fly.
#!/usr/bin/env python
import time, os, re
from datetime import datetime
# Regex used to match relevant loglines
error_regex = re.compile(r"ERROR:")
start_regex = re.compile(r"INFO: Service started:")
# Output file, where the matched loglines will be copied to
output_filename = os.path.normpath("log/script-log.txt")
# Function that will work as tail -f for python
def follow(thefile):
thefile.seek(0,2)
while True:
line = thefile.readline()
if not line:
time.sleep(0.1)
continue
yield line
logfile = open("log/service.log")
loglines = follow(logfile)
counter = 0
for line in loglines:
if (error_regex.search(line)):
counter += 1
sttime = datetime.now().strftime('%Y%m%d_%H:%M:%S - ')
out_file=open(output_filename, "a")
out_file.write(sttime + line)
out_file.close()
if (start_regex.search(line)):
sttime = datetime.now().strftime('%Y%m%d_%H:%M:%S - ')
out_file=open(output_filename, "a")
out_file.write(sttime + "SERVICE STARTED\n" + sttime + "Number of errors detected during the startup = {}\n".format(counter))
counter = 0
out_file.close()
You can use watchgod for this purpose. This may be a comment too, not sure if it deserves to be na answer.
I have created this search and replace program.
But I want to make changes to it, so I can do a search and replace for
multiple files at once.
Now, is there a way so I have
the option to select multiple files at once
from any folder or directory that I choose.
The code that helps me to select files using file dialog window is given below, but is giving errors. can you help me to correct it?
The FULL traceback error is :
Traceback <most recent call last>:
File "replace.py", line 24, in <module>
main()
File "replace.py", line 10, in main
file = tkFileDialog.askopenfiles(parent=root,mode='r',title='Choose a file')
File "d:\Python27\lib\lib-tk\tkFileDialog.py",line 163, in askopenfiles
ofiles.append(open(filename,mode))
IOError: [Errno 2] No such file or directory: u'E'
And here's the code: I finally got this code to work I changed 'file' to 'filez' and 'askopenfiles' to askopenfilenames'. and I was able to replace the word in my chosen file. the only thing is that it doesnt work when I choose 2 files. maybe I should add in a loop for it to work for multiple files. But, this was a kind of trial and error and I want to be able to really know why it worked. Is there a book or something that will help me to fully understand this tkinter and file dialog thing? anyways, I have changed the code below to show the working code now:
#replace.py
import string
def main():
#import tkFileDialog
#import re
#ff = tkFileDialog.askopenfilenames()
#filez = re.findall('{(.*?)}', ff)
import Tkinter,tkFileDialog
root = Tkinter.Tk()
filez = tkFileDialog.askopenfilenames(parent=root,mode='r',title='Choose a file')
#filez = raw_input("which files do you want processed?")
f=open(filez,"r")
data=f.read()
w1=raw_input("what do you want to replace?")
w2= raw_input("what do you want to replace with?")
print data
data=data.replace(w1,w2)
print data
f=open(filez,"w")
f.write(data)
f.close()
main()
EDIT: One of the replies below gave me an idea about file dialog window and now I am able to select multiple files using a tkinter window, but I am not able to go ahead with the replacing. it's giving errors.
I tried out different ways to use file dialog and the different ways are giving different errors. Instead of deleting one of the ways, I have just put a hash sign in front so as to make it a comment, so you guys are able to take a look and see which one would be better.
Maybe you should take a look at the glob module, it can make finding all files matching a simple pattern (such as *.txt) easy.
Or, easier still but less user-friendly, you could of course treat your input filename filez as a list, separating filenames with space:
for fn in filez.split():
# your code here, replacing filez with fn
You probably want to have a look at glob module.
An example that handles "*" in your input:
#replace.py
import string
import glob
def main():
filez = raw_input("which files do you want processed?")
filez_l = filez.split()
w1=raw_input("what do you want to replace?")
w2= raw_input("what do you want to replace with?")
# Handle '*' e.g. /home/username/* or /home/username/mydir/*/filename
extended_list = []
for filez in filez_l:
if '*' in filez:
extended_list += glob.glob(filez)
else:
extended_list.append(filez)
#print extended_list
for filez in extended_list:
print "file:", filez
f=open(filez,"r")
data=f.read()
print data
data=data.replace(w1,w2)
print data
f=open(filez,"w")
f.write(data)
f.close()
main()
I would rather use the command line instead of input.
#replace.py
def main():
import sys
w1 = sys.argv[1]
w2 = sys.argv[2]
filez = sys.argv[3:]
# ***
for fname in filez:
with open(fname, "r") as f:
data = f.read()
data = data.replace(w1, w2)
print data
with open(fname, "w") as f:
f.write(data)
if __name__ == '__main__':
main()
So you can call your program with
replace.py "old text" "new text" *.foo.txt
or
find -name \*.txt -mmin -700 -exec replace.py "old text" "new text" {} +
If you think of a dialog window, you could insert the following at the position with ***:
if not filez:
import tkFileDialog
import re
ff = tkFileDialog.askopenfilenames()
filez = re.findall('{(.*?)}', ff)
Why not put the program into a for-loop:
def main():
files = raw_input("list all the files do you want processed (separated by commas)")
for filez in files.split(','):
f=open(filez,"r")
data=f.read()
f.close()
w1=raw_input("what do you want to replace?")
w2= raw_input("what do you want to replace with?")
print data
data=data.replace(w1,w2)
print data
f=open(filez,"w")
f.write(data)
f.close()
main()
A good trick to open huge files line by line in python:
contents = map(lambda x: x.next().replace("\n",""),map(iter,FILES))
how to create file names from a number plus a suffix??.
for example I am using two programs in python script for work in a server, the first creates a file x and the second uses the x file, the problem is that this file can not overwrite.
no matter what name is generated from the first program. the second program of be taken exactly from the path and file name that was assigned to continue the script.
thanks for your help and attention
As far as I can understand you, you want to create a file with a unique name in one program and pass the name of that file to another program. I think you should take a look at the tempfile module, http://docs.python.org/library/tempfile.html#module-tempfile.
Here is an example that makes use of NamedTemporaryFile:
import tempfile
import os
def produce(text):
with tempfile.NamedTemporaryFile(suffix=".txt", delete=False) as f:
f.write(text)
return f.name
def consume(filename):
try:
with open(filename) as f:
return f.read()
finally:
os.remove(filename)
if __name__ == '__main__':
filename = produce('Hello, world')
print('Filename is: {0}'.format(filename))
text = consume(filename)
print('Text is: {0}'.format(text))
assert not os.path.exists(filename)
The output is something like this:
Filename is: /tmp/tmpp_iSrw.txt
Text is: Hello, world