Why does pycharm warn about "Redeclared variable defined above without usage"? - python

Why does PyCharm warn me about Redeclared 'do_once' defined above without usage in the below code? (warning is at line 3)
for filename in glob.glob(os.path.join(path, '*.'+filetype)):
with open(filename, "r", encoding="utf-8") as file:
do_once = 0
for line in file:
if 'this_text' in line:
if do_once == 0:
//do stuff
do_once = 1
//some other stuff because of 'this text'
elif 'that_text' in line and do_once == 0:
//do stuff
do_once = 1
Since I want it to do it once for each file it seems appropriate to have it every time it opens a new file and it does work just like I want it to but since I have not studied python, just learned some stuff by doing and googling, I wanna know why it is giving me a warning and what I should do differently.
Edit:
Tried with a boolean instead and still got the warning:
Short code that reproduces the warning for me:
import os
import glob
path = 'path'
for filename in glob.glob(os.path.join(path, '*.txt')):
with open(filename, "r", encoding="utf-8") as ins:
do_once = False
for line in ins:
if "this" in line:
print("this")
elif "something_else" in line and do_once == False:
do_once = True

In order to solve the general case:
What you may be doing
v1 = []
for i in range(n):
v1.append([randrange(10)])
v2 = []
for i in range(n): # <<< Redeclared i without usage
v2.append([randrange(10)])
What you can do
v1 = [[randrange(10)] for _ in range(5)] # use dummy variable "_"
v2 = [[randrange(10)] for _ in range(5)]

My guess is PyCharm is being confused by the use of integers as flags, there are several alternatives that could be used in your use case.
Use a boolean flag instead of an integer
file_processed = False
for line in file:
if 'this' in line and not file_processed:
# do stuff
file_processed = True
...
A better approach would be to jump simply stop once you have processed something in the file eg:
for filename in [...list...]:
while open(filename) as f:
for line in f:
if 'this_text' in line:
# Do stuff
break # Break out of this for loop and go to the next file

Not really an answer, but maybe an explanation:
Apparently, PyCharm is trying to avoid code like
do_once = False
do_once = True
However, it's also flagging normal code like the OP's:
item_found = False
for item in items:
if item == item_that_i_want:
item_found = True
if item_found:
# do something
or, something like
last_message = ''
try:
# do something
if success:
last_message = 'successfully did something'
else:
last_message = 'did something without success'
# do something else
if success:
last_message = '2nd something was successful'
else
last_message = '2nd something was not successful'
# and so on
print(last_message)
Redeclared 'last_message' defined above without usage warning will appear for every line where last_message was reassigned without using it inbetween.
So, the workaround would be different for each case where this is happening:
ignore the warning(s)
print or log the value somewhere after setting it
perhaps make a function to call for setting/retrieving the value
determine if there's an alternate way to accomplish the desired outcome
My code was using the last_message example, and I just removed the code reassigning last_message in each case (though printing after each reassignment also removed the warnings). I was using it for testing to locate a problem, so it wasn't critical. Had I wanted to log the completed actions, I might've used a function to do so instead of reassigning the variable each time.
If I find a way to turn it off or avoid the warning in PyCharm, I'll update this answer.

Related

Verify data on a file

I'm trying to make my life easier on my work, and writing down errors and solutions for that same errors. The program itself works fine when it's about adding new errors, but then I added a function to verify if the error exists in the file and then do something to it (not added yet).
The function doesn't work and I don't know why. I tried to debug it, but still not able to find the error, maybe a conceptual error?
Anyway, here's my entire code.
import sys
import os
err = {}
PATH = 'C:/users/userdefault/desktop/errordb.txt'
#def open_file(): #Not yet used
#file_read = open(PATH, 'r')
#return file_read
def verify_error(error_number, loglist): #Verify if error exists in file
for error in loglist:
if error_number in loglist:
return True
def dict_error(error_number, solution): #Puts input errors in dict
err = {error_number: solution}
return err
def verify_file(): #Verify if file exists. Return True if it does
archive = os.path.isfile(PATH)
return archive
def new_error():
file = open(PATH, 'r') #Opens file in read mode
loglist = file.readlines()
file.close()
found = False
error_number = input("Error number: ")
if verify_error(error_number, loglist) == True:
found = True
# Add new solution, or another solution.
pass
solution = str(input("Solution: "))
file = open(PATH, 'a')
error = dict_error(error_number, solution)
#Writes dict on file
file.write(str(error))
file.write("\n")
file.close()
def main():
verify = verify_file() #Verify if file exists
if verify == True:
new = str.lower(input("New job Y/N: "))
if new == 'n':
sys.exit()
while new == 'y':
new_error()
new = str.lower(input("New job Y/N: "))
else:
sys.exit()
else:
file = open(PATH, "x")
file.close()
main()
main()
To clarify, the program executes fine, it don't return an error code. It just won't execute the way I'm intended, I mean, it supposed to verify if certain error number already exists.
Thanks in advance :)
The issue I believe you're having is the fact that you're not actually creating a dictionary object in the file and modifying it but instead creating additional dictionaries every time an error is added then reading them back as a list of strings by using the .readlines() method.
An easier way of doing it would be to create a dictionary if one doesn't exist and append errors to it. I've made a few modifications to your code which should help.
import sys
import os
import json # Import in json and use is as the format to store out data in
err = {}
PATH = 'C:/users/userdefault/desktop/errordb.txt'
# You can achieve this by using a context manager
#def open_file(): #Not yet used
#file_read = open(PATH, 'r')
#return file_read
def verify_error(error_number, loglist): #Verify if error exists in file
# Notice how we're looping over keys of your dictionary to check if
# an error already exists.
# To access values use loglist[k]
for k in loglist.keys():
if error_number == k:
return True
return False
def dict_error(loglist, error_number, solution): #Puts input errors in dict
# Instead of returning a new dictionary, return the existing one
# with the new error appended to it
loglist[error_number] = solution
return loglist
def verify_file(): #Verify if file exists. Return True if it does
archive = os.path.isfile(PATH)
return archive
def new_error():
# Let's move all the variables to the top, makes it easier to read the function
# Changes made:
# 1. Changed the way we open and read files, now using a context manager (aka with open() as f:
# 2. Added a json parser to store in and read from file in a json format. If data doesn't exist (new file?) create a new dictionary object instead
# 3. Added an exception to signify that an error has been found in the database (this can be removed to add additional logic if you'd like to do more stuff to the error, etc)
# 4. Changed the way we write to file, instead of appending a new line we now override the contents with a new updated dictionary that has been serialized into a json format
found = False
loglist = None
# Open file as read-only using a context manager, now we don't have to worry about closing it manually
with open(PATH, 'r') as f:
# Lets read the file and run it through a json parser to get a python dictionary
try:
loglist = json.loads(f.read())
except json.decoder.JSONDecodeError:
loglist = {}
error_number = input("Error number: ")
if verify_error(error_number, loglist) is True:
found = True
raise Exception('Error exists in the database') # Raise exception if you want to stop loop execution
# Add new solution, or another solution.
solution = str(input("Solution: "))
# This time open in write only and replace the dictionary
with open(PATH, 'w') as f:
loglist = dict_error(loglist, error_number, solution)
# Writes dict on file in json format
f.write(json.dumps(loglist))
def main():
verify = verify_file() #Verify if file exists
if verify == True:
new = str.lower(input("New job Y/N: "))
if new == 'n':
sys.exit()
while new == 'y':
new_error()
new = str.lower(input("New job Y/N: "))
else:
sys.exit()
else:
with open(PATH, "x") as f:
pass
main()
main()
Note that you will have to create a new errordb file for this snippet to work.
Hope this has helped somehow. If you have any further questions hit me up in the comments!
References:
Reading and Writing files in Python
JSON encoder and decoder in Python
I think that there may be a couple of problems with your code, but the first thing that I noticed was that you are saving Error Numbers and Solutions as a dictionary in errorsdb.txt and when you read them back in you are reading them back in as a list of strings:
The line:
loglist = file.readlines()
in new_error returns a list of strings. This means that verify_error will always return False.
So you have a couple of choices:
You could modify verify_error to the following:
def verify_error(error_number, loglist): #Verify if error exists in file
for error in loglist:
if error_number in error:
return True
Although, I think that a better solution would be to load errorsdb.txt as a JSON file and then you'll have a dictionary. That would look something like:
import json
errordb = {}
with open(PATH) as handle:
errordb = json.load(handle)
So here are the full set of changes I would make:
import json
def verify_error(error_number, loglist): #Verify if error exists in file
for error in loglist:
if error_number in error:
return True
def new_error():
errordb = list()
exitsting = list()
with open(PATH) as handle:
existing = json.load(handle)
errordb += existing
error_number = input("Error number: ")
if verify_error(error_number, errordb) == True:
# Add new solution, or another solution.
print("I might do something here.")
else:
solution = str(input("Solution: "))
errordb.append({error_number, solution})
#Writes dict on file
with open(PATH, "w") as handle:
json.dump(errordb, handle)

Try statement not running as I expect

I have three functions, the readHeader thet reads the header of the a txt file, readExpertsFile that reads the contents of the file and the exceptionNH function that compares the file name and header and raises an exception if the two are not compatible (e.g. if the date in the name is not the same as the header).
Here are the three functions and a txt example:
def readHeader(fileName):
fileIn = open(fileName, "r")
fileIn.readline()
day = fileIn.readline().replace("\n", "")
fileIn.readline()
time = fileIn.readline().replace("\n", "")
fileIn.readline()
company = fileIn.readline().replace("\n", "")
scope = fileIn.readline().replace(":", "").replace("\n", "")
fileIn.close()
return (day, time, company, scope)
def readFile(fileName):
expertsList = []
expertsList.append(readHeader(fileName))
fileIn = open(fileName, "r")
for line_counter in range(LNHEADER):
fileIn.readline()
fileInE.close()
return expertsList
def exceptionNH(fileName):
try:
assert fileName[10:17] == readFile(fileName)[3][0].lower().replace(":", "")
except AssertionError:
print("Error in input file: inconsistent name and header in file", fileName,".")
exit()
fileName = "file.txt"
exceptionNH("2018y03m28experts10h30.txt")
2018y03m28experts10h30.txt:
Day:
2018-03-28
Time:
10:30
Company:
XXX
Experts:
...
...
My problem here is that on the try statement I expect the assert "sees" the comparation as True and skip the except clause but this is not happening.
I suspect that the .lower() is not working but I can't understand why.
If you see other things that could be better feel free to share, as I'm a new at python and want to improve myself.
I've found the error. I was thinking that when I want to get the first element from the first tuple inside a list, I would need to write list[position of item][position of tuple], instead of it's inverse.
Following the mkrieger1's advice, I printed fileName[10:17] and readFile(fileName)[3][0].lower().replace(":", ""), the first was good but the second was not showing the third item of the first tuple (that's from readHeader) but the first item of the third tuple.
I've changed from readFile(fileName)[3][0].lower().replace(":", "") to readFile(fileName)[0][3].lower().replace(":", "") and it's working now, thank you for the help.

I need a shortcut

So im just trying to make a simple script that can filter emails with different domains its working great but i need a shortcut, cause i dont wana write if and elif statements many time , Can anyone tell my how to write my script with function so that will become shorter and easier.. thanks in advance ,Script is below:
f_location = 'C:/Users/Jack The Reaper/Desktop/mix.txt'
text = open(f_location)
good = open('C:/Users/Jack The Reaper/Desktop/good.txt','w')
for line in text:
if '#yahoo' in line:
yahoo = None
elif '#gmail' in line:
gmail = None
elif '#yahoo' in line:
yahoo = None
elif '#live' in line:
live = None
elif '#outlook' in line:
outlook = None
elif '#hotmail' in line:
hotmail = None
elif '#aol' in line:
aol = None
else:
if ' ' in line:
good.write(line.strip(' '))
elif '' in line:
good.write(line.strip(''))
else:
good.write(line)
text.close()
good.close()
I would suggest you to use dict for this instead of having separate variables for all the cases.
my_dict = {}
...
if '#yahoo' in line:
my_dict['yahoo'] = None
But if you want to do the way you described in the question, you can do as done below,
email_domains = ['#yahoo', '#gmail', '#live', '#outlook', '#hotmail', '#aol']
for e in email_domains:
if e in line:
locals()[e[1:]] = None
#if you use dict, use the below line
#my_dict[e[1:]] = None
locals() returns a dictionary of the current namespace. The keys in this dict are the variable names and value is the value of the variable.
So locals()['gmail'] = None creates a local variable named gmail(if it doesn't exist) and assigns it None.
As you stated the problem and provided the sample file :
So i have two solution : One line solution and other is detailed solution.
First let's define regex pattern and import re module
import re
pattern=r'.+#(?!gmail|yahoo|aol|hotmail|live|outlook).+'
Now detailed version code:
emails=[]
with open('emails.txt','r') as f:
for line in f:
match=re.finditer(pattern,line)
for find in match:
emails.append(find.group())
with open('result.txt','w') as f:
f.write('\n'.join(emails))
output in result.txt file :
nic-os9#gmx.de
angelique.charuel#sfr.fr
nannik#interia.pl
l.andrioli#freenet.de
kamil_sieminski8#o2.pl
hugo.lebrun.basket#orange.fr
One line solution if you want too short:
with open('results.txt','w') as file:
file.write('\n'.join([find.group() for line in open('emails.txt','r') for find in re.finditer(pattern,line)]))
output:
nic-os9#gmx.de
angelique.charuel#sfr.fr
nannik#interia.pl
l.andrioli#freenet.de
kamil_sieminski8#o2.pl
hugo.lebrun.basket#orange.fr
P.S : with one line solution file will not close automatically but python clear that stuff its not a big issue (but not always) but still if you want you can use.

reading log files in Python and outputing specific text

I have a piece of code that reads the last line of a log file as the log is being written to. I want to print errors which occur in the logs, basically start printing when line.startswith('Error') and finish printing when line.startwith('End of Error'). My code is below, Could anybody help me with this please?
log = 'C:\mylog.log'
file = open(log, 'r')
res = os.stat(log)
size = res[6]
file.seek(size)
while 1:
where = file.tell()
line = file.readline()
if not line:
time.sleep(1)
file.seek(where)
else:
if line.startswith('Error'):
#print lines until you come to 'End of Error'
Initialize a flag before the loop:
in_error = False
Then switch it on and off as needed:
if line.startswith('Error'):
in_error = True
elif line.startswith('End of Error'):
print(line)
in_error = False
if in_error:
print(line)
It may be easier to use the subprocess module to simply run tail -F (capital F, available on GNU platforms) and process the output.

Problems with variable referenced before assignment when using os.path.walk

OK. I have some background in Matlab and I'm now switching to Python.
I have this bit of code under Pythnon 2.6.5 on 64-bit Linux which scrolls through directories, finds files named 'GeneralData.dat', retrieves some data from them and stitches them into a new data set:
import pylab as p
import os, re
import linecache as ln
def LoadGenomeMeanSize(arg, dirname, files):
for file in files:
filepath = os.path.join(dirname, file)
if filepath == os.path.join(dirname,'GeneralData.dat'):
data = p.genfromtxt(filepath)
if data[-1,4] != 0.0: # checking if data set is OK
data_chopped = data[1000:-1,:] # removing some of data
Grand_mean = data_chopped[:,2].mean()
Grand_STD = p.sqrt((sum(data_chopped[:,4]*data_chopped[:,3]**2) + sum((data_chopped[:,2]-Grand_mean)**2))/sum(data_chopped[:,4]))
else:
break
if filepath == os.path.join(dirname,'ModelParams.dat'):
l = re.split(" ", ln.getline(filepath, 6))
turb_param = float(l[2])
arg.append((Grand_mean, Grand_STD, turb_param))
GrandMeansData = []
os.path.walk(os.getcwd(), LoadGenomeMeanSize, GrandMeansData)
GrandMeansData = sorted(GrandMeansData, key=lambda data_sort: data_sort[2])
TheMeans = p.zeros((len(GrandMeansData), 3 ))
i = 0
for item in GrandMeansData:
TheMeans[i,0] = item[0]
TheMeans[i,1] = item[1]
TheMeans[i,2] = item[2]
i += 1
print TheMeans # just checking...
# later do some computation on TheMeans in NumPy
And it throws me this (though I would swear it was working a month ego):
Traceback (most recent call last):
File "/home/User/01_PyScripts/TESTtest.py", line 29, in <module>
os.path.walk(os.getcwd(), LoadGenomeMeanSize, GrandMeansData)
File "/usr/lib/python2.6/posixpath.py", line 233, in walk
walk(name, func, arg)
File "/usr/lib/python2.6/posixpath.py", line 225, in walk
func(arg, top, names)
File "/home/User/01_PyScripts/TESTtest.py", line 26, in LoadGenomeMeanSize
arg.append((Grand_mean, Grand_STD, turb_param))
UnboundLocalError: local variable 'Grand_mean' referenced before assignment
All right... so I went and did some reading and came up with this global variable:
import pylab as p
import os, re
import linecache as ln
Grand_mean = p.nan
Grand_STD = p.nan
def LoadGenomeMeanSize(arg, dirname, files):
for file in files:
global Grand_mean
global Grand_STD
filepath = os.path.join(dirname, file)
if filepath == os.path.join(dirname,'GeneralData.dat'):
data = p.genfromtxt(filepath)
if data[-1,4] != 0.0: # checking if data set is OK
data_chopped = data[1000:-1,:] # removing some of data
Grand_mean = data_chopped[:,2].mean()
Grand_STD = p.sqrt((sum(data_chopped[:,4]*data_chopped[:,3]**2) + sum((data_chopped[:,2]-Grand_mean)**2))/sum(data_chopped[:,4]))
else:
break
if filepath == os.path.join(dirname,'ModelParams.dat'):
l = re.split(" ", ln.getline(filepath, 6))
turb_param = float(l[2])
arg.append((Grand_mean, Grand_STD, turb_param))
GrandMeansData = []
os.path.walk(os.getcwd(), LoadGenomeMeanSize, GrandMeansData)
GrandMeansData = sorted(GrandMeansData, key=lambda data_sort: data_sort[2])
TheMeans = p.zeros((len(GrandMeansData), 3 ))
i = 0
for item in GrandMeansData:
TheMeans[i,0] = item[0]
TheMeans[i,1] = item[1]
TheMeans[i,2] = item[2]
i += 1
print TheMeans # just checking...
# later do some computation on TheMeans in NumPy
It does not give error massages. Even gives a file with data... but data are bloody wrong! I checked some of them manually by running commands:
import pylab as p
data = p.genfromtxt(filepath)
data_chopped = data[1000:-1,:]
Grand_mean = data_chopped[:,2].mean()
Grand_STD = p.sqrt((sum(data_chopped[:,4]*data_chopped[:,3]**2) \
+ sum((data_chopped[:,2]-Grand_mean)**2))/sum(data_chopped[:,4]))
on selected files. They are different :-(
1) Can anyone explain me what's wrong?
2) Does anyone know a solution to that?
I'll be grateful for help :-)
Cheers,
PTR
I would say this condition is not passing:
if filepath == os.path.join(dirname,'GeneralData.dat'):
which means you are not getting GeneralData.dat before ModelParams.dat. Maybe you need to sort alphabetically or the file is not there.
I see one issue with the code and the solution that you have provided.
Never hide the issue of "variable referencing before assignment" by just making the variable visible.
Try to understand why it happened?
Prior to creating a global variable "Grand_mean", you were getting an issue that you are accessing Grand_mean before any value is assigned to it. In such a case, by initializing the variable outside the function and marking it as global, only serves to hide the issue.
You see erroneous result because now you have made the variable visible my making it global but the issue continues to exist. You Grand_mean was never equalized to some correct data.
This means that section of code under "if filepath == os.path.join(dirname,..." was never executed.
Using global is not the right solution. That only makes sense if you do in fact want to reference and assign to the global "Grand_mean" name. The need for disambiguation comes from the way the interpreter prescans for assignment operators in function declarations.
You should start by assigning a default value to Grand_mean within the scope of LoadGenomeMeanSize(). You have 1 of 4 branches to actually assign a value to Grand_mean that has correct semantic meaning within one loop iteration. You are likely running into a case where
if filepath == os.path.join(dirname,'ModelParams.dat'): is true, but either
if filepath == os.path.join(dirname,'GeneralData.dat'): or if data[-1,4] != 0.0: is not. It's likely the second condition that is failing for you. Move the
The quick and dirty answer is you probably need to rearrange your code like this:
...
if filepath == os.path.join(dirname,'GeneralData.dat'):
data = p.genfromtxt(filepath)
if data[-1,4] != 0.0: # checking if data set is OK
data_chopped = data[1000:-1,:] # removing some of data
Grand_mean = data_chopped[:,2].mean()
Grand_STD = p.sqrt((sum(data_chopped[:,4]*data_chopped[:,3]**2) + sum((data_chopped[:,2]-Grand_mean)**2))/sum(data_chopped[:,4]))
if filepath == os.path.join(dirname,'ModelParams.dat'):
l = re.split(" ", ln.getline(filepath, 6))
turb_param = float(l[2])
arg.append((Grand_mean, Grand_STD, turb_param))
else:
break
...

Categories