Rename a file using variables in the program - Python

Rename a file using variables in the program - Python - python

I want to rename a file called decon.out using two variables in my program. So far I have
gwf = input ("Enter value: ")
myList = os.listdir('.')
for myFile in myList:
if re.match("^HHEMQZ", myFile):
numE = myFile
elif re.match("^HHNMQZ", myFile):
numN = myFile
else:
den = myFile
os.rename('decon.out', 'RF'+gwf+''+numE+'')
For example, gwf = 2.5 and numE = HHEMQZ20010101
I would then want decon.out to be renamed as RF2.5HHEMQZ20010101 where RF will always be the same.
Currently when I run the script I get an error:
Traceback (most recent call last):
File "RunDeconv.py", line 77, in <module>
os.rename('decon.out', 'RF'+gwf+''+numE+'')
TypeError: cannot concatenate 'str' and 'float' objects
Any suggestions?

Use raw_input() instead, input() interprets the input values as Python code turning your 2.5 input into a float number.

About the error: in the string concatenation
'RF'+gwf+''+numE+''
all the members must be strings.
You can use
type(gwf)
type(numE)
to check which is a number.
You then just need to
str(gwf)
or
str(numE)
depending on which may be the case. Or probably both gwf and numE need the str() treatment, so your last line of code should look like this:
os.rename('decon.out', 'RF'+str(gwf)+''+str(numE)+'')

Related

Split a numeric string to a list of integers

I have fetched a list using pandas, but the numeric is like a numeric string. I am trying to convert it to a list of integers.
excel_frame = read_excel(args.path, sheet_name=1, verbose=True, na_filter=False)
data_need = excel_frame['Dependencies'].tolist()
print(data_need)
intStr = data_need.split(',')
map_list = map(int, intStr)
print(map_list)
I am getting the following error.
$python ExcelCellCSVRead.py -p "C:\MyCave\iso\SDG\Integra\Intest\first.xlsx"
Reading sheet 1
['187045, 187046']
Traceback (most recent call last):
File "ExcelCellCSVRead.py", line 31, in <module>
intStr = data_need.split(',')
AttributeError: 'list' object has no attribute 'split'
The target output must be like this -> [187045, 187046]. The current output is coming out like this ->['187045, 187046']
I am pretty sure I have followed suggested approach to resolve the issue, yet it is throwing error.
Regards
data_need

The problem is:
data_need = excel_frame['Dependencies'].tolist()
returns a list. So you can't split it further.
Change your existing code to this:
intStr = data_need[0].split(',') ## if you have only 1-element in data_need
map_list = list(map(int, intStr))
print(map_list)
Tested on your sample:
In [1000]: data_need = ['187045, 187046']
In [1001]: intStr = data_need[0].split(',')
In [1002]: map_list = list(map(int, intStr))
In [1003]: print(map_list)
[187045, 187046]

ValueError in Python 3 code

I have this code that will allow me to count the number of missing rows of numbers within the csv for a script in Python 3.6. However, these are the following errors in the program:
Error:
Traceback (most recent call last):
File "C:\Users\GapReport.py", line 14, in <module>
EndDoc_Padded, EndDoc_Padded = (int(s.strip()[2:]) for s in line)
File "C:\Users\GapReport.py", line 14, in <genexpr>
EndDoc_Padded, EndDoc_Padded = (int(s.strip()[2:]) for s in line)
ValueError: invalid literal for int() with base 10: 'AC-SEC 000000001'
Code:
import csv
def out(*args):
print('{},{}'.format(*(str(i).rjust(4, "0") for i in args)))
prev = 0
data = csv.reader(open('Padded Numbers_export.csv'))
print(*next(data), sep=', ') # header
for line in data:
EndDoc_Padded, EndDoc_Padded = (int(s.strip()[2:]) for s in line)
if start != prev+1:
out(prev+1, start-1)
prev = end
out(start, end)
I'm stumped on how to fix these issues.Also, I think the csv many lines in it, so if there's a section that limits it to a few numbers, please feel free to update me on so.
CSV Snippet (Sorry if I wasn't clear before!):

The values you have in your CSV file are not numeric.
For example, FMAC-SEC 000000001 is not a number. So when you run int(s.strip()[2:]), it is not able to convert it to an int.
Some more comments on the code:
What is the utility of doing EndDoc_Padded, EndDoc_Padded = (...)? Currently you are assigning values to two variables with the same name. Either name one of them something else, or just have one variable there.
Are you trying to get the two different values from each column? In that case, you need to split line into two first. Are the contents of your file comma separated? If yes, then do for s in line.split(','), otherwise use the appropriate separator value in split().
You are running this inside a loop, so each time the values of the two variables would get updated to the values from the last line. If you're trying to obtain 2 lists of all the values, then this won't work.

Name not defined error python when reading file line by line

So, I am very new to python and I am not sure if my code is the most effective, but would still be very appreciative if someone could explain to me why my script is returning the "name not defined" error when I run it. I have a list of 300 gene names in a separate file, one name per line, that I want to read, and store each line as a string variable.
Within the script I have a list of 600 variables. 300 variables labeled name_bitscore and 300 labeled name_length for each of the 300 names.
I want to filter through the list based on a condition. My script looks like this:
#!/usr/bin/python
with open("seqnames-test1-iso-legal-temp.txt") as f:
for line in f:
exec("b="+line+"_bitscore")
exec("l="+line+"_length")
if 0.5*b <= 2*1.05*l and 0.5*b >= 2*0.95*l:
print line
ham_pb_length=2973
ham_pb_bitscore=2165
g2225_ph_length=3303
cg2225_ph_bitscore=2278
etc. for the length and bitscore variables.
Essentially, what I am trying to do here, is read line 1 of the file "seqnames-test1-iso-legal-temp.txt" which is ham_pb. Then I use wanted to use the exec function to create a variable b=ham_pb_bitscore and l=ham_pb_length, so that I could test if half the value of the gene's bitscore is within the range of double its length with a 5% margin of error. Then, repeat this for every gene, i.e. every line of the file "seqnames-test1-sio-legal-temp.txt".
When I execute the script, I get the error message:
Traceback (most recent call last):
File "duplicatebittest.py", line 4, in <module>
exec("b="+line+"_bitscore")
File "<string>", line 1, in <module>
NameError: name 'ham_pb' is not defined
I made another short script to make sure I was using the exec function correctly that looks like this:
#!/usr/pin/python
name="string"
string_value=4
exec("b="+name+"_value")
print(name)
print(b)
And this returns:
string
4
So, I know that I can use exec to include a string variable in a variable declaration because b returns 4 as expected. So, I am not sure why I get an error in my first script.
I tested to make sure the variable line was a string by entering
#!/usr/bin/python
with open("seqnames-test1-iso-legal-temp.txt") as f:
for line in f:
print type(line)
And it returned the line
<type 'str'>
300 times, so I know each variable line is a string, which is why I don't understand why my test script worked, but this one did not.
Any help would be super appreciated!

line is yield by the text file iterator, which issues a newline for each line read.
So your expression:
exec("b="+line+"_bitscore")
is passed to exec as:
b=ham_pb
_bitscore
Strip the output and that will work
exec("b="+line.rstrip()+"_bitscore")
provided that you move the following lines before the loop so variables are declared:
ham_pb_length=2973
ham_pb_bitscore=2165
g2225_ph_length=3303
cg2225_ph_bitscore=2278
Better: quit using exec and use dictionaries to avoid defining variables dynamically.

put #!/usr/bin/env python as the first line. See this question for more explanation.
As Jean pointed out, exec is not the right tool for this job. You should be using dictionaries, as they are less dangerous (search code injection) and dictionaries are easier to read. Here's an example of how to use dictionaries taken from the python documentation:
>>> tel = {'jack': 4098, 'sape': 4139}
>>> tel['guido'] = 4127
>>> tel
{'sape': 4139, 'guido': 4127, 'jack': 4098}
>>> tel['jack']
4098
>>> del tel['sape']
>>> tel['irv'] = 4127
>>> tel
{'guido': 4127, 'irv': 4127, 'jack': 4098}
>>> list(tel.keys())
['irv', 'guido', 'jack']
>>> sorted(tel.keys())
['guido', 'irv', 'jack']
>>> 'guido' in tel
True
>>> 'jack' not in tel
False
Here's a way I can think of to accomplish your goal:
with open("seqnames-test1-iso-legal-temp.txt") as f:
gene_data = {'ham_pb_length':2973, 'am_pb_bitscore':2165,
'g2225_ph_length':3303, 'cg2225_ph_bitscore':2278}
'''maybe you have more of these gene data things. If so,
just append them to the end of the above dictionary literal'''
for line in f:
if not line.isspace():
bitscore = gene_data[line.rstrip()+'_bitscore']
length = gene_data[line.rstrip()+'_bitscore']
if (0.95*length <= bitscore/4 <= 1.05*length):
print line
I take advantage of a few useful python features here. In python3, 5/7 evaluates to 0.7142857142857143, not your typical 0 as in many programming languages. If you want integer division in python3, use 5//7. Additionally, in python 1<2<3 evaluates to True, and 1<3<2 evaluates to False whereas in many programming languages, 1<2<3 evaluates to True<3 which might give an error or evaluate to True depending on the programming language.

TypeError: '_io.TextIOWrapper' object is not subscriptable

The main function that the code should do is to open a file and get the median. This is my code:
def medianStrat(lst):
count = 0
test = []
for line in lst:
test += line.split()
for i in lst:
count = count +1
if count % 2 == 0:
x = count//2
y = lst[x]
z = lst[x-1]
median = (y + z)/2
return median
if count %2 == 1:
x = (count-1)//2
return lst[x] # Where the problem persists
def main():
lst = open(input("Input file name: "), "r")
print(medianStrat(lst))
Here is the error I get:
Traceback (most recent call last):
File "C:/Users/honte_000/PycharmProjects/Comp Sci/2015/2015/storelocation.py", line 30, in <module>
main()
File "C:/Users/honte_000/PycharmProjects/Comp Sci/2015/2015/storelocation.py", line 28, in main
print(medianStrat(lst))
File "C:/Users/honte_000/PycharmProjects/Comp Sci/2015/2015/storelocation.py", line 24, in medianStrat
return lst[x]
TypeError: '_io.TextIOWrapper' object is not subscriptable
I know lst[x] is causing this problem but not too sure how to solve this one.
So what could be the solution to this problem or what could be done instead to make the code work?

You can't index (__getitem__) a _io.TextIOWrapper object. What you can do is work with a list of lines. Try this in your code:
lst = open(input("Input file name: "), "r").readlines()
Also, you aren't closing the file object, this would be better:
with open(input("Input file name: ", "r") as lst:
print(medianStrat(lst.readlines()))
with ensures that file get closed.

basic error my end, sharing in case anyone else finds it useful. Difference between datatypes is really important! just because it looks like JSON doesn't mean it is JSON - I ended up on this answer, learning this the hard way.
Opening the IO Stream needs to be converted using the python json.load method, before it is a dict data type, otherwise it is still a string. Now it is in a dict it can be brought into a dataFrame.
def load_json(): # this function loads json and returns it as a dataframe
with open("1lumen.com.json", "r") as io_str:
data = json.load(io_str)
df = pd.DataFrame.from_dict(data)
logging.info(df.columns.tolist())
return(df)

TypeError: not all arguments converted during string formatting 11

def main():
spiral = open('spiral.txt', 'r') # open input text file
dim = spiral.readline() # read first line of text
print(dim)
if (dim % 2 == 0): # check to see if even
dim += 1 # make odd
I know this is probably very obvious but I can't figure out what is going on. I am reading a file that simply has one number and checking to see if it is even. I know it is being read correctly because it prints out 10 when I call it to print dim. But then it says:
TypeError: not all arguments converted during string formatting
for the line in which I am testing to see if dim is even. I'm sure it's basic but I can't figure it out.

The readline method of file objects always returns a string; it will not convert the number into an integer for you. You need to do this explicitly:
dim = int(spiral.readline())
Otherwise, dim will be a string and doing dim % 2 will cause Python to try to perform string formatting with 2 as an argument:
>>> '10' % 2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: not all arguments converted during string formatting
>>>
Also, doing print(dim) outputed 10 instead of '10' because print automatically removes the apostrophes when printing:
>>> print('10')
10
>>>

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Rename a file using variables in the program - Python - python

Use raw_input() instead, input() interprets the input values as Python code turning your 2.5 input into a float number.

Related

Split a numeric string to a list of integers

ValueError in Python 3 code

Name not defined error python when reading file line by line

TypeError: '_io.TextIOWrapper' object is not subscriptable

TypeError: not all arguments converted during string formatting 11

Categories

Resources