Parameters feeding input variables in Python - python

Beginning Python guy here. Have some code I need help with.
My main question here is in this bit of code we have 3 define statements for mm, yy, and yyyy.
In the 'hpds_folder =' statement it references 'dnb*{0}{1}' with the {0} and {1} being the 1st and 2nd input parameters.
mm = hadoop.date_part()['mm']
yy = hadoop.date_part()['yy']
yyyy = hadoop.date_part()['yyyy']
hdfs_folder = '/sandbox/US_MARKETING/COMMON_DATA/BAU/FILES/{0}/{1}'.format(yyyy, mm)
find_dnb = hadoop.file_find(file='dnb*{0}*{1}*'.format(mm, yy), folder = hadoop._xfer_in_hadoop['dnb'])
print('dnb*{0}*{1}*')
I'm assuming {0} and {1} should be what are populated by mm and yy respectively.
But when I try to print out the string for it:
print('dnb*{0}{1}')
I get just the literal 'dnb*{0}{1}' as output.
Shouldn't I get a month and a year?

On your print statement, you didn't format the text, so it wasn't replaced. the assignment on file happened once and didn't change the string literal for other locations.
Therefore, your print should be formatted as well:
print('dnb*{0}*{1}*'.format(mm, yy))
In Python3.6+, a new formatted strings were introduced, letting you do things such as:
print(f'dnb*{mm}*{yy}*')
Notice the f before the string mark. fstrings let you inject code to the string inside curly brackets {}.
You can also use it on your find_dnb line:
find_dnb = hadoop.file_find(file=f'dnb*{mm}*{yy}*', folder = hadoop._xfer_in_hadoop['dnb'])

Related

Read text file to create list and then convert to dictionary python

I am reading text file using
source= open(curr_file,"r")
lines= source.readlines()
This converts every line in my text file to a list, but some items in my list are created with double quotes while some are created with single quotes like below.
['[INFO] Name: Xxxx, section: yyyy, time: 21.2, status: 0\n', "proof:proof1,table: db.table_name,columns:['column_1'],count:10,status:SUCCESS\n",'run time: 30 seconds\n']
The first item in list is created with single quotes, while the second is created with double quotes.
When trying to convert the above to dictionary
new_line= dict(x.split(":"),1) for x in line.split(","))
It gives me a value error
Value error: dictionary update sequence element has length 1; 2 is required
The above error is because it considers the entire string under double quotes as single value and it's not able to convert it to dictionary.
Is there a way to convert it to single quotes instead of double. I tried using replace, strip. But nothing helps.
Expected output:
{
Name:Xxxx,
section:yyyy,
time:21.2,
proof:proof1
table:db.table_name
status: success
}
The quotes has nothing to do with the error. The exterior quotes of each line are not part of the str object. They are only printed to you know it is a str. The single quotes are switched to double because the content has single quotes in it, then single quotes cannot be used to delimit the str. But again, that is only a change in what is printed not in what is stored in memory.
Try to do it in steps and print the intermediate objects you get to debug the program.
for x in line: #prints nicer than print(line)
print(x)
arg = [x.split(":",1) for x in line.split(",")]
for x in arg:
print(x)
new_line = dict(arg)
you should get printed tuples with two elements
for convert your one line(str) to dict, you can use dictionary comprehension:
new_line = dict(x.split(":",1) for x in line.split()

Does string formatting not work within an input() function?

My code:
new_account = sys.argv[1]
confirm_new = input("Would you like to add {} to the dictionary?" +
"\ny or n\n".format(new_account))
This doesn't format the string to place the variable in place of {}. What's up?
This has nothing to do with input. It's just that addition has lower precedence than method calls:
>>> "{}" + "b".format('a')
'{}b'
Normally I just use automatic string concatenation if I have a multi-line string (just omit the +):
confirm_new = input("Would you like to add {} to the dictionary?"
"\ny or n\n".format(new_account))

Bell character as Fields separator in Python print output

I am fairly new to Python and need a little help here.
I have a Python script running on Python 2.6 that parses some JSON.
Example Code:
if "prid" in data[p]["prdts"][n]:
print data[p]["products"][n]["prid"],
if "metrics" in data[p]["prdts"][n]:
lenmet = len(data[p]["prdts"][n]["metrics"])
i = 0
while (i < lenmet):
if (data[p]["prdts"][n]["metrics"][i]["metricId"] == "price"):
print data[p]["prdts"][n]["metrics"][i]["metricValue"]["value"]
break
Now, this prints values in 2 columns:
prid price
123 20
234 40
As you see the fields separator above is ' '. How can I put a field separator like BEL character in the output?
Sample expected output:
prid price
123^G20
234^G40
FWIW, your while loop doesn't increment i, so it will loop forever, but I assume that was just a copy & paste error, and I'll ignore it in the rest of my answer.
If you want to use two separate print statements to print your data on one line you can't avoid getting that space produced by the first print statement. Instead, simply save the prid data until you can print it with the price in one go using string concatenation. Eg,
if "prid" in data[p]["prdts"][n]:
prid = [data[p]["products"][n]["prid"]]
if "metrics" in data[p]["prdts"][n]:
lenmet = len(data[p]["prdts"][n]["metrics"])
i = 0
while (i < lenmet):
if (data[p]["prdts"][n]["metrics"][i]["metricId"] == "price"):
price = data[p]["prdts"][n]["metrics"][i]["metricValue"]["value"]
print str(prid) + '\a' + str(price)
break
Note that I'm explicitly converting the prid and price to string. Obviously, if either of those items is already a string then you don't need to wrap it in str(). Normally, we can let print convert objects to string for us, but we can't do
print prid, '\a', price
here because that will give us an unwanted space between each item.
Another approach is to make use of the new print() function, which we can import using a __future__ import at the top of the script, before other imports:
from __future__ import print_function
# ...
if "prid" in data[p]["prdts"][n]:
print(data[p]["products"][n]["prid"], end='\a')
if "metrics" in data[p]["prdts"][n]:
lenmet = len(data[p]["prdts"][n]["metrics"])
i = 0
while (i < lenmet):
if (data[p]["prdts"][n]["metrics"][i]["metricId"] == "price"):
print(data[p]["prdts"][n]["metrics"][i]["metricValue"]["value"])
break
I don't understand why you want to use BEL as a separator rather than something more conventional, eg TAB. The BEL char may print as ^G in your terminal, but it's invisible in mine, and if you save this output to a text file it may not display correctly in a text viewer / editor.
BTW, It would have been better if you posted a Minimal, Complete, Verifiable Example that focuses on your actual printing problem, rather than all that crazy JSON parsing stuff, which just makes your question look more complicated than it really is, and makes it impossible to test your code or their modifications to it.

replacing text in a file, Python

so this piece of code is meant to take a line from a file and replace the certain line from the string with a new word/number, but it doesn't seem to work :(
else:
with open('newfile', 'r+')as myfile:
x=input("what would you like to change: \nname \ncolour \nnumber \nenter option:")
if x == "name":
print("your current name is:")
test_lines = myfile.readlines()
print(test_lines[0])
y=input("change name to:")
content = (y)
myfile.write(str.replace((test_lines[0]), str(content)))
I get the error message TypeError: replace() takes at least 2 arguments (1 given), i don't know why (content) is not accepted as an argument. This also happens for the code below
if x == "number":
print ("your current fav. number is:")
test_lines = myfile.readlines()
print(test_lines[2])
number=(int(input("times fav number by a number to get your new number \ne.g 5*2 = 10 \nnew number:")))
result = (int(test_lines[2])*(number))
print (result)
myfile.write(str.replace((test_lines[2]), str(result)))
f=open('newfile', 'r')
print("now we will print the file:")
for line in f:
print (line)
f.close
replace is a function of a 'str' object.
Sounds like you want to do something like (this is a guess not knowing your inputs)
test_lines[0].replace(test_lines[0],str(content))
I'm not sure what you're attempting to accomplish with the logic in there. looks like you want to remove that line completely and replace it?
also i'm unsure what you are trying to do with
content = (y)
the output of input is a str (which is what you want)
EDIT:
In your specific case (replacing a whole line) i would suggest just reassigning that item in the list. e.g.
test_lines[0] = content
To overwrite the file you will have to truncate it to avoid any race conditions. So once you have made your changes in memory, you should seek to the beginning, and rewrite everything.
# Your logic for replacing the line or desired changes
myfile.seek(0)
for l in test_lines:
myfile.write("%s\n" % l)
myfile.truncate()
Try this:
test_lines = myfile.readlines()
print(test_lines[0])
y = input("change name to:")
content = str(y)
myfile.write(test_lines[0].replace(test_lines[0], content))
You have no object known purely as str. The method replace() must be called on a string object. You can call it on test_lines[0] which refers to a string object.
However, you may need to change your actual program flow. However, this should circumvent the error.
You need to call it as test_lines[0].replace(test_lines[0],str(content))
Calling help(str.replace) at the interpreter.
replace(...)
S.replace(old, new[, count]) -> str
Return a copy of S with all occurrences of substring
old replaced by new. If the optional argument count is
given, only the first count occurrences are replaced.
Couldn't find the docs.

difflib python formatting

I am using this code to find difference between two csv list and hove some formatting questions. This is probably an easy fix, but I am new and trying to learn and having alot of problems.
import difflib
diff=difflib.ndiff(open('test1.csv',"rb").readlines(), open('test2.csv',"rb").readlines())
try:
while 1:
print diff.next(),
except:
pass
the code works fine and I get the output I am looking for as:
Group,Symbol,Total
- Adam,apple,3850
? ^
+ Adam,apple,2850
? ^
bob,orange,-45
bob,lemon,66
bob,appl,-56
bob,,88
My question is how do I clean the formatting up, can I make the Group,Symbol,Total into sperate columns, and the line up the text below?
Also can i change the ? to represent a text I determine? such as test 1 and test 2 representing which sheet it comes from?
thanks for any help
Using difflib.unified_diff gives much cleaner output, see below.
Also, both difflib.ndiff and difflib.unified_diff return a Differ object that is a generator object, which you can directly use in a for loop, and that knows when to quit, so you don't have to handle exceptions yourself. N.B; The comma after line is to prevent print from adding another newline.
import difflib
s1 = ['Adam,apple,3850\n', 'bob,orange,-45\n', 'bob,lemon,66\n',
'bob,appl,-56\n', 'bob,,88\n']
s2 = ['Adam,apple,2850\n', 'bob,orange,-45\n', 'bob,lemon,66\n',
'bob,appl,-56\n', 'bob,,88\n']
for line in difflib.unified_diff(s1, s2, fromfile='test1.csv',
tofile='test2.csv'):
print line,
This gives:
--- test1.csv
+++ test2.csv
## -1,4 +1,4 ##
-Adam,apple,3850
+Adam,apple,2850
bob,orange,-45
bob,lemon,66
bob,appl,-56
So you can clearly see which lines were changed between test1.csv and test1.csv.
To line up the columns, you must use string formatting.
E.g. print "%-20s %-20s %-20s" % (row[0],row[1],row[2]).
To change the ? into any text test you like, you'd use s.replace('any text i like').
Your problem has more to do with the CSV format, since difflib has no idea it's looking at columnar fields. What you need is to figure out into which field the guide is pointing, so that you can adjust it when printing the columns.
If your CSV files are simple, i.e. they don't contain any quoted fields with embedded commas or (shudder) newlines, you can just use split(',') to separate them into fields, and figure out where the guide points as follows:
def align(line, guideline):
"""
Figure out which field the guide (^) points to, and the offset within it.
E.g., if the guide points 3 chars into field 2, return (2, 3)
"""
fields = line.split(',')
guide = guideline.index('^')
f = p = 0
while p + len(fields[f]) < guide:
p += len(fields[f]) + 1 # +1 for the comma
f += 1
offset = guide - p
return f, offset
Now it's easy to show the guide properly. Let's say you want to align your columns by printing everything 12 spaces wide:
diff=difflib.ndiff(...)
for line in diff:
code = line[0] # The diff prefix
print code,
if code == '?':
fld, offset = align(lastline, line[2:])
for f in range(fld):
print "%-12s" % '',
print ' '*offset + '^'
else:
fields = line[2:].rstrip('\r\n').split(',')
for f in fields:
print "%-12s" % f,
print
lastline = line[2:]
Be warned that the only reliable way to parse CSV files is to use the csv module (or a robust alternative); but getting it to play well with the diff format (in full generality) would be a bit of a headache. If you're mainly interested in readability and your CSV isn't too gnarly, you can probably live with an occasional mix-up.

Categories