Date conversion from numbers to words in Python - python

I wrote a program in python(I'm a beginner still) that converts a date from numbers to words using dictionaries. The program is like this:
dictionary_1 = { 1:'first', 2:'second'...}
dictionary_2 = { 1:'January', 2:'February',...}
and three other more for tens, hundreds, thousands;
2 functions, one for years <1000, the other for years >1000;
an algorithm that verifies if it's a valid date.
In main I have:
a_random_date = raw_input("Enter a date: ")
(I've chosen raw_input for special chars. between numbers such as: 21/11/2014 or 21-11-2014 or 21.11.2014, only these three) and after verifying if it's a valid date I do not know nor did I find how to call upon the dictionaries to convert the date into words, when I run the program I want at the output for example if I typed 1/1/2015: first/January/two thousand fifteen.
And I would like to apply the program to a text document to seek the dates and convert them from numbers to words if it is possible.
Thank you!

You can split that date in list and then check if there is that date in dictionary like this:
import re
dictionary_1 = { 1:'first', 2:'second'}
dictionary_2 = { 1:'January', 2:'February'}
dictionary_3 = { 1996:'asd', 1995:'asd1'}
input1 = raw_input("Enter date:")
lista = re.split(r'[.\/-]', input1)
print "lista: ", lista
day = lista[0]
month = lista[1]
year = lista[2]
everything_ok = False
if dictionary_1.get(int(day)) != None:
day_print = dictionary_1.get(int(day))
everything_ok = True
else:
print "There is no such day"
if dictionary_2.get(int(month)) != None:
month_print = dictionary_2.get(int(month))
everything_ok = True
else:
print "There is no such month"
everything_ok = False
if dictionary_3.get(int(year)) != None:
year_print = dictionary_3.get(int(year))
everything_ok = True
else:
print "There is no such year"
everything_ok = False
if everything_ok == True:
print "Date: ", day_print, "/", month_print, "/", year_print #or whatever format
else:
pass
This is the output:
Enter date:1/2/1996
Date: first / February / asd
I hope this helps you.

Eventually you will need the re module. Learn to write a regular expression that can search strings of a particular format. Here's some code example:
with open("mydocument.txt") as f:
contents = f.read()
fi = re.finditer(r"\d{1,2}-\d{1,2}-\d{4}", contents)
This will find all strings that are made up of 1 or 2 digits followed by a hyphen, followed by another 1 or 2 digits followed by a hyphen, followed by 4 digits. Then, you feed each string into datetime.strptime; it will parse your "date" string and decide if it is valid according to your specified format.
Have fun!

Related

python - if conditions in string is true

python beginner here. My file contains lines that consists of conditions that are formatted just like a python if statement (without the if in the beginning and colon in the end). Example:
temperature < 40 and weekday == "Thursday" and (country != "Norway" or country != "USA")
(temperature != 30 or temperature != 35) and weekday == "Friday" and country == "Canada"
I want to write code that reads the lines as if they were if statements and prints True or False depending on if the conditions were met. I am thinking something along the lines of:
temperature = 35
country = "Canada"
weekday = "Friday"
file = open('output.txt', r)
lines = file.readlines()
for line in lines:
if line:
# if conditions in string are met, print true
print(True)
else:
# else print False
print(False)
which should when run with above file lines should output
False
True
EDIT:
Would it be possible to reliably and consistently parse the file lines and process it, like how I'm assuming a python compiler would read an actual if statement?
Alright so you've got a few basic issues, mostly forgetting to use "" for strings such as in country = Canada and weekday = Friday. Other than that, you can use the eval() method, however, it is considered bad practice and you should try to avoid it.
temperature = 35
country = "Canada"
weekday = "Friday"
file = open('output.txt', "r")
lines = file.readlines()
for line in lines:
if eval(line):
# if conditions in string are met, print true
print(True)
else:
# else print False
print(False)
Note: Forgot to mention, you need "" for the read specifier in open().
There are two concepts, selection (if statements), and conditions / boolean logic <, or, and, etc. You are trying to use the 2nd. You are correct you don't need the 1st.
When we program we don't write big programs that don't work, the fix them. We start with small programs that work, and then make them bigger. So let me start with a simple example.
temperature < 40
How do we print True if this is true, and False if it is false? The answer may surprise you.
print(temperature < 40)

How to overcome a ValueError when working with multiple if conditions?

I'm trying to make a script that can identify if a folder name is a project or not. To do this I want to use multiple if conditions. But I struggle with the resulting ValueError that comes from checking, for example, the first letter of a folder name and if it is a Number. If it's a String i want to skip the folder and make it check the next one. Thank you all in advance for your help.
Cheers, Benn
I tried While and except ValueError: but haven't been successful with it.
# Correct name to a project "YYMM_ProjectName" = "1908_Sample_Project"
projectnames = ['190511_Waldfee', 'Mountain_Shooting_Test', '1806_Coffe_Prime_Now', '180410_Fotos', '191110', '1901_Rollercoaster_Vision_Ride', 'Musicvideo_LA', '1_Project_Win', '19_Wrong_Project', '1903_di_2', '1907_DL_2', '3401_CAR_Wagon']
# Check conditions
for projectname in projectnames:
if int(str(projectname[0])) < 3 and int(projectname[1]) > 5 and ((int(projectname[2]) * 10) + int(projectname[3])) <= 12 and str(projectname[4]) == "_" and projectname[5].isupper():
print('Real Project')
print('%s is a real Project' % projectname)
# print("Skipped Folders")
ValueError: invalid literal for int() with base 10: 'E'
From what I understand from all the ifs...you may actually be better off using a regex match. You're parsing through each character, and expecting each individual one to be within a very limited character range.
I haven't tested this pattern string, so it may be incorrect or need to be tweaked for your needs.
import re
projectnames = ['1911_Waldfee', "1908_Project_Test", "1912_WinterProject", "1702_Stockfootage", "1805_Branded_Content"]
p = ''.join(["^", # Start of string being matched
"[0-2]", # First character a number 0 through 2 (less than 3)
"[6-9]", # Second character a number 6 through 9 (single digit greater than 5)
"(0(?=[0-9])|1(?=[0-2]))", # (lookahead) A 0 followed only by any number 0 through 9 **OR** A 1 followed only by any number 0 through 2
"((?<=0)[1-9]|(?<=1)[0-2])", # (lookbehind) Match 1-9 if the preceding character was a 0, match 0-2 if the preceding was a 1
"_", # Next char is a "_"
"[A-Z]", #Next char (only) is an upper A through Z
".*$" # Match anything until end of string
])
for projectname in projectnames:
if re.match(p, projectname):
#print('Real Project')
print('%s is a real Project' % projectname)
# print("Skipped Folders")
EDIT: ========================
You can step-by-step test the pattern using the following...
projectname = "2612_UPPER"
p = "^[0-2].*$" # The first character is between 0 through 2, and anything else afterwards
if re.match(p, projectname): print(projectname)
# If you get a print, the first character match is right.
# Now do the next
p = "^[0-2][6-9].*$" # The first character is between 0 through 2, the second between 6 and 9, and anything else afterwards
if re.match(p, projectname): print(projectname)
# If you get a print, the first and second character match is right.
# continue with the third, fourth, etc.
This is something that just gets the work done and may not be the most efficient way to do this.
Given your list of projects,
projectnames = [
'190511_Waldfee',
'Mountain_Shooting_Test',
'1806_Coffe_Prime_Now',
'180410_Fotos',
'191110',
'1901_Rollercoaster_Vision_Ride',
'Musicvideo_LA',
'1_Project_Win',
'19_Wrong_Project',
'1903_di_2',
'1907_DL_2',
'3401_CAR_Wagon'
]
I see that there are a limited number of valid YYMM strings (24 of them to be more precise). So I first create a list of these 24 valid YYMMs.
nineteen = list(range(1900, 1913))
eighteen = list(range(1800, 1813))
YYMM = nineteen + eighteen # A list of all 24 valid dates
Then I modify your for loop a little bit using a try-except-else block,
for projectname in projectnames:
try:
first_4_digits = int(projectname[:4]) # Make sure the first 4 are digits.
except ValueError:
pass # Pass silently
else:
if (first_4_digits in YYMM
and projectname[4] == "_"
and projectname[5].isupper()):
# if all conditions are true
print("%s is a real project." % projectname)
One solution would be to make a quick function that catches an error and returns false:
def isInt(elem):
try:
int(elem)
return True
except ValueError:
return False
...
if all(isInt(e) for e in projectname[:3]) and int(str(projectname[0])) < 3 and ...:
...
Or you could use something like str.isdigit() as a check first, rather than writing your own function, to avoid triggering the ValueError in the first place.
Though if I were you I would reconsider why you need such a long and verbose if statement in the first place. There might be a more efficient way to implement this feature.
You could write a small parser (might be a bit over the top, admittedly):
from parsimonious.grammar import Grammar
from parsimonious.nodes import NodeVisitor
from parsimonious.exceptions import ParseError
projectnames = ['190511_Waldfee', 'Mountain_Shooting_Test', '1806_Coffe_Prime_Now', '180410_Fotos', '191110', '1901_Rollercoaster_Vision_Ride', 'Musicvideo_LA', '1_Project_Win', '19_Wrong_Project', '1903_di_2', '1907_DL_2', '3401_CAR_Wagon']
class ProjectVisitor(NodeVisitor):
grammar = Grammar(
r"""
expr = first second third fourth fifth rest
first = ~"[0-2]"
second = ~"[6-9]"
third = ~"\d{2}"
fourth = "_"
fifth = ~"[A-Z]"
rest = ~".*"
"""
)
def generic_visit(self, node, visited_children):
return visited_children or node
def visit_third(self, node, visited_children):
x, y = int(node.text[0]), int(node.text[1])
if not (x * 10 + y) <= 12:
raise ParseError
# loop over them
pv = ProjectVisitor()
for projectname in projectnames:
try:
pv.parse(projectname)
print("Valid project name: {}".format(projectname))
except ParseError:
pass
This yields
Valid project name: 1806_Coffe_Prime_Now
Valid project name: 1901_Rollercoaster_Vision_Ride
Valid project name: 1907_DL_2

Construct the next strings by using the string time format function strftime

Start by setting t to be the local time 1, 500, 000, 000 seconds from the start of January 1, 1970 UTC:
import time
t = time.localtime(1500000000)
Construct the next strings by using the string time format function strftime(): (a) 'Thursday, July 13 2017'
(b) '09:40 PM Central Daylight Time on 07/13/2017'
(c) 'I will meet you on Thu July 13 at 09:40 PM.'
A couple things, Stack Overflow is not the place for code reviews, for that: try this.
Regardless, you have indentation problems, Python is based off of indents, you need to have the code in your function indented one ahead of your def, like so:
def filesStringSearch():
infile = open('example.txt')
a = input('Search for a word: ')
result = infile.read().find(a)
#result = a.find:
#for a in infile:
if a.find:
print("True")
elif a < 3:
print("-1")
else:
print("False")
return
Second, you're not taking an input with the function, and hard-coding the file to open; this is a simple fix however,
def filesStringSearch(filename):
infile = open(filename)
Third, you're not going to accomplish your goal with your if statements, if the length of the input is less than 3, you shouldn't even try to search for anything, so you need to reorder and change your boolean expressions a bit; to this:
if len(a) < 3:
print("-1")
elif a.find:
print("True")
else:
print("False")
Finally, a.find will not work, rather you can check to see the value of result, so you can replace elif: a.find with:
elif result != -1:
print("True")
Since result will be -1 if it cannot find anything.
Also, the return is useless at the end.
According to your questions the right implementation is:
def filesStringSearch(filename, pattern):
with open(filename, 'r') as f:
text = f.read()
if len(pattern) >= 3:
return text.find(pattern) > -1 or False
else:
return -1
filename = 'example.txt'
pattern_to_find = input('Search for a word: ')
out = filesStringSearch(filename, pattern_to_find)
print(out)
If you are asked to write a function that accepts two arguments, then your function must accept two arguments as here:
def filesStringSearch(filename, pattern):
Then you must read the file, I did it using with statement. with statement will close our file for us, so you don't have to do it manually (and yes, you forgot to close an opened file, it is not a big problem for now, but avoid such things in big projects). You can read more about with statement there: Reading and writing files
What about find method. It is a string method, that will return index of found substring in your string, for instance my_string.find('h') is going to return the index of first substring (which is 'h') in my_string string. If find method can't find your substring it will return -1, that's why we do this:
return text.find(pattern) > -1 or False
As if we will find our pattern in text, then the index certainly is going to be greater that -1. Otherwise we return False or -1 if pattern string's length is less than 3, according to your question
And at the end we take input from user and pass that input to our function with the name of file example.txt. We store the return value of our function in out variable and then print it

Python - line split with spaces?

I'm sure this is a basic question, but I have spent about an hour on it already and can't quite figure it out. I'm parsing smartctl output, and here is the a sample of the data I'm working with:
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-2.6.32-39-pve] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Device Model: TOSHIBA MD04ACA500
Serial Number: Y9MYK6M4BS9K
LU WWN Device Id: 5 000039 5ebe01bc8
Firmware Version: FP2A
User Capacity: 5,000,981,078,016 bytes [5.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Thu Jul 2 11:24:08 2015 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
What I'm trying to achieve is pulling out the device model (some devices it's just one string, other devices, such as this one, it's two words), serial number, time, and a couple other fields. I assume it would be easiest to capture all data after the colon, but how to eliminate the variable amounts of spaces?
Here is the relevant code I currently came up with:
deviceModel = ""
serialNumber = ""
lines = infoMessage.split("\n")
for line in lines:
parts = line.split()
if str(parts):
if parts[0] == "Device Model: ":
deviceModel = parts[1]
elif parts[0] == "Serial Number: ":
serialNumber = parts[1]
vprint(3, "Device model: %s" %deviceModel)
vprint(3, "Serial number: %s" %serialNumber)
The error I keep getting is:
File "./tester.py", line 152, in parseOutput
if parts[0] == "Device Model: ":
IndexError: list index out of range
I get what the error is saying (kinda), but I'm not sure what else the range could be, or if I'm even attempting this in the right way. Looking for guidance to get me going in the right direction. Any help is greatly appreciated.
Thanks!
The IndexError occurs when the split returns a list of length one or zero and you access the second element. This happens when it isn't finding anything to split (empty line).
No need for regular expressions:
deviceModel = ""
serialNumber = ""
lines = infoMessage.split("\n")
for line in lines:
if line.startswith("Device Model:"):
deviceModel = line.split(":")[1].strip()
elif line.startswith("Serial Number:"):
serialNumber = line.split(":")[1].strip()
print("Device model: %s" %deviceModel)
print("Serial number: %s" %serialNumber)
I guess your problem is the empty line in the middle. Because,
>>> '\n'.split()
[]
You can do something like,
>>> f = open('a.txt')
>>> lines = f.readlines()
>>> deviceModel = [line for line in lines if 'Device Model' in line][0].split(':')[1].strip()
# 'TOSHIBA MD04ACA500'
>>> serialNumber = [line for line in lines if 'Serial Number' in line][0].split(':')[1].strip()
# 'Y9MYK6M4BS9K'
Try using regular expressions:
import re
r = re.compile("^[^:]*:\s+(.*)$")
m = r.match("Device Model: TOSHIBA MD04ACA500")
print m.group(1) # Prints "TOSHIBA MD04ACA500"
Not sure what version you're running, but on 2.7, line.split() is splitting the line by word, so
>>> parts = line.split()
parts = ['Device', 'Model:', 'TOSHIBA', 'MD04ACA500']
You can also try line.startswith() to find the lines you want https://docs.python.org/2/library/stdtypes.html#str.startswith
The way I would debug this is by printing out parts at every iteration. Try that and show us what the list is when it fails.
Edit: Your problem is most likely what #jonrsharpe said. parts is probably an empty list when it gets to an empty line and str(parts) will just return '[]' which is True. Try to test that.
I think it would be far easier to use regular expressions here.
import re
for line in lines:
# Splits the string into at most two parts
# at the first colon which is followed by one or more spaces
parts = re.split(':\s+', line, 1)
if parts:
if parts[0] == "Device Model":
deviceModel = parts[1]
elif parts[0] == "Serial Number":
serialNumber = parts[1]
Mind you, if you only care about the two fields, startswith might be better.
When you split the blank line, parts is an empty list.
You try to accommodate that by checking for an empty list, But you turn the empty list to a string which causes your conditional statement to be True.
>>> s = []
>>> bool(s)
False
>>> str(s)
'[]'
>>> bool(str(s))
True
>>>
Change if str(parts): to if parts:.
Many would say that using a try/except block would be idiomatic for Python
for line in lines:
parts = line.split()
try:
if parts[0] == "Device Model: ":
deviceModel = parts[1]
elif parts[0] == "Serial Number: ":
serialNumber = parts[1]
except IndexError:
pass

How to make strings within strings optional in python

I am writing to write something where there are two variables that are formatted in datetime format. The way the user may input their date and time may have the letter "Z" at the end of it. For example:
"2008-01-01T00:00:01Z"
The user may or may not enter in the "Z" at the end so I want to do something that makes either format acceptable. Here's what I have:
import datetime
b = datetime.datetime.strptime("2008-01-01T00:00:01Z", "%Y-%m-%dT%H:%M:%S")
c = datetime.datetime.strptime("2008-05-01T23:59:00Z", "%Y-%m-%dT%H:%M:%S")
def startTime(b):
try:
datetime.datetime.strptime(b, "%Y-%m-%dT%H:%M:%S")
except:
print "Error: start time is invalid."
def endTime(c):
try:
datetime.datetime.strptime(c, "%Y-%m-%dT%H:%M:%S")
except:
print "Error: end time is invalid."
How about just manually removing the Z if it is there?
user_in = raw_input("Please enter a date")
if user_in.endswith('Z'): user_in = user_in[:-1]
rstrip can remove the Z for you if it exists, and leave the string alone otherwise:
>>> "2008-05-01T23:59:00Z".rstrip("Z")
'2008-05-01T23:59:00'
>>> "2008-05-01T23:59:00".rstrip("Z")
'2008-05-01T23:59:00'
So if you have a date s in string format,
date = datetime.datetime.strptime(s.rstrip("Z"), "%Y-%m-%dT%H:%M:%S")
will handle both cases.

Categories