Data check routine isn't working - python

I have a file containing a 5 x 7 table:
enter image description here
I want a validation check that there is either a 5,7, or 9; but none of them is repeated, i.e there must be only one occurrence of these numbers. 5 and 7 are required, 9 is optional, the remaining three columns can be 0. I have written this code but it doesn't work. I also want to store the valid rows in a separate list.
My attempt of the program in python is as follows
def validation ():
numlist = open("scores.txt","r")
invalidnum=0
for line in numlist:
x = line.count("0")
inv1 = line.count("1")
inv2 = line.count("2")
inv3 = line.count("3")
if x > 2 or inv1 > 1 or inv2 > 1 or inv3 > 1 or line not in ("0","5","7","9"):
invalidnum=invalidnum+1
print(invalidnum,"Invalid numbers found"
else:
print("All numbers are valid in the list")
I will appreciate if someone can help me on this.

Here is an example that uses a set:
lolwat = []
for line in open('scores.txt'):
numbers = set(line.split(','))
if '5' in numbers and '7' in numbers:
print('okay 5,7')
elif '9' in numbers:
print('okay 9')
lolwat.append(numbers)
do_stuff_with(lolwat)
Set de-duplicates the numbers ensuring each is unique like 5,7,9 only occur once.

You need to learn how to break a problem like that down into smaller pieces:
E.g., You want to check each row:
for row in open('scores.txt'):
check_row(row)
def check_row(row):
...
You want to save good rows to a list:
good_rows = []
for row in ...:
if check_row(row): good_rows.add(row)
A good row contains exactly one '5':
def check_row(row):
number_of_fives = count_number_of(row, '5')
if number_of_fives != 1:
return False
...
return True
def count_number_of(row, digit):
...
And so on.

Related

Printing numbers that contain only odd digits in a given range

I find some difficulties with the task of printing numbers in given range that contain only odd digits.
f.e: The first number is 2345 and the second number is 6789. There is one more thing - the printed numbers should be limited only the range according to the digit position 2 to 6 (3,5), 3 to 7(3,5,7), 4 to 8(5,7), 5 to 9(5,7,9) - so it means that the first numbers should be 3355,3357,3359,3375,3377,3379,3555,3557....
The code does not execute it the way output should look:
number_one=int(input())
number_two=int(input())
list_one=[]
list_two=[]
number_one=str(number_one)
number_two=str(number_two)
for i in number_one:
if int(i)==0 or int(i)%2==0:
i=int(i)+1
list_one.append(int(i))
for i in number_two:
list_two.append(int(i))
a=0
b=0
c=0
d=0
for j in range(list_one[0],list_two[0]+1):
if j%2==1:
a=j
for p in range(list_one[1],list_two[1]+1):
if p%2==1:
b=p
for x in range(list_one[2],list_two[2]+1):
if x%2==1:
c=x
for y in range(list_one[3],list_two[3]+1):
if y%2==1:
d=y
print(f"{a}{b}{c}{d}",end=" ")
There are a lot of repetitions in the output that I would like to avoid.
Thank you in advance!
May be this is not an optimal solution.
But this is working for positive integers with same length.
number_one=int(input())
number_two=int(input())
if len(str(number_one)) != len(str(number_two)):
raise Exception("numbers should be of same length")
def print_num(num_one, num_two):
res = []
for i,j in zip(num_one, num_two):
next_odd_for_i = int(i) + (not (int(i)%2))
prev_odd_for_j = int(j) - (not (int(j)%2))
temp_str = ""
for i_next in range(next_odd_for_i, prev_odd_for_j+1, 2):
temp_str += str(i_next)
res.append(temp_str)
return res
def print_perm(li_of_str):
if len(li_of_str) == 1:
return [li_of_str[-1]]
res = []
first = li_of_str[0]
for j in first:
tmp = [j+k for n in print_perm(li_of_str[1:]) for k in n ]
res.append(tmp)
return res
print(print_num(str(number_one), str(number_two)))
print(print_perm(print_num(str(number_one), str(number_two))))
One way to solve this problem is with recursion. This function takes in two strings representing numbers and returns all the odd numbers (as strings) that satisfy the conditions you specified:
def odd_digits(num1, num2):
# split off first digit of string
msd1, rest1 = int(num1[0]), num1[1:]
# make the digit odd if required
msd1 += msd1 % 2 == 0
# split off first digit of string
msd2, rest2 = int(num2[0]), num2[1:]
# make the digit odd if required
msd2 -= msd2 % 2 == 0
# if no more digits, just return the values between msd1 and msd2
if not rest1:
return [str(i) for i in range(msd1, msd2+1, 2)]
# otherwise, append the results of a recursive call to each
# odd digit between msd1 and msd2
result = []
for i in range(msd1, msd2+1, 2):
result += [str(i) + o for o in odd_digits(rest1, rest2)]
return result
print(odd_digits('2345', '6789'))
Output:
[
'3355', '3357', '3359',
'3375', '3377', '3379',
'3555', '3557', '3559',
'3575', '3577', '3579',
'3755', '3757', '3759',
'3775', '3777', '3779',
'5355', '5357', '5359',
'5375', '5377', '5379',
'5555', '5557', '5559',
'5575', '5577', '5579',
'5755', '5757', '5759',
'5775', '5777', '5779'
]
If you want to use integer values just use (for example)
print(list(map(int, odd_digits(str(2345), str(6789)))))
The output will be as above but all values will be integers rather than strings.
If you can use libraries, you can generate ranges for each digit and then use itertools.product to find all the combinations:
import itertools
def odd_digits(num1, num2):
ranges = []
for d1, d2 in zip(num1, num2):
d1 = int(d1) + (int(d1) % 2 == 0)
d2 = int(d2) - (int(d2) % 2 == 0)
ranges.append(list(range(d1, d2+1, 2)))
return [''.join(map(str, t)) for t in itertools.product(*ranges)]
This function takes string inputs and produces string outputs, which will be the same as the first function above.

How to convert '2.6840000e+01' type like datas to float in Python?

I got a "input.txt" file that contains lines like:
1 66.3548 1011100110110010 25
Then i apply some functions column by column:
column stays the same,
column is rounding in a spesific way,
column is converted from binary to decimal,
column is converted from hexadecimal to binary.
And finaly i get this:
[1.0000000e+00 6.6340000e+01 4.7538000e+04 1.0010100e+05]
Then i write this to "fall.txt".
All the operations is working correctly. But i want to see the numbers like:
1 66.34 47538 100101
I placed the columns of the relevant rows in list_for_1. Then i applied the functions to indexes and put them to another list list_for_11. Finally i put all the answers in a matrix. I wrote the matrix to the "fall.txt".
Here's what i did:
with open("input.txt", "r") as file:
#1. TİP SATIRLAR İÇİN GEREKLİ OBJELER
list_for_1 = list()
list_for_11 = list()
#list_final_1 = list()
for line in file:
#EĞER SATIR TİPİ 1 İSE
if line.startswith("1"):
line = line[:-1]
list_for_1 = line.split(" ") #tüm elemanları 1 listede toplama
#1. tip satır için elemanlara gerekli işlemlerin yapılması
list_for_11.append(list_for_1[0]) #ilk satır 1 kalacak
list_for_11.append(float_yuvarla(float(list_for_1[1]))) #float yuvarlama
list_for_11.append(binary_decimal(list_for_1[2])) #binary'den decimal'e
list_for_11.append(hexa_binary(list_for_1[3])) #hexa'dan binary'e
m = 0
n = 0
array1 = np.zeros((6,4))
for i in list_for_11: #listedeki elemanları matrise yerleştirme
if(m > 5):
break
if(isinstance(i, str)):
x = int(i, 2)
array1[m][n] = float(i)
n += 1
if(n == 4):
n = 0
m += 1
with open("fall.txt","w") as ff:
ff.write(str(array1))
ff.write("\n")
Over here i actually send float type to matrix but it's not working:
if(isinstance(i, str)):
x = int(i, 2)
array1[m][n] = float(i)
I'm sort of a new python user, so i might write unnecessarily long and complex codes. If there's any shorter way to do what i did, i would like to get opinions for that as well.
Here's a function to format your numbers the way you want them:
def formatNumber(num):
if num % 1 == 0:
return int(num)
else:
return num
Your list of numbers:
l = [1.0000000e+00, 6.6340000e+01, 4.7538000e+04, 1.0010100e+05]
Reformatting your list of numbers:
for x in l:
print(formatNumber(x))
Output:
1
66.34
47538
100101

Changing version number to single digits python

I have a version number in a file like this:
Testing x.x.x.x
So I am grabbing it off like this:
import re
def increment(match):
# convert the four matches to integers
a,b,c,d = [int(x) for x in match.groups()]
# return the replacement string
return f'{a}.{b}.{c}.{d}'
lines = open('file.txt', 'r').readlines()
lines[3] = re.sub(r"\b(\d+)\.(\d+)\.(\d+)\.(\d+)\b", increment, lines[3])
I want to make it so if the last digit is a 9... then change it to 0 and then change the previous digit to a 1. So 1.1.1.9 changes to 1.1.2.0.
I did that by doing:
def increment(match):
# convert the four matches to integers
a,b,c,d = [int(x) for x in match.groups()]
# return the replacement string
if (d == 9):
return f'{a}.{b}.{c+1}.{0}'
elif (c == 9):
return f'{a}.{b+1}.{0}.{0}'
elif (b == 9):
return f'{a+1}.{0}.{0}.{0}'
Issue occurs when its 1.1.9.9 or 1.9.9.9. Where multiple digits need to rounded. How can I handle this issue?
Use integer addition?
def increment(match):
# convert the four matches to integers
a,b,c,d = [int(x) for x in match.groups()]
*a,b,c,d = [int(x) for x in str(a*1000 + b*100 + c*10 + d + 1)]
a = ''.join(map(str,a)) # fix for 2 digit 'a'
# return the replacement string
return f'{a}.{b}.{c}.{d}'
If your versions are never going to go beyond 10, it is better to just convert it to an integer, increment it and then convert back to a string.
This allows you to go up to as many version numbers as you require and you are not limited to thousands.
def increment(match):
match = match.replace('.', '')
match = int(match)
match += 1
match = str(match)
output = '.'.join(match)
return output
Add 1 to the last element. If it's more than 9, set it to 0 and do the same for the previous element. Repeat as necessary:
import re
def increment(match):
# convert the four matches to integers
g = [int(x) for x in match.groups()]
# increment, last one first
pos = len(g)-1
g[pos] += 1
while pos > 0:
if g[pos] > 9:
g[pos] = 0
pos -= 1
g[pos] += 1
else:
break
# return the replacement string
return '.'.join(str(x) for x in g)
print (re.sub(r"\b(\d+)\.(\d+)\.(\d+)\.(\d+)\b", increment, '1.8.9.9'))
print (re.sub(r"\b(\d+)\.(\d+)\.(\d+)\.(\d+)\b", increment, '1.9.9.9'))
print (re.sub(r"\b(\d+)\.(\d+)\.(\d+)\.(\d+)\b", increment, '9.9.9.9'))
Result:
1.9.0.0
2.0.0.0
10.0.0.0

python intelligent hexadecimal numbers generator

I want to be able to generate 12 character long chain, of hexadecimal, BUT with no more than 2 identical numbers duplicate in the chain: 00 and not 000
Because, I know how to generate ALL possibilites, including 00000000000 to FFFFFFFFFFF, but I know that I won't use all those values, and because the size of the file generated with ALL possibilities is many GB long, I want to reduce the size by avoiding the not useful generated chains.
So my goal is to have results like 00A300BF8911 and not like 000300BF8911
Could you please help me to do so?
Many thanks in advance!
if you picked the same one twice, remove it from the choices for a round:
import random
hex_digits = set('0123456789ABCDEF')
result = ""
pick_from = hex_digits
for digit in range(12):
cur_digit = random.sample(hex_digits, 1)[0]
result += cur_digit
if result[-1] == cur_digit:
pick_from = hex_digits - set(cur_digit)
else:
pick_from = hex_digits
print(result)
Since the title mentions generators. Here's the above as a generator:
import random
hex_digits = set('0123456789ABCDEF')
def hexGen():
while True:
result = ""
pick_from = hex_digits
for digit in range(12):
cur_digit = random.sample(hex_digits, 1)[0]
result += cur_digit
if result[-1] == cur_digit:
pick_from = hex_digits - set(cur_digit)
else:
pick_from = hex_digits
yield result
my_hex_gen = hexGen()
counter = 0
for result in my_hex_gen:
print(result)
counter += 1
if counter > 10:
break
Results:
1ECC6A83EB14
D0897DE15E81
9C3E9028B0DE
CE74A2674AF0
9ECBD32C003D
0DF2E5DAC0FB
31C48E691C96
F33AAC2C2052
CD4CEDADD54D
40A329FF6E25
5F5D71F823A4
You could also change the while true loop to only produce a certain number of these based on a number passed into the function.
I interpret this question as, "I want to construct a rainbow table by iterating through all strings that have the following qualities. The string has a length of 12, contains only the characters 0-9 and A-F, and it never has the same character appearing three times in a row."
def iter_all_strings_without_triplicates(size, last_two_digits = (None, None)):
a,b = last_two_digits
if size == 0:
yield ""
else:
for c in "0123456789ABCDEF":
if a == b == c:
continue
else:
for rest in iter_all_strings_without_triplicates(size-1, (b,c)):
yield c + rest
for s in iter_all_strings_without_triplicates(12):
print(s)
Result:
001001001001
001001001002
001001001003
001001001004
001001001005
001001001006
001001001007
001001001008
001001001009
00100100100A
00100100100B
00100100100C
00100100100D
00100100100E
00100100100F
001001001010
001001001011
...
Note that there will be several hundred terabytes' worth of values outputted, so you aren't saving much room compared to just saving every single string, triplicates or not.
import string, random
source = string.hexdigits[:16]
result = ''
while len(result) < 12 :
idx = random.randint(0,len(source))
if len(result) < 3 or result[-1] != result[-2] or result[-1] != source[idx] :
result += source[idx]
You could extract a random sequence from a list of twice each hexadecimal digits:
digits = list('1234567890ABCDEF') * 2
random.shuffle(digits)
hex_number = ''.join(digits[:12])
If you wanted to allow shorter sequences, you could randomize that too, and left fill the blanks with zeros.
import random
digits = list('1234567890ABCDEF') * 2
random.shuffle(digits)
num_digits = random.randrange(3, 13)
hex_number = ''.join(['0'] * (12-num_digits)) + ''.join(digits[:num_digits])
print(hex_number)
You could use a generator iterating a window over the strings your current implementation yields. Sth. like (hex_str[i:i + 3] for i in range(len(hex_str) - window_size + 1)) Using len and set you could count the number of different characters in the slice. Although in your example it might be easier to just compare all 3 characters.
You can create an array from 0 to 255, and use random.sample with your list to get your list

What is the best way to write a code in python to read some sample input from a "txt" files?

I know it's a very basic question but i am also a newbie in python environment. I am writing my first program (Data structure problem) where i need to read some input test cases.
Input:
The first line contains the number of test cases T. T test cases follow.
The first line for each case contains N, the number of elements to be sorted.
The next line contains N integers a[1],a[2]...,a[N].
Constraints:
1 <= T <= 5
1 <= N <= 100000
1 <= a[i] <= 1000000
Sample Input:
2
5
1 1 1 2 2
5
2 1 3 1 2
I wrote a following program to read the above input from a file but i am sure that this is not the best way to do it because it contains a lot of if-else loop and for loop which will really sucks at the large inputs.
sample = open('sample.txt')
first = sample.readline()
if len(first) > 5 or len(first) <1:
print "Not correct input";
else:
test = sample.readline
for x in range(0,len(first)):
second = sample.readline()
if len(second) >100000 or len(second) < 1:
print "wrong input";
else:
third = list()
for y in range(0, len(third)):
third.append(sample.readline()[:1])
method_test(third) #calling a method for each sample input
Please suggest me the best solution.
This should do it:
with open('sample.txt') as sample:
num_testcases = int(sample.readline())
assert 1 <= num_testcases <= 5
for testcase in range(num_testcases):
num_elems = int(sample.readline())
assert 1 <= num_elems <= 10000
elems = map(int, sample.readline().split())
assert len(elems) == num_elems
assert all(1 <= elem <= 100000 for elem in elems)
method_test(elems)
Edit: Added validity checks.
Firstly. len(x) will tell you the length of the inputs, so if your input line is "9", len(line) will be 1; if your input line is "999", len(line) will be 3. You need to use int(line) to read a number from the input file correctly.
The logic of the rest of the program doesn't look right - for example, you are reading the first line (the number of tests), and looping over this number (which is fine) - but you are reading the number of values outside this loop, which is the wrong order.
I strongly recommend you print out the various values as you read them, so you can follow what is going on, and debug your program more easily.
Finally when you do the following:
third = list()
for y in range(0, len(third)):...
you are creating an empty list list() then looping from 0 to the length of the list (which is also zero). So the loop won't actually do anything.
Something like this:
use cycle() to read only only alternate lines after the first line, and the size of cycle will be twice the value of T.
from itertools import islice,cycle
with open("data1.txt") as f:
T = int(f.readline())
if T != 0:
cyc=islice(cycle((False,True)),T*2)
for x in cyc:
if x or not f.readline():
print map(int,f.readline().split())
output:
[1, 1, 1, 2, 2]
[2, 1, 3, 1, 2]

Categories