How to retain leading zeros of int variables? - python

Below is a section of code which is part of a functional decryption and encryption program.
while checkvar < maxvar: # is set to < as maxvar is 1 to high for the index of var
#output.append("%02d" % number)
i =ord(var[checkvar]) - 64 # Gets postional value of i
i = ("%02d" % i)
if (checkvar + 1) < maxvar:
j =ord(var[(checkvar + 1)]) - 64 # Gets postional value of i
j = ("%02d" % j)
i = str(i) + str(j) #'Adds' the string i and j to create a new i
li.append(int(i))
checkvar = checkvar + 2
print li
As you can see the two variables i and j are first treated as string to add a 0 in front of any single digit numbers (as string). These variables then are combined to make a four digit number (still as a string). Later in the program the number created are used in a pow() function, as ints remove any leading zeros.
My question: Is it possible to force python to keep the leading zero for ints? I have and continued to search online.
Edit
To help people I have included the encryption part of the program. This is where the problem lies. The variables created in the above code are passed through a pow() function. As this can't handle strings I have to convert the variables to ints where the leading zero is lost.
#a = li[]
b=int(17)#pulic = e
c=int(2773)#=n
lenli=int(len(li))
enchecker = int(0)
#encrpted list
enlist = []
while enchecker < lenli:
en = pow(li[enchecker],b,c)#encrpyt the message can only handle int
enlist.append(int(en))
enchecker = enchecker + 1
print enlist

Though the comments above are true regarding 1, 01, and 001, are all the same as an int, it can be very helpful in temporal modeling, or sequential movie making to maintain the leading zeros. I do it often to ensure movie clips are in proper order. The easy way to do that is using zfill() to ensure the str version of the number has at least the number of characters you tell it, and does so by filling in the left-side of the string "number" with zeros.
>>> x = int(1)
>>> NewStringVariable = str(x).zfill(3)
>>> print NewStringVariable
001
>>> NewStringVariable = str(x).zfill(5)
>>> print NewStringVariable
00001

The concept of leading zeros is a display concept, not a numerical one. You can put an infinite number of leading zeros on a number without changing its value. Since it's not a numeric concept, it's not stored with the number.
You have to decide how many zeros you want when you convert the number to a string. You could keep that number separately if you want.

I was getting date strings in the format of hhmmss coming from the serial line of my Arduino.
Suppose I got s = "122041"; this would be 12:20:41, however 9am would be 090000.
The statement print "%d" % (s) provokes a run time error because the 9 is not an octal number and is hence an illegal character.
To fix this problem:
print "%06d" % (int(s))

Try this:
number = 1
print("%02d" % (number,))
or:
print("{:02d}".format(number))
The explanation of "%02d":
% - This tells the interpreter that a variable should be inserted here.
02 - This tells the interpreter to expect the variable to be 2 in length.
d - This tells the interpreter to expect a number, or should we say a"'d’igit".

Related

How to count cells containing numbers in specific range with cells that contain both text and numbers

I thought I could easily sort this issue out but it took me ages to solve just half of it.
I have a table that contains 100 data cells in a row. Data in each cell are either text-only or text and numbers (see layout at bottom).
I need a function that COUNTs how many cells are present in the table that report the value of N2 OR E to be >=37.
Negative
Positive (N2: 23, E: 23)
Negative Function answer: 2
Positive (N2: 37, E: 26)
Positive (N2: 31, E: 38)
So far I could only extract each N2 number with a function [=MID(A2,15,FIND(",",A2)-15)] that considers the 15th character, then a second function counts how many extracted numbers (they have been extracted in B row) are >=37, [=COUNTIF(B2:B100, ">=37")] but have not a clue on how to take the E value into account.
In addition, I would like the function to consider cells containing the N2 value OR the E value >=37.
Is there the chance to have one big function that does that? Is there the chance not to rely on KUTools for Excel?
If you have the newest version of Excel, you can use FILTERXML after making some minor changes. First concatenate the whole range with CONCAT, then eliminate all ","s and replace ")"s with spaces in the concatenated string.
For example, the below gets you all the instances over 36 (if you only want the number of times, wrap it in a COUNT):
=FILTERXML("<t><s>"&SUBSTITUTE(
SUBSTITUTE(SUBSTITUTE(CONCAT($F$2:$F$7), ")", " "), ",", ""), " ",
"</s><s>")&"</s></t>", "//s[number()>=37]")
For more info on dealing with strings, see here.
EDIT: Thanks #MarkBalhoff for catching a missing space in the formula and
#JvdV for giving another way with =IFERROR(COUNT(FILTERXML("<t><s>"&SUBSTITUTE(TEXTJOIN(" ",,F2:F6)," ","</s><s>")&"</s></t>","//s[translate(.,',','')*1>=37 or translate(following::*[2],')','')*1>=37]")),0)
Since you include the python tag and also reference KU-Tools, I assume you have some familiarity with VBA.
You could easily, and flexibly, implement the logic in Excel VBA using regular expressions.
For this function, I allowed three arguments:
The range to search
The threshold for the values
A list of values to look for
In the regex, the pattern looks for the digits that follow either of the strings in "searchFor". Note that, as written, you need to include the colons in the searchFor strings, and that that the strings are case-sensitive. (easily changed)
Option Explicit
Function CountVals(r As Range, Threshold As Long, ParamArray searchFor() As Variant) As Long
Dim RE As Object, MC As Object, M As Object
Dim counter As Long
Dim vSrc As Variant, v As Variant
Dim sPat As String
'read range into variant array for fastest processing
vSrc = r
'create Pattern
sPat = "(?:" & Join(searchFor, "|") & ")\s*(\d+)"
'initialize regex
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = True
.ignorecase = False 'or change to true if capitalization not important
.Pattern = sPat
counter = 0
'check each string for the values
For Each v In vSrc
Set MC = .Execute(v)
For Each M In MC
If CLng(M.submatches(0)) >= Threshold Then counter = counter + 1
Next M
Next v
CountVals = counter
End With
End Function

Why my code consumes too much memory even after clearing list?

So i'm trying to solve this problem and the question goes like this
Probably, You all Know About The Famous Japanese Cartoon Character Nobita and Shizuka. Nobita Shizuka are very Good friend. However , Shizuka Love a special kind of string Called Tokushuna.
A string T is called Tokushuna if
The length of the string is greater or equal then 3 (|T| ≥ 3 )
It start and end with a charecter ‘1’ (one)
It contain (|T|-2) number of ‘0’ (zero)
here |T| = length of string T . Example , 10001 ,101,10001 is Tokushuna string But 1100 ,1111, 0000 is not.
One Day Shizuka Give a problem to nobita and promise to go date with him if he is able to solve this problem. Shizuka give A string S and told to Count number of Tokushuna string can be found from all possible the substring of string S . Nobita wants to go to date with Shizuka But You Know , he is very weak in Math and counting and always get lowest marks in Math . And In this Time Doraemon is not present to help him .So he need your help to solve the problem .
Input
First line of the input there is an integer T, the number of test cases. In each test case, you are given a binary string S consisting only 0 and 1.
Subtasks
Subtask #1 (50 points)
1 ≤ T ≤ 100
1 ≤ |S| ≤ 100
Subtask #2 (50 points)
1 ≤ T ≤ 100
1 ≤ |S| ≤ 105
Output
For each test case output a line Case X: Y where X is the case number and Y is the number of Tokushuna string can be found from all possible the substring of string S
Sample
Input
3
10001
10101
1001001001
Output
Case 1: 1
Case 2: 2
Case 3: 3
Look, in first case 10001 is itself is Tokushuna string.
In second Case 2 Substring S[1-3] 101 and S[3-6] 101 Can be found which is Tokushuna string.
What I've done so far
I've already solved the problem but the problem is it shows my code exceeds memory limit (512mb). I'm guessing it is because of the large input size. To solve that I've tried to clear the list which holds all the substring of one string after completing each operation. But this isn't helping.
My code
num = int(input())
num_list = []
for i in range(num):
i = input()
num_list.append(i)
def condition(a_list):
case = 0
case_no = 1
sub = []
for st in a_list:
sub.append([st[i:j] for i in range(len(st)) for j in range(i + 1, len(st) + 1)])
for i in sub:
for item in i:
if len(item) >= 3 and (item[0] == '1' and item[-1] == '1') and (len(item) - 2 == item.count('0')):
case += 1
print("Case {}: {}".format(case_no, case))
case = 0
case_no += 1
sub.clear()
condition(num_list)
Is there any better approach to solve the memory consumption problem?
Have you tried taking java heap dump and java thread dump? These will tell the memory leak and also the thread that is consuming memory.
Your method of creating all possible substrings won't scale very well to larger problems. If the input string is length N, the number of substrings is N * (N + 1) / 2 -- in other words, the memory needed will grow roughly like N ** 2. That said, it is a bit puzzling to me why your code would exceed 512MB if the length of the input string is always less than 105.
In any case, there is no need to store all of those substrings in memory, because a Tokushuna string cannot contain other Tokushuna strings nested within
it:
1 # Leading one.
0... # Some zeros. Cannot be or contain a Tokushuna.
1 # Trailing one. Could also be the start of the next Tokushuna.
That means a single scan over the string should be sufficient to find them all.
You could write your own algorithmic code to scan the characters and keep track
of whether it finds a Tokushuna string. But that requires some tedious
bookkeeping.
A better option is regex, which is very good at character-by-character analysis:
import sys
import re
# Usage: python foo.py 3 10001 10101 1001001001
cases = sys.argv[2:]
# Match a Tokushuna string without consuming the last '1', using a lookahead.
rgx = re.compile(r'10+(?=1)')
# Check the cases.
for i, c in enumerate(cases):
matches = list(rgx.finditer(c))
msg = 'Case {}: {}'.format(i + 1, len(matches))
print(msg)
If you do not want to use regex, my first instinct would be to start the algorithm by finding the indexes of all of the ones: indexes = [j for j, c in enumerate(case) if c == '1']. Then pair those indexes up: zip(indexes, indexes[1:]). Then iterate over the pairs, checking whether the part in the middle is all zeros.
A small note regarding your current code:
# Rather than this,
sub = []
for st in a_list:
sub.append([...]) # Incurs memory cost of the temporary list
# and a need to drill down to the inner list.
...
sub.clear() # Also requires a step that's easy to forget.
# just do this.
for st in a_list:
sub = [...]
...

Save list of numbers to (binary) file with defined bits per number

I have a list/array of numbers, which I want to save to a binary file.
The crucial part is, that each number should not be saved as a pre-defined data type.
The bits per value are constant for all values in the list but do not correspond to the typical data types (e.g. byte or int).
import numpy as np
# create 10 random numbers in range 0-63
values = np.int32(np.round(np.random.random(10)*63));
# each value requires exactly 6 bits
# how to save this to a file?
# just for debug/information: bit string representation
bitstring = "".join(map(lambda x: str(bin(x)[2:]).zfill(6), values));
print(bitstring)
In the real project, there are more than a million values I want to store with a given bit dephts.
I already tried the module bitstring, but appending each value to the BitArray costs a lot of time...
The may be some numpy-specific way that make things easier, but here's a pure Python (2.x) way to do it. It first converts the list of values into a single integer since Python supports int values of any length. Next it converts that int value into a string of bytes and writes it to the file.
Note: If you're sure all the values will fit within the bit-width specified, the array_to_int() function could be sped up slightly by changing the (value & mask) it's using to just value.
import random
def array_to_int(values, bitwidth):
mask = 2**bitwidth - 1
shift = bitwidth * (len(values)-1)
integer = 0
for value in values:
integer |= (value & mask) << shift
shift -= bitwidth
return integer
# In Python 2.7 int and long don't have the "to_bytes" method found in Python 3.x,
# so here's one way to do the same thing.
def to_bytes(n, length):
return ('%%0%dx' % (length << 1) % n).decode('hex')[-length:]
BITWIDTH = 6
#values = [random.randint(0, 2**BITWIDTH - 1) for _ in range(10)]
values = [0b000001 for _ in range(10)] # create fixed pattern for debugging
values[9] = 0b011111 # make last one different so it can be spotted
# just for debug/information: bit string representation
bitstring = "".join(map(lambda x: bin(x)[2:].zfill(BITWIDTH), values));
print(bitstring)
bigint = array_to_int(values, BITWIDTH)
width = BITWIDTH * len(values)
print('{:0{width}b}'.format(bigint, width=width)) # show integer's value in binary
num_bytes = (width+8 - (width % 8)) // 8 # round to whole number of 8-bit bytes
with open('data.bin', 'wb') as file:
file.write(to_bytes(bigint, num_bytes))
Since you give an example with a string, I'll assume that's how you get the results. This means performance is probably never going to be great. If you can, try creating bytes directly instead of via a string.
Side note: I'm using Python 3 which might require you to make some changes for Python 2. I think this code should work directly in Python 2, but there are some changes around bytearrays and strings between 2 and 3, so make sure to check.
byt = bytearray(len(bitstring)//8 + 1)
for i, b in enumerate(bitstring):
byt[i//8] += (b=='1') << i%8
and for getting the bits back:
bitret = ''
for b in byt:
for i in range(8):
bitret += str((b >> i) & 1)
For millions of bits/bytes you'll want to convert this to a streaming method instead, as you'd need a lot of memory otherwise.

Pad an integer to a given length with one 0 in front and some at the end

I need to manipulate a number as follows,
inputs
1
23456
6674321
outputs
01000000
02345600
06674321
Simply it's adding a zero to in front of number and still if less than eight characters add 0s to the end. It should be a number not a string . Is there a simple way get this done without casting from string to int or int to string?
A Sagemath code I tried is as follows. It only adds zeros to the front to pad the number to 8 characters. I need to modify this as I mentioned.
for num in range(1,25):
s=randrange(2^16)
r=mod((s-1)*503,randrange(2^32-1))
print "%08d" % (r)
To pad on the right means to multiply the number by an appropriate power of 10. The power 6-floor(log(x,10)) does the job here, since you want 1000000 to not be padded.
for x in range(1, 101):
print '%08d' % (x*10^(6-floor(log(x,10))))
This assumes that x is in a range where such padding is possible at all: that is, an integer between 1 and 9999999.

How do I use string formatting to show BOTH leading zeros and precision of 3?

I'm trying to represent a number with leading and trailing zeros so that the total width is 7 including the decimal point. For example, I want to represent "5" as "005.000". It seems that string formatting will let me do one or the other but not both. Here's the output I get in Ipython illustrating my problem:
In [1]: '%.3f'%5
Out[1]: '5.000'
In [2]: '%03.f'%5
Out[2]: '005'
In [3]: '%03.3f'%5
Out[3]: '5.000'
Line 1 and 2 are doing exactly what I would expect. Line 3 just ignores the fact that I want leading zeros. Any ideas? Thanks!
The first number is the total number of digits, including decimal point.
>>> '%07.3f' % 5
'005.000'
Important Note: Both decimal points (.) and minus signs (-) are included in the count.
This took me a second to figure out how to do #nosklo's way but with the .format() and being nested.
Since I could not find an example anywhere else atm I am sharing here.
Example using "{}".format(a)
Python 2
>>> a = 5
>>> print "{}".format('%07.3F' % a)
005.000
>>> print("{}".format('%07.3F' % a))
005.000
Python 3
More python3 way, created from docs, but Both work as intended.
Pay attention to the % vs the : and the placement of the format is different in python3.
>>> a = 5
>>> print("{:07.3F}".format(a))
005.000
>>> a = 5
>>> print("Your Number is formatted: {:07.3F}".format(a))
Your Number is formatted: 005.000
Example using "{}".format(a) Nested
Then expanding that to fit my code, that was nested .format()'s:
print("{}: TimeElapsed: {} Seconds, Clicks: {} x {} "
"= {} clicks.".format(_now(),
"{:07.3F}".format((end -
start).total_seconds()),
clicks, _ + 1, ((_ + 1) * clicks),
)
)
Which formats everything the way I wanted.
Result
20180912_234006: TimeElapsed: 002.475 Seconds, Clicks: 25 + 50 = 75 clicks.
Important Things To Note:
#babbitt: The first number is the total field width.
#meawoppl: This also counts the minus sign!...
[Edit: Gah, beaten again]
'%07.3F'%5
The first number is the total field width.

Categories