Python "zfill()" equivalent in OCaml - python

I'm a beginner and I have to learn Ocaml for scientific programming. I just have one question:
Is there an equivalent of Python's .zfill() method in Ocaml to make leading zeros appear in a string?

Strings in OCaml are immutable. That means you're not supposed to modify a string but to create a new one.
There is no zfill function in the standard library, but you can easily make one that way:
let zfill s width =
let to_fill = width - (String.length s) in
if to_fill <= 0 then s
else (String.make to_fill '0') ^ s

I don't think there's one.
You can do it easily with build-in functions when you're working with numbers. For instance, to print the number 142857 with leading 0's over 30 characters, use Printf.printf "%030d" 142857.
You can also make it work with strings if you're fine with using leading spaces instead of leading zeros. For instance, Printf.printf "%30s" "abcdefg".
Finally if you have to, you can define your own function if need be.
The way the first two options work is by using Printf, which is an extremely useful too you really should learn at some point. Here is its documentation for OCaml, but a lot of programming languages have a similar tool.
In %030d, we started from %d which is a placeholder that will be replaced by an integer (in our case, 142857). We fixed its minimum width to 30 (right-aligned by default) by adding 30 between the two characters: %30d. Finally, we added the option to make the leading characters zeros instead of spaces by adding a 0 after the percent sign.
%30s is just a placeholder for a right-aligned string of at least 30 characters (with leading spaces, because the options for leading zeros only works with numbers).
Now here's a zfill function if for some reason you can't use a well-chosen Printf format in your scenario:
let zfill n s =
let length = Bytes.length s in
if n <= length then
s
else
let result = Bytes.make n '0' in
Bytes.blit s 0 result (n-length) length;
result
;;
Notice that if performance is an issue (though it probably isn't), this should perform faster than the solution of creating a string of zeros and then concatenating it with s, as while blit is done "in-place", string concatenation is not, so a temporary string of zeros has to be created. In most scenarios, it shouldn't matter all that much and you can use either option.

Related

Indexing the wrong character for an expression

My program seems to be indexing the wrong character or not at all.
I wrote a basic calculator that allows expressions to be used. It works by having the user enter the expression, then turning it into a list, and indexing the first number at position 0 and then using try/except statements to index number2 and the operator. All this is in a while loop that is finished when the user enters done at the prompt.
The program seems to work fine if I type the expression like this "1+1" but if I add spaces "1 + 1" it cannot index it or it ends up indexing the operator if I do "1+1" followed by "1 + 1".
I have asked in a group chat before and someone told me to use tokenization instead of my method, but I want to understand why my program is not running properly before moving on to something else.
Here is my code:
https://hastebin.com/umabukotab.py
Thank you!
Strings are basically lists of characters. 1+1 contains three characters, whereas 1 + 1 contains five, because of the two added spaces. Thus, when you access the third character in this longer string, you're actually accessing the middle element.
Parsing input is often not easy, and certainly parsing arithmetic expressions can get tricky quite quickly. Removing spaces from the input, as suggested by #Sethroph is a viable solution, but will only go that far. If you all of a sudden need to support stuff like 1+2+3, it will still break.
Another solution would be to split your input on the operator. For example:
input = '1 + 2'
terms = input.split('+') # ['1 ', ' 2'] note the spaces
terms = map(int, terms) # [1, 2] since int() can handle leading/trailing whitespace
output = terms[0] + terms[1]
Still, although this can handle situations like 1 + 2 + 3, it will still break when there's multiple different operators involved, or there are parentheses (but that might be something you need not worry about, depending on how complex you want your calculator to be).
IMO, a better approach would indeed be to use tokenization. Personally, I'd use parser combinators, but that may be a bit overkill. For reference, here's an example calculator whose input is parsed using parsy, a parser combinator library for Python.
You could remove the spaces before processing the string by using replace().
Try adding in:
clean_input = hold_input.replace(" ", "")
just after you create hold_input.

Finding the end of a contiguous substring of a string without iteration or RegEx

I'm trying to write an iterative LL(k) parser, and I've gotten strings down pretty well, because they have a start and end token, and so you can just "".join(tokenlist[string_start:string_end]).
Numbers, however, do not, and only consist of .0123456789. They can occur at any given point in a program, have any arbitrary length and are delimited purely by non-numerals.
Some examples, because that definition is pretty vague:
56 123.45/! is 56 and 123.45 followed by two other tokens
565.5345.345 % is 565.5345, 0.345 and two other tokens (incl. whitespace)
The problem I'm trying to solve is how the parser should figure out where a numeric literal ends. (Note that this is a context-free, self-modifying interpretive grammar thus there is no separate lexical analysis to be done.)
I could and have solved this with iteration:
def _next_notinst(self, atindex, subs = DIGITS):
"""return the next index of a char not in subs"""
for i, e in enumerate(self.toklist[atindex:]):
if e not in subs:
return i - len(self.toklist)
else:
break
return self.idx.v
(I don't think I need to clarify the variables, since it's an example and extremely straightforward.)
Great! That works, but there are at least two issues:
It's O(n) for a number with digit-length n. Not ideal.*
The parser class of which this method is a member is already using a while True: to cycle over arbitrary parts of the string, and I would prefer not having remotely nested loops when I don't need to.
From the previous bullet: since the parser uses arbitrary k lookahead and skipahead, parsing each individual token is absolutely not what I want.
I don't want to use RegEx mostly because I don't know it, and using it for this right now would make my code uncomprehendable to me, its creator.
There must be a simple, < O(n) solution to this, that simply collects the contiguous numerals in a string given a starting point, up until a non-numeral.
*Yes, I'm fully aware the parser itself is O(n), but we don't also need the number catenator to be > O(n). If you don't believe me, the string catenator is O(1) because it simply looks for the next unescaped " in the program and then joins all the chars up to that. Can't I do the same thing for numbers?
My other answer was actually erroneous due to lack of testing.
I decided to suck it up and learn a little bit of RegEx just because it's the only other way to solve this.
^([.\d]+[.\d]+|[.\d]) works for what I want, and matches these:
123.43.453""
.234234!/%
but not, for example:
"1233

Adjust my input to the right - python

I'm trying to adjust my text in python, so that the input gets adjusted to the right. I want the last character in the input to be in the 60th position. I therefore used the following script:
def adjust_right(s):
print(' '*60 - len(s)*' '+s)
adjust_right(input())
This works if I change the - to a +, but that does the reverse.
My question is: Why does this generate an error, when it works perfectly with a +, instead of a -?
Could the answer be, that if len(s) > 60, we get a negative amount of spaces? If this is the case, how should I rewrite my code?
I think you mean:
print(' '*(60 - len(s))+s)
You're trying to subtract a string of spaces from another string of spaces. The reason that + works is that it concatenates the strings. But Python doesn't have string subtraction.
You're subtracting strings due to order of operations, to avoid this:
print(s.rjust(60)) # right justify string with length = 60
Python has no string subtraction. (This is because it is considered "unpythonic" - most operations like these are syntax sugar/syrup, can be done easily by the programmer, or has functions to do operations)
You should use ' '*60.strip(len(s)*' '). This will take len(s)*' ' off of ' '*60. If it can't it won't do anything.

How to make binary to hex converter using for loop - Python

Yes, this is homework.
I have the basic idea. I know that basically I need to introduce a for loop and set if's saying if the value is above 9 then it's a, b, c, and so forth. But what I need is to get the for loop to grab the integer and its index number to calculate and go back and forth and then print out the hex. by the way its an 8 bit binary number and has to come out in two digit hex form.
thanks a lot!!
I'm assuming that you have a string containing the binary data.
In Python, you can iterate over all sorts of things, strings included. It becomes as simple as this:
for char in mystring:
pass
And replace pass with your suite (a term meaning a "block" of code). At this point, char will be a single-character string. Nice an straight forward.
For getting the character ordinal, investigate ord (find help for it yourself, it's not hard and it's good practice).
For converting the number to hex, you could use % string formatting with '%x', which will produce a value like '9f', or you could use the hex function, which will produce a value like '0x9f'; there are other ways, too.
If you can't figure any thing out, ask; but try to work it out first. It's your homework. :-)
So assuming that you've got the binary number in a string, you will want to have an index variable that gets incremented with each iteration of the for loop. I'm not going to give you the exact code, but consider this:
Python's for loop is designed to set the index variable (for index in list) to each value of a list of values.
You can use the range function to generate a list of numbers (say, from 0 to 7).
You can get the character at a given index in a string by using e.g. binary[index].

How to work with very long strings in Python?

I'm tackling project euler's problem 220 (looked easy, in comparison to some of the
others - thought I'd try a higher numbered one for a change!)
So far I have:
D = "Fa"
def iterate(D,num):
for i in range (0,num):
D = D.replace("a","A")
D = D.replace("b","B")
D = D.replace("A","aRbFR")
D = D.replace("B","LFaLb")
return D
instructions = iterate("Fa",50)
print instructions
Now, this works fine for low values, but when you put it to repeat higher then you just get a "Memory error". Can anyone suggest a way to overcome this? I really want a string/file that contains instructions for the next step.
The trick is in noticing which patterns emerge as you run the string through each iteration. Try evaluating iterate(D,n) for n between 1 and 10 and see if you can spot them. Also feed the string through a function that calculates the end position and the number of steps, and look for patterns there too.
You can then use this knowledge to simplify the algorithm to something that doesn't use these strings at all.
Python strings are not going to be the answer to this one. Strings are stored as immutable arrays, so each one of those replacements creates an entirely new string in memory. Not to mention, the set of instructions after 10^12 steps will be at least 1TB in size if you store them as characters (and that's with some minor compressions).
Ideally, there should be a way to mathematically (hint, there is) generate the answer on the fly, so that you never need to store the sequence.
Just use the string as a guide to determine a method which creates your path.
If you think about how many "a" and "b" characters there are in D(0), D(1), etc, you'll see that the string gets very long very quickly. Calculate how many characters there are in D(50), and then maybe think again about where you would store that much data. I make it 4.5*10^15 characters, which is 4500 TB at one byte per char.
Come to think of it, you don't have to calculate - the problem tells you there are 10^12 steps at least, which is a terabyte of data at one byte per character, or quarter of that if you use tricks to get down to 2 bits per character. I think this would cause problems with the one-minute time limit on any kind of storage medium I have access to :-)
Since you can't materialize the string, you must generate it. If you yield the individual characters instead of returning the whole string, you might get it to work.
def repl220( string ):
for c in string:
if c == 'a': yield "aRbFR"
elif c == 'b': yield "LFaLb"
else yield c
Something like that will do replacement without creating a new string.
Now, of course, you need to call it recursively, and to the appropriate depth. So, each yield isn't just a yield, it's something a bit more complex.
Trying not to solve this for you, so I'll leave it at that.
Just as a word of warning be careful when using the replace() function. If your strings are very large (in my case ~ 5e6 chars) the replace function would return a subset of the string (around ~ 4e6 chars) without throwing any errors.
You could treat D as a byte stream file.
Something like:-
seedfile = open('D1.txt', 'w');
seedfile.write("Fa");
seedfile.close();
n = 0
while (n
warning totally untested

Categories