How to read stdin to a 2d python array of integers?

How to read stdin to a 2d python array of integers? - python

I would like to read a 2d array of integers from stdin (or from a file) in Python.
Non-working code:
from StringIO import StringIO
from array import array
# fake stdin
stdin = StringIO("""1 2
3 4
5 6""")
a = array('i')
a.fromstring(stdin.read())
This gives me an error: a.fromstring(stdin.read())
ValueError: string length not a multiple of item size

Several approaches to accomplish this are available. Below are a few of the possibilities.
Using an array
From a list
Replace the last line of code in the question with the following.
a.fromlist([int(val) for val in stdin.read().split()])
Now:
>>> a
array('i', [1, 2, 3, 4, 5, 6])
Con: does not preserve 2d structure (see comments).
From a generator
Note: this option is incorporated from comments by eryksun.
A more efficient way to do this is to use a generator instead of the list. Replace the last two lines of the code in the question with:
a = array('i', (int(val) for row in stdin for val in row.split()))
This produces the same result as the option above, but avoids creating the intermediate list.
Using a NumPy array
If you want the preserve the 2d structure, you could use a NumPy array. Here's the whole example:
from StringIO import StringIO
import numpy as np
# fake stdin
stdin = StringIO("""1 2
3 4
5 6""")
a = np.loadtxt(stdin, dtype=np.int)
Now:
>>> a
array([[1, 2],
[3, 4],
[5, 6]])
Using standard lists
It is not clear from the question if a Python list is acceptable. If it is, one way to accomplish the goal is replace the last two lines of the code in the question with the following.
a = [map(int, row.split()) for row in stdin]
After running this, we have:
>>> a
[[1, 2], [3, 4], [5, 6]]

I've never used array.array, so I had to do some digging around.
The answer is in the error message -
ValueError: string length not a multiple of item size
How do you determine the item size? Well it depends on the type you initialized it with. In your case you initialized it with i which is a signed int. Now, how big is an int? Ask your python interpreter..
>>> a.itemsize
4
The value above provides insight into the problem. Your string is only 11 bytes wide. 11 isn't a multiple of 4. But increasing the length of the string will not give you an array of {1,2,3,4,5,6}... I'm not sure what it would give you. Why the uncertainty? Well, read the docstring below... (It's late, so I highlighted the important part, in case you're getting sleepy, like me!)
array.fromfile(f, n)
Read n items (as machine values) from the file object f and append them to the end of the array. If less than n items are available, EOFError is raised, but the items that were available are still inserted into the array. f must be a real built-in file object; something else with a read() method won’t do.
array.fromstring reads data in the same manner as array.fromfile. Notice the bold above. "as machine values" means "reads as binary". So, to do what you want to do, you need to use the struct module. Check out the code below.
import struct
a = array.array('i')
binary_string = struct.pack('iiii', 1, 2, 3, 4)
a.fromstring(binary_string)
The code snippet above loads the array with tlhe values 1, 2, 3, 4; like we expect.
Hope it helps.

arr = []
arr = raw_input()
If you want to split the input by spaces:
arr = []
arr = raw_input().split()

Related

Appending to Numpy array produces one big array rather than an array of arrays

I want to append arrays to an array in the following way:
np.append([[1, 2, 3], [4, 5, 6]], [[7, 8, 9]], axis=0)
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Yet, when I don't write the arrays out, but try to do something like this
DataMatrix = np.array([])
dataArray = np.array([])
with open("fakedata.txt", "r") as file:
for line in file.readlines():
#f_list = [float(i) for i in line.split(" ") or i in line.split(", ") if i.strip()]
rr = re.findall("[+-]?\d*[\.]?\d*(?:(?:[eE])[+-]?\d+)?", line)
dataArray=np.array([])
for numbers in rr:
if(numbers!=""):
dataArray=np.append(dataArray,float(numbers))
DataMatrix=np.append(DataMatrix,dataArray, axis=0)
print(DataMatrix)
it just will not work. It will produce one big array, rather than an array of arrays. Putting extra []-brackets just about anywhere did not help. Every example I find, uses explicit arrays, as shown above, rather than variables.

Here's a modest tweak to your answer code. Without a txt file I can't test it, but I think it's right :)
alist=[]
with open("fakedata.txt", "r") as file:
for line in file.readlines():
rr = re.findall("[+-]?\d*[\.]?\d*(?:(?:[eE])[+-]?\d+)?", line)
innerlist = [numbers in rr if numbers!=""]
alist.append(innerlist)
np.array(alist, dtype=float)
I replaced the for loop with a list comprehension; that's mainly a syntactic cleanup. And deferred the conversion to float, so np.array can do it on all strings 'at once'.
There have been several SO posts recently about list append versus array append. Nearly everyone agrees that list append like this is right way. Repeated array append/concatenate is inefficient, and hard to get right. np.concatenate with a list is quite useful; np.append should (IMO) be depricated.

Assuming your file looks something like this:
1e1 1e2 -1e3
2.4e5 4.5e6 1.8e1
-1.1 -0.6 1.11
You can use np.loadtxt:
>>> import numpy as np
>>> import io
>>> matrix = """\
1e1 1e2 -1e3
2.4e5 4.5e6 1.8e1
-1.1 -0.6 1.11"""
>>> file = io.StringIO(matrix)
>>> np.loadtxt(file)
array([[ 1.00e+01, 1.00e+02, -1.00e+03],
[ 2.40e+05, 4.50e+06, 1.80e+01],
[-1.10e+00, -6.00e-01, 1.11e+00]])
In this case the default arguments to np.loadtxt will work, but if this isn't the exact format of your file there are various tweaks that can be made. To pass it a filename directly as in your case you can use np.loadtxt('fakedata.txt') instead.

Alright, the only way that i manage, is to define a normal array (DataMatrix=[], rather than DataMatrix=np.array([])), and then use np.array(array) at the end to get it into the form i want:
DataMatrix=[]
with open("fakedata.txt", "r") as file:
for line in file.readlines():
rr = re.findall("[+-]?\d*[\.]?\d*(?:(?:[eE])[+-]?\d+)?", line)
dataArray=[]
for numbers in rr:
if(numbers!=""):
dataArray.append(float(numbers))
DataMatrix.append(dataArray)
np.array(DataMatrix)
print(np.array(DataMatrix))
Considering that I'm a total programming noob, this is probably not the smartest way to do so. But well...thanks for the downvote...

save different array with different length in 1D array

I have arrays with different length and I want to save them inside 1D array using python,
a new array is generated after some tests this is why I have different sizes of arrays,
here is a smple of what I have:
array1=[1,3,5]
array2=[10,12,13,14]
array3=[12,14,14,15,15] #etc
The desired result:
myArray=[[1,3,5],[10,12,13,14],[12,14,14,15,15]]
I tried to use this code
myArray=[]
myArray.append(array1)
myArray.append(array2) #etc
when I print myArray I get:
[[array([1,3,5])], [array([10,12,13,14])], [array([12,14,14,15,15])]]
so when I try to get the second array, for example, I have to use this code
temp = myArray[1]
result = temp[0]
this was working for me but it looks like it has a limitation and it stopped working after a while when I'm retrieving results using some loops.

The currently accepted answer makes little sense, so here's what's actually going on: array_1, array_2, etc. are not plain Python lists, they're almost certainly NumPy arrays. my_array, however, is just a Python list.
Here is a simple program which should allow you to reproduce and understand the difference, at least in how it relates to your program:
import numpy as np
plain_list = [1, 2, 3]
numpy_array = np.array([1, 2, 3])
result_list = [plain_list, numpy_array]
print(plain_list) # [1, 2, 3]
print(numpy_array) # [1 2 3]
print(result_list) # [[1, 2, 3], array([1, 2, 3])]
Now, it isn't exactly clear what's happening to your program, since you just write this was working for me but it looks like it has a limitation and it stopped working after a while when I'm retrieving results using some loops.
Depending on what the rest of the program is doing, numpy arrays may or may not be the appropriate data structure. In any case, please share the entirety of your code as well as an explanation of the program.

First thing first there is no array data structure in python.
Instead List and tuples are used.
In your case variable array1, array2 & array3 are lists.
array1=[1,3,5]
array2=[10,12,13,14]
array3=[12,14,14,15,15]
# to get the desired result as myArray=[[1,3,5],[10,12,13,14],[12,14,14,15,15]]
myArray = [array1, array2, array3]
Check python documentation to know more about lists

Weird behaviour initializing a numpy array of string data

I am having some seemingly trivial trouble with numpy when the array contains string data. I have the following code:
my_array = numpy.empty([1, 2], dtype = str)
my_array[0, 0] = "Cat"
my_array[0, 1] = "Apple"
Now, when I print it with print my_array[0, :], the response I get is ['C', 'A'], which is clearly not the expected output of Cat and Apple. Why is that, and how can I get the right output?
Thanks!

Numpy requires string arrays to have a fixed maximum length. When you create an empty array with dtype=str, it sets this maximum length to 1 by default. You can see if you do my_array.dtype; it will show "|S1", meaning "one-character string". Subsequent assignments into the array are truncated to fit this structure.
You can pass an explicit datatype with your maximum length by doing, e.g.:
my_array = numpy.empty([1, 2], dtype="S10")
The "S10" will create an array of length-10 strings. You have to decide how big will be big enough to hold all the data you want to hold.

I got a "codec error" when I tried to use a non-ascii character with dtype="S10"
You also get an array with binary strings, which confused me.
I think it is better to use:
my_array = numpy.empty([1, 2], dtype="<U10")
Here 'U10' translates to "Unicode string of length 10; little endian format"

The numpy string array is limited by its fixed length (length 1 by default). If you're unsure what length you'll need for your strings in advance, you can use dtype=object and get arbitrary length strings for your data elements:
my_array = numpy.empty([1, 2], dtype=object)
I understand there may be efficiency drawbacks to this approach, but I don't have a good reference to support that.

in case of anyone who's new here, I guess there's another way to do this job for now, just need a little work:
my_array = np.full([1, 2], "", dtype=np.object)
Use np.full instead of np.empty, and create the array with a empty string (type is object).

Another alternative is to initialize as follows:
my_array = np.array([["CAT","APPLE"],['','']], dtype=str)
In other words, first you write a regular array with what you want, then you turn it into a numpy array. However, this will fix your max string length to the length of the longest string at initialization. So if you were to add
my_array[1,0] = 'PINEAPPLE'
then the string stored would be 'PINEA'.

What works best if you are doing a for loop is to start a list comprehension, which will allow you to allocate the right memory.
data = ['CAT', 'APPLE', 'CARROT']
my_array = [name for name in data]

How do I declare an array in Python?

How do I declare an array in Python?

variable = []
Now variable refers to an empty list*.
Of course this is an assignment, not a declaration. There's no way to say in Python "this variable should never refer to anything other than a list", since Python is dynamically typed.
*The default built-in Python type is called a list, not an array. It is an ordered container of arbitrary length that can hold a heterogenous collection of objects (their types do not matter and can be freely mixed). This should not be confused with the array module, which offers a type closer to the C array type; the contents must be homogenous (all of the same type), but the length is still dynamic.

This is surprisingly complex topic in Python.
Practical answer
Arrays are represented by class list (see reference and do not mix them with generators).
Check out usage examples:
# empty array
arr = []
# init with values (can contain mixed types)
arr = [1, "eels"]
# get item by index (can be negative to access end of array)
arr = [1, 2, 3, 4, 5, 6]
arr[0] # 1
arr[-1] # 6
# get length
length = len(arr)
# supports append and insert
arr.append(8)
arr.insert(6, 7)
Theoretical answer
Under the hood Python's list is a wrapper for a real array which contains references to items. Also, underlying array is created with some extra space.
Consequences of this are:
random access is really cheap (arr[6653] is same to arr[0])
append operation is 'for free' while some extra space
insert operation is expensive
Check this awesome table of operations complexity.
Also, please see this picture, where I've tried to show most important differences between array, array of references and linked list:

You don't actually declare things, but this is how you create an array in Python:
from array import array
intarray = array('i')
For more info see the array module: http://docs.python.org/library/array.html
Now possible you don't want an array, but a list, but others have answered that already. :)

I think you (meant)want an list with the first 30 cells already filled.
So
f = []
for i in range(30):
f.append(0)
An example to where this could be used is in Fibonacci sequence.
See problem 2 in Project Euler

This is how:
my_array = [1, 'rebecca', 'allard', 15]

For calculations, use numpy arrays like this:
import numpy as np
a = np.ones((3,2)) # a 2D array with 3 rows, 2 columns, filled with ones
b = np.array([1,2,3]) # a 1D array initialised using a list [1,2,3]
c = np.linspace(2,3,100) # an array with 100 points beteen (and including) 2 and 3
print(a*1.5) # all elements of a times 1.5
print(a.T+b) # b added to the transpose of a
these numpy arrays can be saved and loaded from disk (even compressed) and complex calculations with large amounts of elements are C-like fast.
Much used in scientific environments. See here for more.

JohnMachin's comment should be the real answer.
All the other answers are just workarounds in my opinion!
So:
array=[0]*element_count

A couple of contributions suggested that arrays in python are represented by lists. This is incorrect. Python has an independent implementation of array() in the standard library module array "array.array()" hence it is incorrect to confuse the two. Lists are lists in python so be careful with the nomenclature used.
list_01 = [4, 6.2, 7-2j, 'flo', 'cro']
list_01
Out[85]: [4, 6.2, (7-2j), 'flo', 'cro']
There is one very important difference between list and array.array(). While both of these objects are ordered sequences, array.array() is an ordered homogeneous sequences whereas a list is a non-homogeneous sequence.

You don't declare anything in Python. You just use it. I recommend you start out with something like http://diveintopython.net.

I would normally just do a = [1,2,3] which is actually a list but for arrays look at this formal definition

To add to Lennart's answer, an array may be created like this:
from array import array
float_array = array("f",values)
where values can take the form of a tuple, list, or np.array, but not array:
values = [1,2,3]
values = (1,2,3)
values = np.array([1,2,3],'f')
# 'i' will work here too, but if array is 'i' then values have to be int
wrong_values = array('f',[1,2,3])
# TypeError: 'array.array' object is not callable
and the output will still be the same:
print(float_array)
print(float_array[1])
print(isinstance(float_array[1],float))
# array('f', [1.0, 2.0, 3.0])
# 2.0
# True
Most methods for list work with array as well, common
ones being pop(), extend(), and append().
Judging from the answers and comments, it appears that the array
data structure isn't that popular. I like it though, the same
way as one might prefer a tuple over a list.
The array structure has stricter rules than a list or np.array, and this can
reduce errors and make debugging easier, especially when working with numerical
data.
Attempts to insert/append a float to an int array will throw a TypeError:
values = [1,2,3]
int_array = array("i",values)
int_array.append(float(1))
# or int_array.extend([float(1)])
# TypeError: integer argument expected, got float
Keeping values which are meant to be integers (e.g. list of indices) in the array
form may therefore prevent a "TypeError: list indices must be integers, not float", since arrays can be iterated over, similar to np.array and lists:
int_array = array('i',[1,2,3])
data = [11,22,33,44,55]
sample = []
for i in int_array:
sample.append(data[i])
Annoyingly, appending an int to a float array will cause the int to become a float, without throwing an exception.
np.array retain the same data type for its entries too, but instead of giving an error it will change its data type to fit new entries (usually to double or str):
import numpy as np
numpy_int_array = np.array([1,2,3],'i')
for i in numpy_int_array:
print(type(i))
# <class 'numpy.int32'>
numpy_int_array_2 = np.append(numpy_int_array,int(1))
# still <class 'numpy.int32'>
numpy_float_array = np.append(numpy_int_array,float(1))
# <class 'numpy.float64'> for all values
numpy_str_array = np.append(numpy_int_array,"1")
# <class 'numpy.str_'> for all values
data = [11,22,33,44,55]
sample = []
for i in numpy_int_array_2:
sample.append(data[i])
# no problem here, but TypeError for the other two
This is true during assignment as well. If the data type is specified, np.array will, wherever possible, transform the entries to that data type:
int_numpy_array = np.array([1,2,float(3)],'i')
# 3 becomes an int
int_numpy_array_2 = np.array([1,2,3.9],'i')
# 3.9 gets truncated to 3 (same as int(3.9))
invalid_array = np.array([1,2,"string"],'i')
# ValueError: invalid literal for int() with base 10: 'string'
# Same error as int('string')
str_numpy_array = np.array([1,2,3],'str')
print(str_numpy_array)
print([type(i) for i in str_numpy_array])
# ['1' '2' '3']
# <class 'numpy.str_'>
or, in essence:
data = [1.2,3.4,5.6]
list_1 = np.array(data,'i').tolist()
list_2 = [int(i) for i in data]
print(list_1 == list_2)
# True
while array will simply give:
invalid_array = array([1,2,3.9],'i')
# TypeError: integer argument expected, got float
Because of this, it is not a good idea to use np.array for type-specific commands. The array structure is useful here. list preserves the data type of the values.
And for something I find rather pesky: the data type is specified as the first argument in array(), but (usually) the second in np.array(). :|
The relation to C is referred to here:
Python List vs. Array - when to use?
Have fun exploring!
Note: The typed and rather strict nature of array leans more towards C rather than Python, and by design Python does not have many type-specific constraints in its functions. Its unpopularity also creates a positive feedback in collaborative work, and replacing it mostly involves an additional [int(x) for x in file]. It is therefore entirely viable and reasonable to ignore the existence of array. It shouldn't hinder most of us in any way. :D

How about this...
>>> a = range(12)
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
>>> a[7]
6

Following on from Lennart, there's also numpy which implements homogeneous multi-dimensional arrays.

Python calls them lists. You can write a list literal with square brackets and commas:
>>> [6,28,496,8128]
[6, 28, 496, 8128]

I had an array of strings and needed an array of the same length of booleans initiated to True. This is what I did
strs = ["Hi","Bye"]
bools = [ True for s in strs ]

You can create lists and convert them into arrays or you can create array using numpy module. Below are few examples to illustrate the same. Numpy also makes it easier to work with multi-dimensional arrays.
import numpy as np
a = np.array([1, 2, 3, 4])
#For custom inputs
a = np.array([int(x) for x in input().split()])
You can also reshape this array into a 2X2 matrix using reshape function which takes in input as the dimensions of the matrix.
mat = a.reshape(2, 2)

# This creates a list of 5000 zeros
a = [0] * 5000
You can read and write to any element in this list with a[n] notation in the same as you would with an array.
It does seem to have the same random access performance as an array. I cannot say how it allocates memory because it also supports a mix of different types including strings and objects if you need it to.

2D arrays in Python

What's the best way to create 2D arrays in Python?
What I want is want is to store values like this:
X , Y , Z
so that I access data like X[2],Y[2],Z[2] or X[n],Y[n],Z[n] where n is variable.
I don't know in the beginning how big n would be so I would like to append values at the end.

>>> a = []
>>> for i in xrange(3):
... a.append([])
... for j in xrange(3):
... a[i].append(i+j)
...
>>> a
[[0, 1, 2], [1, 2, 3], [2, 3, 4]]
>>>

Depending what you're doing, you may not really have a 2-D array.
80% of the time you have simple list of "row-like objects", which might be proper sequences.
myArray = [ ('pi',3.14159,'r',2), ('e',2.71828,'theta',.5) ]
myArray[0][1] == 3.14159
myArray[1][1] == 2.71828
More often, they're instances of a class or a dictionary or a set or something more interesting that you didn't have in your previous languages.
myArray = [ {'pi':3.1415925,'r':2}, {'e':2.71828,'theta':.5} ]
20% of the time you have a dictionary, keyed by a pair
myArray = { (2009,'aug'):(some,tuple,of,values), (2009,'sep'):(some,other,tuple) }
Rarely, will you actually need a matrix.
You have a large, large number of collection classes in Python. Odds are good that you have something more interesting than a matrix.

In Python one would usually use lists for this purpose. Lists can be nested arbitrarily, thus allowing the creation of a 2D array. Not every sublist needs to be the same size, so that solves your other problem. Have a look at the examples I linked to.

If you want to do some serious work with arrays then you should use the numpy library. This will allow you for example to do vector addition and matrix multiplication, and for large arrays it is much faster than Python lists.
However, numpy requires that the size is predefined. Of course you can also store numpy arrays in a list, like:
import numpy as np
vec_list = [np.zeros((3,)) for _ in range(10)]
vec_list.append(np.array([1,2,3]))
vec_sum = vec_list[0] + vec_list[1] # possible because we use numpy
print vec_list[10][2] # prints 3
But since your numpy arrays are pretty small I guess there is some overhead compared to using a tuple. It all depends on your priorities.
See also this other question, which is pretty similar (apart from the variable size).

I would suggest that you use a dictionary like so:
arr = {}
arr[1] = (1, 2, 4)
arr[18] = (3, 4, 5)
print(arr[1])
>>> (1, 2, 4)
If you're not sure an entry is defined in the dictionary, you'll need a validation mechanism when calling "arr[x]", e.g. try-except.

If you are concerned about memory footprint, the Python standard library contains the array module; these arrays contain elements of the same type.

Please consider the follwing codes:
from numpy import zeros
scores = zeros((len(chain1),len(chain2)), float)

x=list()
def enter(n):
y=list()
for i in range(0,n):
y.append(int(input("Enter ")))
return y
for i in range(0,2):
x.insert(i,enter(2))
print (x)
here i made function to create 1-D array and inserted into another array as a array member. multiple 1-d array inside a an array, as the value of n and i changes u create multi dimensional arrays

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to read stdin to a 2d python array of integers? - python

arr = [] arr = raw_input() If you want to split the input by spaces: arr = [] arr = raw_input().split()

Related

Appending to Numpy array produces one big array rather than an array of arrays

save different array with different length in 1D array

Weird behaviour initializing a numpy array of string data

How do I declare an array in Python?

2D arrays in Python

Categories

Resources