Parsing indeterminate amount of data into a python tuple - python

I have a config file that contains a list of strings. I need to read these strings in order and store them in memory and I'm going to be iterating over them many times when certain events take place. Since once they're read from the file I don't need to add or modify the list, a tuple seems like the most appropriate data structure.
However, I'm a little confused on the best way to first construct the tuple since it's immutable. Should I parse them into a list then put them in a tuple? Is that wasteful? Is there a way to get them into a tuple first without the overhead of copying/destroying the tuple every time I add a new element.

As you said, you're going to read the data gradually - so a tuple isn't a good idea after all, as it's immutable.
Is there a reason for not using a simple list for holding the strings?

Since your data is changing, I am not sure you need a tuple. A list should do fine.
Look at the following which should provide you further information. Assigning a tuple is much faster than assigning a list. But if you are trying to modify elements every now and then then creating a tuple may not make more sense.
Are tuples more efficient than lists in Python?

I wouldn't worry about the overhead of first creating a list and then a tuple from that list. My guess is that the overhead will turn out to be negligible if you measure it.
On the other hand, I would stick with the list and iterate over that instead of creating a tuple. Tuples should be used for struct like data and list for lists of data, which is what your data sounds like to me.

with open("config") as infile:
config = tuple(infile)

You may want to try using chained generators to create your tuple. You can use the generators to perform multiple filtering and transformation operations on your input without creating intermediate lists. All of the generator processing is delayed until iteration. In the example below the processing/iteration all happens on the last line.
Like so:
f = open('settings.cfg')
step1 = (tuple(i.strip() for i in l.split(':', 1)) for l in f if len(l) > 2 and ':' in l)
step2 = ((l[0], ',' in l[1] and 'Tag' in l[0] and l[1].split(',') or l[1]) for l in step1)
t = tuple(step2)

Related

Fastest way to calculate new list from empty list in python

I want to perform calculations on a list and assign this to a second list, but I want to do this in the most efficient way possible as I'll be using a lot of data. What is the best way to do this? My current version uses append:
f=time_series_data
output=[]
for i, f in enumerate(time_series_data):
if f > x:
output.append(calculation with f)
etc etc
should I use append or declare the output list as a list of zeros at the beginning?
Appending the values is not slower compared to other ways possible to accomplish this.
The code looks fine and creating a list of zeroes would not help any further. Although it can create problems as you might not know how many values will pass the condition f > x.
Since you wrote etc etc I am not sure how long or what operations you need to do there. If possible try using list comprehension. That would be a little faster.
You can have a look at below article which compared the speed for list creation using 3 methods, viz, list comprehension, append, pre-initialization.
https://levelup.gitconnected.com/faster-lists-in-python-4c4287502f0a

Pass list of lists of list to function

I have values in a list of lists.
I would like to send the whole block to a conversion function which then returns all the converted values in the same structure.
my_list = [sensor1...sensor4] = [hum1...hum3] = [value1, value2, value3, value4]
So several nested lists
def conversion(my_list): dictionaries
for sensor in my_list:
for hum in sensor:
for value in hum:
map(function, value)
Is there a way to do a list comprehension as a one liner? I'm not sure how to use the map function in comprehensions especially when you have several nested iterations.
map(function, value)
Since you are just mapping a function on each value, without collecting the return value in a list, using a list comprehension is not a good idea. You could do it, but you would be collecting list items that have no value, for the sole purpose of throwing them away later—just so you can save a few lines that actually serve a much better purpose: Clearly telling what’s going on, without being in a single, long, and complicated line.
So my advice would be to keep it as it is. It makes more sense like that and clearly shows what’s going on.
I am however collecting the values. They all need to be converted and saved in the same structure as they were.
In that case, you still don’t want a list comprehension as that would mean that you created a new list (for no real reason). Instead, just update the most-inner list. To do that, you need to change the way you’re iterating though:
for sensor in my_list:
for hum in sensor:
for i, value in enumerate(hum):
hum[i] = map(function, value)
This will update the inner list.
Alternatively, since value is actually a list of values, you can also replace the value list’s contents using the slicing syntax:
for sensor in my_list:
for hum in sensor:
for value in hum:
value[:] = map(function, value)
Also one final note: If you are using Python 3, remember that map returns a generator, so you need to convert it to a list first using list(map(function, value)); or use a list comprehension for that part with [function(v) for v in value].
This is the right way to do it. You can use list comprehension to do that, but you shouldn't for code readability and because it's probably not faster.

Formatting a variable with another variable in Python

Let's suppose I have a for loop. After each loop, I get a result that I want to store in a list.
I want to write something like this but it's not working:
for i in range(50):
"result_%d" %i = result
Supposing that result is a list containing the results after each loop.
I want to do that in order to have a different list for each result, so I could be able to use them after the loop is finished.
Is there any way to do that?
Note: I thought about storing all the result lists in one big list. But won't that be heavy for the code? Noting that each result list has a size of 60.
I thought about storing all the result arrays in one big array. But won't that be heavy for the code?
In Python, everything is an object and names and list contents are just references to them. Creating new list objects containing references to existing values is quite lightweight really.
Don't try to create dynamic variables, just store your results in another list or a dictionary.
In this case that's as easy as:
results = []
for i in range(50):
# do things
results.append(result)
and results is then a list with 50 references to other list objects. That's no different from having 50 names referencing those 50 list objects, other than that it is much easier to address them now.

A list of lists of lists in Python

I'm working with the Flask framework in Python, and need to hand off a list of lists to a renderer.
I step through a loop and create a list, sort it, append it to another list, then call the render function with the masterlist, like so:
for itemID in itemsArray:
avgQuantity = getJitaQuantity(itemID)
lowestJitaSell = getJitaLowest(itemID)
candidateArray = findLowestPrices(itemID, lowestJitaSell, candidateArray, avgQuantity)
candidateArray.sort()
multiCandidateArray.append(candidateArray)
renderPage(multiCandidateArray)
My problem is that I need to clear the candidateArray and create a new one each time through the loop, but it looks like the candidateArray that I append to the multiCandidateArray is actually a pointer, not the values themselves.
When I do this:
for itemID in itemsArray:
avgQuantity = getJitaQuantity(itemID)
lowestJitaSell = getJitaLowest(itemID)
candidateArray = findLowestPrices(itemID, lowestJitaSell, candidateArray, avgQuantity)
candidateArray.sort()
multiCandidateArray.append(candidateArray)
**del candidateArray[:]**
renderPage(multiCandidateArray)
I end up with no values.
Is there a way to handle this situation that I'm missing?
I would probably go with something like:
for itemID in itemsArray:
avgQuantity = getJitaQuantity(itemID)
lowestJitaSell = getJitaLowest(itemID)
candidateArray = findLowestPrices(itemID, lowestJitaSell, candidateArray, avgQuantity)
multiCandidateArray.append(sorted(candidateArray))
No need to del anything here, and sorted returns a new list, so even if FindLowestPrices is for some reason returning references to the same list (which is unlikely), then you'll still have unique lists in the multiCandidateArray (although your unique lists could hold references to the same objects).
Your code already creates a new one each time through the loop.
candidateArray = findLowestPrices(...)
This assigns a new list to the variable, candidateArray. It should work fine.
When you do this:
del candidateArray[:]
...you're deleting the contents of the same list you just appended to the master list.
Don't think about pointers or variables; just think about objects, and remember nothing in Python is ever implicitly copied. A list is an object. At the end of the loop, candidateArray names the same list object as multiCandidateArray[-1]. They're different names for the same thing. On the next run through the loop, candidateArray becomes a name for a new list as produced by findLowestPrices, and the list at the end of the master list is unaffected.
I've written about this before; the C way of thinking about variables as being predetermined blocks of memory just doesn't apply to Python at all. Names are moved onto values, rather than values being copied into some fixed number of buckets.
(Also, nitpicking, but Python code generally uses under_scores and doesn't bother with types in names unless it's really ambiguous. So you might have candidates and multi_candidates. Definitely don't call anything an "array", since there's an array module in the standard library that does something different and generally not too useful. :))

How to grow a list to fit a given capacity in Python

I'm a Python newbie. I have a series of objects that need to be inserted at specific indices of a list, but they come out of order, so I can't just append them. How can I grow the list whenever necessary to avoid IndexErrors?
def set(index, item):
if len(nodes) <= index:
# Grow list to index+1
nodes[index] = item
I know you can create a list with an initial capacity via nodes = (index+1) * [None] but what's the usual way to grow it in place? The following doesn't seem efficient:
for _ in xrange(len(nodes), index+1):
nodes.append(None)
In addition, I suppose there's probably a class in the Standard Library that I should be using instead of built-in lists?
This is the best way to of doing it.
>>> lst.extend([None]*additional_size)
oops seems like I misunderstood your question at first. If you are asking how to expand the length of a list so you can insert something at an index larger than the current length of the list, then lst.extend([None]*(new_size - len(lst)) would probably be the way to go, as others have suggested. Of course, if you know in advance what the maximum index you will be needing is, it would make sense to create the list in advance and fill it with Nones.
For reference, I leave the original text: to insert something in the middle of the existing list, the usual way is not to worry about growing the list yourself. List objects come with an insert method that will let you insert an object at any point in the list. So instead of your set function, just use
lst.insert(item, index)
or you could do
lst[index:index] = item
which does the same thing. Python will take care of resizing the list for you.
There is not necessarily any class in the standard library that you should be using instead of list, especially if you need this sort of random-access insertion. However, there are some classes in the collections module which you should be aware of, since they can be useful for other situations (e.g. if you're always appending to one end of the list, and you don't know in advance how many items you need, deque would be appropriate).
Perhaps something like:
lst += [None] * additional_size
(you shouldn't call your list variable list, since it is also the name of the list constructor).

Categories