subtracting strings in array of data python

subtracting strings in array of data python - python

I am trying to do the following:
create an array of random data
create an array of predefined codes (AW, SS)
subtract all numbers as well as any instance of predefined code.
if a string called "HL" remains after step 3, remove that as well and take the next alphabet pair. If a string called "HL" is the ONLY string in the array then take that.
I do not know how to go about completing steps 3 - 4.
1.
array_data = ['HL22','PG1234-332HL','1334-SF-21HL','HL43--222PG','HL222AW11144RH','HLSSDD','SSDD']
2.
predefined_code = ['AW','SS']
3.
ideally, results for this step will look like
result_data = [['HL'],['PG,HL'],['SF','HL'],['HL','PG'],['HL','RH'],
['HL','DD'],['DD']
4. ideally, results for this step will look like this:
result_data = [['HL'],['PG'],['SF'],['PG'],['RH'], ['DD'],['DD']
for step 3, I have tried the following code
not_in_predefined = [item for item in array_data if item not in predefined_code]
but this doesnt produce the result im looking for, because it it checking item against item. not a partial string match.

This is fairly simple using Regex.
re.findall(r'[A-Z].',item) should give you the text from your strings, and then you can do the required processing on that.
You may want to convert the list to a set eventually and use the difference operation, instead of looping and removing the elements defined in the predefined_code list.

Related

Reduce a list in a specific way

I have a list of strings which looks like this:
['(num1, num2):1', '(num3, num4):1', '(num5, num6):1', '(num7, num8):1']
What I try to achieve is to reduce this list and combine every two elements and I want to do this until there is only one big string element left.
So the intermediate list would look like this:
['((num1, num2):1,(num3, num4):1)', '((num5, num6):1,(num7, num8):1)']
The complicated thing is (as you can see in the intermediate list), that two strings need to be wrapped in paranthesis. So for the above mentioned starting point the final result should look like this:
(((num_1,num_2):1,(num_3,num_4):1),((num_5,num_6):1,(num_7,num_8):1))
Of course this should work in a generic way also for 8, 16 or more string elements in the starting list. Or to be more precise it should work for an=2(n+1).
Just to be very specific how the result should look with 8 elements:
'((((num_1,num_2):1,(num_3,num_4):1),((num_5,num_6):1,(num_7,num_8):1)),(((num_9,num_10):1,(num_11,num_12):1),((num_13,num_14):1,(num_15,num_16):1)))'
I already solved the problem using nested for loops but I thought there should be a more functional or short-cut solution.
I also found this solution on stackoverflow:
import itertools as it
l = [map( ",".join ,list(it.combinations(my_list, l))) for l in range(1,len(my_list)+1)]
Although, the join isn't bad, I still need the paranthesis. I tried to use:
"{},{}".format
instead of .join but this seems to be to easy to work :).
I also thought to use reduce but obviously this is not the right function. Maybe one can implement an own reduce function or so?
I hope some advanced pythonics can help me.

Sounds like a job for the zip clustering idiom: zip(*[iter(x)]*n) where you want to break iterable x into size n chunks. This will discard "leftover" elements that don't make up a full chunk. For x=[1, 2, 3], n=2 this would yield (1, 2)
def reducer(l):
while len(l) > 1:
l = ['({},{})'.format(x, y) for x, y in zip(*[iter(l)]*2)]
return l
reducer(['(num1, num2):1', '(num3, num4):1', '(num5, num6):1', '(num7, num8):1'])
# ['(((num1, num2):1,(num3, num4):1),((num5, num6):1,(num7, num8):1))']

This is an explanation of what is happening in zip(*[iter(l)]*2)
[iter(l)*2] This creates an list of length 2 with two times the same iterable element or to be more precise with two references to the same iter-object.
zip(*...) does the extracting. It pulls:
Loop
the first element from the first reference of the iter-object
the second element from the second reference of the iter-object
Loop
the third element from the first reference of the iter-object
the fourth element from the second reference of the iter object
Loop
the fifth element from the first reference of the iter-object
the sixth element from the second reference of the iter-object
and so on...
Therefore we have the extracted elements available in the for-loop and can use them as x and y for further processing.
This is really handy.
I also want to point to this thread since it helped me to understand the concept.

how to create a list containing of 100 number of strings whose names are in series

a list of string objects is like
nodes=["#A_CN1","#A_CN2","#A_CN3","#A_CN4","#A_CN5","#A_CN6","#A_CN7","#A_CN8","#A_CN9","#A_CN10"]
Here in the above list there are 10 elements but i need to use around 100 elements and the element is like #A_CN100
Is there any way to represent it shortly rather than writing 100 times in python ?
If suppose there is a list of 100 elements where each element itself is a list like, node1 , node2.. all are some lists
nodes=[node1,node2,node3,node4,node5,node6....node100]
if I express this as
nodes=[node{0}.format(i) for i in range(1,101)]
But,this throws an error! How to rectify this?

A one liner with list comprehensions
nodes = ["#A_CN{0}".format(i) for i in range(1,101)]
There is also a suggestion in the comments that a generator version be demonstrated. It would look like this:
nodes = ("#A_CN{0}".format(i) for i in range(1,101))
But more commonly this is passed to list
nodes = list("#A_CN{0}".format(i) for i in range(1,101))
So we end up with the same result as the list comprehension. However the second form is useful if you want to generate about a million items.

You omitted quotes (or apostrophes). Instead of
nodes=[node{0}.format(i) for i in range(1,101)]
use
nodes=["node{0}".format(i) for i in range(1,101)]

How can I make a list that contains specific values from another list?

I am new to python and I need to take a list populated with numerical values and create a sublist containing specific values from that list.
The original list contains 16,419 individual numerical values. I need to make multiple lists from this original that contain 4500 values each.
Sample code (pretend the length of 'Original_List' is 16419):
Original_List = [46325, 54326, 32666, 32453, 54325, 32542, 38573]
First_4500_Numbers = []
First_4500_Numbers.append(List_of_16419_Indivudual_Numbers[0:4499])
The above code is creating a list that looks like this:
print(First_4500_Numbers)
[[46325, 54326, 32666, 32453, 54325, 32542, 38573]]
How can I get rid of this extra bracket on the outside of the list? It is causing downstream issues.
Thank you for any help!

List_of_16419_Indivudual_Numbers[0:4499]
returns a list. You don't need to append it to another one. Try just this:
Original_List = [46325, 54326, 32666, 32453, 54325, 32542, 38573]
First_4500_Numbers = Original_List[0:4499]
Then output will look like
>>> print(First_4500_Numbers)
[46325, 54326, 32666, 32453, 54325, 32542, 38573]

Introductory Python task from the edX MIT class

I have recently started learning Python in the MIT class on edX.
However, I have been having some trouble with certain exercises. Here is one of them:
"Write a procedure called oddTuples, which takes a tuple as input, and returns a new tuple as output, where every other element of the input tuple is copied, starting with the first one. So if test is the tuple ('I', 'am', 'a', 'test', 'tuple'), then evaluating oddTuples on this input would return the tuple ('I', 'a', 'tuple'). "
The correct code, according to the lecture, is the following:
def oddTuples(aTup):
'''
aTup: a tuple
returns: tuple, every other element of aTup.
'''
# a placeholder to gather our response
rTup = ()
index = 0
# Idea: Iterate over the elements in aTup, counting by 2
# (every other element) and adding that element to
# the result
while index < len(aTup):
rTup += (aTup[index],)
index += 2
return rTup
However, I have tried to solve it myself in a different way with the following code:
def oddTuples(aTup):
'''
aTup: a tuple
returns: tuple, every other element of aTup.
'''
# Your Code Here
bTup=()
i=0
for i in (0,len(aTup)-1):
if i%2==0:
bTup=bTup+(aTup[i],)
print(bTup)
print(i)
i+=1
return bTup
However, my solution does not work and I am unable to understand why (I think it should do essentially the same thing as the code the tutors provide).

I just like to add that the pythonic solution for this problem uses slices with a stepwidth and is:
newTuple = oldTuple[::2]
oldTuple[::2] has the meaning: Get copy of oldtuple from start (value is omitted) to end (omitted) with a spepwidth of 2.

I think I get the problem here.
In your for loop you specify two fixed values for i:
0
len(aTup)-1
Want you really want is the range of values from 0 to len(aTup)-1:
0
1
2
...
len(aTup)-1
In order to convert start and end values into all values in a range you need to use Python's range method:
for i in range(0,len(aTup)-1):
(Actually if you take a look into range's documentation, you will find out there is a third parameter called skip. If you use it your function becomes kind of irrelevant :))

Your code should read:
for i in range(0,len(aTup)):
# i=0, 1, 2 ..., len(aTup)-1.
rather than
for i in (0,len(aTup)-1):
# i=0 or i=len(aTup)-1.

The lines for i in (0,len(aTup)-1): and i+=1 aren't quite doing what you want. As in other answers, you probably want for i in range(0,len(aTup)-1): (insert range), but you also want to remove i+=1, since the for-in construct sets the value of i to each of the items in the iterable in turn.

Okay when running your code the output is the following:
('I', 'tuple')
This is because the problem in the code you wrote is the way you implement the for loop.
Instead of using:
for i in (0,len(aTup)-1):
You should change that to the following and your code will work:
for i in range(len(aTup)):
the range function basically creates a list of integers ranging from 0 to the length of your tuple - 1.
So your code should after editing it should look like:
def oddTuples(aTup):
bTup=()
for i in range(len(aTup)):
if i%2==0:
bTup=bTup+(aTup[i],)
return bTup

Python: removing specific lines from an object

I have a bit of a weird question here.
I am using iperf to test performance between a device and a server. I get the results of this test over SSH, which I then want to parse into values using a parser that has already been made. However, there are several lines at the top of the results (which I read into an object of lines) that I don't want to go into the parser. I know exactly how many lines I need to remove from the top each time though. Is there any way to drop specific entries out of a list? Something like this in psuedo-python
print list
["line1","line2","line3","line4"]
list = list.drop([0 - 1])
print list
["line3","line4"]
If anyone knows anything I could use I would really appreciate you helping me out. The only thing I can think of is writing a loop to iterate through and make a new list only putting in what I need. Anyway, thanlks!
Michael

Slices:
l = ["line1","line2","line3","line4"]
print l[2:] # print from 2nd element (including) onwards
["line3","line4"]
Slices syntax is [from(included):to(excluded):step]. Each part is optional. So you can write [:] to get the whole list (or any iterable for that matter -- string and tuple as an example from the built-ins). You can also use negative indexes, so [:-2] means from beginning to the second last element. You can also step backwards, [::-1] means get all, but in reversed order.
Also, don't use list as a variable name. It overrides the built-in list class.

This is what the slice operator is for:
>>> before = [1,2,3,4]
>>> after = before[2:]
>>> print after
[3, 4]
In this instance, before[2:] says 'give me the elements of the list before, starting at element 2 and all the way until the end.'
(also -- don't use reserved words like list or dict as variable names -- doing that can lead to confusing bugs)

You can use slices for that:
>>> l = ["line1","line2","line3","line4"] # don't use "list" as variable name, it's a built-in.
>>> print l[2:] # to discard items up to some point, specify a starting index and no stop point.
['line3', 'line4']
>>> print l[:1] + l[3:] # to drop items "in the middle", join two slices.
['line1', 'line4']

why not use a basic list slice? something like:
list = list[3:] #everything from the 3 position to the end

You want del for that
del list[:2]

You can use "del" statment to remove specific entries :
del(list[0]) # remove entry 0
del(list[0:2]) # remove entries 0 and 1

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

subtracting strings in array of data python - python

Related

Reduce a list in a specific way

how to create a list containing of 100 number of strings whose names are in series

How can I make a list that contains specific values from another list?

Introductory Python task from the edX MIT class

Python: removing specific lines from an object

Categories

Resources