How to extract values from complex list forms

How to extract values from complex list forms - python

I have a list
result_list=['hello(1,2)','bye(3,4)']
How to get a string
\>>hello
\>>1
\>>2

i did not understand your question but i am trying to help you
i think you want the result be like this
1- the word that contains numbers
2- the numbers
i think this code will help:
for i in result_list:
print(i.split("(")[0])
print(i.split("(")[-1].split(")")[0].split(",")[0])
print(i.split("(")[-1].split(")")[0].split(",")[1])

Related

How to iterate over a list while applying a regex to it in Python?

I'm trying to run this block of code.
castResultType = list(filterResultSet[1:32:5])
cleanList = []
for i in castResultType:
cleanList.append(re.sub('[^\d\.]', '',castResultType[i]))
print(cleanList)
I'm hoping to get essentially each item in the list castResultType, which is getting specific values from the list filterResultSet inserted into the empty list cleanList. While also having the regex above applied to each value from castResultType before it's inserted into cleanList. I'm sure I'm doing something wrong, and would appreciate any assistance. I'm also very new to python and programming in general, so I apologize if I'm asking something stupid.

The problem you're (probably) facing is in this line: for i in castResultType.
What this line is doing is looping over the values within our (list) castResultType and assigning its elements to i. However, when you call castResultType[i], you're asking for a non-existent index of the list.
What you probably meant to do is:
for i in range(len(castResultType)):
cleanList.append(re.sub('[^\d\.]', '',castResultType[i]))
# etc
What you could also do is:
for i in castResultType:
cleanList.append(re.sub('[^\d\.]', '', i))
# etc
There are a lot of other ways of coding what you want to do, but these are probably the most familiar for a beginner.
In addition to the bug in your looping, the problem could be in your use of regular expressions. I recommend experimenting with your regular expression on a single element from castResultType before looping through.

You can use the built-in map function to map a function to items in an iterable:
clean_list = list(map(lambda elem: re.sub('[^\d\.]', '', elem), castResultType))

Data inside a JSON has letters and numbers I do not need, how to get data I need in Python

I am looking at extracting data from within a JSON file, but the data I need has numbers and letters before and sometimes after the data. I would like to know if it is possible to remove the unnecessary numbers and letter I do not need. Here is an example of the data:
"most_common_aircraft":[{"planned_aircraft":"B738/L","dcount":4592},{"planned_aircraft":"H/B744/L","dcount":3639},{"planned_aircraft":"H/B77L/L","dcount":2579},{"planned_aircraft":"H/B772/L","dcount":1894},{"planned_aircraft":"H/B763/L","dcount":1661},{"planned_aircraft":"H/B748/L","dcount":1303},{"planned_aircraft":"B712/L","dcount":1289},{"planned_aircraft":"B739/L","dcount":1198},{"planned_aircraft":"H/B77W/L","dcount":978},{"planned_aircraft":"B738","dcount":957}]
"H/B77L/L , B752/L, A320/X, B738,"
all I am interested in is the main 4 letters/numbers, for example instead of "H/B77L/L" I want just "B77L", instead of "B752/L" I want "B752". The data is very mixed, so some will have a letters in front, some at the end and some with both, then there are others that are already in the correct format I want. Is there a way to remove the additional letters during the extracting of data from a JSON file using Python, if not would it be better as I am using Pandas to extracting them all to a dataframe then compare it to another dataframe which has the correct sequence without the additional letters?

I have managed to find the answer and solve my problem. I will put it here so to help others that may have a similar problem -
for entry in json_data['results']:
for value in entry['most_common_aircraft']:
for splitted_string in value['planned_aircraft'].split('/'):
if len(splitted_string) == 4:
value['planned_aircraft'] = splitted_string

How to take multiple inputs from user in a single line & add them together

Pretty much I'm trying to make a GPA calculator. I don't want anyone to just do the whole thing for me because I'm trying to figure out how to get 8 different values from the user in one line and add them together into one value. Most of the answers I've found online only talk about adding 2 values together so it's not of very much use to me...
I've tried using the ".split" function but really that's about it I'm new to python and dont have the background knowledge to really try much else.
No code, just need help with this problem
The expected result is to ask the user to put in 8 different grades between 0 and 100, then add them together into one value to later be divided.

If the GPAs come in in this format:
'3.3 3.6 2.7'
then you can read it in like this:
gpas = input('Please enter the GPAs in one line separated by spaces').split(' ')
and then you can loop through them (since split() returns a list), convert them to floats, and add them up, like so:
sum = 0
for gpa in gpas:
sum += float(gpa)

From what I read, I see you're getting the user's input as a string, from what you want to get the numbers the user entered and then operate with them, your problem being getting each separate number from the input. I think this other question on SO may help. Once you get each 'word' as an element of the array, you should convert each element to an int, getting the desired result.
Hope this helps!

Using Python & NLP, how can I extract certain text strings & corresponding numbers preceding the strings from Excel column having a lot of free text?

I am relatively new to Python and very new to NLP (and nltk) and I have searched the net for guidance but not finding a complete solution. Unfortunately the sparse code I have been playing with is on another network, but I am including an example spreadsheet. I would like to get suggested steps in plain English (more detailed than I have below) so I could first try to script it myself in Python 3. Unless it would simply be easier for you to just help with the scripting... in which case, thank you.
Problem: A few columns of an otherwise robust spreadsheet are very unstructured with anywhere from 500-5000 English characters that tell a story. I need to essentially make it a bit more structured by pulling out the quantifiable data. I need to:
1) Search for a string in the user supplied unstructured free text column (The user inputs the column header) (I think I am doing this right)
2) Make that string a NEW column header in Excel (I think I am doing this right)
3) Grab the number before the string (This is where I am getting stuck. And as you will see in the sheet, sometimes there is no space between the number and text and of course, sometimes there are misspellings)
4) Put that number in the NEW column on the same row (Have not gotten to this step yet)
I will have to do this repeatedly for multiple keywords but I can figure that part out, I believe, with a loop or something. Thank you very much for your time and expertise...

If I'm understanding this correctly, first we need to obtain the numbers from the string of text.
cell_val = sheet1wb1.cell(row=rowNum,column=4).value
This will create a list containing every number in the string
new_ = [int(s) for s in cell_val.split() if s.isdigit()]
print(new_)
You can use the list to assign the values to the column.
Then define the value of the 1st number in the list to the 5th column
sheet1wb1.cell(row=rowNum, column=5).value = str(new_[1])

I think I have found what I am looking for. https://community.esri.com/thread/86096 has 3 or 4 scripts that seem to do the trick. Thank you..!

Python inserting lists into a list with given length of the list

My problem is, that need a list with length of 6:
list=[[],[],[],[],[],[]]
Ok, that's not difficult. Next I'm going to insert integers into the list:
list=[[60],[47],[0],[47],[],[]]
Here comes the real problem: How can I now extend the lists and fill them again and so on, so that it looks something like that:
list=[[60,47,13],[47,13,8],[1,3,1],[13,8,5],[],[]]
I can't find a solution, because at the beginning i do not know the length of each list, I know, they are all the same, but I'm not able to say what length exactly they will have at the end, so I'm forced to add an element to each of these lists, but for some reason i can't.
Btw: This is not a homework, it's part of a private project :)

You don't. You use normal list operations to add elements.
L[0].append(47)

Don't use the name list for your variable it conflicts with the built-in function list()
my_list = [[],[],[],[],[],[]]
my_list[0].append(60)
my_list[1].append(47)
my_list[2].append(0)
my_list[3].append(47)
print my_list # prints [[60],[47],[0],[47],[],[]]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to extract values from complex list forms - python

I have a list result_list=['hello(1,2)','bye(3,4)'] How to get a string \>>hello \>>1 \>>2

Related

How to iterate over a list while applying a regex to it in Python?

Data inside a JSON has letters and numbers I do not need, how to get data I need in Python

How to take multiple inputs from user in a single line & add them together

Using Python & NLP, how can I extract certain text strings & corresponding numbers preceding the strings from Excel column having a lot of free text?

Python inserting lists into a list with given length of the list

Categories

Resources