Q6
4;99
3;4;8;9;14;18
2;3;8;12;18
2;3;11;18
2;3;8;18
2;3;4;5;6;7;8;9;11;12;15;16;17;18
2;3;4;8;9;10;11;13;18
1;3;4;5;6;7;13;16;17
2;3;4;5;6;7;8;9;11;12;14;15;18
3;11;18
2;3;5;8;9;11;12;13;15;16;17;18
2;5;11;18
1;2;3;4;5;8;9;11;17;18
3;7;8;11;13;14
2;3;8;18
2;13
2;3;5;8;9;11;12;13;18
2;3;4;9;11;12;18
2;3;5;9;11;18
1;2;3;4;5;6;7;8;9;11;14;15;16;17;18
2;3;8;11;13;18
import pandas as pd
df_1 = pd.read_csv('amazon_final 29082018.csv')
list_6 = list(df_1["Q6"])
list_6 = list(map(str, list_6))
list_7 = list(zip(list_6))
tem_list = []
for x in list_6:
if ('3' in x[0]):
tem_list.append('Fire')
else:
tem_list.append(None)
df_1.to_csv('final.csv', index=False)
I have many such columns in data.
I want to extract value '3' from this, the code which i wrote is give giving me 3 value along with 13,23,33 so on. I only want count of rows having value 3.
You need to break up the rows and convert each value to an integer. At the moment you are looking for the presence of the string "3" which is why strings like "2;13" pass the test. Try something like this:
list_6 = ["4;99", "3;4;8;9;14;18", "2;3;8;12;18", "2;3;11;18", "2;3;8;18",
"2;3;4;5;6;7;8;9;11;12;15;16;17;18", "2;3;4;8;9;10;11;13;18",
"1;3;4;5;6;7;13;16;17", "2;3;4;5;6;7;8;9;11;12;14;15;18", "3;11;18",
"2;3;5;8;9;11;12;13;15;16;17;18", "2;5;11;18", "1;2;3;4;5;8;9;11;17;18",
"3;7;8;11;13;14", "2;3;8;18", "2;13", "2;3;5;8;9;11;12;13;18",
"2;3;4;9;11;12;18", "2;3;5;9;11;18",
"1;2;3;4;5;6;7;8;9;11;14;15;16;17;18", "2;3;8;11;13;18"]
temp_list = []
for x in list_6:
numbers = [int(num_string) for num_string in x.split(';')]
if (3 in numbers):
temp_list.append('Fire')
else:
temp_list.append('None')
print(temp_list)
Related
I have some strings in a column and I want to explode the words out only if they are not within brackets. The column looks like this
pd.DataFrame(data={'a': ['first,string','(second,string)','third,string (another,string,here)']})
and I want the output to look like this
pd.DataFrame(data={'a': ['first','string','(second,string)','third','string','(another,string,here)']})
This sort of works, but i would like to not have to put the row number in each time
re.split(r',(?![^()]*\))', x['a'][0])
re.split(r',(?![^()]*\))', x['a'][1])
re.split(r',(?![^()]*\))', x['a'][2])
i thought i could do with a lmbda function but i cannot get it to work. Thanks for checking this out
x['a'].apply(lambda i: re.split(r',(?![^()]*\))', i))
It is not clear to me if the elements in your DataFrame may have multiple groups between brackets. Given that doubt, I have implemented the following:
import pandas as pd
import re
df = pd.DataFrame(data={'a': ['first,string','(second,string)','third,string (another,string,here)']})
pattern = re.compile("([^\(]*)([\(]?.*[\)]?)(.*)", re.IGNORECASE)
def findall(ar, res = None):
if res is None:
res = []
m = pattern.findall(ar)[0]
if len(m[0]) > 0:
res.extend(m[0].split(","))
if len(m[1]) > 0:
res.append(m[1])
if len(m[2]) > 0:
return findall(ar[2], res = res)
else:
return res
res = []
for x in df["a"]:
res.extend(findall(x))
print(pd.DataFrame(data={"a":res}))
Essentially, you recursively scan the last part of the match until you find no more words between strings. If order was not an issue, the solution is easier.
So I have a function which returns a List which contains either empty lists or Series. I loop through a list of tickers and for each it will return a empty list or Series and store them inside one list.
However, after looping through all I want to be able to drop the empty lists and only have the Series within the list.
def get_revenue_growth(ticker) -> pd.DataFrame:
income_statement_annually = fa.financial_statement_growth(ticker, FA_API_KEY, period="annual")
if 'revenueGrowth' in income_statement_annually.index:
revenue_growth = income_statement_annually.loc['revenueGrowth']
exchange_df = pd.DataFrame({ticker : revenue_growth})
exchange_df.index = pd.to_datetime(pd.Series(exchange_df.index))
exchange_df = exchange_df[exchange_df.index.year >= 1998]
exchange_df = exchange_df.sort_index()
print('Getting Revenue Growth for ' + ticker + ': Passed')
else:
print('Getting Revenue Growth for ' + ticker + ': Failed')
exchange_df = []
return exchange_df
This is the function I am calling via this:
revenue_growth = [get_revenue_growth(t) for t in tickers]
Here is what the output looks like...
So what I am trying to achieve is to remove all the empty lists. I tried this list2 = [x for x in list1 if x != []] but it did not work.
You can simply solve it via:
list2 = [x for x in list1 if len(x)>0]
Look at this Example -
mylist = []
if len(mylist) == 0:
del mylist # Deletes the Empty List
else:
# Do Something else
Modify this piece for your program
I have a string and I created a JSON array which contains strings and values:
amount = 0
a = "asaf,almog,arnon,elbar"
values_li={'asaf':'1','almog':'6','elbar':'2'}
How can I create a loop that will search all items on values_li in a and for each item it will find it will do
amount = amount + value(the value that found from value_li in a)
I tried to do this but it doesn't work:
for k,v in values_li.items():
if k in a:
amount = amount + v
It's working.
I figure out my problem.
v is a string and I tried to do math with a string so I had to convert v to an int
amount = amount + int(v)
Now It's working :)
You should be careful using:
if k in a:
a is the string: "asaf,almog,arnon,elbar" not a list. This means that:
"bar" in a # == True
"as" in a # == True
..etc Which is probably not what you want.
You should consider splitting it into an array, then you'll only get complete matches. With that you can simply use:
a = "asaf,almog,arnon,elbar".split(',')
values_li={'asaf':'1','almog':'6','elbar':'2'}
amount = sum([int(values_li[k]) for k in a if k in values_li])
# 9
collections.Counter() is your friend:
from collections import Counter
a = "asaf,almog,arnon,elbar"
values_li = Counter({'asaf':1,'almog':6,'elbar':2})
values_li.update(a.split(','))
values_li
That will result in:
Counter({'almog': 7, 'elbar': 3, 'asaf': 2, 'arnon': 1})
And if you want the sum of all values in values_li, you can simply do:
sum(values_li.values())
Which will result in 13, for the key/value pairs in your example.
i've just started studying python in college and i have a problem with this exercise:
basically i have to take a list of integers, like for example [10,2,2013,11,2,2014,5,23,2015], turn the necessary elements to form a date into a string, like ['1022013',1122014,5232015] and then put a / between the strings so i have this ['10/2/2013', '11/22/2014','05/23/2015']. It needs to be a function, and the length of the list is assumed to be a multiple of 3. How do i go about doing this?
I wrote this code to start:
def convert(lst):
...: for element in lst:
...: result = str(element)
...: return result
...:
but from a list [1,2,3] only returns me '1'.
To split your list into size 3 chunks you use a range with a step of 3
for i in range(0, len(l), 3):
print(l[i:i+3])
And joining the pieces with / is as simple as
'/'.join([str(x) for x in l[i:i+3]])
Throwing it all together into a function:
def make_times(l):
results = []
for i in range(0, len(l), 3):
results.append('/'.join([str(x) for x in l[i:i+3]]))
return results
testList = [10,2,2013,11,2,2014,5,23,2015]
def convert(inputList):
tempList = []
for i in range (0, len(inputList), 3): #Repeats every 3 elements
newDate = str(inputList[i])+"/"+str(inputList[i+1])+"/"+str(inputList[i+2]) #Joins everything together
tempList.append(newDate)
return tempList
print(convert(testList))
Saswata sux
Use datetime to extract the date and and strftime to format it:
from datetime import datetime
dates = [10,2,2013,11,2,2014,5,23,2015]
for i in range(0, len(dates), 3):
d = datetime(dates[i+2], dates[i], dates[i+1])
print(d.strftime("%m/%d/%y"))
OUTPUT
10/02/13
11/02/14
05/23/15
Something like this would work:
def convert(lst):
string = ''
new_lst = []
for x in lst:
if len(str(x)) < 4:
string += str(x)+'/'
else:
string += str(x)
new_lst.append(string)
string = ''
return(new_lst)
lst = [10,2,2013,11,2,2014,5,23,2015]
lst = convert(lst)
print(lst)
#output
['10/2/2013', '11/2/2014', '5/23/2015']
So create a placeholder string and a new list. Then loop through each element in your list. If the element is not a year, then add it to the string with a '/'. If it is a year, add the string to the new list and clear the string.
So here is my code updating many column values based on a condition of split values of the column 'location'. The code works fine, but as its iterating by row it's not efficient enough. Can anyone help me to make this code work faster please?
for index, row in df.iterrows():
print index
location_split =row['location'].split(':')
after_county=False
after_province=False
for l in location_split:
if l.strip().endswith('ED'):
df[index, 'electoral_district'] = l
elif l.strip().startswith('County'):
df[index, 'county'] = l
after_county = True
elif after_province ==True:
if l.strip()!='Ireland':
df[index, 'dublin_postal_district'] = l
elif after_county==True:
df[index, 'province'] = l.strip()
after_province = True
'map' was what I needed :)
def fill_county(column):
res = ''
location_split = column.split(':')
for l in location_split:
if l.strip().startswith('County'):
res= l.strip()
break
return res
df['county'] = map(fill_county, df['location'])