Problems with SQL and Python to extract values - python

I have a problem when i extract values from a DB with sqlite3. Everytime that i extract values ,for example, one number have this format:
[(6,)]
And is like tuple.
I want only the 6 value WITHOUT , and ( and [
Thanks for your help in advanced!

This is common in all SQL adapters. They always return a list of rows (tuples) even if you only fetchone, it comes back with a tuple, because it doesn't know if this is one value, or one row of values, so to stay constant... tuple.
x[0] is your first row
x[0][0] is your first item in that row.

Related

Wrong value of row obtained

I am using the python.mysql connector to access a MySQL table.
The MySQL database has a table policy_data which has a few columns, one being application_no.
The application_no column has values in the form of t000... totaling 8 digits including t.
Ideally, the first value of application_no column is t0000001.
So I pass a command (from Python):
cursor.execute(select* application_no from policy_data where...(some condition)
data = cursor.fetchall()
appl = data[0][0] # this should give me 't0000001'
Here's the problem: I tried the above command as it is on MySQL, and it gives me t0000001. But from Python (the above code), the value (appl=data[0][0]) is coming as t.
I even tried putting the received value inside str(), but it still doesn't work.
data=cursor.fetchall() returns a list of tuples (one tuple for each row of your table)
appl=data[0][0] returns the first element of the first tuple namely the value of first column of first row in your query result.
Given this, if column 'application_no' is second in your query result (and it is as you use * in your query) you will get the values of this column with data[i][1]
So if you check for aapl=data[0][1] it sould return your desired output 't0000001'
If I understand it correctly your SQL query returns a list of strings. By doing
aapl=data[0][0]
you grab the first string, and then its first character 't'
maybe give
aapl=data[0]
a try

Extract values from array type of column in pandas

I am trying to extract the location codes / product codes from a sql table using pandas. The field is an array type, i.e. it has multiple values as a list within each row. I have to extract values from string for product/location codes.
Here is a sample of the table
df.head()
Target_Type Constraints
45 ti_8188,to_8188,r_8188,trad_8188_1,to_9258,ti_9258,r_9258,trad_9258_1
45 ti_8188,to_8188,r_8188,trad_8188_1,trad_22420_1
45 ti_8894,trad_8894_0.2
Now I want to extract the numeric values of the codes. I also want to ignore the end float values after 2nd underscore in the entries, i.e. ignore the _1, _0.2 etc.
Here is a sample output I want to achieve. It should be unique list/df column of all the extracted values -
Target_Type_45_df.head()
Constraints
8188
9258
22420
8894
I have never worked with nested/array type of column before. Any help would be appreciated.
You can use explode to bring each variable into a single cell, under one column:
df = df.explode('Constraints')
df['newConst'] = df['Constraints'].apply(lambda x: str(x).split('_')[1])
I would think the following overall strategy would work well (you'll need to debug):
Define a function that takes a row as input (the idea being to broadcast this function with the pandas .apply method).
In this function, set my_list = row['Constraints'].
Then do my_list = my_list.split(','). Now you have a list, with no commas.
Next, split with the underscore, take the second element (index 1), and convert to int:
numbers = [int(element.split('_')[1]) for element in my_list]
Finally, convert to set: return set(numbers)
The output for each row will be a set - just union all these sets together to get the final result.

Best way to fuzzy match values in a data frame and then replace the value?

I'm working with a dataframe containing various datapoints of customer data. I'm looking to essentially replace any junk phone numbers as a blank value, right now I'm struggling to find an efficient way to find potential junk values such as a phone number like 111-111-1111 and replace that specific value with a blank entry.
I currently have a fairly ugly solution where I'm going through 3 fields; home phone, cell phone and work phone, locating the index values of the rows in question and respective column and then am replacing those,
with regards to actually finding junk values in a dataframe, is there a better approach to this than what I am currently doing?
row_index = dataset[dataset['phone'].str.contains('11111')].index
column_index = dataset.columns.get_loc('phone')
Afterwards, I would zip these up and cycle through a for loop, using dataset.iat[row_index, column_index] = ''. The row and column index variables would also have the junk values in the 'cellphone' and 'workphone' columns appended on as well.
Pandas 'where' function tends to be quick:
dataset['phone'] = dataset['phone'].where(~dataset['phone'].str.contains('11111'),
None)

Query external tuples list embedded within a SQL query

I have to run a sql query that grabs the values only if two conditions are true. So for example, I need to grab all the values where asset=x and id_name=12345. There are about 10k combinations between asset and id_name that I need to be able to query for using sql. Usually I would just do the following:
select * from database where id_name IN (12345)
But how do I make this query when two conditions have to be true. id_name has to equal 12345 AND asset has to equal x.
I tried turning the list i need into tuples like this:
new_list = list(scams[['asset', 'id_name']].itertuples(index=False, name=None))
which gives me a list like this:
new_list = (12345, x), (32342, z)...etc.
Any suggestions would be great. Thanks!
Based on my understanding you need to query or fetch records based on a combination of two filters. Also you have around 10K combinations. Here is a simple SQL based solution.
Create a new column in the same table or build a temp table/view with a new column say "column_new". Populate the concatenated value of id_name and asset in the new column. You can use a concatenation function based on the database. Example in SQL server use CONCAT(column1,column2).
Now you can write your SQL as select * from database where colum_new IN ("12345x","32342z");.
Note : You can also use a "-" or "|" between column 1 and column 2 while doing a concatenation.

SQL Alchemy Filter rows based on the values contained in cells of other column

I am new to python and SQLALCHEMY, and I came across this doubt, whether can we filter rows of the table based on cell values of the column of same table.
example:
Sbranch=value
result=Transaction.query.filter(Transaction.branch==Sbranch)
.order_by(desc(Transaction.id)).limit(50).all()
if the value of Sbranch=0, i want to read all the rows regardless of Sbranch value, else i want to filter rows with contains Transaction.branch==Sbranch.
I know that it can be achieved by comparing the values of Sbranch(if-else conditions),but it gets complicated as the number of such columns increases.
Example:
Sbranch=value1
trans_by=value2
trans_to=value3
.
.
result=Transaction.query.filter(Transaction.branch==Sbranch,Transaction.trans_by==value2,Transaction_to==trans_to)
.order_by(desc(Transaction.id)).limit(50).all()
I want to apply similar filter with all 3 columns.
I want to know if there is any inbuilt function in SQLALCHEMY to this problem.
You can optionally add the filter based on the value of SBranch
query = Transaction.query
if SBranch != 0:
query = query.filter(Transaction.branch == SBranch)
result = query.order_by(Transaction.id.desc()).limit(50).all()
I think i found the solution, it's not the best but will reduce the work for the developers(not the processor).
Sbranch=value
branches=[]
if Sbranch==0:
# Append all the values into the array for which the rows are filtered
# for example:
branches=[1,2,4,7,3,8]
else:
branches.append(branch)
result=Transaction.query.filter(Transaction.branch.in_(branches))
.order_by(desc(Transaction.id)).limit(50).all()

Categories