Passing a Python variable to R using rpy2 - python

I have basic R script that performs a GLM on a MySQL dataset. This runs fine using Rscript in bash. However I would like to call it within a python script so I can add it to a loop, I can create the sql statement but I can't seem to pass it to R using rpy2;
for word in words:
sql_scores = "select a.article_id, response, score from scores as a join profile as b on a.article_id = b.article_id where response in (1,0) and keyword = '%s';" % (word[0])
robjects.r("library(RMySQL)")
robjects.r("mydb = dbConnect(MySQL(), user='me', password='xxxx', host='aws.host', dbname='mydb')")
robjects.r("results = fetch(dbSendQuery(mydb, '%s'))") % (sql_scores)
robjects.r("model <- glm(response ~ score , data=results, family=binomial)")
robjects.r("summary(model)")
If I print sql_scores I can run this fine directly in MySQL. However Python produces this error;
Loading required package: DBI
Traceback (most recent call last):
File "keyword_searcher.py", line 30, in <module>
robjects.r("results = fetch(dbSendQuery(mydb, '%s'))") % (sql_scores)
File "/usr/local/lib/python2.7/dist-packages/rpy2/robjects/__init__.py", line 268, in __call__
p = rinterface.parse(string)
ValueError: Error while parsing the string.
I can't figure out the proper syntax for:
robjects.r("results = fetch(dbSendQuery(mydb, %s))") % (sql_scores)

Use double quotes around the "%s" and single quotes around the robjects.r string:
robjects.r('results = fetch(dbSendQuery(mydb, "%s"))') % (sql_scores)
or use the format() method:
robjects.r('fetch(dbSendQuery(mydb, {0}))'.format(sql_scores))

You can access variables in the R environment with robjects.globalenv['varname'].
So an alternative way is:
robjects.globalenv['sql_scores'] = sql_scores
robjects.r("results = fetch(dbSendQuery(mydb, sql_scores))")

Related

CiscoConfigParser & Functions

I am trying to make the code cleaner and better manageable and i wanted to start with reading a cisco file. However when i try to put it in a function, it is not able to give me the outputs. The same works perfectly out of a function
Working model
parse = CiscoConfParse("C:\\python\\mydata\\TestConfigFile.txt")
TCPSrv = parse.find_objects("service\stcp\sdestination\seq")
UDPSrv = parse.find_objects("service\sudp\sdestination\seq")
ObjectNetwork = parse.find_objects("^object\snetwork\s")
ObjectGroupSrv = parse.find_objects("^object-group\sservice")
ObjectGroupNetwork = parse.find_objects("^object-group\snetwork\s")
This creates a list for all the above like the one below
TCPSrv = [<IOSCfgLine # 83 ' service tcp destination eq https' (parent is # 82)>,<IOSCfgLine # 97 ' service tcp destination eq www '(parent is # 102)>]
But when i put this into a function, it does not work. This is the first time i am trying out to use functions, and I know that i am doing something wrong.
This is my code for Functions
def cisco(filename):
parse = CiscoConfParse(filename)
TCPSrv = parse.find_objects("service\stcp\sdestination\seq")
UDPSrv = parse.find_objects("service\sudp\sdestination\seq")
ObjectNetwork = parse.find_objects("^object\snetwork\s")
ObjectGroupSrv = parse.find_objects("^object-group\sservice")
ObjectGroupNetwork = parse.find_objects("^object-group\snetwork\s")
return TCPSrv, UDPSrv, ObjectNetwork, ObjectGroupSrv, ObjectGroupNetwork
file = C:\\python\\mydata\\TestConfigFile.txt
cisco(file)
This does not give any output.
>>> TCPSrc
Traceback (most recent call last):
File "<input>", line 1, in <module>
NameError: name 'TCPSrc' is not defined
I have tried putting it like this below also
cisco("C:\\python\\mydata\\TestConfigFile.txt")
Can someone kindly assist what I am doing wrong.
This does not give any output
>>> TCPSrc
Traceback (most recent call last):
File "", line 1, in
NameError: name 'TCPSrc' is not defined
You have not assigned the return values to anything. When you call cisco(), you need to assign the return values to something... please use:
from ciscoconfparse import CiscoConfParse
def cisco(filename):
parse = CiscoConfParse(filename)
TCPSrv = parse.find_objects("service\stcp\sdestination\seq")
UDPSrv = parse.find_objects("service\sudp\sdestination\seq")
ObjectNetwork = parse.find_objects("^object\snetwork\s")
ObjectGroupSrv = parse.find_objects("^object-group\sservice")
ObjectGroupNetwork = parse.find_objects("^object-group\snetwork\s")
return TCPSrv, UDPSrv, ObjectNetwork, ObjectGroupSrv, ObjectGroupNetwork
values = cisco("C:\\python\\mydata\\TestConfigFile.txt")
TCPsrv = values[0]
UDPsrv = values[1]
# ... etc unpack the remaining values as illustrated above

Cannot generate subsets of feature class with arcpy (ArcGIS library in Python 2.7)

I'm having a hard time here on processing GIS data in Python, using library ArcPy.
I've been trying to generate independent features from a feature class based on a field of the attribute table which is a unique code representing productive forest units, but I can't get it done.
I've already done this in other situations, but this time I don't know what I am missing.
Here is the code and the error I get:
# coding utf-8
import arcpy
arcpy.env.overwriteOutput = True
ws = r'D:\Projeto_VANT\SIG\proc_parc.gdb'
arcpy.env.workspace = ws
talhoes = r'copy_talhoes'
estados = ('SP', 'MG')
florestas = ('PROPRIA', 'PARCERIA')
arcpy.MakeFeatureLayer_management(talhoes,
'talhoes_layer',
""" "ESTADO" IN {} AND "FLORESTA" IN {} """.format(estados, florestas),
ws)
arcpy.FeatureClassToFeatureClass_conversion(in_features = 'talhoes_layer',
out_path = ws,
out_name = 'talhoes1')
talhoes1 = r'talhoes1'
arcpy.AddField_management(talhoes1, 'CONCAT_T', 'TEXT')
arcpy.CalculateField_management(talhoes1, 'CONCAT_T', """ [ESTADO] & "_" & [CODIGO] & "_" & [TALHAO] """, 'VB')
with arcpy.da.SearchCursor(talhoes1, ['CONCAT_T', 'AREA']) as tal_cursor:
for x in tal_cursor:
print(x[0] + " " + str(x[1])) # This print is just to check if the cursor works and it does!
arcpy.MakeFeatureLayer_management(x,
'teste',
""" CONCAT_T = '{}' """.format(str(x[0]))) # Apparently the problem is here!
arcpy.CopyFeatures_management('teste',
'Layer{}'.format(x[0]))
Here is the error:
Traceback (most recent call last):
File "D:/ArcPy_Classes/Scripts/sampling_sig.py", line 32, in <module>
""" CONCAT_T = '{}' """.format(str(x[0])))
File "C:\Program Files (x86)\ArcGIS\Desktop10.5\ArcPy\arcpy\management.py", line 6965, in MakeFeatureLayer
raise e
RuntimeError: Object: Error in executing tool
I think the issue is with your In feature. you will want your in feature to be talhoes1 since x is the cursor object and not a feature.
arcpy.MakeFeatureLayer_management(talhoes1,'teste',""" CONCAT_T =
'{}'""".format(str(x[0])))

Python scripting with ete3 to query NCBI's Taxonomy: "sqlite3 Warning (can only execute one statement at a time)"

I am using this script:
import csv
import time
import sys
from ete3 import NCBITaxa
ncbi = NCBITaxa()
def get_desired_ranks(taxid, desired_ranks):
lineage = ncbi.get_lineage(taxid)
names = ncbi.get_taxid_translator(lineage)
lineage2ranks = ncbi.get_rank(names)
ranks2lineage = dict((rank,taxid) for (taxid, rank) in lineage2ranks.items())
return{'{}_id'.format(rank): ranks2lineage.get(rank, '<not present>') for rank in desired_ranks}
if __name__ == '__main__':
file = open(sys.argv[1], "r")
taxids = []
contigs = []
for line in file:
line = line.split("\n")[0]
taxids.append(line.split(",")[0])
contigs.append(line.split(",")[1])
desired_ranks = ['superkingdom', 'phylum']
results = list()
for taxid in taxids:
results.append(list())
results[-1].append(str(taxid))
ranks = get_desired_ranks(taxid, desired_ranks)
for key, rank in ranks.items():
if rank != '<not present>':
results[-1].append(list(ncbi.get_taxid_translator([rank]).values())[0])
else:
results[-1].append(rank)
i = 0
for result in results:
print(contigs[i] + ','),
print(','.join(result))
i += 1
file.close()
The script takes taxids from a file and fetches their respective lineages from a local copy of NCBI's Taxonomy database. Strangely, this script works fine when I run it on small sets of taxids (~70, ~100), but most of my datasets are upwards of 280k taxids and these break the script.
I get this complete error:
Traceback (most recent call last):
File "/data1/lstout/blast/scripts/getLineageByETE3.py", line 31, in <module>
ranks = get_desired_ranks(taxid, desired_ranks)
File "/data1/lstout/blast/scripts/getLineageByETE3.py", line 11, in get_desired_ranks
lineage = ncbi.get_lineage(taxid)
File "/data1/lstout/.local/lib/python2.7/site-packages/ete3/ncbi_taxonomy/ncbiquery.py", line 227, in get_lineage
result = self.db.execute('SELECT track FROM species WHERE taxid=%s' %taxid)
sqlite3.Warning: You can only execute one statement at a time.
The first two files from the traceback are simply the script I referenced above, the third file is one of ete3's. And as I stated, the script works fine with small datasets.
What I have tried:
Importing the time module and sleeping for a few milliseconds/hundredths of a second before/after my offending lines of code on lines 11 and 31. No effect.
Went to line 227 in ete3's code...
result = self.db.execute('SELECT track FROM species WHERE taxid=%s' %merged_conversion[taxid])
and changed the "execute" function to "executescript" in order to be able to handle multiple queries at once (as that seems to be the problem). This produced a new error and led to a rabbit hole of me changing minor things in their script trying to fudge this to work. No result. This is the complete offending function:
def get_lineage(self, taxid):
"""Given a valid taxid number, return its corresponding lineage track as a
hierarchically sorted list of parent taxids.
"""
if not taxid:
return None
result = self.db.execute('SELECT track FROM species WHERE taxid=%s' %taxid)
raw_track = result.fetchone()
if not raw_track:
#perhaps is an obsolete taxid
_, merged_conversion = self._translate_merged([taxid])
if taxid in merged_conversion:
result = self.db.execute('SELECT track FROM species WHERE taxid=%s' %merged_conversion[taxid])
raw_track = result.fetchone()
# if not raise error
if not raw_track:
#raw_track = ["1"]
raise ValueError("%s taxid not found" %taxid)
else:
warnings.warn("taxid %s was translated into %s" %(taxid, merged_conversion[taxid]))
track = list(map(int, raw_track[0].split(",")))
return list(reversed(track))
What bothers me so much is that this works on small amounts of data! I'm running these scripts from my school's high performance computer and have tried running on their head node and in an interactive moab scheduler. Nothing has helped.

Interpolating R objects into R code strings

I am using rpy2 to run some R commands. Dont ask why. It's necessary at this moment. So here's a part of the code.
import pandas.rpy.common as com
from rpy2.robjects import r
#Load emotionsCART decision tree. Successful.
r_dataframe = com.convert_to_r_dataframe(data)
print type(r_dataframe)
(<class 'rpy2.robjects.vectors.DataFrame'>)
r('pred = predict(emotionsCART, newdata = %s)') %(r_dataframe)
Here what I want to do is pass this r_dataframe into the calculation. I'm using the decision tree that I'd loaded earlier to predict the values. But the last line gives me an error. It says
Traceback (most recent call last):
File "<pyshell#38>", line 1, in <module>
r('pred = predict(emotionsCART, newdata = %s)') %(r_dataframe)
File "C:\Python27\lib\site-packages\rpy2\robjects\__init__.py", line 245, in __call__
p = rinterface.parse(string)
ValueError: Error while parsing the string.
Ideas why this is happening?
I think that:
r('pred = predict(emotionsCART, newdata = %s)') %(r_dataframe)
SHould be:
r('pred = predict(emotionsCART, newdata = %s)' % (r_dataframe) )
The %(r_dataframe) was associated with the r() part, while it should be associated with the '' (string).
But it is hard to check without a reproducible example.

Weird TypeError in jython/python?

I'm bukkit jython/python plugin programmer. Last few days, I'm struggling with this problem. I have to add an potion effect to an user, it's not problem if I enter effect name, duration and amplifier manually in code, but when I want to get them from config, I get this error:
13:38:20 [SEVERE] Could not pass event PlayerInteractEvent to ItemEffect v1.0
Traceback (most recent call last):
File "<iostream>", line 126, in onPlayerInteractEvent
TypeError: addPotionEffect(): 1st arg can't be coerced to org.bukkit.potion.Poti
onEffect
Here's that part of code:
effectname = section.getString("%s.effect"%currentKey)
duration = section.getInt("%s.duration"%currentKey)
durationinticks = duration * 20
geteffectname = "PotionEffectType.%s"%effectname
getpotioneffect = "PotionEffect(%s, %i, 1)"%(geteffectname, durationinticks)
geteffectname = "PotionEffectType.%s"%effectname
if iteminhand == currentKey:
event.getPlayer().addPotionEffect(getpotioneffect)
When I print getpotioneffect out, I get:
13:38:20 [INFO] PotionEffect(PotionEffectType.SPEED, 600, 1)
which is okay, and should work. I tested it without getting informations from config, and it works perfectly... To sum up, code above is not working, but below one works:
getpotioneffect = PotionEffect(PotionEffectType.SPEED, 600, 1)
if iteminhand == currentKey:
event.getPlayer().addPotionEffect(getpotioneffect)
Link to javadocs of this event!
http://jd.bukkit.org/rb/apidocs/org/bukkit/entity/LivingEntity.html#addPotionEffect(org.bukkit.potion.PotionEffect)
Thanks!
In your first snippet, getpotioneffect is a string. You can check it adding print type(getpotioneffect) somewhere.
What you want is to replace this :
geteffectname = "PotionEffectType.%s"%effectname
getpotioneffect = "PotionEffect(%s, %i, 1)"%(geteffectname, durationinticks)
with this:
effect_type = getattr(PotionEffectType, effectname)
potion_effect = PotionEffect(effect_type, durationinticks, 1)

Categories