Parametrized complex string working with gpkg in python - python

I need to construct a chain of text like this:
out = 'ogr:dbname=\'C:\\output\\2020.gpkg\' table=\"2020\" (geom) sql='
Here is my code:
import glob, time, sys, threading, os
from datetime import date, timedelta, datetime
import time, threading
#Parameters
layer = 'C:\\layer.gpkg'
n ='2020'
outdir = 'C:\\output'
#Process
l = os.path.realpath(layer)
pn = os.path.realpath(outdir + '/' + n + '.gpkg')
p = f"'{pn}'"
f = f"'{n}'"
o = f'ogr:dbname={p} table={f} (geom) sql='
#Test
out = 'ogr:dbname=\'C:\\output\\2020.gpkg\' table=\"2020\" (geom) sql='
o == out
The goal is to get o == out.
What do I need to change in the #Process part in order to get this as True ?
Moreover I need to run this either in linux or windows.
My final goal is to create a function that give 3 strings returns the complex string line shown above.

Assuming you are using python 3.6 or above you should use format strings (also known as f strings) to construct strings from variables. Start the string with the letter "f" and then put whatever variables you want in curly brackets {}. Also if you use single quotes as the outer quote then you don't have to escape double quotes and vice versa.
Code:
db_name = "'home/user/output/prueba.gpkg'"
table_name = '"prueba"'
outputlayer = f'ogr:dbname={db_name} table={table_name} (geom) sql='
outputlayer
Output:
'ogr:dbname=\'home/user/output/prueba.gpkg\' table="prueba" (geom) sql='

I think one of the issues that this isn't working is your path here pn = os.path.realpath(outdir + '/' + n + '.gpkg'). This is trying to combine UNIX path / with windows path \\. A more robust solution in terms of portability between linux and windows would be to use the path.join function in os module.
Additionally, python f strings will only add escapes to whichever quote character you used to open the string (' or "). If the escaped quotes around both strings are necessary, you're probably better off hard coding it into an f-string instead of setting 2 new variables with different quote types.
import glob, time, sys, threading, os
from datetime import date, timedelta, datetime
import time, threading
#Parameters
layer = 'C:\\layer.gpkg'
n ='2020'
outdir = 'C:\\output'
#Process
l = os.path.realpath(layer)
pn = os.path.realpath(os.path.join(outdir, f"{n}.gpkg"))
o = f'ogr:dbname=\'{pn}\' table=\"{n}\" (geom) sql='
#Test
out = 'ogr:dbname=\'C:\\output\\2020.gpkg\' table=\"2020\" (geom) sql='
o == out
A version of this (different path) has been tested to work on my linux machine.

Another option is to use a triple quoted string:
dbname = """/home/user/output/prueba.gpkg"""
outputlayer = """ogr:dbname='"""+dbname+"""' table="prueba" (geom) sql="""
which gives:
'ogr:dbname=\'/home/user/output/prueba.gpkg\' table="prueba" (geom) sql='

Related

Is the for loop in my code the speed bottleneck?

The following code looks through 2500 markdown files with a total of 76475 lines, to check each one for the presence of two strings.
#!/usr/bin/env python3
# encoding: utf-8
import re
import os
zettelkasten = '/Users/will/Dropbox/zettelkasten'
def zsearch(s, *args):
for x in args:
r = (r"(?=.* " + x + ")")
p = re.search(r, s, re.IGNORECASE)
if p is None:
return None
return s
for filename in os.listdir(zettelkasten):
if filename.endswith('.md'):
with open(os.path.join(zettelkasten, filename),"r") as fp:
for line in fp:
result_line = zsearch(line, "COVID", "vaccine")
if result_line != None:
UUID = filename[-15:-3]
print(f'›[[{UUID}]] OR', end=" ")
This correctly gives output like:
›[[202202121717]] OR ›[[202003311814]] OR
, but it takes almost two seconds to run on my machine, which I think is much too slow. What, if anything, can be done to make it faster?
The main bottleneck is the regular expressions you're building.
If we print(f"{r=}") inside the zsearch function:
>>> zsearch("line line covid line", "COVID", "vaccine")
r='(?=.* COVID)'
r='(?=.* vaccine)'
The (?=.*) lookahead is what is causing the slowdown - and it's also not needed.
You can achieve the same result by searching for:
r=' COVID'
r=' vaccine'

Python match '.\' at start of string

I need to identify where some powershell path strings cross over into Python.
How do I detect if a path in Python starts with .\ ??
Here's an example:
import re
file_path = ".\reports\dsReports"
if re.match(r'.\\', file_path):
print "Pass"
else:
print "Fail"
This Fails, in the debugger it lists
expression = .\\\\\\
string = .\\reports\\\\dsReports
If I try using replace like so:
import re
file_path = ".\reports\dsReports"
testThis = file_path.replace(r'\', '&jkl$ff88')
if re.match(r'.&jkl$ff88', file_path):
print "Pass"
else:
print "Fail"
the testThis variable ends up like this:
testThis = '.\\reports&jkl$ff88dsReports'
Quite agravating.
The reason this is happening is because \r is an escape sequence. You will need to either escape the backslashes by doubling them, or use a raw string literal like this:
file_path = r".\reports\dsReports"
And then check if it starts with ".\\":
if file_path.startswith('.\\'):
do_whatever()

How to use PyParsing's QuotedString?

I'm trying to parse a string which contains several quoted values. Here is what I have so far:
from pyparsing import Word, Literal, printables
package_line = "package: name='com.sec.android.app.camera.shootingmode.dual' versionCode='6' versionName='1.003' platformBuildVersionName='5.0.1-1624448'"
package_name = Word(printables)("name")
versionCode = Word(printables)("versionCode")
versionName = Word(printables)("versionName")
platformBuildVersionName = Word(printables)("platformBuildVersionName")
expression = Literal("package:") + "name=" + package_name + "versionCode=" + versionCode \
+ "versionName=" + versionName + "platformBuildVersionName=" + platformBuildVersionName
tokens = expression.parseString(package_line)
print tokens['name']
print tokens['versionCode']
print tokens['versionName']
print tokens['platformBuildVersionName']
which prints
'com.sec.android.app.camera.shootingmode.dual'
'6'
'1.003'
'5.0.1-1624448'
Note that all the extracted tokens are contains within single quotes. I would like to remove these, and it seems like the QuotedString object is meant for this purpose. However, I'm having difficulty adapting this snippet to use QuotedStrings; in particular, their constructor doesn't seem to take printables.
How might I go about removing the single quotes?
Replacing the expressions with the following:
package_name = QuotedString(quoteChar="'")("name")
versionCode = QuotedString(quoteChar="'")("versionCode")
versionName = QuotedString(quoteChar="'")("versionName")
platformBuildVersionName = QuotedString(quoteChar="'")("platformBuildVersionName")
seems to work. Now the script prints the output
com.sec.android.app.camera.shootingmode.dual
6
1.003
5.0.1-1624448
without quotation marks.

How to specify python regular expression that starts with "testing" without declaring the content variable?

I have written following python code, to uninstall "testing_kip-win32" software in windows.
import wmi
import re
c = wmi.WMI()
print ("Searching for matching products...")
for product in c.Win32_Product(Name = "testing_kip-win32"):
print ("Uninstalling" + product.Name + "...")
result = product.Uninstall()
But in the above code, instead of giving full name "testing_kip-win32", I want to give a software name that startswith "testing". Then this script should uninstall "testing_kip-win32".
Any ideas please?
Thanks in advance
NOTE : I am using python 2.7
You can use the regular expression module re, the code you need will look something like:
import re
m = re.match("^testing", "ptesting_kip-win332")
if m != None:
print ("Uninstalling" + product.Name + "...")
result = product.Uninstall()
You could also consider to use the string function startswith()
which will work something like:
name = "testing_kip-win332"
if name.startswith("testing"):
print ("Uninstalling" + product.Name + "...")
result = product.Uninstall()

Mac to Windows Python

I have recently moved a set of near identical programs from my mac to my school's windows, and while the paths appear to be the same (or the tail end of them), they will not run properly.
import glob
import pylab
from pylab import *
def main():
outfnam = "igdata.csv"
fpout = open(outfnam, "w")
nrows = 0
nprocessed = 0
nbadread = 0
filenames = [s.split("/")[1] for s in glob.glob("c/Cmos6_*.IG")]
dirnames = "c an0 an1 an2 an3 an4".split()
for suffix in filenames:
nrows += 1
row = []
row.append(suffix)
for dirnam in dirnames:
fnam = dirnam+"/"+suffix
lines = [l.strip() for l in open(fnam).readlines()]
nprocessed += 1
if len(lines)<5:
nbadread += 1
print "warning: file %s contains only %d lines"%(fnam, len(lines))
tdate = "N/A"
irrad = dirnam
Ig_zeroVds_largeVgs = 0.0
else:
data = loadtxt(fnam, skiprows=5)
tdate = lines[0].split(":")[1].strip()
irrad = lines[3].split(":")[1].strip()
# pull out last column (column "-1") from second-to-last row
Ig_zeroVds_largeVgs = data[-2,-1]
row.append(irrad)
row.append("%.3e"%(Ig_zeroVds_largeVgs))
fpout.write(", ".join(row) + "\n")
print "wrote %d rows to %s"%(nrows, outfnam)
print "processed %d input files, of which %d had missing data"%( \
nprocessed, nbadread)`
This program worked fine for a mac, but for windows I keep getting for :
print "wrote %d rows to %s"%(nrows, outfnam)
print "processed %d input files, of which %d had missing data"%( \
nprocessed, nbadread)
wrote 0 row to file name
processed 0 input files, of which o had missing data
on my mac i go 144 row to file...
does any one have any suggestions?
If the script doesn't raise any errors, this piece of code is most likely returning an empty list.
glob.glob("c/Cmos6_*.IG")
Seeing as glob.glob works perfectly fine with forward slashes on Windows, the problem is most likely that it's not finding the files, which most likely means that the string you provided has an error somewhere in it. Make sure there isn't any error in "c/Cmos6_*.IG".
If the problem isn't caused by this, then unfortunately, I have no idea why it is happening.
Also, when I tried it, filenames returned by glob.glob have backslashes in them on Windows, so you should probably split by "\\" instead.
Off the top of my head, it looks like a problem of using / in the path. Windows uses \ instead.
os.path contains a number of functions to ease working with paths across platforms.
Your s.split("/") should definitely be s.split(os.pathsep). I got bitten by this, once… :)
In fact, glob returns paths with \ on Windows and / on Mac OS X, so you need to do your splitting with the appropriate path separator (os.pathsep).

Categories