Python module to merge DXF files - python

I am looking for a python module that will merge dxf files. I have found dxfgrabber and ezdxf, however they seem to be used for different applications then what I am after.
I am using ExpressPCB, which outputs each layer of the PCB, the holes, and the silkscreen separately. For my application I want to combine all of these individual DXF into a single. See photo
As far as I know, the origins, etc are the same so it should all line up as it would in real life.
Currently, neither of these modules have any tutorials for this type of application. Some psudo code to get the idea across in a pythonic way:
dxf_file1 = read(file1)
dxf_file2 = read(file2)
dxf_file3 = read(file3)
out_file.append(dxf_file1)
out_file.append(dxf_file2)
out_file.append(dxf_file3)
outfile.save()
In my application, the files will all have the same origin point and will never overlap, so you should be able to easily merge the files somehow. Thank you for the help in advance!

You can use the rewritten Importer add-on in ezdxf v0.10:
import ezdxf
from ezdxf.addons import Importer
def merge(source, target):
importer = Importer(source, target)
# import all entities from source modelspace into target modelspace
importer.import_modelspace()
# import all required resources and dependencies
importer.finalize()
base_dxf = ezdxf.readfile('file1.dxf')
for filename in ('file2.dxf', 'file3.dxf'):
merge_dxf = ezdxf.readfile(filename)
merge(merge_dxf, base_dxf)
# base_dxf.save() # to save as file1.dxf
base_dxf.saveas('merged.dxf')
This importer supports just basic graphics like LINE, CIRCLE, ARC and DIMENSION (without dimension style override) and so on.
All XDATA and 3rd party data will be ignored, but your files seems to be simple enough to work.
Documentation of the Importer add-on can be found here.

import sys
import ezdxf
from ezdxf.addons import geo
from shapely.geometry import shape
from shapely.ops import unary_union
def dxf2shapley(filename):
doc = ezdxf.readfile(filename)
msp = doc.modelspace()
entities = msp.query('LINE')
proxy = geo.proxy(entities)
shapley_polygon = shape(proxy)
if !shapley_polygon.is_valid:
raise Exception('polygon is not valid')
return shapley_polygon
h0 = dxf2shapley('test-0.dxf')
h1 = dxf2shapley('test-1.dxf')
polygons = [h0, h1]
polyout = unary_union(polygons)
result = ezdxf.addons.geo.dxf_entities(polyout, polygon=2)
doc = ezdxf.new('R2010')
msp = doc.modelspace()
for entity in result:
msp.add_entity(entity)
doc.saveas('test_merged.dxf')

Related

Interactive Xarray dataset raster visualisation app using Panel and hvplot

I am trying to replicate the Glaciers Demo using an Xarray of geospatial data. I am able to create pretty much exactly what I want but I am trying to create a Panel app that allows the user to select the data_vars, each of which has different dimensions that I want make interactable, and visualize on an interactive map with at least the continents contour. Here is what my Xarray Dataset looks like :
def plot(field):
return xds[field].hvplot.image().opts(cmap='jet',height=650,width=1300,data_aspect=1)
interact(plot, field = list(xds.data_vars))
and here is what the code above produces in a notebook :
I would like to integrate the selector for the data_vars and then depending on its dimensions have interactive maps with controls for all its dimensions (ES has (time, pres1, lat, lon) while P0 has only (time, lat, lon)) and I would like to have the controls in the sidebar and the plots in the main of the following template :
from turtle import width
from matplotlib.pyplot import title
import panel as pn
import numpy as np
import holoviews as hv
from panel.template import DefaultTheme
from pathlib import Path
import fstd2nc
import hvplot.xarray
import xarray as xr
from unicodedata import name
import hvplot
import param
from panel.interact import interact
pn.extension(sizing_mode='stretch_width')
bootstrap = pn.template.MaterialTemplate(title='Material Template', theme=DefaultTheme, )
glob_path = Path(r"C:\Users\spart\Documents\Anaconda-Work-Dir")
file_list = [str(pp).split('\\')[-1] for pp in glob_path.glob("2022*")]
phase = pn.widgets.FloatSlider(name="Phase", start=0, end=np.pi)
fileSel = pn.widgets.Select(name='Select File', options=file_list)
#pn.depends(fileSel=fileSel)
def selectedFile(fileSel):
base_path = r"C:\Users\spart\Documents\Anaconda-Work-Dir\{}".format(fileSel)
return pn.widgets.StaticText(name='Selected', value=base_path)
#pn.depends(fileSel=fileSel)
def dataXArray(fileSel):
base_path = r"C:\Users\spart\Documents\Anaconda-Work-Dir\{}".format(fileSel)
xds = fstd2nc.Buffer(base_path).to_xarray()
return xds.ES.hvplot( width=500)
bootstrap.sidebar.append(fileSel)
bootstrap.sidebar.append(selectedFile)
bootstrap.main.append(
pn.Row(
pn.Card(hv.DynamicMap(dataXArray), title='Plot'),
)
)
bootstrap.show()
EDIT : Here is a link to an example dataset which can be loaded with the following code
xds = fstd2nc.Buffer(PATH_TO_FILE).to_xarray()
Without the data file I can't easily run the code, but some observations:
If using bare functions like this rather than classes, I'd recommend using pn.bind rather than pn.depends; it really helps get the code organized better.
For a simple application like this, I'd use hvPlot .interactive: https://hvplot.holoviz.org/user_guide/Interactive.html
I can't seem to find this in the docs, but you can pull out the widgets from the result of dataXArray (or any other hvplot or holoviews object) using .widgets(), and you can then put that in the sidebar. You can then pull out just the plot using .panel(), and put that in the main area.
If that helps you get it done, then great; if not please post a sample data file or two so that it's runnable, and I can look into it further. And please submit a PR to the docs once you get it working so that future users have less trouble!

PyQGIS changing basemap crs when importing a csv

I am using PyQGIS to import a csv file with a lat and long, when doing this I am using the appropriate crs of EPSG:4326.
I'm plotting this onto Google Maps.
I load my basemap, then import my CSV. The issue is that my basemap projection then changes to 4326 and I need it to remain on 3857.
I've tried importing the basemap after the CSV and moving it down in the layers, however this still changes the projections.
import requests
from PyQt5.QtGui import *
from PyQt5.QtCore import *
from qgis.core import *
from qgis.utils import iface
from qgis import core
#Use Google Street Map as QGIS basemap.
service_url = "mt1.google.com/vt/lyrs=m&x={x}&y={y}&z={z}"
service_uri = "type=xyz&zmin=0&zmax=21&url=https://"+requests.utils.quote(service_url)
tms_layer = iface.addRasterLayer(service_uri, "GoogleSat", "wms")
#Import CSV and plot.
uri = 'file:///home/user/fred.csv?type=csv&xField=%s&yField=%s&crs=%s' % ("Site Longitude", "Site Latitude", "EPSG:4326")
layer_csv = QgsVectorLayer(uri, 'fred', 'delimitedtext')
layer_csv.isValid()
QgsProject.instance().addMapLayer(layer_csv)
I'll be the first to admit I'm a novice with QGIS!
It seems this has something to do with the application not refreshing properly as mentioned in this answer on gis stack. You may want to look into it for details.
To answer your question in brief, you can add QApplication.instance().processEvents() after QgsProject.instance().addMapLayer(layer_csv) and then use setCrs() to set your basemap CRS to whatever value you need. It will hold.
proj = QgsProject.instance()
proj.addMapLayer(layer_csv)
# This line makes the difference
QApplication.instance().processEvents()
# This sets the project CRS back to 3857
proj.setCrs(QgsCoordinateReferenceSystem(3857))

imread_collection() Load Patterns (scikit-images Python)

I am having an issue (likely my own mistake) using imread_collection() to load a set of 480 .tif images from a folder.
I have an external drive with 480 images in it such that the path name for each image is:
'D:\img_channel000_position000_time000000000_z000.tif',
'D:\img_channel000_position000_time000000001_z000.tif',
'D:\img_channel000_position000_time000000002_z000.tif'
and so on. The 480 images are the only objects on the external drive. I know this is the path name as I have successfully used
import skimage
from skimage import io
image = skimage.io.imread('D:\img_channel000_position000_time000000000_z000.tif')
to import an image and perform a first-pass at the analysis I was looking to accomplish. I, perhaps naively, then attempted to use the following code to import the entirety of the collection
import skimage
from skimage import io
ic = skimage.io.imread_collection('D:\*.tif')
However, the variable ic is never even created. The code runs successfully without error, but nothing occurs. Is this a problem with how I have implemented the load pattern? I have also tried the more complete D:\img_channel000_position000_*_z000.tif, but nothing occurred. Any advice would be greatly appreciated!
The issue might be, as #Juan pointed out in the comments, that Linux and Windows use different directory separators. One possible way to make your code platform-independent would be using os.path.join like this:
In [18]: import os
In [19]: from skimage import io
In [20]: external_drive = 'D:'
In [21]: file_spec = '*.tif'
In [22]: load_pattern = os.path.join(external_drive, file_spec)
In [23]: ic = io.imread_collection(load_pattern)
In [24]: ic
Out[24]: <skimage.io.collection.ImageCollection at 0x1f94f27f080>
In [25]: ic.files
Out[25]: ['D:\\img_001.tif', 'D:\\img_002.tif', 'D:\\img_003.tif']
Test performed on a machine with Windows 10 and Python 3.6.3 (Anaconda).

Remove all images from docx files

I've searched the documentation for python-docx and other packages, as well as stack-overflow, but could not find how to remove all images from docx files with python.
My exact use-case: I need to convert hundreds of word documents to "draft" format to be viewed by clients. Those drafts should be identical the original documents but all the images must be deleted / redacted from them.
Sorry for not including an example of things I tried, what I have tried is hours of research that didn't give any info. I found this question on how to extract images from word files, but that doesn't delete them from the actual document: Extract pictures from Word and Excel with Python
From there and other sources I've found out that docx files could be read as simple zip files, I don't know if that means that it's possible to "re-zip" without the images without affecting the integrity of the docx file (edit: simply deleting the images works, but prevents python-docx from continuing to work with this file because of missing references to images), but thought this might be a path to a solution.
Any ideas?
If your goal is to redact images maybe this code I used for a similar usecase could be useful:
import sys
import zipfile
from PIL import Image, ImageFilter
import io
blur = ImageFilter.GaussianBlur(40)
def redact_images(filename):
outfile = filename.replace(".docx", "_redacted.docx")
with zipfile.ZipFile(filename) as inzip:
with zipfile.ZipFile(outfile, "w") as outzip:
for info in inzip.infolist():
name = info.filename
print(info)
content = inzip.read(info)
if name.endswith((".png", ".jpeg", ".gif")):
fmt = name.split(".")[-1]
img = Image.open(io.BytesIO(content))
img = img.convert().filter(blur)
outb = io.BytesIO()
img.save(outb, fmt)
content = outb.getvalue()
info.file_size = len(content)
info.CRC = zipfile.crc32(content)
outzip.writestr(info, content)
Here I used PIL to blur images in some files, but instead of the blur filter any other suitable operation could be used. This worked quite nicely for my usecase.
I don't think it's currently implemented in python-docx.
Pictures in the Word Object Model are defined as either floating shapes or inline shapes. The docx documentation states that it only supports inline shapes.
The Word Object Model for Inline Shapes supports a Delete() method, which should be accessible. However, it is not listed in the examples of InlineShapes and there is also a similar method for paragraphs. For paragraphs, there is an open feature request to add this functionality - which dates back to 2014! If it's not added to paragraphs it won't be available for InlineShapes as they are implemented as discrete paragraphs.
You could do this with win32com if you have a machine with Word and Python installed.
This would allow you to call the Word Object Model directly, giving you access to the Delete() method. In fact you could probably cheat - rather than scrolling through the document to get each image, you can call Find and Replace to clear the image. This SO question talks about win32com find and replace:
import win32com.client
from os import getcwd, listdir
docs = [i for i in listdir('.') if i[-3:]=='doc' or i[-4:]=='docx'] #All Word file
FromTo = {"First Name":"John",
"Last Name":"Smith"} #You can insert as many as you want
word = win32com.client.DispatchEx("Word.Application")
word.Visible = True #Keep comment after tests
word.DisplayAlerts = False
for doc in docs:
word.Documents.Open('{}\\{}'.format(getcwd(), doc))
for From in FromTo.keys():
word.Selection.Find.Text = From
word.Selection.Find.Replacement.Text = FromTo[From]
word.Selection.Find.Execute(Replace=2, Forward=True) #You made the mistake here=> Replace must be 2
name = doc.rsplit('.',1)[0]
ext = doc.rsplit('.',1)[1]
word.ActiveDocument.SaveAs('{}\\{}_2.{}'.format(getcwd(), name, ext))
word.Quit() # releases Word object from memory
In this case since we want images, we would need to use the short-code ^g as the find.Text and blank as the replacement.
word.Selection.Find
find.Text = "^g"
find.Replacement.Text = ""
find.Execute(Replace=1, Forward=True)
I don't know about this library, but looking through the documentation I found this section about images. It mentiones that it is currently not possible to insert images other than inline. If that is what you currently have in your documents, I assume you can also retrieve these by looking in the Document object and then remove them?
The Document is explained here.
Although not a duplicate, you might also want to look at this question's answer where user "scanny" explains how he finds images using the library.

Python activity logging

I have a question regarding logging for somescript.py
The script performs some actions to find matches for words the user is looking for in some pages that have become unreadable due to re-formatting and printing of the pages.
Because of this, OCR techniques don't work for us anymore so i've come up with a script that compares countours of words to find matches.
the script looks something like:
import cv2
from cv2 import *
import numpy as np
method = cv.CV_TM_SQDIFF_NORMED
template_name = "this.png"
image_name = "3.tif"
needle = cv2.imread(template_name)
haystack = cv2.imread(image_name)
# Convert to gray:
needle_g = cv2.cvtColor(needle, cv2.CV_32FC1)
haystack_g = cv2.cvtColor(haystack, cv2.CV_32FC1)
# Attempt match
d = cv2.matchTemplate(needle_g, haystack_g, cv2.cv.CV_TM_SQDIFF_NORMED)
#we want the minimum squared difference
mn,_,mnLoc,_ = cv2.minMaxLoc(d)
print mnLoc
# Draw the rectangle
MPx,MPy = mnLoc
trows,tcols = needle_g.shape[:2]
#Normed methods give better results, ie matchvalue = [1,3,5], others sometimes showserrors
cv2.rectangle(haystack, (MPx,MPy),(MPx+tcols,MPy+trows),(0,0,255),2)
cv2.imshow('output',haystack)
cv2.waitKey(0)
import sys
sys.exit(0)
Now i want to log the various tasks that the script performs, like
converting the image to grayscale
attempting a match
drawing the rectangle
I have seen a few scripts on stackoverflow explaining how to log an entire script or the entire output but i haven't found anything that just logs a few actions.
Also i would like to add the date and time the activity was performed.
Furthermore i have wrote a function that calculates an MD5 and SHA1 hash of the input file, for this particular case, that is for 'this.png' and '3.tif', I have yet to implement this piece of code but would it be easy to log that as well?
I am a python-noob so if the anwsers are obvious to you guys you know why i couldn't figure it out myself.
I hope you can help me out on this one!

Categories