How to get true raw data input from webcam? pref with python - python

Interested in getting the raw data that is streamed from a webcam..before it is transformed into binary -> pixels or whatever the sequence is.
From my understanding, libraries like opencv won't help with this.
Edit: kiss

in order to see the image output from a webcam using python you need to convert it from raw image data (the pixels) into a image that can be displayed one way you can do it is by using matplotlib more info here https://matplotlib.org/stable/tutorials/introductory/images.html
this will display the image but will not convert it into other image formats such as .png .jpeg ext. but the data you are getting is probably pixel data which is the rawest image data you will get

Related

Extracting small region of image from BMP file in Python

I have a BMP image and I want to extract a small portion of it and save it as a new BMP image file.
I was able to load image and read it however I was not able to extract the small portion of BMP image as I am new to manipulating BMP image with python and also it not same as reading text file.
Things I have to do is
load image.
extract small portion of image.
eg. I have to extract 40X40 pixel image from 900X900 image file
then save extracted image as new file. eg new.bmp
I am trying to do this for last 3 days also I have searched a lot in the net but got solution which uses Pillow library however I need it to do this without using any external module of Python. Stackoverflow is my last hope I need some guidance from a expert people present here, please provide my some guidance.

How to read RGB image (JPG) without additional module in python

I am trying to read image.jpg (RGB) into an array in python without any additional module but it doesn't work?
pic = open('image.jpg')
array=[]
with open(p, 'rb') as inf:
jpgdata = inf.read()
values=jpgdata.split()
array=array.append(values[:][:])
print (array)
Can anyone help me how to read an image 3 bands (RGB) in python without using external module?
A JPEG image is not just a series of pixels, unlike some other formats like BMP.
In order to get the pixel data from a JPEG image you need to decompress it, which involves reading its header data, then rebuilding the data from 8x8px blocks which contain information regarding the brightness and color (YCbCr).
You need to:
Build the Huffman tree and revert the blocks
Invert the discrete cosine transform with the given parameters
Revert the YCbCr into RGB
Place each block into its corresponding location in the image
Building a simple decoder from scratch is certainly possible, but it's not going to be done in a few lines.

OCR for Bank Receipts

I am working on OCR problem for Bank receipts and I need to extract details like the Date and Account Number for the same. After processing the input,I am using Tessaract-OCR (using pyteserract in python) for the same.I have obtained the hocr output file however I am not able to make sense of it.How do we extract information from the HOCR output file?Note that the receipt has numbers filled in Boxes like the normal forms.
I used the below text for extraction.Should I use a different Encoding?
import os
if os.path.isfile('output.hocr'):
fp=open('output.hocr','r',encoding='UTF-8')
text=fp.read()
fp.close()
Note:The attached image is one example of data.These images are available in pdf files which I am converting programmatically into images.
I personally would use something more like tesseract to do the OCR and then perhaps something like opencv with surf for the tick boxes...
or even do edge detection with opencv and surf for each section and ocr that specific area to make it more robust by analyzing that specific area rather than the whole document..
You can simply provide the image as input, instead of processing and creating an HOCR output file.
Try:-
from PIL import Image
import pytesseract
im = Image.open("reciept.jpg")
text = pytesseract.image_to_string(im, lang = 'eng')
print(text)
This program takes in the location of your image which is to be run through OCR, and extracts text from it, stores it in a variable text, and prints it out. If you want you can store the data in text in a separate file too.
P.S.:- The Image that you are trying to process, is way too complex as compared to images that tesseract is made to deal with. Due to this you may get incorrect results, after the text is processed. I would definitely recommend you to optimize it before using, like reducing the character set used, processing the image before passing it to OCR, upsampling image, having dpi over 250 etc.

Loading and saving raw images

I'm looking to be able to read in pixel values as captured in a raw NEF image, process the data for noise removal, and then save the new values back into the raw image format maintaining all the metadata for later use. I've seen dcraw can read in raw format and output the Bayer pattern data as a tiff or other image but I can't save it back to my NEF. I've also been attempting to read in and save the image with simple python file open or numpy memmap but have no clue how to handle the binary data. Any help would be appreciated. Thanks!

Making a 3 Colour FITS file using aplpy

I am trying to make a three colour FITS image using the $aplpy.make_rgb_image$ function. I use three separate FITS images in RGB to do so and am able to save a colour image in png, jpeg.... formats, but I would prefer to save its as a FITS file.
When I try that I get the following error.
IOError: FITS save handler not installed
I've tried to find a solution in the web for a few days but was unable to get any good results.
Would anyone know how to get such a handler installed, or perhaps any other approach I could use to get this done?
I don't think there is enough information for me to answer your question completely; for example, I don't know what call you are making to perform the "image" "save", but I can guess:
FITS does not store RGB data like you wish it to. FITS can store multi-band data as individual monochromatic data layers in a multi-extension data "cube". Software, including ds9 and aplpy, can read that FITS data cube and author RGB images in RGB formats (png, jpg...). The error you see comes from PIL, which has no backend to author FITS files (I think, but the validity of that point doesn't matter).
So I think that you should use aplpy.make_rgb_cube to save a 3 HDU FITS cube based your 3 input FITS files, then import that FITS cube back into aplpy and use aplpy.make_rgb_image to output RGB compatible formats. This way you have the saved FITS cube in near native astronomy formats, and a means to create RGB formats from a variety of tools that can import that cube.

Categories