CSV credential python one liner

CSV credential python one liner - python

I have a csv file like this :
name,username
name2,username2
etc...
And I need to extract each column into lists so I can create a account (admin script).
I am hoping the result would look like this :
NAMES=( name name2 )
MAILS=( username username2 )
LENGHT=3 # number of lines in csv files actually
I would like to do it in python (because I use it elsewhere in my script and would like to convert my collegues to the dark side). Exept that I am not really a python user...
Something like this would do the trick (I assume) :
NAMES=( $(echo "$csv" | pythonFooMagic) )
MAILS=( $(echo "$csv" | python -c "import sys,csv; pythonFooMagic2") )
LENGHT=$(echo "$csv" | pythonFooMagic3)
I kind of found tutos to do it accross several lines but glued together it was ugly.
There must be some cool ways to do it. Else I will resign to use sed... Any ideas?
EDIT : ok bad idea, for future reference, see the comments

You could use a temporary file, like this:
tmpfile=$(mktemp)
# Python script produces the variables you want
pythonFooMagic < csv > $tmpfile
# Here you take the variables somehow. For example...
source $tempfile
rm $tmpfile

Related

Formatting a rpm query output with a separator

I am trying to get a list of all packages that are installed on my system. For this I call 'rpm -qai' from within a Python-script where further transformations on the output take place.
I kind of ran into the problem now that the output of above query does not separate the different packages. This looks something like this:
$ rpm -qai
Name : PackageName
Version : 1.0
...
LastEntry: Something
Name : NextPackageName
Version : 1.1
...
What I want is something along the line of
Name : PackageName
Version : 1.0
...
LastEntry: Something
//empty line or some other kind of separator
Name : NextPackageName
Version : 1.1
...
Since my script reads everything line for line and saves the lines in a dictionary. My workaround as of now checks, if the current line starts with 'Name' and if so, proceeds with appending the dictionary to a list and clearing the dictionary; this step is skipped for the very first line.
This solution is pretty ugly. Unfortunately, a fixed number of lines does not work as not all packages provide the same amount of information.
I also thought about running 'rpm -qai' first, retrieving a list of all package names from this, then iterating over the list while calling 'rpm -qi current_item'. Then one could grab the output from each single query. But since this requires two runs, I deem it unnecessary extra work.
So, does RPM (or some other tool) provide a feature which would allow the desired output?

There are python bindings for "proper" RPMDB interfacing instead of parsing "rpm" output. Think of it as git's porcelain vs plumbing. In fact, yum is all python (last time I checked). I think that will be better for you in the long run.
This documentation could be a good start.

You can use rpm --qf|--queryformat flag with format string for set the output format.
For example you can use rpm -qa --qf "%{NAME} %{VERSION}\n" for get fields interesting for you about every package separated as you want.
Or, just in your case, you can use something like rpm -qai --qf "\n####\n". You will get all fields about every package, but separator setted by you will be between them. Note that Description field may contains multiline text, so it is may be wrong to use \n as separator.
You can read about that in more details using man rpm.

Intermittent syntax error in a bash script involving echo, a python script, grep, sed, bc, and date

I have written a python script to take magnetic field measurements from a Raspberry Pi Astro "Sense Hat" sensor. It is called "mag-AstroPi.py":
#!/usr/bin/python
from sense_hat import SenseHat
sense = SenseHat()
raw = sense.get_compass_raw()
#print(x: {x}, y: {y}, z: {z}.format(**raw))
#alternatives
print(sense.compass_raw)
This is the script provided by element14, the manufacturer of the Sense Hat.
The script outputs magnetic field data in three axes (X, Y, and Z) in microteslas, along with a bunch of extra characters:
pi#raspberrypi ~ $ python mag-AstroPi.py
{'y': 13.895279884338379, 'x': -1.1642401218414307, 'z': -0.4132799804210663}
I need to remove the extra characters, multiply the values by 1,000 in order to convert them into nanoteslas (standard SI unit for my particular application), and then log the multiplied value alongside the date and time into a file. This needs to happen every two seconds.
I want there to be three separate log files - one for the X axis, one for the Y axis, and one for the Z axis. However, for now, I am just working with the Y-axis data. Once I get the Y-axis data logging working, I can then duplicate and alter for the two other axes.
So I wrote a bash script, AstroPiMagLogger.sh, which runs at boot via a cron job:
#!/bin/bash
while true
do
echo $(python mag-AstroPi.py -n | grep "y" | cut -d " " -f2 | cut -c 1-18 | sed 's/$/*1000/' | bc; date +"%Y,%m,%d,%T,%Z") >> rawysecnT.txt
sleep 2
done
This should extract the Y-axis value only, multiply it by 1,000, and then save it alongside the current date, time, and time zone into a new text file, rawysecnT.txt.
It works, sorta... here are the contents of rawysecnT.txt:
13703.761100769043000 2015,09,14,08:56:41,UTC
13703.761100769043000 2015,09,14,08:56:44,UTC
13613.041877746582000 2015,09,14,08:56:46,UTC
13794.480323791504000 2015,09,14,08:56:49,UTC
13804.560661315918000 2015,09,14,08:56:52,UTC
13875.120162963867000 2015,09,14,08:56:55,UTC
13633.201599121094000 2015,09,14,08:56:58,UTC
2015,09,14,08:57:00,UTC
2015,09,14,08:57:03,UTC
13744.080543518066000 2015,09,14,08:57:06,UTC
14016.241073608398000 2015,09,14,08:57:09,UTC
As you can see, it works most of the time. But every now and then, it doesn't log the magnetic field measurement to the file; it only logs the date and time.
Earlier today, I had the logging working perfectly, but that was before I added the code to multiply the magnetic data by 1000 (i.e. earlier today, the script was only logging the original magnetic data in microteslas, along with the date/time). I have several hours worth of data like that without any errors at all, so it's apparent I've stuffed something up when adding in the code for multiplication of the magnetic measurement.
I decided to run the following directly in the command line (rather than through the script), in order to debug.
echo $(python mag-AstroPi.py -n |grep "y" | cut -d " " -f2 | cut -c 1-18 | sed 's/$/*1000/' | bc; date +"%Y,%m,%d,%T,%Z")
Predictably, this worked about a dozen times, with the following output printed to the terminal, which is exactly how I want it:
14167.440414428711000 2015,09,14,09:07:30,UTC
and then, one last time, it returned the following error:
(standard_in) 1: syntax error
2015,09,14,09:07:59,UTC
Given that the error is intermittent, and I'm fairly new to programming (I've only been at it about a month), I've got no idea what could possibly be the issue.
I would appreciate any thoughts anyone may have as to why this is working most of the time but not all of the time.
The two sample outputs requested in the comments are as follows:
pi#raspberrypi ~ $ python mag-AstroPi.py -n | grep "y" | cut -d " " -f2 | cut -c 1-18 | sed 's/$/*1000/' | bc; date +"%Y,%m,%d,%T,%Z"
14076.720237731934000 2015,09,14,09:53:33,UTC
pi#raspberrypi ~ $ python mag-AstroPi.py -n
{'y': 13.935601234436035, 'x': -1.506960153579712, 'z': 0.24192002415657043}

It looks like what you're trying to do could (and almost certainly should) be done within the Python script itself. get_compass_raw() is returning you a dictionary, so you can extract the y value (and multiply by 1000) directly:
raw = sense.get_compass_raw()
y_component_nT = raw['y'] * 1000
To add your timestamp, I'd use the built-in datetime module:
from datetime import datetime
now = datetime.now()
You can then format the time however you want, using now.strftime(format), where format is a format string built up as shown in the docs.
I'll leave the challenges of writing to a file in python and pausing execution to you - they're already covered in many good answers on this site and elsewhere.

I agree that this is better done in the Python script itself, and if you're unfamiliar with Python, this is a good opportunity to learn some small parts of it. As for your pipeline, I think the key issue is in cut -c 1-18 | sed 's/$/*1000/' | bc. The floating point numbers are not guaranteed to be a specific width or even format since you just printed a dictionary without requesting any formatting, so sometimes this will include the comma (or final brace for the last component), or be in scientific notation such as 2.34e-07. bc does not understand those forms. Also, as the script only prints the one line with all values, grep does nothing.
If I were to use a pipeline like this to extract a value, I would probably use something like sed -e "s/.*'y': \([-Ee.0-9]*\).*/\1/" -e "s/[Ee]\(.*\)/*10^\1/" instead of the cuts (the latter substitution converts e forms to bc compatible expressions). On top of that, bc has some specific rules regarding precision, which mean the exponent handling requires you to set scale or everything becomes zero.

Obviously I can't test this code properly without a Sense Hat sensor, but you should find it helpful. It does an infinite loop, so you need to send it a CtrlC to kill it.
#!/usr/bin/env python
import time
from sense_hat import SenseHat
sense = SenseHat()
#Un-comment the following line to ensure the compass is on and the gyro & accelerometer are off
#sense.set_imu_config(True, False, False)
while True:
raw_field = sense.get_compass_raw()
now = time.strftime("%Y,%m,%d,%X,%Z")
print(raw_field['y'] * 1000, now)
time.sleep(2)
If you're using Python 2 you'll need to remove the parentheses in the print line, or add from __future__ import print_function before the other import lines.
It's not hard to adapt this code to write directly to a named file rather than having to pipe its output. But I'll leave that as an Exercise for the Reader. :) And it's also easy to modify it write to 3 files for x, y, and z.
I suspect that sometimes the SenseHat simply doesn't return the expected magnetic field data, but it's hard to tell from the info you posted what it does return when it does that. It possibly returns a dictionary containing empty data values, or it could return a totally empty dictionary.
So if you still get bad output, let me know what it looks like & any associated error message, and I should be able to show you how to fix it.

Powershell - how to pass csv fields as parameters

I have a problems that I'm trying to solve, but can't seem to figure the key part out. We have tons of records that we process every day using a .jar file, but the problem is that we have to go one by one and that is time consuming. I think we can cut a tremendous amount of time if we use a powershell script.
The problem is that I don't know how to pass the parameters from a csv to a function in powershell.
My csv looks like this
NAME,ID
-------
John,18
Dave,19
Carmen,20
Eric,21
Tom,22
Lisa,23
Kyle,24
The function is
function CreateUser
& java -jar --create -user $name -id $id -file D:/HR/$name-$id-form.pdf
I imported the csv file using
$dataCSV = "D:\HR\Input\20150303NewUsers.csv"
$data = Import-Csv $dataCSV
So I need something that will go systematically down the file and pass the name field inside the csv as $name and the ID field as $id over and over again until completed. But I can't figure out how to pass those two down on a ForEach-Object method :(
I'm stuck... I've been fighting this all weekend, but nothing.
Any help or guidance will be greatly appreciated! Or if anyone know how to do this in python, that will be cool too! :)

I have written a tool that steps through a table (imported from a csv file) and generates an expansion of a template for each row in the table. One thing I do is to copy each of the values in the row to a powershell variable of the same name as the column. This may help you.
Here is the tool that I wrote:
<# This scriptlet is a table driven template tool.
It's a refinement of an earlier attempt.
It generates an output file from a template and
a driver table. The template file contains plain
text and embedded variables. The driver table has
one column for each variable, and one row for each
expansion to be generated.
2/15/2015
#>
param ($driver, $template, $out);
$OFS = "`r`n"
$list = Import-Csv $driver
[string]$pattern = Get-Content $template
Clear-Content $out -ErrorAction SilentlyContinue
foreach ($item in $list) {
foreach ($key in $item.psobject.properties) {
Set-variable -name $key.name -value $key.value
}
$ExecutionContext.InvokeCommand.ExpandString($pattern) >> $out
}
The part that may interest you is the innner loop, where I do a Set-Variable that matches the column name with the actual value.

Not sure if typo but your D:\... needs to be enclosed in quotation marks, you haven't closed it off.
Once $data holds the list of imported values simply do foreach ($item in $data) {do something}
Where $item is any word (variable) you want, it simply refers to each row in the CSV.
So...
$data = Import-Csv "D:\importfile.csv"
foreach( $item in $data )
{
# Do-whatever
}

Parsing specific keywords in Select Statements and formatting

I have a sample select statement:
Select D.account_csn, D.account_key, D.industry_id, I.industry_group_nm, I.industry_segment_nm From ecs.DARN_INDUSTRY I JOIN ecs.DARN_ACCOUNT D
ON I.SRC_ID=D.INDUSTRY_ID
WHERE D.ACCOUNT_CSN='5070000240'
I would like to parse the select statements into separate files. The first file name is called ecs.DARN_INDUSTRY
and inside the file it should look like this:
industry_group_nm
industry_segment_nm
Similarly another file called ecs.DARN_ACCOUNT and the content looks like this:
account_csn
account_key
industry_id
How do I do this in Bash or Python??

I doubt you will find a truly simple answer (maybe someone can prove otherwise). However, you might find python-sqlparse useful.
Parsing general SQL statments will be complicated and it is difficult to guess exactly what you are trying to accomplish. However, I think you are trying to extract the tables and corresponding column references via sql parsing, in which case, look at this question which basically asks that very thing directly.

Here is a long working command through awk,
awk 'NR==1{gsub(/^.*\./,"",$5);gsub(/^.*\./,"",$6);gsub(/.$/,"",$5); printf $5"\n"$6"\n" > "DARN_INDUSTRY"; gsub(/^.*\./,"",$2);gsub(/^.*\./,"",$3);gsub(/^.*\./,"",$4);gsub(/.$/,"",$2);gsub(/.$/,"",$3);gsub(/.$/,"",$4); printf $2"\n"$3"\n"$4"\n" > "DARN_ACCOUNT"}' file
Explanation:
gsub(/^.*\./,"",$5) remove all the characters upto the first . symbol in colum number 5.
printf $5"\n"$6"\n" > "DARN_INDUSTRY" redirects the output of printf command to the file named DARN_INDUSTRY.
gsub(/.$/,"",$4) Removes the last character in column 4.

Sorting a file by a specific field

I have a file that has the following format:
12345 TAB_HERE Name : The Actual Name TAB_HERE 6785
eg.
1001020 Name : SMITH S ANNALOLA 14570
5701061 Name : MATTHEW SANDY HILL 6440
7001083 Name : TANYA MORRISON MILLER 14406
I want to sort by the last field of numbers.
I'd prefer a simple one line python solution or a linux tool based solution.
I tried using sort -k 3,3n but it did not work.
And I can't seem to write a single line python code that I can run as python -c "code here"
I looked at the following but to no avail:
http://www.unix.com/unix-dummies-questions-answers/18359-how-do-i-specify-tab-field-separator-sort.html
http://www.unix.com/unix-dummies-questions-answers/30450-sort-third-column-n-command.html
http://www.linuxquestions.org/questions/programming-9/unix-sort-on-multiple-fields-598813/

Quick solution:
import sys
print "".join(sorted(sys.stdin.readlines(), key=lambda x:int(x.split()[-1])))
This solution has some disadvantages. For example, it will not work if you have lines without number at the last field, or if you want sort the data not by the last field but by everything else. In this case you must use regular expressions (re module) and descrive the field that you want to use for sorting in the key function.

Python one liner:
cat file | python -c 'import sys; print "".join(sorted(sys.stdin.readlines(), key=lambda x:int(x.split()[-1])))'
My guess why the other python example won't work as a one liner is that he is using " to mark up the code and to invoke the join()...

I guess the --key parameter for the sort command counts the space characters.
sort -k7n
worked for me..

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

CSV credential python one liner - python

You could use a temporary file, like this: tmpfile=$(mktemp) # Python script produces the variables you want pythonFooMagic < csv > $tmpfile # Here you take the variables somehow. For example... source $tempfile rm $tmpfile

Related

Formatting a rpm query output with a separator

Intermittent syntax error in a bash script involving echo, a python script, grep, sed, bc, and date

Powershell - how to pass csv fields as parameters

Parsing specific keywords in Select Statements and formatting

Sorting a file by a specific field

Categories

Resources