Related
I've been trying to automate the Cinebench window with python using pyautogui, as this is the best library that i came across. I made a few projects that worked well, but with Cinebench i don't get any control identifiers (Except for title, and the normal 3 top buttons). My main objective is to be able to automatically start benchmarks and read the final score.
I didn't come here to bother you all as soon as I hit an issue, so here's all of the things that i tried:
Switching backend="uia" to backend="win32". Result: code stopped working
Waiting for the window to load, using time.sleep(). Result: no difference was noticed
Adding a timeout=10 to the .connect() function. Result: no difference was noticed
Researching if Cinebench had an API. Result: of course it doesn't (as of what i found)
Researching if there was another library to do it. Result: didn't find any.
I really don't want to do this using "click at this coordinates" and even so i wouldn't be able to read from it, so it would be useless.
The code that i used:
app = Application(backend="uia").start(rf"C:/Users/{os.getlogin()}/Desktop/MasterBench/Benchmarks/Cinebench.exe")
app = Application(backend="uia").connect(title=CINEBENCH_WINDOW_NAME, timeout=10)
app.CINEBENCHR23200.print_control_identifiers()
What i got:
Control Identifiers:
Dialog - 'CINEBENCH R23.200' (L-8, T-8, R1928, B1088)
['CINEBENCH R23.200', 'CINEBENCH R23.200Dialog', 'Dialog']
child_window(title="CINEBENCH R23.200", control_type="Window")
|
| TitleBar - '' (L16, T-5, R1920, B23)
| ['TitleBar']
| |
| | Menu - 'Sistema' (L0, T0, R22, B22)
| | ['SistemaMenu', 'Sistema', 'Menu', 'Sistema0', 'Sistema1']
| | child_window(title="Sistema", auto_id="MenuBar", control_type="MenuBar")
| | |
| | | MenuItem - 'Sistema' (L0, T0, R22, B22)
| | | ['Sistema2', 'SistemaMenuItem', 'MenuItem']
| | | child_window(title="Sistema", control_type="MenuItem")
| |
| | Button - 'Riduci a icona' (L1779, T8, R1826, B22)
| | ['Button', 'Riduci a iconaButton', 'Riduci a icona', 'Button0', 'Button1']
| | child_window(title="Riduci a icona", control_type="Button")
| |
| | Button - 'Ripristino' (L1826, T8, R1872, B22)
| | ['Button2', 'Ripristino', 'RipristinoButton']
| | child_window(title="Ripristino", control_type="Button")
| |
| | Button - 'Chiudi' (L1872, T8, R1928, B22)
| | ['Button3', 'Chiudi', 'ChiudiButton']
| | child_window(title="Chiudi", control_type="Button")
In Spark, with pyspark, I have a data frame with duplicates. I want to deduplicate them with multiples rules like email and mobile_phone.
This is my code in python 3 :
from pyspark.sql import Row
from pyspark.sql.functions import collect_list
df = sc.parallelize(
[
Row(raw_id='1001', first_name='adam', mobile_phone='0644556677', email='adam#gmail.fr'),
Row(raw_id='2002', first_name='adam', mobile_phone='0644556688', email='adam#gmail.fr'),
Row(raw_id='3003', first_name='momo', mobile_phone='0644556699', email='momo#gmail.fr'),
Row(raw_id='4004', first_name='momo', mobile_phone='0644556600', email='mouma#gmail.fr'),
Row(raw_id='5005', first_name='adam', mobile_phone='0644556688', email='adama#gmail.fr'),
Row(raw_id='6006', first_name='rida', mobile_phone='0644556688', email='rida#gmail.fr')
]
).toDF()
My original dataframe is :
+--------------+----------+------------+------+
| email|first_name|mobile_phone|raw_id|
+--------------+----------+------------+------+
| adam#gmail.fr| adam| 0644556677| 1001|
| adam#gmail.fr| adam| 0644556688| 2002|
| momo#gmail.fr| momo| 0644556699| 3003|
|mouma#gmail.fr| momo| 0644556600| 4004|
|adama#gmail.fr| adam| 0644556688| 5005|
| rida#gmail.fr| rida| 0644556688| 6006|
+--------------+----------+------------+------+
Then, i apply my deduplication rules :
df_mobile = df \
.groupBy('mobile_phone') \
.agg(collect_list('raw_id').alias('raws'))
df_email = df \
.groupBy('email') \
.agg(collect_list('raw_id').alias('raws'))
This is the result i have :
df_mobile.select('raws').show(10, False)
+------------------+
|raws |
+------------------+
|[2002, 5005, 6006]|
|[1001] |
|[4004] |
|[3003] |
+------------------+
df_email.select('raws').show(10, False)
+------------+
|raws |
+------------+
|[3003] |
|[4004] |
|[1001, 2002]|
|[5005] |
|[6006] |
+------------+
So, the final result I want is to regroup common elements of the raws column like this :
+------------------------+
|raws |
+------------------------+
|[3003] |
|[4004] |
|[2002, 5005, 6006, 1001]|
+------------------------+
Do you know how I can do it with pyspark ?
Thank you very much!
So it seems as #pault is hinting at you could model this as a graph where your original dataframe df is a list of vertices and df_email and df_mobile are lists of connected vertices. Now unfortunately GraphX is not available for python, but GraphFrames is!
GrameFrames has a function called Connected Components that will return the list of connected raw_ids or vertices. To use it we must do two things, raw_id must be just called id and the edge must be source (src) and destination (dst) pairs not simply lists of vertices.
from pyspark.sql import SparkSession
from pyspark.sql.functions import *
from pyspark.sql.types import *
from graphframes import GraphFrame
spark = SparkSession \
.builder \
.appName("example") \
.getOrCreate()
spark.sparkContext.setCheckpointDir("checkpoints")
# graphframes requires a checkpoint dir since v0.3.0
# https://graphframes.github.io/user-guide.html#connected-components
spark.sparkContext.setLogLevel("WARN") # make it easier to see our output
vertices = spark.createDataFrame([
('1001', 'adam', '0644556677', 'adam#gmail.fr'),
('2002', 'adam', '0644556688', 'adam#gmail.fr'),
('3003', 'momo', '0644556699', 'momo#gmail.fr'),
('4004', 'momo', '0644556600', 'mouma#gmail.fr'),
('5005', 'adam', '0644556688', 'adama#gmail.fr'),
('6006', 'rida', '0644556688', 'rida#gmail.fr')
]).toDF("id", "first_name", "mobile_phone", "email")
mk_edges = udf(
lambda a: [{'src': src, 'dst': dst} for (src, dst) in zip(a, a[-1:] + a[:-1])],
returnType=ArrayType(StructType([
StructField('src', StringType(), nullable=False),
StructField('dst', StringType(), nullable=False)])))
def edges_by_group_key(df, group_key):
return df.groupBy(group_key) \
.agg(collect_list('id').alias('ids')) \
.select(mk_edges('ids').alias('edges')) \
.select(explode('edges').alias('edge')) \
.select("edge.*")
mobileEdges = edges_by_group_key(vertices, 'mobile_phone')
print('mobile edges')
mobileEdges.show(truncate=False)
# mobile edges
# +----+----+
# |src |dst |
# +----+----+
# |2002|6006|
# |5005|2002|
# |6006|5005|
# |1001|1001|
# |4004|4004|
# |3003|3003|
# +----+----+
emailEdges = edges_by_group_key(vertices, 'email')
print('email edges')
emailEdges.show(truncate=False)
# email edges
# +----+----+
# |src |dst |
# +----+----+
# |3003|3003|
# |4004|4004|
# |1001|2002|
# |2002|1001|
# |5005|5005|
# |6006|6006|
# +----+----+
g = GraphFrame(vertices, mobileEdges.union(emailEdges))
result = g.connectedComponents()
print('connectedComponents')
result.select("id", "component") \
.groupBy("component") \
.agg(collect_list('id').alias('ids')) \
.select('ids').show(truncate=False)
# connectedComponents
# +------------------------+
# |ids |
# +------------------------+
# |[1001, 2002, 5005, 6006]|
# |[4004] |
# |[3003] |
# +------------------------+
There might be a cleverer way to do the union between the mobile and email dataframes, maybe deduplicate with distinct, but you get the idea.
I would like to use wp_scan to scan my wordpress website for new plugins.
I want to have a python script that show me everyday
a list of vulnerable plugins
a list of new plugins.
To write a parser which give me only the vulnerable plugins of the output is not complicate. But how I can write a parser (or in which way) so that I get only a list of new plugins.
Example - (source of the example - I modified it a little bit http://www.blackmoreops.com/2013/10/14/wpscan-and-quick-wordpress-security/).
First day:
___________________________________________________
__ _______ _____
\ \ / / __ \ / ____|
\ \ /\ / /| |__) | (___ ___ __ _ _ __
\ \/ \/ / | ___/ \___ \ / __|/ _` | '_ \
\ /\ / | | ____) | (__| (_| | | | |
\/ \/ |_| |_____/ \___|\__,_|_| |_| v2.1rNA
WordPress Security Scanner by the WPScan Team
Sponsored by the RandomStorm Open Source Initiative
_____________________________________________________
| URL: http://www.blackmoreops.com/
| Started on Sun Oct 13 13:39:25 2013
[31m[!][0m The WordPress 'http://www.blackmoreops.com/readme.html' file exists
[31m[!][0m Full Path Disclosure (FPD) in 'http://www.blackmoreops.com/wp-includes/rss-functions.php'
[32m[+][0m XML-RPC Interface available under http://www.blackmoreops.com/xmlrpc.php
[32m[+][0m WordPress version 3.6.1 identified from meta generator
[32m[+][0m The WordPress theme in use is twentyten v1.6
| Name: twentyten v1.6
| Location: http://www.blackmoreops.com/wp-content/themes/twentyten/
[32m[+][0m Enumerating plugins from passive detection ...
2 plugins found :
| Name: add-to-any v1.2.5
| Location: http://www.blackmoreops.com/wp-content/plugins/add-to-any/
| Directory listing enabled: Yes
| Readme: http://www.blackmoreops.com/wp-content/plugins/add-to-any/README.txt
| Name: captcha v3.8.4
| Location: http://www.blackmoreops.com/wp-content/plugins/captcha/
| Directory listing enabled: Yes
| Readme: http://www.blackmoreops.com/wp-content/plugins/captcha/readme.txt
[32m[+] Finished at Sun Oct 13 13:39:51 2013[0m
[32m[+] Elapsed time: 00:00:26[0m]
on the next day:
___________________________________________________
__ _______ _____
\ \ / / __ \ / ____|
\ \ /\ / /| |__) | (___ ___ __ _ _ __
\ \/ \/ / | ___/ \___ \ / __|/ _` | '_ \
\ /\ / | | ____) | (__| (_| | | | |
\/ \/ |_| |_____/ \___|\__,_|_| |_| v2.1rNA
WordPress Security Scanner by the WPScan Team
Sponsored by the RandomStorm Open Source Initiative
_____________________________________________________
| URL: http://www.blackmoreops.com/
| Started on Sun Oct 13 13:39:25 2013
[31m[!][0m The WordPress 'http://www.blackmoreops.com/readme.html' file exists
[31m[!][0m Full Path Disclosure (FPD) in 'http://www.blackmoreops.com/wp-includes/rss-functions.php'
[32m[+][0m XML-RPC Interface available under http://www.blackmoreops.com/xmlrpc.php
[32m[+][0m WordPress version 3.6.1 identified from meta generator
[32m[+][0m The WordPress theme in use is twentyten v1.6
| Name: twentyten v1.6
| Location: http://www.blackmoreops.com/wp-content/themes/twentyten/
[32m[+][0m Enumerating plugins from passive detection ...
3 plugins found :
| Name: add-to-any v1.2.5
| Location: http://www.blackmoreops.com/wp-content/plugins/add-to-any/
| Directory listing enabled: Yes
| Readme: http://www.blackmoreops.com/wp-content/plugins/add-to-any/README.txt
| Name: captcha v3.8.4
| Location: http://www.blackmoreops.com/wp-content/plugins/captcha/
| Directory listing enabled: Yes
| Readme: http://www.blackmoreops.com/wp-content/plugins/captcha/readme.txt
| Name: google-analyticator v6.4.5
| Location: http://www.blackmoreops.com/wp-content/plugins/google-analyticator/
| Directory listing enabled: Yes
| Readme: http://www.blackmoreops.com/wp-content/plugins/google-analyticator/readme.txt
[32m[+] Finished at Sun Oct 14 13:39:51 2013[0m
[32m[+] Elapsed time: 00:00:26[0m]
Should I separate the string always after a [+] and compare them all
(I don't know how the list of the output is sorted - I think alpahbetic - so I can't get only the last plugins and say this are my new plugins)? Is that efficient? Making the problem simple:
first string:
Hallo
Pet
Me
second string:
Hallo
World
Pet
Me
How I find out what is the new word in a efficient way?
First you split the string in a list and then print every word in the second string given it is not the first string.
str1 = "Hallo Pet Me"
str2 = "Hallo World Pet Me"
split1 = str1.split()
split2 = str2.split()
print [word for word in split2 if word not in split1]
If you want to ignore differences in lower/uppercase:
str1 = "Hallo Pet Me"
str2 = "Hallo World Pet Me"
split1 = str1.lower().split()
split2 = str2.lower().split()
print [word for word in split2 if word not in split1]
Solving your simplified example:
str1 = "Hallo Pet Me"
str2 = "Hallo World Pet Me"
set1 = set(str1.split())
set2 = set(str2.split())
print set2 - set1
You have two sets of strings and you want to obtain strings that are in the second set but not in the first one.
My question is similar to this unanswered question: SQLAlchemy commits makes float to be rounded
I have a text file of data that looks like this:
#file camera date mjd focus error
ibcy02blq UVIS1 08/03/09 55046.196630 0.57857 0.55440
ibcy02bnq UVIS1 08/03/09 55046.198330 -0.15000 0.42111
ibcy03j8q UVIS1 08/11/09 55054.041650 -0.37143 0.40802
ibcy03jaq UVIS1 08/11/09 55054.043350 -0.91857 0.51859
ibcy04m4q UVIS1 08/18/09 55061.154900 -0.32333 0.52327
ibcy04m6q UVIS1 08/18/09 55061.156600 -0.24867 0.66651
ibcy05b7q UVIS1 09/05/09 55079.912670 0.64900 0.58423
ibcy05b9q UVIS1 09/05/09 55079.914370 0.82000 0.50202
ibcy06meq UVIS1 10/02/09 55106.909840 -0.09667 0.24016
But once I read it into my MySQL database it looks like this:
+------+-----------+--------+------------+---------+----------+
| id | filename | camera | date | mjd | focus |
+------+-----------+--------+------------+---------+----------+
| 1026 | ibcy02blq | UVIS1 | 2009-08-03 | 55046.2 | 0.57857 |
| 1027 | ibcy02bnq | UVIS1 | 2009-08-03 | 55046.2 | -0.15 |
| 1028 | ibcy03j8q | UVIS1 | 2009-08-11 | 55054 | -0.37143 |
| 1029 | ibcy03jaq | UVIS1 | 2009-08-11 | 55054 | -0.91857 |
| 1030 | ibcy04m4q | UVIS1 | 2009-08-18 | 55061.2 | -0.32333 |
| 1031 | ibcy04m6q | UVIS1 | 2009-08-18 | 55061.2 | -0.24867 |
| 1032 | ibcy05b7q | UVIS1 | 2009-09-05 | 55079.9 | 0.649 |
| 1033 | ibcy05b9q | UVIS1 | 2009-09-05 | 55079.9 | 0.82 |
| 1034 | ibcy06meq | UVIS1 | 2009-10-02 | 55106.9 | -0.09667 |
| 1035 | ibcy06mgq | UVIS1 | 2009-10-02 | 55106.9 | -0.1425 |
+------+-----------+--------+------------+---------+----------+
The mjd column is being truncated and I'm not sure why. I understand that there are floating point precision errors for something like 1/3 but this looks more like some type of rounding is being implemented.
Here is the code I use to ingest the data into the database:
def make_focus_table_main():
"""The main controller for the make_focus_table
module."""
logging.info('Process Starting')
filename_list = glob.glob('/grp/hst/OTA/focus/source/FocusModel/UVIS*FocusHistory.txt')
logging.info('Found {} files'.format(len(filename_list)))
for filename in filename_list:
logging.info('Reading data from {}'.format(filename))
output_list = []
with open(filename, 'r') as f:
data = f.readlines()
for line in data[1:]:
line = line.split()
output_dict = {}
output_dict['filename'] = line[0]
output_dict['camera'] = line[1]
output_dict['date'] = datetime.strptime(line[2], '%m/%d/%y')
output_dict['mjd'] = float(line[3])
output_dict['focus'] = float(line[4])
output_list.append(output_dict)
logging.info('Beginning bulk insert of records.')
engine.execute(Focus.__table__.insert(), output_list)
logging.info('Database insert complete.')
logging.info('Process Complete')
I've used pdb to check that the values are not being truncated prior to being passed to the database (i.e. Python/SQLAlchemy is not performing the rounding). I can verify this in the INSERT command SQLAlchemy issues:
2014-04-11 13:08:20,522 INFO sqlalchemy.engine.base.Engine INSERT INTO focus (filename, camera, date, mjd, focus) VALUES (%s, %s, %s, %s, %s)
2014-04-11 13:08:20,602 INFO sqlalchemy.engine.base.Engine (
('ibcy02blq', 'UVIS2', datetime.datetime(2009, 8, 3, 0, 0), 55046.19663, 1.05778),
('ibcy02bnq', 'UVIS2', datetime.datetime(2009, 8, 3, 0, 0), 55046.19833, 1.32333),
('ibcy03j8q', 'UVIS2', datetime.datetime(2009, 8, 11, 0, 0), 55054.04165, 1.57333),
('ibcy03jaq', 'UVIS2', datetime.datetime(2009, 8, 11, 0, 0), 55054.04335, 0.54333),
('ibcy04m4q', 'UVIS2', datetime.datetime(2009, 8, 18, 0, 0), 55061.1549, -1.152),
('ibcy04m6q', 'UVIS2', datetime.datetime(2009, 8, 18, 0, 0), 55061.1566, -1.20733),
('ibcy05b7q', 'UVIS2', datetime.datetime(2009, 9, 5, 0, 0), 55079.91267, 2.35905),
('ibcy05b9q', 'UVIS2', datetime.datetime(2009, 9, 5, 0, 0), 55079.91437, 1.84524)
... displaying 10 of 1025 total bound parameter sets ...
('ichl05qwq', 'UVIS2', datetime.datetime(2014, 4, 2, 0, 0), 56749.05103, -2.98),
('ichl05qxq', 'UVIS2', datetime.datetime(2014, 4, 2, 0, 0), 56749.05177, -3.07))
2014-04-11 13:08:20,959 INFO sqlalchemy.engine.base.Engine COMMIT
Here is how the column is defined in my SQLAlchemy classes:
class Focus(Base):
"""ORM for the table storing the focus measurement information."""
__tablename__ = 'focus'
id = Column(Integer(), primary_key=True)
filename = Column(String(17), index=True, nullable=False)
camera = Column(String(5), index=True, nullable=False)
date = Column(Date(), index=True, nullable=False)
mjd = Column(Float(precision=20, scale=10), index=True, nullable=False)
focus = Column(Float(15), nullable=False)
__table_args__ = (UniqueConstraint('filename', 'camera',
name='focus_uniqueness_constraint'),)
Here is the SQL that's logged from SQLAlchemy with echo=True when I create the table:
CREATE TABLE focus (
id INTEGER NOT NULL AUTO_INCREMENT,
filename VARCHAR(17) NOT NULL,
camera VARCHAR(5) NOT NULL,
date DATE NOT NULL,
mjd FLOAT(20) NOT NULL,
focus FLOAT(15) NOT NULL,
PRIMARY KEY (id),
CONSTRAINT focus_uniqueness_constraint UNIQUE (filename, camera)
)
So far, so good. But here's what I see MySQL with a SHOW CREATE TABLE focus;:
CREATE TABLE `focus` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`filename` varchar(17) NOT NULL,
`camera` varchar(5) NOT NULL,
`date` date NOT NULL,
`mjd` float NOT NULL,
`focus` float NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `focus_uniqueness_constraint` (`filename`,`camera`),
KEY `ix_focus_filename` (`filename`),
KEY `ix_focus_mjd` (`mjd`),
KEY `ix_focus_date` (`date`),
KEY `ix_focus_camera` (`camera`)
) ENGINE=InnoDB AUTO_INCREMENT=1193 DEFAULT CHARSET=latin1
Somehow the FLOAT definition changed! Is this some type of MySQL configuration setting? I'm just running this on my local host right now, but if this is a configuration setting I'm concerned about the portability of this code onto a production server if I continue to use floats. I could just switch to a decimal column type as I've seen in other SO questions since I need exact values but I would like to understand what's going on here.
Update: Just to expand a little on two-bit-alchemist's answer, here is how it changes my query:
> SELECT ROUND(mjd,10) FROM focus LIMIT 10;
+------------------+
| ROUND(mjd,10) |
+------------------+
| 55046.1953125000 |
| 55046.1992187500 |
| 55054.0429687500 |
| 55054.0429687500 |
| 55061.1562500000 |
| 55061.1562500000 |
| 55079.9140625000 |
| 55079.9140625000 |
| 55106.9101562500 |
| 55106.9101562500 |
+------------------+
10 rows in set (0.00 sec)
Notice that all the decimal precision is still there. I had no idea SELECT was rounding values but I guess this makes sense if you think about how a floating point representation works. It uses the full bytes allocated for that number, how many decimals you display is arbitrary up to the full length of the float:https://stackoverflow.com/a/20482699/1216837
Specifying the precision only seems to affect if it's stored as a double or a single: http://dev.mysql.com/doc/refman/5.0/en/floating-point-types.html.
But, what's also interesting/annoying is that I have to worry about this same thing when issuing a SELECT from the SQLAlchemy layer:
query = psf_session.query(Focus).first()
print query.filename, query.mjd, query.focus
Gives me bcy02blq 55046.2 1.05778 so the values are still being rounded. Again, this makes sense because SQLAlchemy is just issuing SQL commands anyway. All in all this is motivating me to switch to a DECIMAL column type: http://dev.mysql.com/doc/refman/5.0/en/fixed-point-types.html
It looks like all your values were printed with exactly six digits (except where .0 was left off in a couple of places). While I can't find any documentation on this, I suspect this is simply a default MySQL behavior for displaying float values in the context of a SELECT statement.
Based on the CREATE TABLE statement you provided, the internal representation is correct, so you need only add something like ROUND(mjd, 3) to your statement, with the first argument being the fields to round and the last being the number of digits to round to (which can be longer than what is displaying now).
I'm new to Python. I'm trying to make code it so it will print out this ASCII art traffic light, here is the actual ASCII
##
_[]_
[____]
.----' '----.
.===| .==. |===.
\ | /####\ | /
/ | \####/ | \
'===| `""` |==='
.===| .==. |===.
\ | /::::\ | /
/ | \::::/ | \
'===| `""` |==='
.===| .==. |===.
\ | /&&&&\ | /
/ | \&&&&/ | \
'===| `""` |==='
jgs '--.______.--'
And the Code I'm trying to use is this
print ("##"),
print (" _[]_"),
print (".----' '----."),
print (" .===| .==. |===."),
print (" \ | /####\ | /"),
print (" / | \####/ | \\"),
print ("'===| `""` |==='"),
print (" .===| .==. |===."),
print ("\ | /::::\ | /"),
print (" / | \::::/ | \"),
print ("'===| `""` |==='"),
print (".===| .==. |===."),
print (" \ | /&&&&\ | /"),
print (" / | \&&&&/ | \"),
print (" '===| `""` |==='"),
print ("'--.______.--'")
You need to escape the \ characters, double them:
print (" / | \::::/ | \"),
should be:
print(" / | \\::::/ | \\")
You want to get rid of all the commas too.
Note that you can create a multiline string using triple quotes; make it a raw string (using r'') and you don't have to escape anything either:
print(r''' _[]_
[____]
.----' '----.
.===| .==. |===.
\ | /####\ | /
/ | \####/ | \
'===| `""` |==='
.===| .==. |===.
\ | /::::\ | /
/ | \::::/ | \
'===| `""` |==='
.===| .==. |===.
\ | /&&&&\ | /
/ | \&&&&/ | \
'===| `""` |==='
jgs '--.______.--'
''')