I'm trying to download some tweets with snscrape. After installing, I can run a command like the following to download a few tweets:
snscrape --jsonl --max-results 4 twitter-search "#SherlockHolmes since:2015-01-01 until:2015-01-15" > sherlock_tweets.json
Now I want to run this command from within a python script. As I understand it, the way to do this is using the subprocess.run method. I use the following code to run the command from python:
import subprocess
# Running this in a terminal works
cmd = '''snscrape --jsonl --max-results 4 twitter-search "#SherlockHolmes since:2015-01-01 until:2015-01-15" > sherlock_tweets.json'''
arglist = cmd.split(" ")
process = subprocess.run(arglist, shell=True)
Running this, however, gives the following error.
usage: snscrape [-h] [--version] [-v] [--dump-locals] [--retry N] [-n N] [-f FORMAT | --jsonl] [--with-entity] [--since DATETIME] [--progress]
{telegram-channel,weibo-user,vkontakte-user,instagram-user,instagram-hashtag,instagram-location,twitter-thread,twitter-search,reddit-user,reddit-subreddit,reddit-search,facebook-group,twitter-user,twitter-hashtag,twitter-list-posts,facebook-user,facebook-community,twitter-profile}
...
snscrape: error: the following arguments are required: scraper
Why is the behaviour not the same in these two cases? How do I accomplish running the command from a python script, getting the exact same behaviour as I would entering it in a terminal?
I don't know if you found the solution, but I ran this code and that worked for me :
import pandas as pd
import snscrape.modules.twitter as sntwitter
tweet_collection = pd.DataFrame({
'Username':[],
'Date'=[],
'Likes'=[],
'Content'=[]})
for tweet in sntwitter.TwitterSearchScraper(f'since:{date_beg} until:{date_end} from:{twitter_account}').get_items():
tweets_collection = tweets_candidats.append({
"Username":tweet.user.username,
"Date":tweet.date,
"Tweet":tweet.content,
"Likes":tweet.likeCount,},ignore_index=True)
tweets_candidats.to_csv('Path/file.csv')
You can find more detail in the code on git hub
Twitter snscrape arguments
Related
How to call the following command using subprocess in python " python -m xport C:/abc.xpt > C:/abc.csv "?
The command works properly in command prompt.
But gives an error when tried to execute via subprocess in python.
subprocess.call(["python", "-m", "xport", "C:/abc.xpt" , ">" , "C:/abc.csv"])
The above command gives an error saying,
usage: xport.py [-h] [input]
xport.py: error: unrecognized arguments: C:/abc.csv
> C:/abc.csv is a redirect of the output to a text file and not part of the command.
But if you are inside a python script already, why don't you call the function or module directly? There is no need to use a subprocess, but if you want to use it you need to catch the output and store it somewhere (in a variable or file)
>>> proc = subprocess.Popen('ls', stdout=subprocess.PIPE)
>>> output = proc.stdout.read()
>>> print output
bar
baz
foo
I tried to pass multiple arguments to my python script (opsgit) using the click module like this:
import click
#click.command()
#click.argument('arguments', nargs=-1)
def cli(arguments):
"""CLI for git"""
cmd = create_command(arguments)
_execute_command(cmd)
When I execute this command line:
$ opsgit git checkout -b pvt_test
I get this error:
Usage: opsgit git [OPTIONS] [ARGUMENTS]...
Try "opsgit git --help" for help.
Error: no such option: -b
Can anyone let me know how to solve this one?
You are missing the ignore_unkown_options flag. Here is your example with the flag added. Check out the docs for more info on how to use nargs.
import click
#click.command(context_settings=dict(
ignore_unknown_options=True,
))
#click.argument('arguments', nargs=-1)
def cli(arguments):
"""CLI for git"""
cmd = click.create_command(arguments)
_execute_command(cmd)
I am seeing a peculiar behavior with osmfilter (https://wiki.openstreetmap.org/wiki/Osmfilter) which can be installed with the following command:
$ sudo apt-get install osmctools
Lets assume I exported map.osm for a region from https://www.openstreetmap.org and I want to filter only highways from that file. The command I can use is:
$ osmfilter map.osm --keep='highway' > highways_terminal.osm
The file highways_terminal.osm contains info about the highways. I then tried to use Python to do the same with subprocess.run():
import subprocess
cmd = ["osmfilter", "map.osm", "--keep='highway'"]
resp = subprocess.run(cmd, capture_output=True, text=True)
with open("highways_subprocess.osm", "w") as fp:
fp.write(resp.stdout)
But, highways_subprocess.osm contains no information other than "bounds".
Am I handling the quotes incorrectly?
I was having this problem (10 months later) and fixed it as
cmd = "osmfilter map.osm --keep='highway' -o=highways_subprocess.osm"
subprocess.check_call(cmd, shell=True)
I have an .R file saved locally at the following path:
Rfilepath = "C:\\python\\buyback_parse_guide.r"
The command for RScript.exe is:
RScriptCmd = "C:\\Program Files\\R\\R-2.15.2\\bin\\Rscript.exe --vanilla"
I tried running:
subprocess.call([RScriptCmd,Rfilepath],shell=True)
But it returns 1 -- and the .R script did not run successfully. What am I doing wrong? I'm new to Python so this is probably a simple syntax error... I also tried these, but they all return 1:
subprocess.call('"C:\Program Files\R\R-2.15.2\bin\Rscript.exe"',shell=True)
subprocess.call('"C:\\Program Files\\R\\R-2.15.2\\bin\\Rscript.exe"',shell=True)
subprocess.call('C:\Program Files\R\R-2.15.2\bin\Rscript.exe',shell=True)
subprocess.call('C:\\Program Files\\R\\R-2.15.2\\bin\\Rscript.exe',shell=True)
Thanks!
The RScriptCmd needs to be just the executable, no command line arguments. So:
RScriptCmd = "\"C:\\Program Files\\R\\R-2.15.2\\bin\\Rscript.exe\""
Then the Rfilepath can actually be all of the arguments - and renamed:
RArguments = "--vanilla \"C:\\python\\buyback_parse_guide.r\""
It looks like you have a similar problem to mine. I had to reinstall RScript to a path which has no spaces.
See: Running Rscript via Python using os.system() or subprocess()
This is how I worked out the communication between Python and Rscript:
part in Python:
from subprocess import PIPE,Popen,call
p = subprocess.Popen([ path/to/RScript.exe, path/to/Script.R, Arg1], stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE)
out = p.communicate()
outValue = out[0]
outValue contains the output-Value after executing the Script.R
part in the R-Script:
args <- commandArgs(TRUE)
argument1 <- as.character(args[1])
...
write(output, stdout())
output is the variable to send to Python
This question already has answers here:
Running Bash commands in Python
(11 answers)
Closed 9 months ago.
I read this somewhere a while ago but cant seem to find it. I am trying to find a command that will execute commands in the terminal and then output the result.
For example: the script will be:
command 'ls -l'
It will out the result of running that command in the terminal
There are several ways to do this:
A simple way is using the os module:
import os
os.system("ls -l")
More complex things can be achieved with the subprocess module:
for example:
import subprocess
test = subprocess.Popen(["ping","-W","2","-c", "1", "192.168.1.70"], stdout=subprocess.PIPE)
output = test.communicate()[0]
I prefer usage of subprocess module:
from subprocess import call
call(["ls", "-l"])
Reason is that if you want to pass some variable in the script this gives very easy way for example take the following part of the code
abc = a.c
call(["vim", abc])
import os
os.system("echo 'hello world'")
This should work. I do not know how to print the output into the python Shell.
Custom standard input for python subprocess
In fact any question on subprocess will be a good read
https://stackoverflow.com/questions/tagged/subprocess
for python3 use subprocess
import subprocess
s = subprocess.getstatusoutput(f'ps -ef | grep python3')
print(s)
You can also check for errors:
import subprocess
s = subprocess.getstatusoutput('ls')
if s[0] == 0:
print(s[1])
else:
print('Custom Error {}'.format(s[1]))
# >>> Applications
# >>> Desktop
# >>> Documents
# >>> Downloads
# >>> Library
# >>> Movies
# >>> Music
# >>> Pictures
import subprocess
s = subprocess.getstatusoutput('lr')
if s[0] == 0:
print(s[1])
else:
print('Custom Error: {}'.format(s[1]))
# >>> Custom Error: /bin/sh: lr: command not found
You should also look into commands.getstatusoutput
This returns a tuple of length 2..
The first is the return integer (0 - when the commands is successful)
second is the whole output as will be shown in the terminal.
For ls
import commands
s = commands.getstatusoutput('ls')
print s
>> (0, 'file_1\nfile_2\nfile_3')
s[1].split("\n")
>> ['file_1', 'file_2', 'file_3']
In python3 the standard way is to use subprocess.run
res = subprocess.run(['ls', '-l'], capture_output=True)
print(res.stdout)
The os.popen() is pretty simply to use, but it has been deprecated since Python 2.6.
You should use the subprocess module instead.
Read here: reading a os.popen(command) into a string
Jupyter
In a jupyter notebook you can use the magic function !
!echo "execute a command"
files = !ls -a /data/dir/ #get the output into a variable
ipython
To execute this as a .py script you would need to use ipython
files = get_ipython().getoutput('ls -a /data/dir/')
execute script
$ ipython my_script.py
You could import the 'os' module and use it like this :
import os
os.system('#DesiredAction')
Running: subprocess.run
Output: subprocess.PIPE
Error: raise RuntimeError
#! /usr/bin/env python3
import subprocess
def runCommand (command):
output=subprocess.run(
command,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
if output.returncode != 0:
raise RuntimeError(
output.stderr.decode("utf-8"))
return output
output = runCommand ([command, arguments])
print (output.stdout.decode("utf-8"))