Semeval twitter data download is not working - python

I want to download the training set of the following dataset: http://alt.qcri.org/semeval2017/task4/index.php?id=data-and-tools
I am required to use a script to download the tweets from their respective ids and the script is on this github: https://github.com/seirasto/twitter_download
When I run the following command in powershell:
python download_tweets_user_api.py --dist input.txt --output output.txt --user
I get
https://alt.qcri.org/semeval2017/task4/
^
SyntaxError: invalid syntax
When trying to run this command:
python download_tweets_api.py --dist=tweeti-a.dist.tsv
I get the following error:
usage: download_tweets_api.py [-h] [--partial PARTIAL] --dist DIST --output OUTPUT
download_tweets_api.py: error: argument --dist: can't open 'tweeti-a.dist.tsv': [Errno 2] No such file or directory: 'tweeti-a.dist.tsv'
What am I doing wrong?

The syntax error can be avoided by removing or commenting the following line 7 of the download_tweets_user_api.py file:
https://alt.qcri.org/semeval2017/task4/
For the second error, use:
python download_tweets_api.py --dist={name-of-file} --output ouput.txt
instead of:
python download_tweets_api.py --dist=tweeti-a.dist.tsv
where {name-of-file} is the filename of the file that you downloaded from https://alt.qcri.org/semeval2017/task4/index.php?id=data-and-tools
Make sure the file is in the same directory where you run the command prompt and where download_tweets_api.py lies.
Assuming you have authorized twitter, change line 19:
MY_TWITTER_CREDS = os.path.expanduser('~/.my_app_credentials')
to:
MY_TWITTER_CREDS = os.path.expanduser('~/.twitter_oauth')
If you have not authorized twitter yet, open command prompt and import twitter package using pip install twitter and then run twitter authorize. Once you are redirected to a page, allow authorization and copy the pin back to command prompt.
Once this is done, your .twitter_oauth file will be created and you will be able to run the command that caused the error.

Related

Why am I getting "Permission denied" when trying to get the file path to a txt file?

I'm trying to exchange a string between a Python file and a Shell script. Here is how I'm doing it:
#! /bin/bash
# This gets the file location of the script:
SCRIPT=$(readlink -f "$0")
SCRIPTPATH=$(dirname "$SCRIPT")
TEXT=$("$SCRIPTPATH/text.txt")
ENTRY_VAR=$(zenity --entry --text="$TEXT")
echo "$ENTRY_VAR" > "$SCRIPTPATH/text.txt"
I'm writing to the text file from the Python script, but on line 7 of the shell script (TEXT=$("$SCRIPTPATH/text.txt")) I get the error code:
/mnt/chromeos/MyFiles/ZenityPy/ZenityFiles/entry.sh: line 7: /mnt/chromeos/MyFiles/ZenityPy/ZenityFiles/text.txt: Permission denied
Why does this happen?
TEXT=$("$SCRIPTPATH/text.txt") means "execute the command called "$SCRIPTPATH/text.txt" and store its stdout in TEXT. You get the "permission denied" error because that file is not executable.
Instead you probably want TEXT=$(cat "$SCRIPTPATH/text.txt").

Run python program in jenkins job

I am trying to run below code in jenkins job, code is to delete files older than 30 days from a directory in ftp server.
I created freestyle project job in jenkins and in build section I have selected "Execute shell" and I have added below code.
#! /usr/bin/python
import time
import ftputil
host = ftputil.FTPHost('host', 'user', 'pass')
mypath = '/path/directory'
now = time.time()
host.chdir(mypath)
names = host.listdir(host.curdir)
for name in names:
      if host.path.isfile(name):
         host.remove(name)
host.close()
I am facing below error on build
Building remotely on docker-4 (maven linux docker) in workspace /var/lib/jenkins/workspace/Capacity/folder/Test_job
[Test_job] $ /usr/bin/python /tmp/jenkins8422988908580909797.sh
File "/tmp/jenkins8422988908580909797.sh", line 6
SyntaxError: Non-ASCII character '\xc2' in file /tmp/jenkins8422988908580909797.sh on line 6, but no encoding declared;
and I also tried with "Execute python script" build option I am facing similar error like below.
Building remotely on docker-4 (maven linux docker) in workspace /var/lib/jenkins/workspace/Capacity/folder/Test_job
[Test_job] $ python /tmp/jenkins5375363980435767190.py
File "/tmp/jenkins5375363980435767190.py", line 6
SyntaxError: Non-ASCII character '\xc2' in file /tmp/jenkins5375363980435767190.py on line 6, but no encoding declared;
I am new to jenkins job and python, can any one guide how can I resolve this issue.
2) If I select jenkins pipeline job how can I call this python code from jenkinsfile.
Try to install ftputil globally.
pip install ftputil
or
sudo pip install ftputil
The following shell script works with no error, once I installed ftputil globally with sudo.
#!/usr/bin/env python
import time
import datetime
import ftputil
There is nothing special about this python program. If you want to execute it from shell. Then create shell script file e.g. python_prog.sh then, change permissions chmod +x python_prog.sh python_prog.py
python_prog.sh
#!/bin/sh
python python_prog.py
Finally, run the script from terminal . python_prog.sh or ./python_prog.sh

Calling curl from python script

I am trying to use curl to connect to the Splunk API and run a search and store the results into a .csv file. I am able to make this work with the subprocess module and powershell.exe, shown in the code block below
powershell_path = "C:\\WINDOWS\\system32\\WindowsPowerShell\\v1.0\\powershell.exe"
search_string = '<search to be used>'
curl_command = '\ncurl -k -u <other args>'
subprocess.call([powershell_path, search_string, curl_command])
However, I would like to be able to do this without using the powershell.exe. I tried just substituting the path to the powershell.exe with the path to the curl.exe, but then I get errors telling me I have illegal characters in my search_string and curl_command variables (escaping them with a \ doesn't help), and when I try to directly use the paths in the subprocess.call:
subprocess.call(['<path to curl.exe>', '<full search string>, 'curl -k -u <other args>'])
I get the error WindowsError: [Error 2] The system cannot find the file specified (I checked path to file and tried os.join, got nowhere). How can I use curl in Python to connect to the Splunk API, run a search, save the results to a csv, and do all this without calling external applications like Powershell?

Python copy file with Fabric - Windows to Debian

I am trying to use Python Fabric to copy a file from Windows to a debian system.
SOURCE: The Windows folder is C:\Users\UserN\Downloads contains the file test_celsius.out.
DESTINATION: The Debian folder is /mnt/Reado/RoTempValC.
I can move other files from the SOURCE to the DESTINATION using WinSCP. However, I need to use Fabric to move this particular file.
I can use Fabric to change into this directory and list its current contents:
ls /mnt/Reado/RoTempValC
Here is what I have tried - in a Fabric task named move() I have this
run('mv C:\Users\UserN\Downloads\test_celsius.out /mnt/Reado/RoTempValC')
Now, here is the output:
.
.
.
.
[10.10..] Executing task 'move'
[10.10..] run: mv C:\Users\UserN\Downloads\test_celsius.out /mnt/Reado/RoTempValC
[10.10..] out: mv: rename C:/Users/UserN/Downloads/test_celsius.out to /mnt/Reado/RoTempValC/test_celsius.out: No such file or directory
[10.10..] out:
Disconnecting from 10.10.. done.
Fatal error: run() received nonzero return code 1 while executing!
Requested: mv C:/Users/UserN/Downloads/test_celsius.out /mnt/Reado/RoTempValC
Executed: /bin/bash -l -c "mv C:/Users/UserN/Downloads/test_celsius.out /mnt/Reado/RoTempValC"
Aborting.
I am not sure why it is doing this. I can correctly list the contents of the directory in Debian by using the ls command above.
Is there a way to copy this file?
EDIT:
Additional Information:
I am running the above fab move command from the Windows command
prompt.
I opened the command prompt and typed cd Python27\SGTemp
since this is where the fabfile.py is located.
Then I ran fab move.
EDIT 2:
I replaced /mnt/Reado/RoTempVal by /mnt/Reado/RoTempValC/ but got the same output as above.
Try fabric.operations.put(*args, **kwargs):
put('C:\Users\UserN\Downloads\test_celsius.out', '/mnt/Reado/RoTempValC')

windows wget & being cut off

System Info:
Windows 7,
GNU Wget 1.11.4,
Python 2.6
The problem:
Im running a python script that fires a wget shortcut, the problem is that wget (even when run purely in command line from the exe) cuts off '&''s. For example when i run the code below, this is what i get:
C:\Program Files\GnuWin32\bin>wget.exe
http://www.imdb.com/search/title?genres=action&sort=alpha,asc&start=51&title_type=feature
SYSTEM_WGETRC = c:/progra~1/wget/etc/wgetrc syswgetrc = C:\Program
Files\GnuWin32/etc/wgetrc
--2013-01-18 12:48:43-- http://www.imdb.com/search/title?genres=action Resolving
www.imdb.com... 72.21.215.52 Connecting to
www.imdb.com|72.21.215.52|:80... failed: Connection refused.
=alpha,ascThe system cannot find the file specified.
The system cannot find the file 51. 'title_type' is not recognized as an internal or
external command, operable program or batch file.
As you can see, wget counts all text before the '&' as the URL in question, and windows take the last half as a new command(s).
There has got to be some way of allowing wget to capture that whole string as the URL.
Thanks in advance.
EDIT:
When i call the command in command line with brackets around it, it works great, however, when i run the script through python:
subprocess.Popen(['start /B wget.lnk --directory-prefix=' + output_folder + ' --output-document=' + output_folder + 'this.html "http://www.imdb.com/search/title?genres=action&sort=alpha,asc&start=51&title_type=feature"'], shell=True)
I get the following error:
SYSTEM_WGETRC = c:/progra~1/wget/etc/wgetrc
syswgetrc = C:\Program Files (x86)\GnuWin32/etc/wgetrc
"http://www.imdb.com/search/title?genres=action&sort=alpha,asc&start=51&title_ty
pe=feature": Unsupported scheme.
It's not Wget that cuts off the URL, but the command interpreter, which uses & to separate two commands, akin to ;. This is indicated by the =alpha,ascThe system cannot find the file specified. error on the following line.
To prevent this from happening, quote the entire URL:
wget.exe "http://www.imdb.com/search/title?genres=action&sort=alpha,asc&start=51&title_type=feature"

Categories