[UPDATE]Google Search Python Module - python

I have a mistake with google search module.
I try to use this module to do multiple requests, but i have a mistake for do each query with the word.
alpha = input(colored ("[{}*{}] Enter Path of you're Word : ",'yellow'))
word = open(alpha, 'r')
Lines = word.readlines()
query = Lines
try:
print(colored("[{}+{}] Scan started! Please wait... :)",'red'))
for gamma in search(query, start=0, tld=beta, num=1000 , pause=2):
print(colored ('[+] Found > ' ,'yellow') + (gamma) )
with open("googleurl.txt","a") as f:
f.write(gamma + "/" + "\n")
except:
print("[{}-{}] Word Liste not found!")
I think it's not possible to do multiple query,
Because my dorks is loaded into my python program but query not done. If i change
query = "test"
I have like 100 requests for the word test. I think i have do a bad things, for do query with the text file.
I'm sorry for my bad English. I'm a beginner with English and also with Python
I hope you can help me
I'm now with this program :
alpha = input(colored ("[{}*{}] Wordlist : ",'yellow'))
Word = open(alpha, 'r')
Lines = Word.readlines()
query = Lines
beta = random.choice(TLD)
Word_number = 0
for line in Lines:
Word_number+=1
for query in Lines:
print("Nombre de Word: "+str(Word_number))
for i in search(query, start=0, tld=beta, num=1000 , pause=2, stop=None):
print(colored ('[+] Found > ' ,'yellow') +(i))
URL_number+=1
with open("googleurl.txt","a") as f:
f.write(i + "/" + "\n")
f.close()
print(colored("[{}+{}] Total Google URL : ",'red') + str(URL_number))
And my program answer do this :
He just fount 98 website and stop, and he only check the 1st word

word.readlines() returns a list of strings, where each item is the next line in the file. This means that query is a list.
The search() function wants query to be a string, so you'll have to loop through Lines to get each individual query:
for query in Lines:
# perform search with this query

Hey i finally update my code. And i now i have a problem with proxies.
The code is fixed for requests with dorks but i can't find how to add proxy my code is :
alpha = input(colored ("[{}*{}] Dorklist : ",'yellow'))
dorks = open(alpha, 'r')
Lines = dorks.readlines()
query = Lines
beta = random.choice(TLD)
ceta = input(colored ("[{}*{}] Proxylist :",'yellow'))
prox = open(ceta, 'r')
Lines2 = prox.readlines()
proxy = Lines2
Dorks_number = 0
Proxy_number = 0
for line in Lines:
Dorks_number+=1
for line in Lines2:
Proxy_number+=1
print("Nombre de dorks: "+str(Dorks_number))
print("Nombre de Proxy: "+str(Proxy_number))
s = requests.Session(proxies=proxy)
s.cookies.set_policy(BlockAll())
for query in Lines:
for i in search(query, start=0, tld=beta, num=1000 , pause=2, stop=None):
print(colored ('[+] Found > ' ,'yellow') +(i))
URL_number+=1
with open("googleurl.txt","a") as f:
f.write(i + "/" + "\n")
f.close()
print(colored("[{}+{}] Total Google URL : ",'red') + str(URL_number))
My error :
s = requests.Session(proxies=proxy)
TypeError: init() got an unexpected keyword argument 'proxies'
Someone have an idea how to done it ?

Related

need help regarding this error: can only concatenate list (not "str") to list

I'm learning python so I am pretty new to it.
I've been working on a class assignment and iv'e been facing some error, such as the one in the title.
This is my code:
import random
def getWORDS(filename):
f = open(filename, 'r')
templist = []
for line in f:
templist.append(line.split("\n"))
return tuple(templist)
articles = getWORDS("articles.txt")
nouns = getWORDS("nouns.txt")
verbs = getWORDS("verbs.txt")
prepositions = getWORDS("prepositions.txt")
def sentence():
return nounphrase() + " " + verbphrase()
def nounphrase():
return random.choice(articles) + " " + random.choice(nouns)
def verbphrase():
return random.choice(verbs) + " " + nounphrase() + " " + \
prepositionalphrase()
def prepositionalphrase():
return random.choice(prepositions) + " " + nounphrase()
def main():
number = int(input("enter the number of sentences: "))
for count in range(number):
print(sentence())
main()
However, whenever I run it I get an this error:
TypeError: can only concatenate list (not "str") to list.
Now, I know there are tons of question like this but I tried a lot of time, I am not able to fix it, I'm new to programming so I've been learning the basics since last week.
Thank you
Here I've modified the function slightly - it'll fetch every words into a tuple. Use with to open the files - it will close the pointer once the values have been fetched.
I hope this will work for you!
def getWORDS(filename):
result = []
with open(filename) as f:
file = f.read()
texts = file.splitlines()
for line in texts:
result.append(line)
return tuple(result)
I think the problem is in this line:
templist.append(line.split("\n"))
split() will return a list that is then appended to templist. If you're wanting to remove the newline character from the end of the line use rstrip() as this will return a string.
When working with a file, you should use the read() method:
file = f.read()
To split the file to lines and add to a list, you first split, then append line by line.
file = f.read()
lines = file.split("\n")
for line in lines:
templist.append(line)
In your case, you are using the list of lines as-is, so I would write:
file = f.read()
templist = file.split("\n")
Edit 1:
Another useful tool when working with files is f.readline(), which returns the first line when calling it for the first time, second when calling it once again... third... and so on, although the previous ways I showed would be more efficient here.
Edit 2:
When you are done using the file, use the close() method, or start using the file with a with ... as method which closes the file at the end of the code block.
Code example using with ... as (The best written code in this answer):
def getWORDS(filename):
with open(filename, 'r') as f:
file = f.read()
templist = file.split("\n")
return tuple(templist)
Code example using close():
def getWORDS(filename):
f = open(filename, 'r')
file = f.read()
templist = file.split("\n")
f.close()
return tuple(templist)
This is how I would write the full code.
(fixed file opening and reading + fixed capitalization)
import random
def getWORDS(filename):
with open(filename, 'r') as f:
file = f.read()
templist = file.split("\n")
return tuple(templist)
articles = getWORDS("articles.txt")
nouns = getWORDS("nouns.txt")
verbs = getWORDS("verbs.txt")
prepositions = getWORDS("prepositions.txt")
def sentence():
sentence = nounphrase() + " " + verbphrase()
sentence = sentence.split(" ")
sentence[0] = sentence[0].capitalize()
sentence = " ".join(sentence)
return sentence
def nounphrase():
return random.choice(articles).lower() + " " + random.choice(nouns).capitalize()
def verbphrase():
return random.choice(verbs).lower() + " " + nounphrase() + " " + \
prepositionalphrase()
def prepositionalphrase():
return random.choice(prepositions).lower() + " " + nounphrase()
def main():
number = int(input("enter the number of sentences: "))
for count in range(number):
print(sentence())
main()

TypeError: cannot concatenate 'str' and 'NoneType' objects?

I have this large script ( I will post the whole thing if I have to but it is very big) which starts off okay when I run it but it immediatly gives me 'TypeError: cannot concatenate 'str' and 'NoneType' objects' when it comes to this last bit of the code:
with open("self.txt", "a+") as f:
f = open("self.txt", "a+")
text = f.readlines()
text_model = markovify.Text(text)
for i in range(1):
tool = grammar_check.LanguageTool('en-GB')
lin = (text_model.make_sentence(tries=800))
word = ('' + lin)
matches = tool.check (word)
correct = grammar_check.correct (word, matches)
print ">",
print correct
print ' '
f = open("self.txt", "a+")
f.write(correct + "\n")
I have searched everywhere but gotten nowhere. It seems to have something to do with: word = ('' + lin). but no matter what I do I can't fix it. What am I doing wrong?
I'm not sure how I did it but with a bit of fiddling and google I came up with a solution, the corrected code is here (if you're interested):
with open("self.txt", "a+") as f:
f = open("self.txt", "a+")
text = f.readlines()
text_model = markovify.Text(text)
for i in range(1):
tool = grammar_check.LanguageTool ('en-GB')
lin = (text_model.make_sentence(tries=200))
matches = tool.check (lin)
correct = grammar_check.correct (lin, matches)
lowcor = (correct.lower())
print ">",
print str (lowcor)
print ' '
f = open("self.txt", "a+")
f.write(lowcor + "\n")
Thanks for all the replies, they had me thinking and that's how I fixed it!
You can't concatenate a string and a NoneType object. In your code, it appears your variable lin is not getting assigned the value you think it is. You might try an if block that starts like this:
if type(lin) == str:
some code
else:
raise Exception('lin is not the correct datatype')
to verify that lin is the correct datatype before printing.

Python - searching if string is in file

I want to search for string in file and if there is string make action and if there isn´t string make other action, but from this code:
itcontains = self.textCtrl2.GetValue()
self.textCtrl.AppendText("\nTY: " + itcontains)
self.textCtrl2.Clear()
pztxtflpath = "TCM/Zoznam.txt"
linenr = 0
with open(pztxtflpath) as f:
found = False
for line in f:
if re.search("\b{0}\b".format(itcontains),line):
hisanswpath = "TCM/" + itcontains + ".txt"
hisansfl = codecs.open(hisanswpath, "r")
textline = hisansfl.readline()
linenr = 0
ans = ""
while textline <> "":
linenr += 1
textline = hisansfl.readline()
hisansfl.close()
rnd = random.randint(1, linenr) - 1
hisansfl = codecs.open(pztxtflpath, "r")
textline = hisansfl.readline()
linenr = 0
pzd = ""
while linenr <> rnd:
textline = hisansfl.readline()
linenr += 1
ans = textline
hisansfl.close()
self.textCtrl.AppendText("\nTexter: " + ans)
if not found:
self.textCtrl.AppendText("\nTexter: " + itcontains)
wrtnw = codecs.open(pztxtflpath, "a")
wrtnw.write("\n" + itcontains)
wrtnw.close
If there is not that string it is working corectly, but if there is that string, what i am searching for it makes if not found action. I really don´t know how to fix it, i have already try some codes from other sites, but in my code it doesn´t works. Can somebody help please?
Are you saying that the code underneath the following if statement executes if the string contains what you're looking for?
if re.search("\b{0}\b".format(itcontains),line):
If so, then you just need to add the following to the code block underneath this statement:
found = True
This will keep your if not found clause from running. If the string you are looking for should only be found once, I would also add a break statement to your first statement to break out of the loop.

Python flow control with Flag?

Matching a file in this form. It always begins with InvNo, ~EOR~ is End Of Record.
InvNo: 123
Tag1: rat cake
Media: d234
Tag2: rat pudding
~EOR~
InvNo: 5433
Tag1: strawberry tart
Tag5: 's got some rat in it
~EOR~
InvNo: 345
Tag2: 5
Media: d234
Tag5: rather a lot really
~EOR~
It should become
IN 123
UR blabla
**
IN 345
UR blibli
**
Where UR is a URL. I want to keep the InvNo as first tag. ** is now the end of record marker. This works:
impfile = filename[:4]
media = open(filename + '_earmark.dat', 'w')
with open(impfile, 'r') as f:
HASMEDIA = False
recordbuf = ''
for line in f:
if 'InvNo: ' in line:
InvNo = line[line.find('InvNo: ')+7:len(line)]
recordbuf = 'IN {}'.format(InvNo)
if 'Media: ' in line:
HASMEDIA = True
mediaref = line[7:len(line)-1]
URL = getURL(mediaref) # there's more to it, but that's not important now
recordbuf += 'UR {}\n'.format(URL))
if '~EOR~' in line:
if HASMEDIA:
recordbuf += '**\n'
media.write(recordbuf)
HASMEDIA = False
recordbuf = ''
media.close()
Is there a better, more Pythonic way? Working with the recordbuffer and the HASMEDIA flag seems, well, old hat. Any examples or tips for good or better practice?
(Also, I'm open to suggestions for a more to-the-point title to this post)
You could set InvNo and URL initially to None, and only print a record when InvNo and URL are both not Falsish:
impfile = filename[:4]
with open(filename + '_earmark.dat', 'w') as media, open(impfile, 'r') as f:
InvNo = URL = None
for line in f:
if line.startswith('InvNo: '):
InvNo = line[line.find('InvNo: ')+7:len(line)]
if line.startswith('Media: '):
mediaref = line[7:len(line)-1]
URL = getURL(mediaref)
if line.startswith('~EOR~'):
if InvNo and URL:
recordbuf = 'IN {}\nUR {}\n**\n'.format(InvNo, URL)
media.write(recordbuf)
InvNo = URL = None
Note: I changed 'InvNo: ' in line to line.startswith('InvNo: ') based on the assumption that InvNo always occurs at the beginning of the line. It appears to be true in your example, but the fact that you use line.find('InvNo: ') suggests that 'InvNo:' might appear anywhere in the line.
If InvNo: appears only at the beginning of the line, then use line.startswith(...) and remove line.find('InvNo: ') (since it would equal 0).
Otherwise, you'll have to retain 'InvNo:' in line and line.find (and of course, the same goes for Media and ~EOR~).
The problem with using code like 'Media' in line is that if the Tags can contain anything, it might contain the string 'Media' without being a true field header.
Here is a version if you don't want to slice and if you ever need to write to the same output file again, you may not, you can change 'w' to 'a'.
with open('input_file', 'r') as f, open('output.dat', 'a') as media:
write_to_file = False
lines = f.readlines()
for line in lines:
if line.startswith('InvNo:'):
first_line = 'IN ' + line.split()[1] + '\n'
if line.startswith('Media:'):
write_to_file = True
if line.startswith('~EOR~') and write_to_file:
url = 'blabla' #Put getUrl() here
media.write(first_line + url + '\n' + '**\n')
write_to_file = False
first_line = ''

list / string in python

I'm trying to parse tweets data.
My data shape is as follows:
59593936 3061025991 null null <d>2009-08-01 00:00:37</d> <s><a href="http://help.twitter.com/index.php?pg=kb.page&id=75" rel="nofollow">txt</a></s> <t>honda just recalled 440k accords...traffic around here is gonna be light...win!!</t> ajc8587 15 24 158 -18000 0 0 <n>adrienne conner</n> <ud>2009-07-23 21:27:10</ud> <t>eastern time (us & canada)</t> <l>ga</l>
22020233 3061032620 null null <d>2009-08-01 00:01:03</d> <s><a href="http://alexking.org/projects/wordpress" rel="nofollow">twitter tools</a></s> <t>new blog post: honda recalls 440k cars over airbag risk http://bit.ly/2wsma</t> madcitywi 294 290 9098 -21600 0 0 <n>madcity</n> <ud>2009-02-26 15:25:04</ud> <t>central time (us & canada)</t> <l>madison, wi</l>
I want to get the total numbers of tweets and the numbers of keyword related tweets. I prepared the keywords in text file. In addition, I wanna get the tweet text contents, total number of tweets which contain mention(#), retweet(RT), and URL (I wanna save every URL in other file).
So, I coded like this.
import time
import os
total_tweet_count = 0
related_tweet_count = 0
rt_count = 0
mention_count = 0
URLs = {}
def get_keywords(filepath, mode):
with open(filepath, mode) as f:
for line in f:
yield line.split().lower()
for line in open('/nas/minsu/2009_06.txt'):
tweet = line.strip().lower()
total_tweet_count += 1
with open('./related_tweets.txt', 'a') as save_file_1:
keywords = get_keywords('./related_keywords.txt', 'r')
if keywords in line:
text = line.split('<t>')[1].split('</t>')[0]
if 'http://' in text:
try:
url = text.split('http://')[1].split()[0]
url = 'http://' + url
if url not in URLs:
URLs[url] = []
URLs[url].append('\t' + text)
save_file_3 = open('./URLs_in_related_tweets.txt', 'a')
print >> save_file_3, URLs
except:
pass
if '#' in text:
mention_count +=1
if 'RT' in text:
rt_count += 1
related_tweet_count += 1
print >> save_file_1, text
save_file_2 = open('./info_related_tweets.txt', 'w')
print >> save_file_2, str(total_tweet_count) + '\t' + srt(related_tweet_count) + '\t' + str(mention_count) + '\t' + str(rt_count)
save_file_1.close()
save_file_2.close()
save_file_3.close()
Following is the sample keywords
Depression
Placebo
X-rays
X-ray
HIV
Blood preasure
Flu
Fever
Oral Health
Antibiotics
Diabetes
Mellitus
Genetic disorders
I think my code has many problem, but the first error is as follows:
line 23, in <module>
if keywords in line:
TypeError: 'in <string>' requires string as left operand, not generator
I coded "def ..." part. I think it has a problem. When I try "print keywords" under line (keywords = get_keywords('./related_keywords.txt', 'r')), it gives something strange numbers not words.... . Please help me out!
Maybe change if keywords in line: to use a regular expression match instead. For example, something like:
import re
...
keywords = "|".join(get_keywords('./related_keywords.txt', 'r'))
matcher = re.compile(keywords)
if matcher.match(line):
text = ...
... And changed get_keywords to something like this instead:
def get_keywords(filepath, mode):
keywords = []
with open(filepath, mode) as f:
for line in f:
sp = line.split()
for w in sp:
keywords.append(w.lower())
return keywords

Categories