I am creating a bot to scrape images from a discord channel. The images can come in two ways:
1) A link such as: https://cdn.discordapp.com/attachments/XXXXXXXX
In this case I download the image directly from the URL and there is no problem.
2) In the second case, there is no URL and the images simply come directly as an attachment.
I am using the Python API, is there a simple way of downloading any attachments that are sent in a channel?
The code I am using for part 1 is:
if(string[0:26] == "https://cnd.discordapp.com"):
r = requests.get(string, stream = True)
with open("image1.png",'wb') as out_file:
shutil.copyfileobj(r.raw, out_file)
Is there any way to like extract a URL from an attachment where the URL isn't listed in chat so I can plug it into the first method? If not, what commands do I use to iterate through message attachments/ downloading them?
you can use
url = ctx.message.attachments[0].url
if url[0:26] == 'https://cdn.discordapp.com':
await ctx.send(url)
to get any attach attachments in the message, get the cdn.discordapp.com link and send it in chat.
I haven't tested it but you should be able to get multiple attachments from a message by changing the list index in ctx.message.attachments[0].url
Related
I am trying to figure out a way to scan through all the unread messages in a Telegram Channel and download videos and images that have more than certain amount of reactions.
I got to the point where the script will download all the unread videos and images, but I am stuck at how to filter those messages based on reactions. For example: only download the videos and images that have at least 3👍 reactions and/or 3❤️ reactions.
My code is included below. The script downloads all the unread videos and images from the channel title and store them in a sub-folder called title. Like I said, I don't want to download all the videos and images. I only want the ones that meet certain reaction thresholds. Any suggestions or ideas would be greatly appreciated!
title = 'channel name'
with TelegramClient(username, api_id, api_hash) as client:
chat_names = client(GetDialogsRequest(
offset_date=None,
offset_id=0,
offset_peer='username',
limit=0,
hash=0
))
result = client(functions.messages.GetPeerDialogsRequest(
peers=[title]
))
for chat in chat_names.chats:
if chat.title == title:
for message in client.iter_messages(title, limit=result.dialogs[0].unread_count):
if message.photo or message.video:
message.download_media('./' + str(titles) + '/')
There are easier ways to fetch dialogs than using GetDialogsRequest, which is raw API, with client.get_dialogs instead:
# Fetch the first 100 dialogs (remove the 100 to fetch all of them)
# I have renamed `chat_names` with `dialogs` to make it clearer.
dialogs = client.get_dialogs(100)
The second call to GetPeerDialogsRequest is also unnecessary. It's meant to be used when you want to fetch dialog information about a particular user or chat - but you already fetched all dialog information before.
(Technically, we could remove client.get_dialogs and only use GetPeerDialogsRequest, but for simplicity I won't do that.)
To check reactions given a message you can access the message.reactions field (as seen in the raw API for Message):
print(message.reactions.stringify())
This will give you a feel for what the object looks like with your Telethon version, in my case:
MessageReactions(
results=[
ReactionCount(
reaction=ReactionEmoji(
emoticon='❤'
),
count=11,
chosen_order=None
),
...
]
)
So all that's left is checking if a message meets your needs. I'll make a separate function to make it easier to read:
def should_download(message):
if not message.photo or not message.video:
return False # no photo or video, no need to download
if not message.reactions:
return False # no reactions
for reaction in message.reactions.results:
# It might be a ReactionCustomEmoji which doesn't have an emoticon
# Use getattr to read the emoticon field or return None if it doesn't exist
emoticon = getattr(reaction.reaction, 'emoticon', None)
if emoticon in ('❤', '👍') and reaction.count >= 3:
return True # has enough reactions
return False # did not find the reactions we wanted
And finally the loop can use the function:
for dialog in dialogs:
if dialog.title == title:
# We can use the dialog instead of the title as the chat
for message in client.iter_messages(dialog, dialog.unread_count):
if should_download(message):
message.download_media('./' + str(titles) + '/')
I am building a meme generator telegram bot that sends a meme whenever the user commands /meme. I am using an API, say "https://......meme-api.....". It gives me a URL for the meme. I want to send the media in that URL provided by the API, as a reply to the /meme command.
dispatcher.add_handler(telegram.ext.CommandHandler("meme", meme))
def meme(update, context):
response = requests.get('https://meme-api.herokuapp.com/gimme').json()
url = response.get('url')
print(url)
update.message.reply_text(url)
I can't wrap my head around how I can write the meme function. What message reply type should I use to send the media without saving it?
As explained in the documentation of Message.reply_text that method is a shortcut for Bot.send_message, which corresponds to sendMessage in the Telegram docs. This method sends text messages.
To send photos & animantions, you'll have to use
Message.reply_photo / Bot.send_photo / sendPhoto
Message.reply_animantion / Bot.send_animantion / sendAnimantion,
respectively.
Hi stalkers
Is there a way to access a site's data directly?
I need it for my code :
#commands.command(aliases = ['isitsafe','issafe','scanlink'])
async def isthissafe(self, ctx, link: str):
try:
link = 'https://transparencyreport.google.com/safe-browsing/search?url='+ link.replace('/','%2F')
embed=discord.Embed(
color = discord.Color.dark_red(),
title = '',
description = f"[Transparency Report verification]({link})")
await self.emb(embed, ctx.author.name, 'https://cwatch.comodo.com/images-new/check-my-site-security.png')
await ctx.send(embed=embed)
except:
await ctx.send('An error has occured')
print('\nERROR')
Basically I made, a command which should tell if a link is safe or not, I did it using google's verification report site, but.. the problem is I only reformatted the link so the bot sens it in an embed and you access it from there.
My question is, now that you understood what I need, is there some way in which I could directly let the bot output the message from the website that indicates if the site is malicious/safe ??
Please help me.
I provided an image as well with the message I want to get from the site.
You might want to try scraping the site with bs4, or just look for the string "No unsafe content found". However, it looks like google populates the field based on a request.
Your best bet would be to use transparencyreport.google.com/transparencyreport/api/v3/safebrowsing/status?site=SITE_HERE. It returns a JSON response, but I don't understand it, so play around and figure out what the keys mean
Title says it all. I have this discord bot that basically uploads cat gifs whenever a certain keyword or command is used. with my current code, I have to manually add the tenor/gif link to the output set so it can display that gif. Instead, I want the bot to just post any gifs of cats from tenor or any other gif website. I'm pretty sure those websites have a tag feature that assigns for example the tag "cat" to a cat gif. I want to know which gif is tagged cat and just add that gif to it's output set. Is there a way I can do this?
import discord
import os
import random
client = discord.Client()
cat_pictures = ["cats", "cat"]
cat_encouragements = [
"https://tenor.com/view/dimden-cat-cute-cat-cute-potato-gif-20953746", "https://tenor.com/view/dimden-cute-cat-cute-cat-potato-gif-20953747", "https://tenor.com/view/cute-cat-cute-cat-dimden-gif-19689251", "https://tenor.com/view/dimden-cute-cat-cute-cat-potato-gif-21657791",
"https://tenor.com/view/cats-kiss-gif-10385036",
"https://tenor.com/view/cute-kitty-best-kitty-alex-cute-pp-kitty-omg-yay-cute-kitty-munchkin-kitten-gif-15917800",
"https://tenor.com/view/cute-cat-oh-yeah-awesome-cats-amazing-gif-15805236",
"https://tenor.com/view/cat-broken-cat-cat-drinking-cat-licking-cat-air-gif-20661740",
"https://tenor.com/view/funny-animals-cute-chicken-cat-fight-dinner-time-gif-8953000"]
#client.event
async def on_ready():
print('We have logged in as catbot '.format(client))
#client.event
async def on_message(message):
if message.author == client.user:
return
if message.content.startswith ('!help'):
await message.channel.send('''
I only have two commands right now which are !cat which posts an image of a cat. !cats which gives you a video/gif
''')
if any(word in message.content for word in cat_pictures):
await message.channel.send(random.choice(cat_encouragements))
if any(word in message.content for word in cat_apology):
await message.channel.send(random.choice(cat_sad))
if any(word in message.content for word in cat_dog):
await message.channel.send(random.choice(cat_dogs))
client.run(os.getenv('TOKEN'))
If you want to get data from other page then you have to learn how to "scrape".
For some pages you may need to use requests (or urllib) to get HTML from server and beautifulsoup (or lxml) to search data in HTML. Often pages uses JavaScript to add elements so it may need Selenium to control real web browser which can run JavaScript (because requests, urllib, beautifulsoup, lxml can't run JavaScript)
But first you should check if page has API for developers to get data in simpler way - as JSON data - so you don't have to search in HTML.
As #ChrisDoyle noticed there is documentation for tensor API.
This documentation shows even example in Python (using requests) which gets JSON data. Example may need only to show how to get urls from JSON because there are other informations - like image sizes, gifs, small gif, animated gifs, mp4, etc.
This is my version based on example from documentation
import requests
# set the apikey and limit
API_KEY = "LIVDSRZULELA" # test value
search_term = "cat"
def get_urls(search, limit=8):
payload = {
'key': API_KEY,
'limit': limit,
'q': search,
}
# our test search
# get the top 8 GIFs for the search term
r = requests.get("https://g.tenor.com/v1/search", params=payload)
results = []
if r.status_code == 200:
data = r.json()
#print('[DEBUG] data:', data)
for item in data['results']:
#print('[DEBUG] item:', item)
for media in item['media']:
#print('[DEBUG] media:', media)
#for key, value in media.items():
# print(f'{key:10}:', value['url'])
#print('----')
if 'tinygif' in media:
results.append(media['tinygif']['url'])
else:
results = []
return results
# --- main ---
cat_encouragements = get_urls('cat')
for url in cat_encouragements:
print(url)
Which gives urls directly to tiny gif images
https://media.tenor.com/images/eff22afc2220e9df92a7aa2f53948f9f/tenor.gif
https://media.tenor.com/images/e0f28542d811073f2b3d223e8ed119f3/tenor.gif
https://media.tenor.com/images/75b3c8eca95d917c650cd574b91db7f7/tenor.gif
https://media.tenor.com/images/80aa0a25bee9defa1d1d7ecaab75f3f4/tenor.gif
https://media.tenor.com/images/042ef64f591bdbdf06edf17e841be4d9/tenor.gif
https://media.tenor.com/images/1e9df4c22da92f1197b997758c1b3ec3/tenor.gif
https://media.tenor.com/images/6562518088b121eab2d19917b65ee793/tenor.gif
https://media.tenor.com/images/eafc0f0bef6d6fd135908eaba24393ac/tenor.gif
If you uncomment some print() in code then you may see more information.
For example links from media.items() for single image
nanowebm : https://media.tenor.com/videos/513b211140bedc05d5ab3d8bc3456c29/webm
tinywebm : https://media.tenor.com/videos/7c1777a988eedb267a6b7d7ed6aaa858/webm
mp4 : https://media.tenor.com/videos/146935e698960bf723a1cd8031f6312f/mp4
loopedmp4 : https://media.tenor.com/videos/e8be91958367e8dc4e6a079298973362/mp4
nanomp4 : https://media.tenor.com/videos/4d46f8b4e95a536d2e25044a0a288968/mp4
tinymp4 : https://media.tenor.com/videos/390f512fd1900b47a7d2cc516dd3283b/mp4
tinygif : https://media.tenor.com/images/eff22afc2220e9df92a7aa2f53948f9f/tenor.gif
mediumgif : https://media.tenor.com/images/c90bf112a9292c442df9310ba5e140fd/tenor.gif
nanogif : https://media.tenor.com/images/6f6eb54b99e34a8128574bd860d70b2f/tenor.gif
gif : https://media.tenor.com/images/8ab88b79885ab587f84cbdfbc3b87835/tenor.gif
webm : https://media.tenor.com/videos/926d53c9889d7604da6745cd5989dc3c/webm
In code I use API_KEY = "LIVDSRZULELA" from documentation but you should register on page to get your unique API_KEY.
Usually API_KEYs from documentations may have restrictions or generate always the same data - they are created only for tests, not for use in real application.
Documentation show more methods to get and filter image - ie to get trending images.
How is it possible to send a document in telegram in python with an other filename that "document"?
I'm using the Python3. "doc" is a clear text i want to send as txt file.
url = baseurl + token + '/sendDocument'
dict = {'chat_id':chat_id}
r = requests.post(url, params=dict, files={'document':doc})
The received file is named "document", without a file extension. When I rewrite the files to
files={'document.txt':doc}
the Telegram replies
{"ok":false,"error_code":400,"description":"Bad Request: there is no document in the request"}
Does anyone know how to set a file name for the file?
According to the API page, you cannot do that. You must use predefined names for such types of API methods (document, photo, audio, etc.).
And even if you could - you won't be able to use your custom name to find your files because Telegram bot API uses its own file_id identifier for this purpose.
As a workaround you may store file_id—your_custom_document.txt relation on your backend and use file_id to communicate with Telegram servers.