Get more than 50 members of a group from Soundcloud - python

I'm trying to get all the members of a SoundCloud group using the Python API.
So far I can get the first 50 but providing the mentioned "linked_partitioning=1" argument doesn't seem to be move on to the next set of members.
My code is:
# Login and authenticate
client = soundcloud.Client(client_id = clientId,
client_secret = clientSecret,
username = username,
password = password)
# print authenticated user's username
print client.get('/me').username
# Get members
count = 1
total = 0;
while count > 0 and total < max:
members = client.get('/resolve', url=group_url, linked_partitioning=1)
count = len(members)
total += count
# Debug Output
print "Total: " + str(total) + ". Retrieved another " + str(count) + " members."
print members[0].username
I've been looking at: https://developers.soundcloud.com/docs/api/guide#pagination but still haven't managed to find a solution.

Snippet of working PHP code using linked_partioning and limit (max value can be 200). The default result set size is 50.
I use the limit with all the endpoints, but have not as yet touched groups, so I can't verify that it works there.
$qryString = self::API_ENDPOINT . self::SEARCH_API_ARTISTS
. "/" . $userId ."/tracks?" . $this->getClientId(true);
$qryString .= "&limit=" . self::MAX_API_PAGE_SIZE;
$qryString .= "&linked_partitioning=1";

Related

Reduce time complexity for a working model (python3)

I have a working model for a chat application. The requirement is such that upon service restart, we design an in-mem mapper and fetch the first page details for each DM / group from that mapper based on the ID.
The working model is as follows:
'''
RECEIVER_SENDER_MAPPER = {"61e7dbcf9edba13755a4eb07" : {"61e7a5559edba13755a4ea65":[{},{},{},{},first page entries(25)],
"61de751742fc165ec8b729c9":[{},{},{},{},first page entries(25)]},
"61e7a5559edba13755a4ea65" : {"61e7dbcf9edba13755a4eb07":[{},{},{},{},first page entries(25)],
"61de751742fc165ec8b729c9":[{},{},{},{},first page entries(25)]}
}
'''
RECEIVER_SENDER_MAPPER = {}
def sync_to_inmem_from_db():
global RECEIVER_SENDER_MAPPER
message = db.messages.find_one()
if message:
#Fetch all users from db
users = list(db.users.find({},{"_id":1, "username":1}))
prepared_data = {}
counter = 0
for user in users:
counter += 1
#find all message groups which user is a part of
user_channel_ids = list(db.message_groups.find({"channel_members":{"$in":[user["_id"]]}},{"_id":1, "channel_name":1}))
combined_list = user_channel_ids + users
users_mapped_data = {}
for x in combined_list:
counter += 1
if x["_id"] == user["_id"]:
continue
find_query = {"receiver_type":"group", "receiver_id":x["_id"], "pid":"0"}
if x.get("username"):
find_query = {"pid":"0", "receiver_type":"user"}
messages = list(db.messages.find(find_query).sort("created_datetime",
-1).limit(50))
if messages:
users_mapped_data[x["_id"]] = messages
prepared_data[user["_id"]] = users_mapped_data
RECEIVER_SENDER_MAPPER = prepared_data
if not RECEIVER_SENDER_MAPPER:
sync_to_inmem_from_db()
The value of the counter for 70 users and 48 message groups is : 5484, It takes close to 9 mins to create the RECEIVER_SENDER_MAPPER.
I have to reduce this atleast to 1/4th of the value
One optimization i found was, since group messages will be same for all the users of the particular group, i can just create a dictionary this way:
all_channels = list(db.message_groups.find())
channels_data = {channel["_id"] : list(db.messages.find({"receiver_id":channel["_id"]}).limit(50)) for channel in all_channels}
But here again, while looping the users, i have to again loop the groups to find if the "user" is a part of that group or not.!
Any idea to reduce the complexity of this ? Thanks in advance.

Reverse-Geo Coding Fails when run in Loop

I am trying to do reverse geocoding and extract pincodes for lot-long. The .csv file has around 1 million records..
Below is my problem
1. Google API failing to give address for large records, and taking huge amount of time. I will later move it to Batch-Process though.
2. I tried to split the file into chunks and ran few files manually one by one (1000 records in each file after splitting), then i surprisingly get 100% result.
3. Later, I ran in loop one by one, again, Google API fails to give the result
Note: Right now we are looking for free API's only
**Below is my code**
def reverse_geocode(latlng):
result = {}
url = 'https://maps.googleapis.com/maps/api/geocode/json?latlng={}'
request = url.format(latlng)
key= '&key=' + api_key
request = request + key
data = requests.get(request).json()
if len(data['results']) > 0:
result = data['results'][0]
return result
def parse_postal_code(geocode_data):
if (not geocode_data is None) and ('formatted_address' in geocode_data):
for component in geocode_data['address_components']:
if 'postal_code' in component['types']:
return component['short_name']
return None
dfinal = pd.DataFrame(columns=colnames)
dmiss = pd.DataFrame(columns=colnames)
for fl in files:
df = pd.read_csv(fl)
print ('Processing file : ' + fl[36:])
df['geocode_data'] = ''
df['Pincode'] = ''
df['geocode_data']=df['latlng'].map(reverse_geocode)
df['Pincode'] = df['geocode_data'].map(parse_postal_code)
if (len(df[df['Pincode'].isnull()]) > 0):
d0=df[df['Pincode'].isnull()]
print("Missing Picodes : " + str(len(df[df['Pincode'].isnull()])) + " / " + str(len(df)))
dmiss.append(d0)
d0=df[~df['Pincode'].isnull()]
dfinal.append(d0)
else:
dfinal.append(df)
Can anybody help me out, what is the problem in my code? or if any additional info required please let me know....
You've run into Google API usage limits.

Extract emails from the entered values in python

My objective is: I need to ask the user to enter the number of emails and then initiate a for loop to register the input email. Then, the emails will be segregated based on '#professor.com' and '#student.com'. This will be counted as appended in the list. Following is what I have tried
email_count = int(input('how many emails you want'))
student_email_count = 0
professor_email_count = 0
student_email = '#student.com'
professor_email = '#professor.com'
for i in range(email_count):
email_list = str(input('Enter the emails')
if email_list.index(student_email):
student_email_count = student_email_count + 1
elif email_list.index(professor_email):
professor_email_count = professor_email_count + 1
Can someone help to shorten this and write it professional with explanations for further reference? Here, the appending part is missing. Could someone through some light there too ?
Thanks
prof_email_count, student_email_count = 0, 0
for i in range(int(input("Email count # "))):
email = input("Email %s # " % (i+1))
if email.endswith("#student.com"): # str.endswith(s) checks whether `str` ends with s, returns boolean
student_email_count += 1
elif email.endswith("#professor.com"):
prof_email_count += 1
Is what a (somewhat) shortened rendition of your code would look like. Main differences is that I'm using str.endswith(...) over str.index(...), also that I've removed the email_count, student_email and professor_email variables which didn't seem to be used anywhere else in the context.
EDIT:
To answer your comment on scalability, you could implement a system such as this:
domains = {
"student.com": 0,
"professor.com": 0,
"assistant.com": 0
}
for i in range(int(input("Email count # "))):
email = input("Email %s # " % (i+1))
try:
domain = email.split('#')[1]
except IndexError:
print("No/invalid domain passed")
continue
if domain not in domains:
print("Domain: %s, invalid." % domain)
continue
domains[domain] += 1
Which allows for further scalability as you can add more domains to the domains dictionary, and access the count per domains[<domain>]
It seems your iteration accepts one email at a time; and executes email_count times. You can use this simple code to count students and professors:
st = '#student.com'
prof = '#professor.com'
for i in range(email_count):
email = str(input('Enter the email'))
if st in email:
student_email_count += 1
elif prof in email:
professor_email_count += 1
else:
print('invalid email domain')
If you are using Python 2.7, you should change input to raw_input.
Here's a scalable version for your code, using defaultdict to support unlimited domains.
email_count = int(input('how many emails you want'))
student_email_count = 0
professor_email_count = 0
from collections import defaultdict
domains = defaultdict(int)
for i in range(email_count):
email = str(raw_input('Enter the email\n'))
try:
email_part = email.split('#')[1]
except IndexError:
print('invalid email syntax')
else:
domains[email_part] += 1

Getting Twitter Followers using Twitter's REST API

I am using python script to get the followers for a specific user. The script runs perfectly and it returns the IDs of the followers when i use user lookup API it only returns 3 result. The script is like this:
#!/usr/bin/python
from twitter import *
import sys
import csv
import json
config = {}
execfile("/home/oracle/Desktop/twitter-1.17.1/config.py", config)
twitter = Twitter(
auth = OAuth(config["access_key"], config["access_secret"],config["consumer_key"], config["consumer_secret"]))
username = "#####"
query = twitter.followers.ids(screen_name = username)
print "found %d followers" % (len(query["ids"]))
for n in range(0, len(query["ids"]), 100):
ids = query["ids"][n:n+100]
subquery = twitter.users.lookup(user_id = ids)
for user in subquery:
print " [%s] %s" % ("*" if user["verified"] else " ", user["screen_name"])
# print json.dumps(user)
And it returns the output like this:
{u'next_cursor_str': u'0', u'previous_cursor': 0, u'ids': [2938672765, 1913345678, 132150958, 2469504797, 2162312397, 737550671029764097, 743699723786158082, 743503916885737473, 742612685632770048, 742487358826811392, 742384945121878020, 741959985127665664, 1541162424, 739102973830254592, 740198523724038144, 542050890, 739971273934176256, 2887662768, 738922874011013120, 738354749045669888, 737638395711791104, 737191937061584896, 329618583, 3331556957, 729645523515396096, 2220176421, 162387597, 727099914635874304, 726665274737475584, 725406360406470657, 938760691, 715260034335305729, 723912842320158720, 538208881, 2188791158, 723558257541828608, 1263571466, 720182865275842564, 719947801598259200, 636067084, 719412219168038912, 719199478260043776, 715921761158574080........ ], u'next_cursor': 0, u'previous_cursor_str': u'0'}
When i use the user look up API it only returns 4 screen names like this:
found 1106 followers
[ ] In_tRu_dEr
[ ] amanhaider3
[ ] SaaddObaid
[ ] Soerwer
I want the screen names of all the IDs present but it returns only 4. Can anyone help.
your issue is in those 2 lines
(I assumed second line is intended although it is not in the question)
for n in range(0, len(query["ids"]), 100):
ids = query["ids"][n:n+100]
those lines will create multi ids arrays and they overwrite each other
so first iteration ids will have the ids from 0 to 100
then you overwrite it with ids from 100 to 200 and so on
till you reach last iteration from 1100 to 1106
so ids will only have ids on it
and apprenatly from those 6 only 4 are returned by twitter.users.lookup
to fix it you will need to keep everything under the for n loop
like this
for n in range(0, len(query["ids"]), 100):
ids = query["ids"][n:n+100]
subquery = twitter.users.lookup(user_id = ids)
for user in subquery:
print " [%s] %s" % ("*" if user["verified"] else " ", user["screen_name"])
this will work

Bad search filter

Im trying to filter few attributes from the ldap server but get errors,
ldap.FILTER_ERROR: {'desc': 'Bad search filter'}
Code:-
import ldap
ldap.OPT_REFERRALS = 0
ldap_server="ldapps.test.com"
username = "testuser"
password= "" #your password
connect = ldap.open(ldap_server)
dn='uid='+username;
print 'dn =', dn
try:
result = connect.simple_bind_s(username,password)
print 'connected == ', result
filter1 = "(|(uid=" + username + "\*))"
result = connect.search("DC=cable,DC=com,DC=com",ldap.SCOPE_SUBTREE,filter1)
print result
except ldap.INVALID_CREDENTIALS as e:
connect.unbind_s()
print "authentication error == ", e
Your search filter is, in fact, bad.
The | character is for joining several conditions together in an OR statement. For example, if you wanted to find people with a last name of "smith", "jones", or "baker", you would use this filter:
(|(lastname=smith)(lastname=jones)(lastname=baker))
However, your filter only has one condition, so there's nothing for the | character to join together. Change your filter to this and it should work:
"(uid=" + username + "\*)"
By the way, what are you trying to do with the backslash and asterisk? Are you looking for people whose usernames actually end with an asterisk?

Categories