So I'm trying to grab all the jobs available in the UK from this site: https://www.ubisoft.com/en-us/company/careers/search?countries=gb and when going into the network setting there is a json file with the data needed https://avcvysejs1-dsn.algolia.net/1/indexes/*/queries?x-algolia-agent=Algolia%20for%20JavaScript%20(4.8.4)%3B%20Browser%20(lite)%3B%20JS%20Helper%20(3.3.4)%3B%20react%20(16.12.0)%3B%20react-instantsearch%20(6.8.3)&x-algolia-api-key=1291fd5d5cd5a76a225fc6b00f7b296a&x-algolia-application-id=AVCVYSEJS1 and it uses Request Method: POST
However when I wrote a script to get that data
data = []
url = "https://avcvysejs1-dsn.algolia.net/1/indexes/*/queries?x-algolia-agent=Algolia%20for%20JavaScript%20(4.8.4)%3B%20Browser%20(lite)%3B%20JS%20Helper%20(3.3.4)%3B%20react%20(16.12.0)%3B%20react-instantsearch%20(6.8.3)&x-algolia-api-key=1291fd5d5cd5a76a225fc6b00f7b296a&x-algolia-application-id=AVCVYSEJS1"
r = requests.post(url)
json = r.json()
print(json)
but get the result
{'message': 'No content in POST request', 'status': 400}
and when I change it to r = requests.post(url) I get the result
{'message': 'indexName is not valid', 'status': 400}
To get correct response from the server, send payload with the request:
import requests
from bs4 import BeautifulSoup
api_url = "https://avcvysejs1-dsn.algolia.net/1/indexes/*/queries"
params = {
"x-algolia-agent": "Algolia for JavaScript (4.8.4); Browser (lite); JS Helper (3.3.4); react (16.12.0); react-instantsearch (6.8.3)",
"x-algolia-api-key": "1291fd5d5cd5a76a225fc6b00f7b296a",
"x-algolia-application-id": "AVCVYSEJS1",
}
payload = """{"requests":[{"indexName":"jobs_en-us_default","params":"highlightPreTag=%3Cais-highlight-0000000000%3E&highlightPostTag=%3C%2Fais-highlight-0000000000%3E&query=&maxValuesPerFacet=100&page=0&facets=%5B%22jobFamily%22%2C%22team%22%2C%22countryCode%22%2C%22city%22%2C%22contractType%22%2C%22graduateProgram%22%5D&tagFilters=&facetFilters=%5B%5B%22countryCode%3Agb%22%5D%5D"},{"indexName":"jobs_en-us_default","params":"highlightPreTag=%3Cais-highlight-0000000000%3E&highlightPostTag=%3C%2Fais-highlight-0000000000%3E&query=&maxValuesPerFacet=100&page=0&hitsPerPage=1&attributesToRetrieve=%5B%5D&attributesToHighlight=%5B%5D&attributesToSnippet=%5B%5D&tagFilters=&analytics=false&clickAnalytics=false&facets=countryCode"}]}"""
data = requests.post(api_url, params=params, data=payload).json()
for h in data["results"][0]["hits"]:
print(h["title"])
print(
BeautifulSoup(h["additionalInformation"], "html.parser").get_text(
strip=True, separator=" "
)
)
print("-" * 80)
Prints:
Lead Concept Artist [New IP] (433)
Benefits & Relocation Flexible working, 22 days annual leave + Christmas shutdown, private healthcare (with option to add immediate family), life insurance & income protection, workplace pension scheme, paid volunteering days, annual fitness & well-being allowance, games, technology & merchandise, subsidised travel and many more... Relocation assistance is available to anyone currently living 50 miles or more from the studio location. Please contact a member of the talent acquisition team to find out what we have to offer and how we can support with your move here... relocation really doesn't have to be a daunting prospect. Find out more about Ubisoft Reflections: https://reflections.ubisoft.com/about/ubisoft-reflections/ Facebook: https://www.facebook.com/pg/Ubisoft.Reflections Twitter: https://twitter.com/UbiReflections Ubisoft offers the same job opportunities to all, without any distinction of gender, ethnicity, religion, sexual orientation, social status, disability or age. Ubisoft ensures the development of an inclusive work environment which mirrors the diversity of our gamers community.
--------------------------------------------------------------------------------
Player Support Product Lead
Benefits With Ubisoft CRC, you'll receive a competitive salary along with: Personal performance bonus Private Health Insurance (including eye care and dental) Life Assurance Long Term Disability Insurance Pension Significant discount on the world’s best video games Access to Ubisoft's back catalogue on PC Perks: We work in the heart of Newcastle city centre, right on top of Haymarket metro station in a lively, international and creative space. We have a kitchen stocked with cereals, fruits, unlimited filtered water, teas, coffee Regular professional and social events Monthly Ubidrinks Flexible working hours A casual dress code Fun, we like to work hard but have a laugh too! For the safety of all our teams we are currently working remotely. We hope to return to our CRC home very soon and anticipate a blended working pattern combining office and home based working in the future Ubisoft is committed to creating an inclusive work environment that reflects the diversity of our player community. We are an equal opportunity employer. Qualified applicants will receive consideration for employment without regard to their race, ethnicity, religion, gender, sexual orientation, age or disability status.
--------------------------------------------------------------------------------
...and so on.
I am scraping some websites and storing the data in my database. Sometimes I get character maps to error, which I think is due to non-ASCII characters. Since I am scraping many websites with texts in different languages, I could not solve my issue in a general and efficient way.
an error example
Message: 'commit exception GRANTS.GOV'
Arguments: (UnicodeEncodeError('charmap', 'The Embassy of the United States in Nur-Sultan and the Consulate General of the United States in Almaty announces an open competition for past participants (“alumni”) of U.S. government-funded and U.S. government-sponsored exchange programs to submit applications to the 2021 Alumni Engagement Innovation Fund (AEIF) 2021.\xa0\xa0We seek proposals from teams of at least two alumni that meet all program eligibility requirements below. Exchange alumni interested in participating in AEIF 2021 should submit proposals to KazakhstanAlumni#state.gov\xa0by March 31, 2021, 18:00 Nur-Sultan time.\xa0\nAEIF provides alumni of U.S. sponsored and facilitated exchange programs with funding to expand on skills gained during their exchange experience to design and implement innovative solutions to global challenges facing their community. Since its inception in 2011, AEIF has funded nearly 500 alumni-led projects around the world through a competitive global competition.\n\nThis year, the U.S. Mission to Kazakhstan will accept proposals managed by teams of at least two (2) alumni that support the following theme:\n\u25cf\xa0\xa0\xa0\xa0\xa0\xa0Mental health awareness, promotion of mental wellbeing and resiliency.\nGoals. Projects may support one or more of the following goals:\nGoal 1: Increase in public understanding of mental health issues,\xa0its signs and strategies for providing timely help;\nGoal 2: Increase in public understanding of resources, methods, and tools that promote mental health and resiliency, especially among at-risk audiences; American best practices to promote mental health.\nGoal 3: Combatting stigma around mental health issues and dispelling common myths.\n\nFor full package of required forms please Related Documents section.', 1098, 1099, 'character maps to <undefined>'),)
my code :
title ='..............'
description ='......'
op = Op(
website='',
op_link='',
title='it might be a long text coming form websites,
description= it might be a long text coming from websites.,
organization_id=org_id,
close_date=',
checksum=singleData['checksum'],
published_date='',
language_id=lang_id,
is_open=1)
try:
session.add(op)
session.commit()
session.flush()
....
....
Please note: it should work on a Linux system; my database (Mysql) is in a Linux system.
I mostly face the issue with title and description, which can be in many languages and any length. How can I make encode it correctly so that I don't get any error while committing to the database?
Thank you
I have a gz format file. The file is very big and the first line is as follow:
{"originaltitle":"Leasing Specialist - WPM Real Estate Management","workexperiences":[{"company":"Home Properties","country":"US","customizeddaterange":"","daterange":{"displaydaterange":"","startdate":null,"enddate":null},"description":"Responsibilities: Inspect tour routes, models and show apartments daily to ensure cleanliness. Greeting prospective residents; determining the needs and preferences of the prospect and professionally present specific apartments while providing information regarding features and benefits. Answering incoming calls in a cheerful and professional manner. Handle each call accordingly whether it is a prospect call or an irate resident that just moved in. Develop and maintain Resident relations through the courtesy of on-site personnel, promptness of maintenance calls, and knowledge of community policies. Learn to develop professional sales and closing techniques. Accompany prospects to model apartments and discusses size and layout of rooms, available facilities, such as swimming pool and saunas, location of shopping centers, services available, and terms of lease. Demonstrate thorough knowledge and use of lead tracking system. Make follow-up calls to prospective Residents who did not fill out an application. Compile and update listings of available rental units.","location":"Baltimore, MD","normalizedtitle":"leasing specialist","title":"Leasing Specialist"},{"company":"WPM Real Estate Management","country":"US","customizeddaterange":"1 year, 3 months","daterange":{"displaydaterange":"July 2017 to October 2018","startdate":{"displaydate":"July 2017","granularity":"MONTH","isodate":{"date":null}},"enddate":{"displaydate":"October 2018","granularity":"MONTH","isodate":{"date":null}}},"description":"Responsibilities: Inspect tour routes, models and show apartments daily to ensure cleanliness. Greeting prospective residents; determining the needs and preferences of the prospect and professionally present specific apartments while providing information regarding features and benefits. Answering incoming calls in a cheerful and professional manner. Handle each call accordingly whether it is a prospect call or an irate resident that just moved in. Develop and maintain Resident relations through the courtesy of on-site personnel, promptness of maintenance calls, and knowledge of community policies. Learn to develop professional sales and closing techniques. Accompany prospects to model apartments and discusses size and layout of rooms, available facilities, such as swimming pool and saunas, location of shopping centers, services available, and terms of lease. Demonstrate thorough knowledge and use of lead tracking system. Make follow-up calls to prospective Residents who did not fill out an application. Compile and update listings of available rental units.","location":"Baltimore, MD","normalizedtitle":"leasing specialist","title":"Leasing Specialist"},{"company":"Westminster Management","country":"US","customizeddaterange":"1 year","daterange":{"displaydaterange":"June 2016 to June 2017","startdate":{"displaydate":"June 2016","granularity":"MONTH","isodate":{"date":null}},"enddate":{"displaydate":"June 2017","granularity":"MONTH","isodate":{"date":null}}},"description":"Responsibilities: Tour vacant units and model with future prospects.Process applications. Answer emails and incoming phone calls. Prepare lease agreement for signing. Collect all monies that is due on dateof move-in. Enter resident repair orders for resident. Walk vacant units to ensure that the unit is ready for show. Complete residency and employment verifications. Income qualify all applicants.","location":"Baltimore, MD","normalizedtitle":"leasing consultant","title":"Leasing Consultant"},{"company":"MARYLAND MANAGEMENT COMPANY","country":"US","customizeddaterange":"1 year, 1 month","daterange":{"displaydaterange":"April 2015 to May 2016","startdate":{"displaydate":"April 2015","granularity":"MONTH","isodate":{"date":null}},"enddate":{"displaydate":"May 2016","granularity":"MONTH","isodate":{"date":null}}},"description":"Responsibilities: Lease apartments, sign lease agreements, complete residence maintenance repairrequest, answer phones, customer service, processed prospects applications, opened and closedinventory, responded to Level One emails Accomplishments: I was able to successfully finish FairHousing requirements. The first month I was able to properly and accurately process a application and move-in documents. Skills Used: The skills I used while at Americana were strong team work, strongcommunication, interpersonal, and leadership.","location":"Glen Burnie, MD","normalizedtitle":"leasing agent","title":"Leasing Agent"},{"company":"Amazon.com","country":"US","customizeddaterange":"1 year, 5 months","daterange":{"displaydaterange":"September 2014 to February 2016","startdate":{"displaydate":"September 2014","granularity":"MONTH","isodate":{"date":null}},"enddate":{"displaydate":"February 2016","granularity":"MONTH","isodate":{"date":null}}},"description":"Responsibilities: I assure customers are receiving the correct merchandise in a timely fashion.And evaluate inventoryAccomplishments:I exceeded Amazon expectations of receiving 2800 items per hour, which allowed me to train otherassociates, building confidence and skills.Skills Used:The skills i used while performing my task were strong leadership, strong communications, and beingdetailed orientated.","location":"Baltimore, MD","normalizedtitle":"customer service representative","title":"Customer Service Representative"},{"company":"Carmax Superstore","country":"US","customizeddaterange":"1 year, 2 months","daterange":{"displaydaterange":"February 2014 to April 2015","startdate":{"displaydate":"February 2014","granularity":"MONTH","isodate":{"date":null}},"enddate":{"displaydate":"April 2015","granularity":"MONTH","isodate":{"date":null}}},"description":"Responsibilities:Greet customersSearch for the right vehicle that best suits the customers needs and wantsSubmit financial applicationsAssist customer with the purchasing process and document signingEnter customers information for appraisal offerAssist customer with purchasing Car Max extended warrantiesConducted follow- up on a daily, weekly, and monthy basisAccomplishments:I was acknowledged by the district for having 100% in Car Max extended warranties. Also I wasacknowledged by the district for having one of the highest Voice Of Customer survey scores. I passedthe 6 week training, obtaining my sales licenseSkills Used:I demonstrate strong communication, interpersonal and listening skills. I also have strongorganizational skills.","location":"Nottingham, MD","normalizedtitle":"sales consultant","title":"Certified Sales Consultant"},{"company":"rue21","country":"US","customizeddaterange":"1 year, 8 months","daterange":{"displaydaterange":"June 2011 to February 2013","startdate":{"displaydate":"June 2011","granularity":"MONTH","isodate":{"date":null}},"enddate":{"displaydate":"February 2013","granularity":"MONTH","isodate":{"date":null}}},"description":"Responsibilities: Managed profit goals on a daily basisCustomer ServiceReceived Incoming shipmentDelivered daily bank depsoitsMaintained store appearanceOverlooked sales associates performanceCreated daily goals for each sales associateAccomplishments:The impact that I was able to have during my time at Rue21, I was able to build a strong team of individuals who were scored top in the region for Customer Service.Skills Used:I demonstrated strong leadership and verbal communication.","location":"Dundalk, MD","normalizedtitle":"assistant store manager","title":"Assistant Store Manager"},{"company":"Shaws Jewelers","country":"US","customizeddaterange":"1 year, 5 months","daterange":{"displaydaterange":"November 2009 to April 2011","startdate":{"displaydate":"November 2009","granularity":"MONTH","isodate":{"date":null}},"enddate":{"displaydate":"April 2011","granularity":"MONTH","isodate":{"date":null}}},"description":"Responsibilities: Customer serviceGeneral office( typing, faxing, )Made outgoing calls to valued customersCleaned and maintained show cases and lunch roomPrepared jewlery repair tickets for outgoing shipmentAccomplishments:During my time at Shaws Jewelers I was able to demonstrate excellent customer service.Also I wasable to achieve personal profit goals and credit application goals on a daily basis. I was acknowledged and rewarded by my DM for excellent team participation and over achieving the 6 standards on a dailybasis.Skills Used:I demonstrated strong verbal and listening skills. Also I have excellent interpersonal skills.","location":"Dundalk, MD","normalizedtitle":"sales associate","title":"Sales Associate"}],"skillslist":[{"monthsofexperience":0,"text":"yardi"},{"monthsofexperience":0,"text":"marketing"},{"monthsofexperience":0,"text":"outlook"},{"monthsofexperience":0,"text":"receptionist"},{"monthsofexperience":0,"text":"management"}],"url":"/r/Lashannon-Felton/1062d3b8cbb13886","additionalinfo":""}\n'
I am not familiar with gzip.GzipFile format.
Is there a way to make it a dictionary?
You will want to make use of the json module and the gzip module in Python, both of which are part of the Python Standard Library.
The gzip module provides the GzipFile class, as well as the open(),
compress() and decompress() convenience functions. The GzipFile class
reads and writes gzip-format files, automatically compressing or
decompressing the data so that it looks like an ordinary file object.
To read the compressed file, you can call gzip.open().
Opening the file with the default rb mode, will return a gzip.GzipFile object, from which you can obtain a bytes-like object by calling read().
Then, using json.loads(), you can convert the raw data into a usable Python object -- a dictionary.
The snippet below is a simple demonstration of this in action:
import gzip
import json
with gzip.open('gzipped_file.json.gz', 'rb') as f:
raw_json = f.read()
data = json.loads(raw_json)
print(type(data))
# Prints <class 'dict'>
print(data)
# Prints {'originaltitle': 'Leasing Specialist - WPM Real Estate Management', 'workexperience ...
print(data['workexperiences'][0]['company'])
# Prints Home Properties
Let's say I have the following text:
Steps toward this goal include: Increasing efficiency of mobile networks, data centers, data transmission, and spectrum allocation Reducing the amount of data apps have to pull from networks through caching, compression, and futuristic technologies like peer-to-peer data transfer Making investments in accessibility profitable by educating people about the uses of data, creating business models that thrive when free data access is offered initially, and building out credit card infrastructure so carriers can move from pre-paid to post-paid models that facilitate investment If the plan works, mobile operators will gain more customers and invest more in accessibility; phone makers will see people wanting better devices; Internet providers will get to connect more people; and people will receive affordable Internet so they can join the knowledge economy and connect with the people they care about.
As you can tell by reading the text, these are multiple sentences (a list of points). How can I split this text into sentences? I've tried using python NLTK but no luck. Checking for uppercase letters won't work either, as it isn't very reliable.
Any ideas on how to solve this problem?
Thanks.
if i understood you correctly this little code could help: (Note tested on python 2.7.5)
paragraph = 'Steps toward this goal include: Increasing efficiency of mobile networks, data centers, data transmission, and spectrum allocation Reducing the amount of data apps have to pull from networks through caching, compression, and futuristic technologies like peer-to-peer data transfer Making investments in accessibility profitable by educating people about the uses of data, creating business models that thrive when free data access is offered initially, and building out credit card infrastructure so carriers can move from pre-paid to post-paid models that facilitate investment If the plan works, mobile operators will gain more customers and invest more in accessibility; phone makers will see people wanting better devices; Internet providers will get to connect more people; and people will receive affordable Internet so they can join the knowledge economy and connect with the people they care about.'
words = []
separators = ['.',',',':',';']
oldValue = 0
for value in range(len(paragraph)):
if paragraph[value] in separators:
words.append(paragraph[oldValue:value+1])
oldValue = value+2
for word in words:
print word
[EDIT]
also you could add uppercase letter check easily with
if paragraph[value] == paragraph[value].upper():
words.append(paragraph[oldValue:value+1])
...
I am trying to fix my condition that says if found any forbidden keyword in string or string_2 then skip it, but if not found any keyword from forbidden, but it found any word from skills then save it, but however it is multiplying the results 10 times in the else part.
string = "opportunity: this opportunity would suit a budding hacker who is seeking a first step into a commercial role or a tester with 1-3 years of experience. this is a great opportunity to utilise your experience in penetration testing, vulnerability assessments and delivering outcomes while also expanding your knowledge and skillset. benefits: perform red team engagements excellent training & development budget attendance at local and international conferences responsibilities include: working with a diverse range of customers identify and solve security problems perform penetration testing and vulnerability assessments maintain and improve penetration testing and methodologies delivery of technical reports and documentation ideally you will have: ideally current security clearance or minimum australian citizenship certifications such as oscp, sans, crest highly regarded fluent with linux command line and windows powershell experience performing assessments on client networks ability to clearly communicate vulnerability details and risks for a confidential discussion about this opportunity or to discuss other opportunities within it security & risk please contact specialist infosec recruiter john smith on 0123 456 789 or email johnsmith#example.com. australian citizens only – ideally already with a security clearance. want to know more about me? connect with me on linkedin"
string_2 = "your new company this melbourne based consultancy boasts a unique depth and breadth of capabilities across cyber security, application security, data & analytics, cloud and digital transformations. they continue to deliver rich insight, innovative strategies and solutions that help their clients reach their potential. about the opportunity this is an outstanding opportunity to utilise your experience in penetration testing and vulnerability assessments. you will use your skills to prepare high quality reports detailing security issues, making recommendations and identifying solutions. the types of testing can include vulnerability assessment, penetration testing and application security assessment. what you’ll need to succeed passion, drive and enthusiasm! demonstrated experience performing internal and external penetration testing, web application penetration testing and mobile application penetration testing industry certifications such as sans, oscp, crest crt/cct or osce strong knowledge of common vulnerabilities such as owasp top 10 and sans top 25 scripting experience - javascript, objective c and python a very strong technical background and a passion for security the ability to think outside the box what you'll get in return our client is looking for an individual that is seeking longevity in their next role and in return offers the chance to join an equal opportunity employer that is passionate about diversity. also on offer is ongoing personal and professional development, providing you with the right tools and support to thrive. what you need to do now if you’re interested in this role, click ‘apply now’ or for more information and a confidential discussion on this role or any others within it security contact john smith at johnsmith#example.com"
forbidden = ['clearance','TS/SCI','4+ years','5+ years','6+ years','7+ years','8+ years','9+ years','10+ years','11+ years','12+ years']
skills = ['owasp']
for s_prefix in forbidden:
if s_prefix in string:
print(s_prefix)
else:
print("save it")
skill_match = [s_prefix for s_prefix in forbidden if s_prefix in string]
print(skill_match)
if len(skill_match) > 0 :
pass
I am getting the output of multiples times save it while once it found clearance it should be marked as flagged, and if it doesn't found any red-flagged keyword, and any keyword from skills then save
clearance
save it
save it
save it
save it
save it
save it
save it
save it
save it
save it
['clearance']
[Finished in 0.0s]
sample:
string = "snip active cleared snip..." # skip or remove because contains cleared
string2 = "snip owasp..... php , devops" # save it because contains owasp
If you only want one line to be printed in your for loop, you probably need to change its logic, since currently it always prints something on every iteration.
One approach might be to print and break out of the loop if you spot one of the strings you're searching for, and to attach the else clause to the for loop instead of to the if. An else on a loop gets run only if the loop ended normally, not if it was escaped early by a break:
for s_prefix in forbidden:
if s_prefix in string:
print(s_prefix)
break
else:
print("save it")
If you don't need to print the matching prefix string, you could also play around with any or all.