Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Hi I have big text file and I want to read the file in python and save the data on list.
the structure of the file is like this
[{"address":"office1","id":"3311"},{"address":"office2","id":"3322"}]
[{"address":"office3","id":"3312"},{"address":"office4","id":"3323"}]
I want to save the first line in one list and the second line in different list. Can you please explain how to do it.
file.txt
[{"address":"office1","id":"3311"},{"address":"office2","id":"3322"}]
[{"address":"office3","id":"3312"},{"address":"office4","id":"3323"}]
code:
import ast
lists = []
for line in open('file.txt'):
lists.append(ast.literal_eval(line.strip()))
>>> lists
[[{'id': '3311', 'address': 'office1'}, {'id': '3322', 'address': 'office2'}], [{'id': '3312', 'address': 'office3'}, {'id': '3323', 'address': 'office4'}]]
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed yesterday.
Improve this question
i have soup response text with multiple group and sub groups.
i want to get automatic all groups and their values .
how can i do it ?
In the end, I want to get the title and the value for each group. The best thing for me is for each group to have its values separately.
OrderedDict([('#id',
'boic'),
('mc:id',
'boic'),
('mc:ocb-conditions',
OrderedDict([('mc:rule-deactivated',
'true'),
('mc:international',
'true')])),
('mc:cb-actions',
OrderedDict([('mc:allow',
'false')]))])
My goal is to get to a state where I get the following output:
'#id','boic'
'mc:id','boic'
'mc:ocb-conditions'
'mc:rule-deactivated','true'
'mc:international', 'true'
'mc:cb-actions'
'mc:allow','false'
i try to use
' '.join(BeautifulSoup(soup_response, "html.parser").findAll(text=True))
and got all values But I'm missing the titles of the values.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 days ago.
Improve this question
I am trying to validate input.csv header column names using existing schema_info.csv file
input.csv
emp_id,emp_name,salary
1,siva,1000
2,ravi,200
3,kiran,800
schema_info
file_name,column_name,column_sequence
input.csv,EMP_ID,1
input.csv,EMP_NAME,2
input.csv,SALARY,3
I try to read header and compare with input.csv file header column name and sequence with schema info data. but unable get sequence order from input file header and unable to compare with Schema file data.. Any suggestions?
input = sc.textFile("examples/src/main/resources/people.txt")
input = input.first()
parts = input.map(lambda l: l.split(","))
# Each line is converted to a tuple.
header_data = parts.map(lambda p: (p[0], p[1].strip()))
schema_info = spark.read.option("header","true").option("inferSchema","true").csv("/schema_info.csv")
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I wanted to scrape some data from a JSON response. Here is the link
I need the values in lessonTypes. I want to export all the values separated with a comma.
So Theorieopleidingen has 4 values Beroepsopleidingen has 8 and so on.
I want to dynamically scrape so even if the num of values is changing, it always scrapes all with comma seperated.
Sorry if my explanation is week.
Since it's a JSON object, why don't you use just requests and do what(ever) you want (with the data).
For example:
import requests
url = "https://www.cbr.nl/web/show?id=289168&langid=43&channel=json&cachetimeout=-1&elementHolder=289170&ssiObjectClassName=nl.gx.webmanager.cms.layout.PagePart&ssiObjectId=285674&contentid=3780&examtype=B"
for value in requests.get(url).json()['lessonTypes'].values():
print(value)
Output:
['Motor', 'Auto', 'Bromfiets', 'Tractor']
['Bus', 'Aanhangwagen achter bus', 'Vrachtauto', 'Aanhangwagen achter vrachtauto', 'Heftruck', 'ADR', 'Taxi', 'Tractor']
['Aangepaste auto', 'Automaat personenauto']
['Motor', 'Auto', 'Aanhangwagen achter auto', 'Bromfiets', 'Brommobiel']
EDIT:
To access individual keys and their values you might want to try this for example:
import requests
url = "https://www.cbr.nl/web/show?id=289168&langid=43&channel=json&cachetimeout=-1&elementHolder=289170&ssiObjectClassName=nl.gx.webmanager.cms.layout.PagePart&ssiObjectId=285674&contentid=3780&examtype=B"
lesson_types = requests.get(url).json()['lessonTypes']
print(list(lesson_types.keys()))
print("\n".join(lesson_types['Theorieopleidingen']))
Output:
['Theorieopleidingen', 'Beroepsopleidingen', 'Bijzonderheden', 'Praktijkopleidingen']
Motor
Auto
Bromfiets
Tractor
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
data = [{'name':'Albert','rel':'Head','unique_number': 101},
{'name':'Sheen','rel':'Head','unique_number': 201},
{'name':'Peter','rel':'Son','unique_number': 101},
{'name':'Chloe','rel':'Daughter','unique_number': 101}]
can you help me out in getting data like this? filtered on unique_number
updated_data = [
{'house_head':'Albert','members':['Peter','Chloe']},
{'house_head':'Sheen','members':[]}
]
The following should work:
numbers=set([i['unique_number'] for i in data])
dict={i:{'Head':'', 'members':[]} for i in numbers}
for i in data:
if i['rel']=='Head':
dict[i['unique_number']]['Head']=i['name']
else:
dict[i['unique_number']]['members'].append(i['name'])
new_data=[{'house_head':dict[i]['Head'], 'members':dict[i]['members']} for i in dict]
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I used pandas to read a lot of datasets from bloomberg.
When I tested the reading program I noticed that pandas wasn't reading all rows, but it skipped some ones.
The code is the following:
def data_read(data_files):
data = {}
#Read all data and add it to a dictionary filename -> content
for file in data_files:
file_key=file.split('/')[-1][:-5]
data[file_key] = {}
#Foreach sheet add data sheet -> data
for sheet_key in data_to_take:
#path+"/"
data[file_key][sheet_key] = pnd.read_excel(file, sheet_name=sheet_key)
return data