Append csv by row from two lists in Python?

Append csv by row from two lists in Python? - python

I have two lists in Python: 'away' and 'home'. I want to append them to an already existing csv file such that I write a row solely of 1st element of away, then 1st element of home, then the 2nd element of away, then the 2nd element of home,...etc with empty spaces in between them, so it will be like this:
away1
home1
away2
home2
away3
home3
and so on and so on. The size of the away and home lists is the same, but might change day to day. How can I do this?
Thanks

Looks like you just want the useful and flexible zip built-in.
>>> away = ["away1", "away2", "away3"]
>>> home = ["home1", "home2", "home3"]
>>> list(zip(away, home))
[('away1', 'home1'), ('away2', 'home2'), ('away3', 'home3')]

import csv
away = ["away1", "away2", "away3"]
home = ["home1", "home2", "home3"]
record_list = [ list(item) for item in list(zip(away, home)) ]
print record_list
with open("sample.csv", "a") as fp:
writer = csv.writer(fp)
writer.writerows(record_list)
# record_list = [['away1', 'home1'], ['away2', 'home2'], ['away3', 'home3']]
You should use writerows method to write multiple list at a time to each row.

Related

Need to Get 4 URLs in Output After Remove Letter S but Get only Last URL

Below 4 URLs Contain Letter s and We need to remove this Letter and
Print the 4 x URLs But The Problem is I got only the last web site not the 4
Sites printed
Note :Language used is Python
file1 = ['https:/www.google.com\n', 'https:/www.yahoo.com\n', 'https:/www.stackoverflow.com\n',
'https:/www.pythonhow.com\n']
file1_remove_s = []
for line in file1:
file1_remove_s = line.replace('s','',1)
print(file1_remove_s)

You are reassigning file1_remove_s from a list object to the modified list element. You want to use append instead
file1 = ['https:/www.google.com\n', 'https:/www.yahoo.com\n', 'https:/www.stackoverflow.com\n',
'https:/www.pythonhow.com\n']
file1_remove_s = []
for line in file1:
file1_remove_s.append(line.replace('s','',1))
print(file1_remove_s)

You are assigning only the last item on the dict by using the = operator. This is actually a perfect place to use a list comprehension, hence your code should look like:
file1 = [file1_remove_s.replace('s','',1) for file1_remove_s in file1]
This will automatically append the formatted text -strings with removed "s" - to a list and by setting the variable name of that list to the name of the initial list, the initial list gets overwritten by the new one which have the proper format of the texts you want.

Using Zip Function to Create Columns in CSV with non-identical lengths of data

I have large number of files that are named according to a gradually more specific criteria.
Each part of the filename separate by the '_' relate to a drilled down categorization of that file.
The naming convetion looks like this:
TEAM_STRATEGY_ATTRIBUTION_TIMEFRAME_DATE_FILEVIEW
What I am trying to do is iterate through all these files and then pull out a list of how many different occurrences of each naming convention exists.
So essentially this is what I've done so far, I iterated through all the files and made a list of each name. I then separated each name by the '_' and then appended each of those to their respective category lists.
Now I'm trying to export them to a CSV file separated by columns and this is where I'm running into problems
L = [teams, strategies, attributions, time_frames, dates, file_types]
columns = zip(*L)
list(columns)
with open (_outputfolder_, 'w') as f:
writer = csv.writer(f)
for column in columns:
print(column)
This is a rough estimation of the list I'm getting out:
[{'TEAM1'},
{'STRATEGY1', 'STRATEGY2', 'STRATEGY3', 'STRATEGY4', 'STRATEGY5', 'STRATEGY6', 'STRATEGY7', 'STRATEGY8', 'STRATEGY9', 'STRATEGY10','STRATEGY11', 'STRATEGY12', 'STRATEGY13', 'STRATEGY14', 'STRATEGY15'},
{'ATTRIBUTION1','ATTRIBUTION1','Attribution3','Attribution4','Attribution5', 'Attribution6', 'Attribution7', 'Attribution8', 'Attribution9', 'Attribution10'},
{'TIME_FRAME1', 'TIME_FRAME2', 'TIME_FRAME3', 'TIME_FRAME4', 'TIME_FRAME5', 'TIME_FRAME6', 'TIME_FRAME7'},
{'DATE1'},
{'FILE_TYPE1', 'FILE_TYPE2'}]
What I want the final result to look like is something like:
Team1 STRATEGY1 ATTRIBUTION1 TIME_FRAME1 DATE1 FILE_TYPE1
STRATEGY2 ATTRIBUTION2 TIME_FRAME2 FILE_TYPE2
... ... ...
etc. etc. etc.
But only the first line actually gets stored in the CSV file.
can anyone help me understand how to iterate just past the first line? I'm sure this is happening because the Team type has only one option, but I don't want this to hinder it.

I referred to the answer, you have to transpose the result and use it.
refer the post below ,
Python - Transposing a list (rows with different length) using numpy fails.
I have used natural sorting to sort the integers and appended the lists with blanks to have the expected outcome.
The natural sorting is slower for larger lists
you can also use third party libraries,
Does Python have a built in function for string natural sort?
def natural_sort(l):
convert = lambda text: int(text) if text.isdigit() else text.lower()
alphanum_key = lambda key: [ convert(c) for c in re.split('([0-9]+)', key) ]
return sorted(l, key = alphanum_key)
res = [[] for _ in range(max(len(sl) for sl in columns))]
count = 0
for sl in columns:
sorted_sl = natural_sort(sl)
for x, res_sl in zip(sorted_sl, res):
res_sl.append(x)
for result in res:
if (count > 0 ):
result.insert(0,'')
count = count +1
with open ("test.csv", 'w', newline='') as f:
writer = csv.writer(f)
writer.writerows(res)
f.close()
the columns should be converted in to list before printing to csv file
writerows method can be leveraged to print multiplerows
https://docs.python.org/2/library/csv.html -- you can find more information here
TEAM1,STRATEGY1,ATTRIBUTION1,TIME_FRAME1,DATE1,FILE_TYPE1
,STRATEGY2,Attribution3,TIME_FRAME2,FILE_TYPE2
,STRATEGY3,Attribution4,TIME_FRAME3
,STRATEGY4,Attribution5,TIME_FRAME4
,STRATEGY5,Attribution6,TIME_FRAME5
,STRATEGY6,Attribution7,TIME_FRAME6
,STRATEGY7,Attribution8,TIME_FRAME7
,STRATEGY8,Attribution9
,STRATEGY9,Attribution10
,STRATEGY10
,STRATEGY11
,STRATEGY12
,STRATEGY13
,STRATEGY14
,STRATEGY15

Unable to write extracted text as individual rows in csv

This may be considered as second part of question Finding an element within an element using Selenium Webdriver.
What Im doing here is, after extracting each text from the table, writing it into csv file
Here is the code:
from selenium import webdriver
import os
import csv
chromeDriver = "/home/manoj/workspace2/RedTools/test/chromedriver"
os.environ["webdriver.chrome.driver"] = chromeDriver
driver = webdriver.Chrome(chromeDriver)
driver.get("https://www.betfair.com/exchange/football/coupon?id=2")
list2 = driver.find_elements_by_xpath('//*[#data-sportid="1"]')
couponlist = []
finallist = []
for game in list2[1:]:
coup = game.find_element_by_css_selector('span.home-team').text
print(coup)
couponlist.append(coup)
print(couponlist)
print('its done')
outfile = open("./footballcoupons.csv", "wb")
writer = csv.writer(outfile)
writer.writerow(["Games"])
writer.writerows(couponlist)
Results of 3 print statements:
Santos Laguna
CSMS Iasi
AGF
Besiktas
Malmo FF
Sirius
FCSB
Eibar
Newcastle
Pescara
[u'Santos Laguna', u'CSMS Iasi', u'AGF', u'Besiktas', u'Malmo FF', u'Sirius', u'FCSB', u'Eibar', u'Newcastle', u'Pescara']
its done
Now, You can notice the code where i write these values into csv. But I end up writing it weirdly into csv. please see the snapshot. Can someone help me to fix this please?

According to the documentation, writerows takes as parameter a list of rows, and
A row must be an iterable of strings or numbers for Writer objects
You are passing a list of strings, so writerows iterates over your strings, making a row out of each character.
You could use a loop:
for team in couponlist:
writer.writerow([team])
or turn your list into a list of lists, then use writerows :
couponlist = [[team] for team in couponlist]
writer.writerows(couponlist)
But anyway, there's no need to use csv if you only have one column...

Modifiying a txt file in Python 3

I am working on a school project to make a video club management program and I need some help. Here is what I am trying to do:
I have a txt file with the client data, in which there is this:
clientId:clientFirstName:clientLastName:clientPhoneNumber
The : is the separator for any file in data.
And in the movie title data file I got this:
movieid:movieKindFlag:MovieName:MovieAvalaible:MovieRented:CopieInTotal
where it is going is that in the rentedData file there should be that:
idClient:IdMovie:DateOfReturn
I am able to do this part. Where I fail due to lack of experience:
I need to actually make a container with 3 levels for the movie data file because I want to track the available and rented numbers (changing them when I rent a movie and when I return one).
The first level represents the whole file, calling it will print the whole file, the second level should have each line in a container, the third one is every word of the line in a container.
Here is an example of what I mean:
dataMovie = [[[movie id],[movie title],[MovieAvailable],[MovieRented],[CopieInTotal]],[[movie id],[movie title],[MovieAvailable],[MovieRented],[CopieInTotal]]
I actually know that I can do this for a two layer in this way:
DataMovie=[]
MovieInfo = open('Data_Movie', 'r')
#Reading the file and putting it into a container
for ligne in MovieInfo:
print(ligne, end='')
words = ligne.split(":")
DataMovie.append(words)
print(DataMovie)
MovieInfo.close()
It separates all the words in to this:
[[MovieID],[MovieTitle],[movie id],[movie title],[MovieAvailable],[MovieRented],[CopieInTotal], [MovieID],[MovieTitle],[movie id],[movie title],[MovieAvailable],[MovieRented],[CopieInTotal]]
Each line is in the same container (second layer) but the lines are not separated, not very helpful since I need to change a specific information about the quantity available and the rented one to be able to not rent the movie if all of the copies are rented.

I think you should be using dictionaries to store your data. Rather then just embedding lists on top of one another.
Here is a quick page about dictionaries.
http://www.network-theory.co.uk/docs/pytut/Dictionaries.html
So your data might look like
movieDictionary = {"movie_id":234234,"movie title":"Iron
Man","MovieAvailable":Yes,"MovieRented":No,"CopieInTotal":20}
Then when you want to retrieve a value.
movieDictionary["movie_id"]
would yield the value.
234234
you can also embed lists inside of a dictionary value.
Does this help answer you question?

If you have to use a txt file, storing it in xml format might make the task easier. Since there's already are several good xml parsers for python.
For example ElementTree:
You could structure you'r data like this:
<?xml version="1.0"?>
<movies>
<movie id = "1">
<type>movieKind</type>
<name>firstmovie</name>
<MovieAvalaible>True</MovieAvalaible>
<MovieRented>False</MovieRented>
<CopieInTotal>2</CopieInTotal>
</movie>
<movie id = "2">
<type>movieKind</type>
<name>firstmovie2</name>
<MovieAvalaible>True</MovieAvalaible>
<MovieRented>False</MovieRented>
<CopieInTotal>3</CopieInTotal>
</movie>
</movies>
and then access and modify it like this:
import xml.etree.ElementTree as ET
tree = ET.parse('data.xml')
root = tree.getroot()
search = root.findall('.//movie[#id="2"]')
for element in search:
rented = element.find('MovieRented')
rented.text = "False"
tree.write('data.xml')

What you are actually doing is creating three databases:
one for clients
one for movies
one for rentals
A relatively easy way to read text files with one record per line and a : separator is to create a csv.reader object. For storing the databases into your program I would recommend using lists of collections.namedtuple objects for the clients and the rentals.
from collections import namedtuple
from csv import reader
Rental = namedtuple('Rental', ['client', 'movie', 'returndate'])
with open('rentals.txt', newline='') as rentalsfile:
rentalsreader = csv.reader(rentalsfile, delimiter=':')
rentals = [Rental(int(row[0]), int(row[1]), row[2]) for row in rentalsreader]
And a list of dictionaries for the movies:
with open('movies.txt', 'rb', newline='') as moviesfile:
moviesreader = csv.reader(moviesfile, delimiter=':')
movies = [{'id': int(row[0]), 'kind', row[1], 'name': row[2],
'rented': int(row[3]), 'total': int(row[4])} for row in moviesreader]
The main reason for using a list of dictionaries for the movies is that a named tuple is a tuple and therefore immutable, and presumably you want to be able to change rented.
Referring to your comment on Daniel Rasmuson's answer, since you only put the values of the fields in the text files, you will have to hardocde the names of the fields into your program one way or another.
An alternative solution is to store the date in json files. Those are easily mapped to Python data structures.

This might be what you we're looking for
#Using OrderedDict so we always get the items in the right order when iteration.
#So the values match up with the categories\headers
from collections import OrderedDict as Odict
class DataContainer(object):
def __init__(self, fileName):
'''
Loading the text file in a list. First line assumed a header line and is used to set dictionary keys
Using OrderedDict to fix the order or iteration for dict, so values match up with the headers again when called
'''
self.file = fileName
self.data = []
with open(self.file, 'r') as content:
self.header = content.next().split('\n')[0].split(':')
for line in content:
words = line.split('\n')[0].split(':')
self.data.append(Odict(zip(self.header, words)))
def __call__(self):
'''Outputs the contents as a string that can be written back to the file'''
lines = []
lines.append(':'.join(self.header))
for i in self.data:
this_line = ':'.join(i.values())
lines.append(this_line)
newContent = '\n'.join(lines)
return newContent
def __getitem__(self, index):
'''Allows index access self[index]'''
return self.data[index]
def __setitem__(self, index, value):
'''Allows editing of values self[index]'''
self.data[index] = value
d = DataContainer('data.txt')
d[0]['MovieAvalaible'] = 'newValue' # Example of how to set the values
#Will print out a string with the contents
print d()

Writing a csv file python

So i have a list:
>>> print references
>>> ['Reference xxx-xxx-xxx-007 ', 'Reference xxx-xxx-xxx-001 ', 'Reference xxx-xxx-xxxx-00-398 ', 'Reference xxx-xxx-xxxx-00-399']
(The list is much longer than that)
I need to write a CSV file wich would look this:
Column 1:
Reference xxx-xxx-xxx-007
Reference xxx-xxx-xxx-001
[...]
I tried this :
c = csv.writer(open("file.csv", 'w'))
for item in references:
c.writerows(item)
Or:
for i in range(0,len(references)):
c.writerow(references[i])
But when I open the csv file created, I get a window asking me to choose the delimiter
No matter what, I have something like
R,e,f,e,r,e,n,c,es

writerows takes a sequence of rows, each of which is a sequence of columns, and writes them out.
But you only have a single list of values. So, you want:
for item in references:
c.writerow([item])
Or, if you want a one-liner:
c.writerows([item] for item in references)
The point is, each row has to be a sequence; as it is, each row is just a single string.
So, why are you getting R,e,f,e,r,e,n,c,e,… instead of an error? Well, a string is a sequence of characters (each of which is itself a string). So, if you try to treat "Reference" as a sequence, it's the same as ['R', 'e', 'f', 'e', 'r', 'e', 'n', 'c', 'e'].
In a comment, you asked:
Now what if I want to write something in the second column ?
Well, then each row has to be a list of two items. For example, let's say you had this:
references = ['Reference xxx-xxx-xxx-007 ', 'Reference xxx-xxx-xxx-001 ']
descriptions = ['shiny thingy', 'dull thingy']
You could do this:
csv.writerows(zip(references, descriptions))
Or, if you had this:
references = ['Reference xxx-xxx-xxx-007 ', 'Reference xxx-xxx-xxx-001 ', 'Reference xxx-xxx-xxx-001 ']
descriptions = {'Reference xxx-xxx-xxx-007 ': 'shiny thingy',
'Reference xxx-xxx-xxx-001 ': 'dull thingy']}
You could do this:
csv.writerows((reference, descriptions[reference]) for reference in references)
The key is, find a way to create that list of lists—if you can't figure it out all in your head, you can print all the intermediate steps to see what they look like—and then you can call writerows. If you can only figure out how to create each single row one at a time, use a loop and call writerow on each row.
But what if you get the first column values, and then later get the second column values?
Well, you can't add a column to a CSV; you can only write by row, not column by column. But there are a few ways around that.
First, you can just write the table in transposed order:
c.writerow(references)
c.writerow(descriptions)
Then, after you import it into Excel, just transpose it back.
Second, instead of writing the values as you get them, gather them up into a list, and write everything at the end. Something like this:
rows=[[item] for item in references]
# now rows is a 1-column table
# ... later
for i, description in enumerate(descriptions):
values[i].append(description)
# and now rows is a 2-column table
c.writerows(rows)
If worst comes to worst, you can always write the CSV, then read it back and write a new one to add the column:
with open('temp.csv', 'w') as temp:
writer=csv.writer(temp)
# write out the references
# later
with open('temp.csv') as temp, open('real.csv', 'w') as f:
reader=csv.reader(temp)
writer=csv.writer(f)
writer.writerows(row + [description] for (row, description) in zip(reader, descriptions))

writerow writes the elements of an iterable in different columns. This means that if your provide a tuple, each element will go in one column. If you provide a String, each letter will go in one column. If you want all the content in the same column do the following:
c = csv.writer(open("file.csv", 'wb'))
c.writerows(references)
or
for item in references:
c.writerow(references)

c = csv.writer(open("file.csv", 'w'))
c.writerows(["Reference"])
# cat file.csv
R,e,f,e,r,e,n,c,e
but
c = csv.writer(open("file.csv", 'w'))
c.writerow(["Reference"])
# cat file.csv
Reference
Would work as others have said.
My original answer was flawed due to confusing writerow and writerows.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.