How do I join string together in a list - python

gene=[]
x=1
path = '/content/drive/MyDrive/abc.txt'
file = open(path,'r').readlines()
for i in file[x]:
i.split('\t')[0]
gene.append(i)
x+=1
print(gene)
my output is
['A', 'H', 'Y', '3', '9', '2', '7', '8', '\t', '1', '4', '5', '.', '5', '4', '4', '\t', '1', '3', '5', '.', '2', '4', '\t', '5', '6', '.', '5', '1', '3', '8']
but i want it to split at '/t', taking the first element in the list which is AHY392978 and append it into the list gene
['AHY39278', 'AHY39278']
any idea how to join the elements together?

add this code to your project:
string = ''
for(i in range len(gene)):
string += gene[i]
you can get your data from string

Related

List out of range whilst using recursion

Currently making Sudoku in my CompSci class, I'm using recursion to check my rows and make sure that there are no duplicates (I'll do columns and 3x3's later) but I'm running into an issue of my index going out of range. Now I know that means that it's going outside of the list but for some reason I can't figure out how it is.
Here is my recursion method for covering all of the rows. I'm importing a default 0 as the row initially and therefore is goes from 0 all the way to 8 and when it reaches 8, it'll return(essentially a break).
def winCheck(self, row):
if row == 8:
return ''
for num in range (1,10): #for EACH values 1-9
count = 0
for j in range(9):
if (self.board[row][j] == str(num)):
count += 1
if (count!=1): #MUST have a count of 1
print("wrong!")
return False
elif count == 1:
print("good!")
row= row + 1
game.winCheck(row)
return True
Here is my list for my playing board and the original board (those can't be modified)
for i in range(9):
self.board[i] = [' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ' ]
self.origBoard[i] = [' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ' ]
# [[' ', '1', '9', '3', '7', '4', '6', '5', '2'],
# ['5', '7', '6', '1', '8', '2', '9', '4', '3'],
# ['3', '4', '2', '5', '9', '6', '7', '1', '8'],
# ['9', '2', '1', '7', '5', '3', '8', '6', '4'],
# ['6', '3', '8', '4', '1', '9', '5', '2', '7'],
# ['4', '5', '7', '6', '2', '8', '1', '3', '9'],
# ['1', '8', '5', '2', '3', '7', '4', '9', '6'],
# ['7', '6', '3', '9', '4', '1', '2', '8', '5'],
# ['2', '9', '4', '8', '6', '5', '3', '7', ' ']]
#raw data set for testing win conditions
game.winCheck(row) --> return game.winCheck(row)

All string list to a numpy float array

I am reading a csv file from pandas where I have a column of (3,3) shaped lists.
An example list is as follows.
[[45.70345721, -0.00014686, -1.679e-05], [-0.00012219, 45.70271889, 0.00012527], [-1.161e-05, 0.00013083, 45.70306778]]
I tried to convert this list to a numpy float array with np.array(arr).astype(np.float). But it gives the following error.
ValueError: could not convert string to float:
When I searched for the root cause I observed that this list is in fully string format. print [i for i in arr] gives the following where everything is a string.
['[', '[', '4', '5', '.', '7', '0', '3', '4', '5', '7', '2', '1', ',', ' ', '-', '0', '.', '0', '0', '0', '1', '4', '6', '8', '6', ',', ' ', '-', '1', '.', '6', '7', '9', 'e', '-', '0', '5', ']', ',', ' ', '[', '-', '0', '.', '0', '0', '0', '1', '2', '2', '1', '9', ',', ' ', '4', '5', '.', '7', '0', '2', '7', '1', '8', '8', '9', ',', ' ', '0', '.', '0', '0', '0', '1', '2', '5', '2', '7', ']', ',', ' ', '[', '-', '1', '.', '1', '6', '1', 'e', '-', '0', '5', ',', ' ', '0', '.', '0', '0', '0', '1', '3', '0', '8', '3', ',', ' ', '4', '5', '.', '7', '0', '3', '0', '6', '7', '7', '8', ']', ']']
How do I convert this list to a numpy float array?
EDIT
Here is a snap of a part of my data frame.
When loaded, the data frame is in the below format. df here is a small example data frame.
df = pd.DataFrame(columns=["e_total"], data=[[['[', '[', '4', '5', '.', '7', '0', '3', '4', '5', '7', '2', '1', ',', ' ', '-', '0', '.', '0', '0', '0', '1', '4', '6', '8', '6', ',', ' ', '-', '1', '.', '6', '7', '9', 'e', '-', '0', '5', ']', ',', ' ', '[', '-', '0', '.', '0', '0', '0', '1', '2', '2', '1', '9', ',', ' ', '4', '5', '.', '7', '0', '2', '7', '1', '8', '8', '9', ',', ' ', '0', '.', '0', '0', '0', '1', '2', '5', '2', '7', ']', ',', ' ', '[', '-', '1', '.', '1', '6', '1', 'e', '-', '0', '5', ',', ' ', '0', '.', '0', '0', '0', '1', '3', '0', '8', '3', ',', ' ', '4', '5', '.', '7', '0', '3', '0', '6', '7', '7', '8', ']', ']']]])
Could someone give it a try and help me to convert this to a float array.
You can probably use eval() to turn the entire string into an actual list. eval() is generally not good to use, but in this case it might be your best bet.
What you listed as your "example" is not correct. You are listing the result of your print statement and list comprehension. What is being stored as an entry for that column is a string.
you should be able to simply take each item and wrap it in eval
eval(arr)
that should return you a shape (3,3) python list. From there you can convert it to a numpy array as necessary and change the types.
Aren't the numbers in the lists already floats? If that is the case just making the list an np.array will do what you are asking. You only need to do
np.array(list)
if the numbers are actually strings like you are showing in the second part you will have to go through the list and convert each number individually using either a nest loop or nested list comprehension.
the loop looks like this
for i in list:
for j in i:
j= np.float(j)
the list comprehension looks like
new_list= [ [np.float(j) for j in i] for i in list]

How to add each number to list in python?

I have following variable:
number = "456367"
I need to add it to list by one number, like this:
list = [['4', '5', '6', '3', '6', '7']]
How can I achieve this?
Since a string is iterable, pass it to the list constructor:
>>> list("456367")
['4', '5', '6', '3', '6', '7']
Which is why you would not want to name the result list because you would stomp on the name of that convenient function ;-)
If you really want [['4', '5', '6', '3', '6', '7']] vs just a list of characters:
>>> s="456367"
>>> map(list, zip(*s))
[['4', '5', '6', '3', '6', '7']]
Or,
>>> [list(s)]
[['4', '5', '6', '3', '6', '7']]
But I am assuming with the first answer that the extra brackets were a typo.
>>> number = "456367"
>>> [char for char in number]
['4', '5', '6', '3', '6', '7']

Printing a variable and appending it with a line from a file in a loop

I've researched this already but I can't find an exact answer. I've found answers for appending when there's 2 lists involved but this is different so here goes.
I'm creating a really basic fuzzer but I'm having issues appending the directory names to the end of the address. Here's what I have so far.
The expected output is as follows;
www.website.com/1
www.website.com/2
www.website.com/3
www.website.com/4
etc. But I'm getting something completely different. Here's the first piece of code I tested.
>>> host = "www.website.com/"
>>> path = [line.strip() for line in open("C:/Users/Public/Documents/tester.txt", 'r')]
>>> print str(host) + str(path)
which returns the following
www.website.com/['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
The second attempt was this;
>>> host = "www.website.com/"
>>> path = [line.strip() for line in open("C:/Users/Public/Documents/tester.txt", 'r')]
>>> for line in path:
print str(host) + str(path)
Which returned this;
www.website.com/['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
www.website.com/['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
www.website.com/['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
www.website.com/['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
www.website.com/['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
www.website.com/['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
www.website.com/['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
www.website.com/['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
www.website.com/['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
www.website.com/['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
I can see exactly what's happening and why it's happening but I can't figure out how to arrive at the expected output. I'll also need to filter out the special characters which I didn't think 'print' would pick up. Maybe there's different rules for print when it's reading something as a list.
I've thought of stupid complex methods such as counting the lines in the file and then throwing it into a while loop using that count but I'm sure there's something I can use or something I've done wrong. My Python knowledge isn't fantastic.
Can anybody help with this?
You were nearly there , when using the for loop , concatenate the elements from the list (which in your case is in variable 'line') , not the complete list again .
Code -
for line in path:
print str(host) + str(line)

Python - group files by modified date [hour]

I'm using the following script to grab all the files in a directory, then filtering them based on their modified date.
dir = '/tmp/whatever'
dir_files = os.listdir(dir)
dir_files.sort(key=lambda x: os.stat(os.path.join(dir, x)).st_mtime)
files = []
for f in dir_files:
t = os.path.getmtime(dir + '/' + f)
c = os.path.getctime(dir + '/' + f)
mod_time = datetime.datetime.fromtimestamp(t)
created_time = datetime.datetime.fromtimestamp(c)
if mod_time >= form.cleaned_data['start'].replace(tzinfo=None) and mod_time <= form.cleaned_data['end'].replace(tzinfo=None):
files.append(f)
return by_hour
I'm need to go one step further and group the files by the hour in which they where modified. Does anyone know how to do this off the top of their head?
UPDATE: I'd like to have them in a dictionary ({date,hour,files})
UPDATED:
Thanks for all your replies!. I tried using the response from david, but when I output the result it looks like below (ie. it's breaking up the filename):
defaultdict(<type 'list'>, {datetime.datetime(2013, 1, 9, 15, 0): ['2', '8', '-', '2', '0', '1', '3', '0', '1', '0', '9', '1', '5', '1', '8', '4', '3', '.', 'a', 'v', 'i', '2', '9', '-', '2', '0', '1', '3', '0', '1', '0', '9', '1', '5', '2', '0', '2', '4', '.', 'a', 'v', 'i', '3', '0', '-', '2', '0', '1', '3', '0', '1', '0', '9', '1', '5', '3', '8', '5', '9', '.', 'a', 'v', 'i', '3', '1', '-', '2', '0', '1', '3', '0', '1', '0', '9', '1', '5', '4', '1', '2', '4', '.', 'a', 'v', 'i', '3', '2', '-', '2', '0', '1', '3', '0', '1', '0', '9', '1', '5', '5', '3', '1', '0', '.', 'a', 'v', 'i', '3', '3', '-', '2', '0', '1', '3', '0', '1', '0', '9', '1', '5', '5', '5', '5', '8', '.', 'a', 'v', 'i'], datetime.datetime(2013, 1, 9, 19, 0): ['6', '1', '-', '2', '0', '1', '3', '0', '1', '0', '9', '1', '9', '0', '1', '1', '8', '.', 'a', 'v', 'i', '6', '2', '-', '2', '0', '1', '3', '0', '1', '0', '9', '1', '9', '0', '6', '3', '1', '.', 'a', 'v', 'i', '6', '3', '-', '2', '0', '1', '3', '0', '1', '0', '9', '1', '9', '1', '4', '1', '5', '.', 'a', 'v', 'i', '6', '4', '-', '2', '0', '1', '3', '0', '1', '0', '9', '1', '9', '2', '2', '3', '3', '.', 'a', 'v', 'i']})
I was hoping to get it to store the complete file names. Also how would I loop over it and grab the files in each hour and the hour they belong to?
I managed to sort the above out by just changing it to append. However it's not sorted from the oldest hour to the most recent.
Many thanks,
Ben
You can round a datetime object to the nearest hour with the line:
mod_hour = datetime.datetime(*mod_time.timetuple()[:4])
(This is because mod_time.timetuple()[:4] returns a tuple like (2013, 1, 8, 21). Thus, using a collections.defaultdict to keep a dictionary of lists:
import collections
by_hour = collections.defaultdict(list)
for f in dir_files:
t = os.path.getmtime(dir + '/' + f)
mod_time = datetime.datetime.fromtimestamp(t)
mod_hour = datetime.datetime(*mod_time.timetuple()[:4])
# for example, (2013, 1, 8, 21)
by_hour[mod_hour].append(f)
import os, datetime, operator
dir = "Your_dir_path"
by_hour =sorted([(f,datetime.datetime.fromtimestamp(os.path.getmtime(os.path.join(dir , f)))) for f in os.listdir(dir)],key=operator.itemgetter(1), reverse=True)
above code will give sorting based on year-->month-->day-->hour-->min-->sec format.
Building on David's excellent answer, you can use itertools.groupby to simplify the work a little bit:
import os, itertools, datetime
dir = '/tmp/whatever'
mtime = lambda f : datetime.datetime.fromtimestamp(os.path.getmtime(dir + '/' + f))
mtime_hour = lambda f: datetime.datetime(*mtime(f).timetuple()[:4])
dir_files = sorted(os.listdir(dir), key=mtime)
dir_files = filter(lambda f: datetime.datetime(2012,1,2,4) < mtime(f) < datetime.datetime(2012,12,1,4), dir_files)
by_hour = dict((k,list(v)) for k,v in itertools.groupby(dir_files, key=mtime_hour)) #python 2.6
#by_hour = {k:list(v) for k,v in itertools.groupby(dir_files, key=mtime_hour)} #python 2.7
Build entries lazily, Use UTC timezone, read modification time only once:
#!/usr/bin/env python
import os
from collections import defaultdict
from datetime import datetime
HOUR = 3600 # seconds in an hour
dirpath = "/path/to/dir"
start, end = datetime(...), datetime(...)
# get full paths for all entries in dirpath
entries = (os.path.join(dirpath, name) for name in os.listdir(dirpath))
# add modification time truncated to hour
def date_and_hour(path):
return datetime.utcfromtimestamp(os.path.getmtime(path) // HOUR * HOUR)
entries = ((date_and_hour(path), path) for path in entries)
# filter by date range: [start, end)
entries = ((mtime, path) for mtime, path in entries if start <= mtime < end)
# group by hour
result = defaultdict(list)
for dt, path in entries:
result[dt].append(path)
from pprint import pprint
pprint(dict(result))

Categories