Creating array from CSV in Python while limiting columns used - python

I am working with a CSV file with the following format,
ST 1 2 3 4
WA 10 10 5 2
OR 0 7 3 9
CA 11 5 4 12
AZ -999 0 0 11
The first row represents # of days 1-4. I want to be able to take the data for each state, example WA, 10, 10, 5, 2 and create an array with just the numbers in that row that is sorted. If I omit the first index which is WA I can do this using.
sorted(list, key=int)
Doing so would give me a list, [2,5,10,10].
What I want to do is
Read each line of the CSV.
Create an array of numbers using the numerical data.
Run some calculations using array(Percent rank)
Combine the calculated values with the correct state fields. For instance if I want to add a value of 3 to the array for WA.
b.insert(list[4]), 3)
to get
[2,3,5,10,10]
so I can calculate rank. (Note: I am unable to use scipy so I must calculate rank using a function which I've already figured out.)
End by writing State and rank value to new csv, something like.
ST Rank
WA 30
CA 26
OR 55
where Rank is the rank of the given value in the array.
I am pretty new to python so any help or pointers would be greatly appreciated. I am also limited to using basic python modules.(numpy, csv....etc)
UPDATE CODE:
with open(outputDir+"needy.csv", 'rb') as f:
first = {row[0]: sorted(row[1:], key=int) for row in list(csv.reader(f))}
for key, value in first.items():
if addn in first:
g= "yes"
print key, addn, g
#print d
else:
g= "no"
print key, addn, g
value.append(300)
value.append(22)
value = sorted(value, key=int)
print "State:", key, value
When i do this the values I append will be prpoperly added and the dict will be properly sorted, but when I define n as a value, it will not be fouund. example below.
{'WA': ['1', '1', '1', '2', '2', '2', '3', '4', '4', '4', '5', '5', '5', '5', '6', '6', '7', '7', '8', '8', '8', '8', '9', '10', '10', '10', '10', '11', '11'}
The above line is what happens if I simply print out first.
If I utilize the for loop and specify addn as 11 as a global function I get.
WA 11 no
State: WA ['1', '1', '1', '2', '2', '2', '3', '4', '4', '4', '5', '5', '5', '5', '6', '6', '7', '7', '8', '8', '8', '8', '9', '10', '10', '10', '10', '11', '11',..]
Being that 11 is part of the key it should return yes etc.

You can use simple commands and a dictionary to organize your data:
fid = open('out.txt') # Just copy what you put in your question inside a file.
l = fid.readlines() # Read the whole file into a list.
d = {} # create a dictionary.
for i in l:
s = i.split() # split the list using spaces (default)
d[s[0]] = [int(s[j]) for j in range(1,len(s))] # list comprehension to transform string into its for you number lists.
print(d)
, the result is:
{'CA': [11, 5, 4, 12], 'ST': [1, 2, 3, 4], 'OR': [0, 7, 3, 9], 'WA': [10, 10, 5, 2], 'AZ': [-999, 0, 0, 11]}
From this point you can do whatever you wish to your entries in the dictionary including append.
d['CA'].append(3)
EDIT: #J.R.W. building the dictionary the way I recommended, followed by your code (plus the correction I gave):
fid = open('out.txt') # Just copy what you put in your question inside a file.
l = fid.readlines() # Read the whole file into a list.
first = {} # create a dictionary.
for i in l:
s = i.split() # split the list using spaces (default)
first[s[0]] = [int(s[j]) for j in range(1,len(s))] # list comprehension to transform string into its for you number lists.
print(first)
addn = 11
for key, value in first.items():
if addn in value:
g= "yes"
print(key, addn, g)
#print d
else:
g= "no"
print(key, addn, g)
value.append(300)
value.append(22)
value = sorted(value, key=int)
print("State:", key, value)
, results in:
{'ST': [1, 2, 3, 4], 'CA': [11, 5, 4, 12], 'OR': [0, 7, 3, 9], 'AZ': [-999, 0, 0, 11], 'WA': [10, 10, 5, 2]}
ST 11 no
State: ST [1, 2, 3, 4, 22, 300]
CA 11 yes
State: CA [4, 5, 11, 12, 22, 300]
OR 11 no
State: OR [0, 3, 7, 9, 22, 300]
AZ 11 yes
State: AZ [-999, 0, 0, 11, 22, 300]
WA 11 no
State: WA [2, 5, 10, 10, 22, 300]
, which says yes when 11 exists (your own test), and no when it doesn't.

Related

how do i convert results to list python

I have this code I'm trying to run by using two columns of a csv file that I've converted into lists and used those lists to get a < and > comparison between the numbers inside, now i want to get the results from this comparison in a list format of multiple lists that I want to display in an interval of six digits(the results) per list
eg I get
1
2
3
4
5
6
7
8
9
10
11
12
and i want to display this as
[1,2,3,4,5,6]
[7,8,9,10,11,12]
this is the code I'm using for comparing the lists
'''
for i in range(len(fsa)):
if fsa[i] < ghf[i]:
print('1')
else:
print('0')
'''
the code that's not working which is the one for showing results in an intervalled list format is this one
'''
print()
start = 0
end = len(''' i want the length of my results from the previous code, the 1's and 0's here. ''')
for x in range(start,end,6):
print('''i want the results here as my list'''[x:x+6])
'''
I'm a beginner, please help, how do i make the results a list?
i got the answer i wanted. Incase someone else was suffering with this as
well here's my solution
'''
kol = []
for i in range(len(fsa)):
if fsa[i] < ghf[i]:
kol.append('1')
else:
kol.append('0')
start = 0
end = len(fsa)
for x in range(start,end,6):
print(kol[x:x+6])
'''
outcome
'''
['1', '1', '0', '0', '1', '1']
['1', '0', '0', '1', '0', '0']
['0', '0', '0', '0', '1', '1']
['1', '1', '1', '0', '1', '1']
'''
you just need to make a new list and append it instead of print.
...
...
temp = []
for x in range(start,end,6):
temp.append(fsa[x:x+6])
print(temp)
#[[1, 2, 3, 4, 5, 6], [7, 8, 9, 10, 11, 12]]

Select either Max or Min value from each of combined files in Python

# File 1
Column = ['1', '2', '3']
# File 2
Column = ['-2', '-6', '-7', '-6', '-7']
# File 3
Column=['0', '3', '4', '6', '5']
# File 4
Column = ['-1', '-2', '-3', '-3', '-3']
# Combined files
Column = ['1', '2', '3', '-2', '-6', '-7', '-6', '-7', '0', '3', '4', '6', '5', '-1', '-2', '-3', '-3', '-3']
Guys, I want to select either max or min value from each file in the combined files.
Expected output:
Column = ['3', '-7', '6', '-3']
Any help will be appreciated!
I think you are asking for the abs maximum value for each column. Try the code below
Column1 = [1, 2, 3]
Column2 = [-2, -6, -7, -6, -7]
Column3 = [0, 3, 4, 6, 5]
Column4 = [-1, -2, -3, -3, -3]
print(max(Column1, key=abs))
print(max(Column2, key=abs))
print(max(Column3, key=abs))
print(max(Column4, key=abs))
Within your lists are strings and not integers so you should first convert them into integers:
--> https://www.geeksforgeeks.org/python-converting-all-strings-in-list-to-integers/
It's the same as asking a person "What's the biggest value of apples, oranges, pears".
After that what you simply do is use the max and min function within python.
Column = [1, 2, 3]
print(max(Column))
--> 3
print(min(Column))
--> 1
I hope I could help a little bit. :)
Use this method
column=[sorted(column1)[random.randint(-1,0)]]
Use one of these.
This method first sort the lists
column=[]
column.append(sorted(column1)[random.randint(-1,0)])
column.append(sorted(column2)[random.randint(-1,0)])
column.append(sorted(column3)[random.randint(-1,0)])
column.appemd(sorted(column4)[random.randint(-1,0)])
column.append(sorted(column5)[random.randint(-1,0)])
Thus use random.choice function
column=[]
column.append(random.choice(max(column1),min(column1)))
column.append(random.choice(max(column2),min(column2)))
column.append(random.choice(max(column3),min(column3)))
column.append(random.choice(max(column4),min(column4)))
column.append(random.choice(max(column5),min(column5)))

Algorithm to divide key value pairs into n groups of sum y or smaller

I have a key-value pair like this:
'NANOUSDT.csv.gz': 15,
'ENJUSDT.csv.gz': 19,
'DGBBTC.csv.gz': 0,
'BTSUSDT.csv.gz': 1,
'BLZBTC.csv.gz': 42,
'BANDUSDT.csv.gz': 14,
'ETCUSDT.csv.gz': 202
It contains over around 300 items. Some are big. Some are small. I want to create a list of list of keys with the following conditions:
The sum of the values in a list cannot exceed 10,000
A list cannot contain more than 8 elements
No single item has a size over 10,000.
How can i acheive this?
Here's an example:
kvs = {'1': 9999,
'2': 19,
'3': 0,
'4': 1,
'5': 42,
'6': 14,
'7': 14,
'8': 14,
'9': 14,
'10': 14,
'11': 10000}
buckets = []
cur_bucket = []
weight_so_far = 0
for k, v in kvs.items():
if len(cur_bucket) == 8 or weight_so_far + v > 10000:
buckets.append(cur_bucket)
cur_bucket = []
weight_so_far = 0
weight_so_far += v
cur_bucket.append(k)
if cur_bucket:
buckets.append(cur_bucket)
print(buckets)
Output
[['1'], ['2', '3', '4', '5', '6', '7', '8', '9'], ['10'], ['11']]
Is this what you're looking for?

Convert int values of an array into a string? [duplicate]

This question already has answers here:
Converting int arrays to string arrays in numpy without truncation
(6 answers)
Closed 4 years ago.
I have the following array
([[ 1, 1, 2, 2],
[ 1, 1, 3, 3],
[ 1, 1, 4, 4]])
I want to convert values from int to str, like this:
([[ '1', '1', '2', '2'],
[ '1', '1', '3', '3'],
[ '1', '1', '4', '4']])
How can I do this?
arr being your array with ints, you can do:
list(map(lambda i: list(map(str,i)), arr)) with a one liner.
The result:
[['1', '1', '2', '2'], ['1', '1', '3', '3'], ['1', '1', '4', '4']]
The following might work:
def stringify_nested_containers(obj):
if hasattr(obj, '__iter__') and not isinstance(obj, str):
obj = list(obj)
for idx in range(0, len(obj)):
obj[idx] = stringify_nested_containers(obj[idx])
else:
obj = str(obj)
return obj
Example Usage:
lyst = [
[ 1, 1, 2, 2],
[ 1, 1, 3, 3],
[ 1, 1, 4, 4]
]
slyst = stringify_nested_containers(lyst)
print(slyst)
print(type(slyst[0][0]))
Warning:
For a string named stryng, stryng[0], stryng[539], or stryng[whatever] are not characters, they are strings.
Thus,
"h" == "hello world"[0]
and
"h" == "hello world"[0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0]
Suppose you try to dig down into nested containers until you reach to an object which does not have an __iter__ method (i.e. a non-container). Suppose there is a string in your nested container. Then you will end up in an infinite recursion or loop because "a"[0] is iterable for any character "a". Specifically "a" has one element, "a"[0] and "a"[0]== "a".

Problems with iterating over a string

I have a string of numbers I'm trying to iterate through. Say for example the string is 20 characters long, I'm trying to find the product of the first 5 numbers, then the second 5, the third, and so on.
So far I have converted the number to a string, then used an iterating index to produce the numbers I want to find the product of as strings.
I've then split the strings of numbers into an array of characters, then converted the characters to integers. I've then used a function to find the product of those numbers, then add it to an array.
The idea is that once I have the full array, I can find the largest of the products.
The problem I'm having is that after the first iteration, the product is coming back as 0, when it should be much higher.
My code looks like this:
def product(list):
p = 1
for i in list:
p *= i
return p
products = []
count = 1
testno = 73167176531330624919225119674426574742355349194934969835203127745063262395783180169848018694788518438586156078911294949545950173795833195285320880551112540698747158523863050715693290
startno = 0
endno = 13
end = (len(str(testno)))-1
print("the end is",end)
while count < 4:
teststring = (str(testno))[startno:endno]
print("teststring is", teststring)
strlist = (list(teststring))
print("strlist is", strlist)
numlist = list(map(int, strlist))
print("numlist is",numlist)
listproduct = (product(numlist))
print("listproduct is",listproduct)
products.append(listproduct)
print("products is now",products)
startno = startno + 1
endno = endno + 1
print("startno is now", startno)
print("endno is now", endno)
count += 1
print("the list of products is", products)
print("the biggest product is", max(products))
I have not done this as elegantly as I wanted to, perhaps because I don't properly understand the problem.
The offending output I'm getting looks like this:
the end is 999
teststring is 7316717653133
strlist is ['7', '3', '1', '6', '7', '1', '7', '6', '5', '3', '1', '3', '3']
numlist is [7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3]
listproduct is 5000940
products is now [5000940]
startno is now 1
endno is now 14
teststring is 3167176531330
strlist is ['3', '1', '6', '7', '1', '7', '6', '5', '3', '1', '3', '3', '0']
numlist is [3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0]
listproduct is 0
products is now [5000940, 0]
startno is now 2
endno is now 15
teststring is 1671765313306
strlist is ['1', '6', '7', '1', '7', '6', '5', '3', '1', '3', '3', '0', '6']
numlist is [1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6]
listproduct is 0
products is now [5000940, 0, 0]
startno is now 3
endno is now 16
the list of products is [5000940, 0, 0]
the biggest product is 5000940
I would be most grateful if someone could explain to me what is going wrong, how I can rectify it, and if there are any more elegant ways I could solve this problem.
Many thanks in advance for your help!
#Axtract, Just modify your product function to below.
def product(list):
p = 1
for i in list:
if i == 0: # Just use this if check here
pass
else:
p *= i
return p
You have zeros in your products. The first one happens not to contain a zero, but all the others do.
So, your function is working properly --- just a problem with the input data.
The product of zero and any number is always zero.
Notice that when your numlist has a zero, the product is zero.
Your first iteration doesn't have a zero, which is why you have a nonzero product.

Categories