Given some code:
keyword=re.findall(r'ke\w+ = \S+',s)
score=re.findall(r'sc\w+ = \S+',s)
print '%s,%s' %(keyword,score)
The output of above code is:
['keyword = NORTH', 'keyword = GUESS', 'keyword = DRESSES', 'keyword = RALPH', 'keyword = MATERIAL'],['score = 88466', 'score = 83965', 'score = 79379', 'score = 74897', 'score = 68168']
But I want the format should be different lines like:
NORTH,88466
GUESS,83935
DRESSES,83935
RALPH,73379
MATERIAL,68168
Instead of the last line, do this instead:
>>> for k, s in zip(keyword, score):
kw = k.partition('=')[2].strip()
sc = s.partition('=')[2].strip()
print '%s,%s' % (kw, sc)
NORTH,88466
GUESS,83965
DRESSES,79379
RALPH,74897
MATERIAL,68168
Here is how it works:
The zip brings the corresponding elements together pairwise.
The partition splits a string like 'keyword = NORTH' into three parts (the part before the equal sign, the equal sign itself, and the part after. The [2] keeps only the latter part.
The strip removes leading and trailing whitespace.
Alternatively, you can modify your regexes to do much of the work for you by using groups to capture the keywords and scores without the surrounding text:
keywords = re.findall(r'ke\w+ = (\S+)',s)
scores = re.findall(r'sc\w+ = (\S+)',s)
for keyword, score in zip(keywords, scores):
print '%s,%s' %(keyword,score)
Hope this will help:
keyword = ['NORTH','GUESS','DERESSES','RALPH']
score = [88466,83935,83935,73379]
for key,value in zip(keyword,score):
print "%s,%s" %(key,value)
One way would be like would be to zip() the two lists together (to iterate over them pairwise) and use str.partition() to grab the data after the =, like this::
def after_equals(s):
return s.partition(' = ')[-1]
for k,s in zip(keyword, score):
print after_equals(k) + ',' + after_equals(s)
If you don't want to call after_equals() twice, you could refactor to:
for pair in zip(keyword, score):
print ','.join(after_equals(data) for data in pair)
If you want to write to a text file (you really should have mentioned this in the question, not in your comments on my answer), then you can take this approach...
with open('output.txt', 'w+') as output:
for pair in zip(keyword, score):
output.write(','.join(after_equals(data) for data in pair) + '\n')
Output:
% cat output.txt
NORTH,88466
GUESS,83965
DRESSES,79379
RALPH,74897
MATERIAL,68168
Related
I know how to write strings in reverse
txt = "Hello World"[::-1]
print(txt)
but I don't know how to do it with one character still in the same place
like when I type world it should be wdlro
thanks
Just prepend the first character to the remainder of the string (reversed using slice notation, but stopping just before we reach index 0, which is the first character):
>>> s = "world"
>>> s[0] + s[:0:-1]
'wdlro'
word = 'w' + "".join(list('world').remove('w'))[::-1]
If you want to reverse all the words in the text based on the criteria (skipping first character of each word):
txt = "Hello World"
result = []
for word in txt.split():
result.append(word[0]+word[1:][::-1])
print (result)
This is a more generic answer that allows you to pick a random location within the string to hold in the same position:
txt = "Hello World"
position = 3
lock_char = txt[position]
new_string = list((txt[:position] + txt[position+1:])[::-1])
new_string.insert(position, lock_char)
listToStr = ''.join([str(elem) for elem in new_string])
print(listToStr)
Result: dlrloW oleH
The simple way using only range:
txt = "Hello World"
position = 10
[first, lock, last] = txt[:position], txt[position],
txt[position+1:]
new_string = (first + last)[::-1]
[first, last] = new_string[:position], new_string[position:]
new_txt = first + lock + last
print(new_txt)
I have a string and a list:
src = 'ways to learn are read and execute.'
temp = ['ways to','are','and']
What I wanted is to split the string using the list temp's values and produce:
['learn','read','execute']
at the same time.
I had tried for loop:
for x in temp:
src.split(x)
This is what it produced:
['','to learn are read and execute.']
['ways to learn','read and execute.']
['ways to learn are read','execute.']
What I wanted is to output all the values in list first, then use it split the string.
Did anyone has solutions?
re.split is the conventional solution for splitting on multiple separators:
import re
src = 'ways to learn are read and execute.'
temp = ['ways to','are','and']
pattern = "|".join(re.escape(item) for item in temp)
result = re.split(pattern, src)
print(result)
Result:
['', ' learn ', ' read ', ' execute.']
You can also filter out blank items and strip the spaces+punctuation with a simple list comprehension:
result = [item.strip(" .") for item in result if item]
print(result)
Result:
['learn', 'read', 'execute']
This is a method which is purely pythonic and does not rely on regular expressions. It's more verbose and more complex:
result = []
current = 0
for part in temp:
too_long_result = src.split(part)[1]
if current + 1 < len(temp): result.append(too_long_result.split(temp[current+1])[0].lstrip().rstrip())
else: result.append(too_long_result.lstrip().rstrip())
current += 1
print(result)
You cann remove the .lstrip().rstrip() commands if you don't want to remove the trailing and leading whitespaces in the list entries.
Loop solution. You can add conditions such as strip if you need them.
src = 'ways to learn are read and execute.'
temp = ['ways to','are','and']
copy_src = src
result = []
for x in temp:
left, right = copy_src.split(x)
if left:
result.append(left) #or left.strip()
copy_src = right
result.append(copy_src) #or copy_src.strip()
just keep it simple
src = 'ways to learn are read and execute.'
temp = ['ways','to','are','and']
res=''
for w1 in src.split():
if w1 not in temp:
if w1 not in res.split():
res=res+w1+" "
print(res)
import re
string = "is2 Thi1s T4est 3a"
def order(sentence):
res = ''
count = 1
list = sentence.split()
for i in list:
for i in list:
a = re.findall('\d+', i)
if a == [str(count)]:
res += " ".join(i)
count += 1
print(res)
order(string)
Above there is a code which I have problem with. Output which I should get is:
"Thi1s is2 3a T4est"
Instead I'm getting the correct order but with spaces in the wrong places:
"T h i 1 si s 23 aT 4 e s t"
Any idea how to make it work with this code concept?
You are joining the characters of each word:
>>> " ".join('Thi1s')
'T h i 1 s'
You want to collect your words into a list and join that instead:
def order(sentence):
number_words = []
count = 1
words = sentence.split()
for word in words:
for word in words:
matches = re.findall('\d+', word)
if matches == [str(count)]:
number_words.append(word)
count += 1
result = ' '.join(number_words)
print(result)
I used more verbose and clear variable names. I also removed the list variable; don't use list as a variable name if you can avoid it, as that masks the built-in list name.
What you implemented comes down to a O(N^2) (quadratic time) sort. You could instead use the built-in sort() function to bring this to O(NlogN); you'd extract the digit and sort on its integer value:
def order(sentence):
digit = re.compile(r'\d+')
return ' '.join(
sorted(sentence.split(),
key=lambda w: int(digit.search(w).group())))
This differs a little from your version in that it'll only look at the first (consecutive) digits, it doesn't care about the numbers being sequential, and will break for words without digits. It also uses a return to give the result to the caller rather than print. Just use print(order(string)) to print the return value.
If you assume the words are numbered consecutively starting at 1, then you can sort them in O(N) time even:
def order(sentence):
digit = re.compile(r'\d+')
words = sentence.split()
result = [None] * len(words)
for word in words:
index = int(digit.search(word).group())
result[index - 1] = word
return ' '.join(result)
This works by creating a list of the same length, then using the digits from each word to put the word into the correct index (minus 1, as Python lists start at 0, not 1).
I think the bug is simply in the misuse of join(). You want to concatenate the current sorted string. i is simply a token, hence simply add it to the end of the string. Code untested.
import re
string = "is2 Thi1s T4est 3a"
def order(sentence):
res = ''
count = 1
list = sentence.split()
for i in list:
for i in list:
a = re.findall('\d+', i)
if a == [str(count)]:
res = res + " " + i # your bug here
count += 1
print(res)
order(string)
I have a variety of values in a text field of a CSV
Some values look something like this
AGM00BALDWIN
AGM00BOUCK
however, some have duplicates, changing the names to
AGM00BOUCK01
AGM00COBDEN01
AGM00COBDEN02
My goal is to write a specific ID to values NOT containing a numeric suffix
Here is the code so far
prov_count = 3000
prov_ID = 0
items = (name, x, y)
xy_tup = tuple(items)
if "*1" not in name and "*2" not in name:
prov_ID = prov_count + 1
else:
prov_ID = ""
It seems that the the wildcard isn't the appropriate method here but I can't seem to find an appropriate solution.
Using regular expressions seems appropriate here:
import re
pattern= re.compile(r'(\d+$)')
prov_count = 3000
prov_ID = 0
items = (name, x, y)
xy_tup = tuple(items)
if pattern.match(name)==False:
prov_ID = prov_count + 1
else:
prov_ID = ""
There are different ways to do it, one with the isdigit function:
a = ["AGM00BALDWIN", "AGM00BOUCK", "AGM00BOUCK01", "AGM00COBDEN01", "AGM00COBDEN02"]
for i in a:
if i[-1].isdigit(): # can use i[-1] and i[-2] for both numbers
print (i)
Using regex:
import re
a = ["AGM00BALDWIN", "AGM00BOUCK", "AGM00BOUCK01", "AGM00COBDEN01", "AGM00COBDEN02"]
pat = re.compile(r"^.*\d$") # can use "\d\d" instead of "\d" for 2 numbers
for i in a:
if pat.match(i): print (i)
another:
for i in a:
if name[-1:] in map(str, range(10)): print (i)
all above methods return inputs with numeric suffix:
AGM00BOUCK01
AGM00COBDEN01
AGM00COBDEN02
You can use slicing to find the last 2 characters of the element and then check if it ends with '01' or '02':
l = ["AGM00BALDWIN", "AGM00BOUCK", "AGM00BOUCK01", "AGM00COBDEN01", "AGM00COBDEN02"]
for i in l:
if i[-2:] in ('01', '02'):
print('{} is a duplicate'.format(i))
Output:
AGM00BOUCK01 is a duplicate
AGM00COBDEN01 is a duplicate
AGM00COBDEN02 is a duplicate
Or another way would be using the str.endswith method:
l = ["AGM00BALDWIN", "AGM00BOUCK", "AGM00BOUCK01", "AGM00COBDEN01", "AGM00COBDEN02"]
for i in l:
if i.endswith('01') or i.endswith('02'):
print('{} is a duplicate'.format(i))
So your code would look like this:
prov_count = 3000
prov_ID = 0
items = (name, x, y)
xy_tup = tuple(items)
if name[-2] in ('01', '02'):
prov_ID = prov_count + 1
else:
prov_ID = ""
Here is my question
count += 1
num = 0
num = num + 1
obs = obs_%d%(count)
mag = mag_%d%(count)
while num < 4:
obsforsim = obs + mag
mylist.append(obsforsim)
for index in mylist:
print index
The above code gives the following results
obs1 = mag1
obs2 = mag2
obs3 = mag3
and so on.
obsforrbd = parentV = {0},format(index)
cmds.dynExpression(nPartilce1,s = obsforrbd,c = 1)
However when i run the code above it only gives me
parentV = obs3 = mag3
not the whole list,it only gives me the last element of the list why is that..??
Thanks.
I'm having difficulty interpreting your question, so I'm just going to base this on the question title.
Let's say you have a list of items (they could be anything, numbers, strings, characters, etc)
myList = [1,2,3,4,"abcd"]
If you do something like:
for i in myList:
print(i)
you will get:
1
2
3
4
"abcd"
If you want to convert this to a string:
myString = ' '.join(myList)
should have:
print(myString)
>"1 2 3 4 abcd"
Now for some explanation:
' ' is a string in python, and strings have certain methods associated with them (functions that can be applied to strings). In this instance, we're calling the .join() method. This method takes a list as an argument, and extracts each element of the list, converts it to a string representation and 'joins' it based on ' ' as a separator. If you wanted a comma separated list representation, just replace ' ' with ','.
I think your indentations wrong ... it should be
while num < 4:
obsforsim = obs + mag
mylist.append(obsforsim)
for index in mylist:
but Im not sure if thats your problem or not
the reason it did not work before is
while num < 4:
obsforsim = obs + mag
#does all loops before here
mylist.append(obsforsim) #appends only last
The usual pythonic way to spit out a list of numbered items would be either the range function:
results = []
for item in range(1, 4):
results.append("obs%i = mag_%i" % (item, item))
> ['obs1 = mag_1', 'obs2 = mag_2', 'ob3= mag_3']
and so on (note in this example you have to pass in the item variable twice to get it to register twice.
If that's to be formatted into something like an expression you could use
'\n'.join(results)
as in the other example to create a single string with the obs = mag pairs on their own lines.
Finally, you can do all that in one line with a list comprehension.
'\n'.join([ "obs%i = mag_%i" % (item, item) for item in range (1, 4)])
As other people have pointed out, while loops are dangerous - its easier to use range