Pymongo Regex match with list - python

I have a list of distinct strings.
lst = ['.\*123.\*','.\*252.\*','.\*812.\*','.\*135.\*']
I want to perform an aggregation operation such that my $match looks like the following:
{"$match":{"Var":{"$in":lst}}}
The var field in MongoDb records is a string containing numbers:
e.g. "abc123", "haaofalfa812", etc. I want to match this Var to a regular expression. If the variables in the lst were less, i could've done this...
{"$match":{"Var":{"$in":[re.compile('.*123.*'),re.compile('.*252.*'),re.compile('.*812.*')]}}}
But since it is a lst already initialized and contains a lot of elements, what should I do?
I tried the following but this doesn't work too...
{"$match":{"Var":{"$in":[re.compile(x for x in lst)]}}}
I get the following error for obvious reasons.
TypeError: first argument must be string or compiled pattern

Your list comprehension is wrong. I reckon what you wanted to do is this:
[re.compile(x) for x in lst]
That TypeError comes from trying to pass generator(x for x in lst statement) to re.compile()

Related

python string to list (special list)

I'm trying to get this string into list, how can i do that pleas ?
My string :
x = "[(['xyz1'], 'COM95'), (['xyz2'], 'COM96'), (['xyz3'], 'COM97'), (['xyz4'], 'COM98'), (['xyz5'], 'COM99'), (['xyz6'], 'COM100')]"
I want to convert it to a list, so that:
print(list[0])
Output : (['xyz1'], 'COM95')
If you have this string instead of a list, that presumes it is coming from somewhere outside your control (otherwise you'd just make a proper list). If the string is coming from a source outside your program eval() is dangerous. It will gladly run any code passed to it. In this case you can use ast.liter_eval() which is safer (but make sure you understand the warning on the docs):
import ast
x = "[(['xyz1'], 'COM95'), (['xyz2'], 'COM96'), (['xyz3'], 'COM97'), (['xyz4'], 'COM98'), (['xyz5'], 'COM99'), (['xyz6'], 'COM100')]"
l = ast.literal_eval(x)
Which gives an l of:
[(['xyz1'], 'COM95'),
(['xyz2'], 'COM96'),
(['xyz3'], 'COM97'),
(['xyz4'], 'COM98'),
(['xyz5'], 'COM99'),
(['xyz6'], 'COM100')]
If the structure is uniformly a list of tuples with a one-element list of strings and an individual string, you can manually parse it using the single quote as a separator. This will give you one string value every other component of the split (which you can access using a striding subscript). You can then build the actual tuple from pairing of two values:
tuples = [([a],s) for a,s in zip(*[iter(x.split("'")[1::2])]*2)]
print(tuples[0])
(['xyz1'], 'COM95')
Note that this does not cover the case where an individual string contains a single quote that needed escaping
You mean convert list like string into list? Maybe you can use eval().
For example
a="[1,2,3,4]"
a=eval(a)
Then a become a list
to convert as list use x = eval(x)
print(list[0]) will give you an error because list is a python builtin function
you should do print(x[0]) to get what you want

Build Dictionary in Python Loop - List and Dictionary Comprehensions

I'm playing with some loops in python. I am quite familiar with using the "for" loop:
for x in y:
do something
You can also create a simple list using a loop:
i = []
for x in y:
i.append(x)
and then I recently discovered a nice efficient type of loop, here on Stack, to build a list (is there a name for this type of loop? I'd really like to know so I can search on it a little better):
[x.name for x in y]
Ok, that being said, I wanted to go further with the last type of loop and I tried to build a python dictionary using the same type of logic:
{x[row.SITE_NAME] = row.LOOKUP_TABLE for row in cursor}
instead of using:
x = {}
for row in cursor:
x[row.SITE_NAME] = row.LOOKUP_TABLE
I get an error message on the equal sign telling me it's an invalid syntax. I believe in this case, it's basically telling me that equal sign is a conditional clause (==), not a declaration of a variable.
My second question is, can I build a python dictionary using this type of loop or am I way off base? If so, how would I structure it?
The short form is as follows (called dict comprehension, as analogy to the list comprehension, set comprehension etc.):
x = { row.SITE_NAME : row.LOOKUP_TABLE for row in cursor }
so in general given some _container with some kind of elements and a function _value which for a given element returns the value that you want to add to this key in the dictionary:
{ _key : _value(_key) for _key in _container }
What you're using is called a list comprehension. They're pretty awesome ;)
They have a cousin called a generator expression that works like a list comprehension but instead of building the list all at once, they generate one item at a time. Hence the name generator. You can even build functions that are generators - there are plenty of questions and sites to cover that info, though.
You can do one of two things:
x = dict(((row.SITE_NAME, row.LOOKUP_TABLE) for row in cursor))
Or, if you have a sufficiently new version of Python, there is something called a dictionary comprehension - which works like a list comprehension, but produces a dictionary instead.
x = {row.SITE_NAME : row.LOOKUP_TABLE for row in cursor}
You can do it like this:
x = dict((row.SITE_NAME, row.LOOKUP_TABLE) for row in cursor)

Check if an int/str is in the list and its location. Python 3.3.2

Say that I have a list that looks something like the this :
MyList = [1,2,3,4,5,"z","x","c","v","b"]
Now the users inputs : "5z1b3". How would you replace each int/str with its location in the list. I'm thinking of using something like this:
for x in MyList.... if located in list, replace with letter/number with its location.
Not entirely sure how to do it though. Help would be much appreciated.
edit::::: It's something I'm working on and I must use both ints and strs in the list. Also I lied about the output I need. Thanks for mentioning it avarnert. commas between each letter/number in the output would make it work for me. Any ideas how to do it ?
Use a list comprehension:
[MyList.index(c) for c in inputstring]
This'll have to scan through MyList for each entry; you could optimize that quite a bit by using a dictionary indexing from character to position; this has the added advantage we can ensure we only have strings as well:
index = {str(c): i for i, c in enumerate(MyList)}
[index[c] for c in inputstring]
If you then need a formatted string, turn the indices to strings and join the final output:
index = {str(c): str(i) for i, c in enumerate(MyList)}
','.join([index[c] for c in inputstring])
I would go about using the list.index() method. See below for example:
MyList = [MyList.index(chr) for chr in user_input]
EDIT:
This however assumes that each character from the user input will be found in MyList, and also that each character in MyList will appear only once.

Doing the wrong list comprehension

I've been trying every iteration of a list comprehension that I can in the context.
I am getting a call from a database, converting it to a list of [['item', long integer]].
I want to convert the long integer to a regular one, because the rest of my math is in regular integrals.
I'm trying this:
catnum = c.fetchall()
catnum = [list(x) for x in catnum]
for x in catnum:
[int(y) for y in x]
I've also tried x[1], and a few other things (it is always in position 1 inside the list)
No luck. How do I convert only the second value in the list to a regular integer?
does this work?
catnum=[[x,int(y)] for x,y in catnum]
But, I think it's worth asking why you need to do this conversion. Python should handle long integers just fine anywhere a regular integer would work. There's a slight performance penalty to leaving them as long ints, but in most cases I don't think that would justify the extra work to convert to regular integers.
EDIT for the people reading the comments, my first answer was incorrect and did not involve a list comprehension. It relied on mutating the elements in catnum, but since those elements are in tuples, they can't be mutated.
[[x[0],int(x[1])] for x in catnum]
This will return a list of lists, where the first entry in the name and the second is the value cast down to a normal integer.

Add string to another string

I currently encountered a problem:
I want to handle adding strings to other strings very efficiently, so I looked up many methods and techniques, and I figured the "fastest" method.
But I quite can not understand how it actually works:
def method6():
return ''.join([`num` for num in xrange(loop_count)])
From source (Method 6)
Especially the ([`num` for num in xrange(loop_count)]) confused me totally.
it's a list comprehension, that uses backticks for repr conversion. Don't do this. Backticks are deprecated and removed in py3k and more efficient and pythonic way is not to build intermediate list at all, but to use generator expression:
''.join(str(num) for num in xrange(loop_count)) # use range in py3k
xrange() is a faster (written in C) version of range().
Backtick notation -- num, coerces a variable to a string, and is the same as str(num).
[x for x in y] is called a list comprehension, and is basically an one-liner for loop that returns a list as its result. So all together, your code's semantically equivalent to the following, but faster, because list comprehensions and xrange are faster than for loops and range:
z = []
for i in range(loop_count):
z.append(str(i))
return "".join(z)
That bit in the brackets is a list comprehension, arguably one of the most powerful elements of Python. It produces a list from iteration. You may want to look up its documentation. The use of backticks to convert num to a string is not suggestible - try str(num) or some such instead.
join() is a method of the string class. It takes a list of strings and return a single string consisting of each component string separated by "self" (aka the calling string). The trick here is that join() is being called directly from the string literal '', which is allowed in Python. What this code will to is produce a string consisting of the string form of each element of xrange(loop_count) with no separation.
First of all: while this code is still correct in the 2.x series of Python, it a bit confusing and can be written differently:
def method6a():
return ''.join(str(num) for num in xrange(loop_count))
In Python 2.x, the backticks can be used instead of the repr function. The expression within the square brackets [] is a list comprehension. In case you are new to list comprehensions: they work like a combination of a loop and a list append-statement, only that you don't have to invent a name for a variable:
Those two are equivalent:
a = [repr(num) for num in xrange(loop_count)]
# <=>
a = []
for num in xrange(loop_count):
a.append(repr(num))
As a result, the list comprehension will contain a list of all numbers from 0 to loop_count (exclusively).
Finally, string.join(iterable) will use the contents of string concatenate all of the strings in iterable, using string as the seperator between each element. If you use the empty string as the seperator, then all elements are concatenated without anything between them - this is exactly what you wanted: a concatenation of all of the numbers from 0 to loop_count.
As for my modifications:
I used str instead of repr because the result is the same for all ints and it is easier to read.
I am using a generator expression instead of a list comprehension because the list built by the list comprehension is unnecessary and gets garbage collected anyway. Generator expressions are iterable, but they don't need to store all elements of the list. Of course, if you already have a list of strings, then simply pass the list to the join.
Generally, the ''.join(iterable) idiom is well understood by most Python programmers to mean "string concatenation of any list of strings", so understandability shouldn't be an issue.

Categories