removing duplicate strings from the list?

removing duplicate strings from the list? - python

I wrote a program to extract all email addresses from a text file starting from 'From:'.I created a list to store all extracted email addresses into list and create another list to store only unique email addresses by removing duplicate email addresses. Now I am getting the output but at the same time I am getting output which shows 'set'before printing the new list i.e after "print Unique_list"
Note - original text file is not attached as I dont know how to do it.
Thank you
print "Please enter the file path:\n"
text_file = raw_input ("Enter the file name:")
print "Opening File\n"
#Using try and except to print error message if file not found
try:
open_file = open ( text_file )
print "Text file " + text_file + " is opened \n"
except:
#Printing error message
print "File not found"
#Using "raise SystemExit" for program termination if file not found
raise SystemExit
#Creating dynamic list to store the no. Email addresses starting from 'From:'
Email_list = [];
#Using loop to extract the Email addresses starting from 'From:'
for line in open_file:
if 'From:' in line:
#Adding each extracted Email addresses at the end of the list
Email_list.append(line)
print "Printing extracted Email addresses\n"
print Email_list,"\n"
print "Before removing duplicate Email addresses, the total no. of Email addresses starting with 'From:'",len(Email_list),"\n"
#Removing duplicate Email addresses
Unique_list = set(Email_list)
#print Email_list.count()
print "Printing Unique Email addresses\n"
print (Unique_list)
print "After removing duplicate Email addresses, the total no. of Email [enter image description here][1]address starting with From:, ",len(Unique_list),"\n" )`

getting output which shows 'set'before printing the new list i.e after "print Unique_list"
Just convert it back to a list again.
Unique_list = set(Email_list)
Unique_list = list(Unique_list)
#print Email_list.count()
print "Printing Unique Email addresses\n"
print (Unique_list)

The answer may depend on the goal. It is not clear based on the question whether the goal is exclusively to print the addresses in a specific way; or to print them according to some assumptions of readability. If the goal is to print in a given desired way, you may be well-served by controlling the output; rather than relying on the built-in String representation of the objects that you would be printing.
An example:
Instead of print Email_list,"\n"
use print print (','.join (Email_list, '\n'))
If you would like to emulate the representation of a list, you could use something like print ('[\'{list}\']'.format (list = '\', \''.join (Email_list)), '\n')
or maybe something more cohesive.
In any case, you could control the way in which you would like to print.
If you rely on internally-determined representations of objects for printing, you may be pushed to make coding considerations based on questions of output; and this is not a choice that supports one's ability to make the best coding choices for pure program logic.
Or, did I misunderstand your question?

Related

Reading list from text file and converting it to string

Hello, I recently started to learn Python, so that's my best explain to you, cause my English skills are not Perfectly.
I made a script which is reading a list from text file, and then my problem is converting it to string, so I could display it in the print function. After doing that, when user is typing his "Nickname", lets say. The script is already readen the list from text file. Also the i don't know if used the split(',') Function, that should split the words with those , from the words in the text file used for list. Here are some pictures of my code.
https://gyazo.com/db797ca0998286248bf846ac70c94067 (Main code)
https://gyazo.com/918aaba9b749116d842fccb78f6204a8 (Text file - list of usernames which are "BANNED")
The text code file name is Listas_BAN.txt.
I've tried to do all this thing myself, i did some research before posting this, but many methods are outdated.
# Name
name = input("~ Please enter Your name below\n")
print("Welcome " + str(name))
def clear(): return os.system('cls')
clear() # Clearina viska.
# define empty list
Ban_Listo_Read = open('Listas_BAN.txt', mode='r')
Ban_Listo_Read = Ban_Listo_Read.readlines()
Ban = Ban_Listo_Read.list(ban)
# Print the function (LIST) in string .
print("Your'e Banned. You'r nickname is - ", + Ban_Listo_Read).Select (' %s ', %s str(name)) # Select the User nickname from
# The input which he typed. (Check for BAN, In the List.)
# Text file is the List Location . - Listas_BAN.txt
enter image description here
enter image description here
I'm getting Wrong Syntax Error

ll = open('untitled.txt', mode='r').readlines()
print("".join(ll).replace('\n', '.'))
name = input("~ Please enter Your name below\n")
if name in ll:
print('your name {n} is in the list'.format(n=name))
EDIT:
plus, you shall consider using string formatting:
var1 = ...
var2 = ...
print("{x}...{y}".format(x=var1, y=var2)
or python 3.7
print(f"{var1}...{var2}")
EDIT:
f.readlines()
https://docs.python.org/3.7/tutorial/inputoutput.html
If you want to read all the lines of a file in a list you can also use
list(f) or f.readlines().

Python giving list out of bounds but it shows both elements in the list [duplicate]

This question already has answers here:
How to read a file line-by-line into a list?
(28 answers)
Closed 7 months ago.
I'm trying to make a simple log in program using python (still fairly new to it), and I have the log in information stored in a text file for the user to match in order to successfully log in. Whenever I run it it says "list index out of range" but I'm able to print out the value of that element in the list, which is where I'm confused. I am trying to add the first line (username) and the second line (password) in the file to the list to compare to the user inputted values for each field, but am unable to compare them.
def main():
username = getUsername()
password = getPassword()
authenticateUser(username, password)
def getUsername():
username = input("Please enter your username: ")
return username
def getPassword():
password = input("Please enter your password: ")
return password
def authenticateUser(username, password):
credentials = []
with open("account_information.txt") as f:
content = f.read()
credentials.append(content)
if(username == credentials[0] and password == credentials[1]):
print("Login Successful!")
else:
print("Login Failed")
main()

You should use readline to get your infos in a list :
def authenticateUser(username, password):
credentials = []
with open("account_information.txt") as f:
content = f.readlines()
Using this, accordingly to what you describe, content[0] will be your username and content[1] your password.

Depending on what is in your account_information.txt, file.read might not be the function you want to use.
Indeed, read will return a string, that is a list of all the characters in the file. So if you have your username and password on two separate lines for instance
foo
H0weSomeP4ssworD
you may instead use readlines to parse your file into a list of strings where each element is a line within the file.

If you look at the python documentation at 7.2.1. Methods of File Objects, you'll see
To read a file’s contents, call f.read(size), which reads some quantity of data and returns it as a string or bytes object
You will see that f.read() is not the function you are looking for. try f.readlines() instead to put the content in a list of lines.
If you want to read all the lines of a file in a list you can also use list(f) or f.readlines().

After having "your problem solved", I invite you to come back to your problem description:
Python giving list out of bounds but it shows both elements in the list
What have you learned besides the fact that you have used the wrong method? I hope also something about your diagnostic skills. How should you print out a list to see how many items it contains?
If you do it like this
items = []
items.append("hello\nworld")
for i in items:
print(i)
you'll see:
hello
world
If you therefore deduce to have 2 items in the list, is this correct?
And python definitely reports that you're accessing the list out of bounds. You saw the conflict between your and python's perspective.
I think you should at least have learned using len() for out-of-bounds diagnostics today.

How to account for string formatting in Python variable assignment?

I am parsing text to check for the presence such as:
u'Your new contact email thedude#gmail.com has been confirmed.'
...where the text either side of the email address will be constant, and the email address won't be constant, but will be known before parsing.
Assume the sentence is contained in a variable called response and the email address in address. I could do:
'Your new contact email' + address + 'has been confirmed' in response
This is a little untidy, and downright inconvenient if the text of the sentence ever changes. Is it possible to advantage of string formatting in a variable assignment e.g.
sentence = 'Your new contact email %s has been confirmed'
And somehow pass address into the variable at runtime?

Of course you can! Try this out...
sentence = 'Your new contact email {} has been confirmed'.format(address)
There's also this other (rather hacky) alternative...
sentence = 'Your new contact email %s has been confirmed' % address
This alternative has its limitations too, such as requiring the use of a tuple for passing more than one argument...
sentence = 'Hi, %s! Your new contact email %s has been confirmed' % ('KemyLand', address)
Edit: According to comments from the OP, he's asking how to do this if the format string happens to exist before address does. Actually, this is very simple. May I show you the last three examples with this?...
# At this moment, `address` does not exist yet.
firstFormat = 'Your new contact email address {} has been confirmed'
secondFormat = 'Your new contact email address %s has been confirmed'
thirdFormat = 'Hi, %s! Your new contact email %s has been confirmed'
# Now, somehow, `address` does now exists.
firstSentence = firstFormat.format(address);
secondSentence = secondFormat % address
thirdSentence = thirdFormat % ('Pyderman', address)
I hope this has led some light on you!

This is what I usually do with my SQL queries, output lines and whatever:
sentence = 'Blah blah {0} blah'
...
if sentence.format(adress) in response:
foo()
bar()
So basically you get to keep all your I/O-related strings defined in one place instead of hardcoded all over the program. But at the same place you get to edit them whenever you please, but only in a limited way ('foo'.format() throws an exception when it gets too few or too many arguments).

Maybe a hack way of doing but if I understand you correctly, here's how you can..
At the beginning, declare the string, but where the address would go, put in something that will never generally be repeated... Like ||||| (5 pipe characters).
Then when you have the address and want to pop it in do:
myString.replace('|||||', address)
That will slot your address right where you need it :)
My understanding was you are trying to create a string and then later, add a piece in. Sorry if I misunderstood you :)

"ValueError: list.index(x): x not in list" in Python, but it exists

I want to use Python to pull the username from an email address. The solution I thought of was to append the email address into a list, find the index of the # symbol, and then slice the list until I found the index.
My code is:
#!/usr/bin/env python<br/>
email = raw_input("Please enter your e-mail address: ")
email_list = []
email_list.append(email)
at_symbol_index = email_list.index("#")
email_username = email_list[0:at_symbol_index]
print email_username
But, everytime I run the script, it returns the error:
ValueError: list.index(x): x not in list
What's wrong with my code?

The reason for this is that you are making a list containing the string, so unless the string entered is "#", then it is not in the list.
To fix, simply don't add the email address to a list. You can perform these operations on a string directly.
As a note, you might want to check out str.split() instead, or str.partition:
email_username, _, email_host = email.partition("#")

You add the entire e-mail address to a list. Did you perhaps mean
email_list = list(email)
which adds the characters of email (as opposed to the whole string at once). But even this is unnecessary, as strings can be sliced / indexed much like lists (so you don't need a list at all in this case).
An easier way to determine the username of an e-mail address would probably be
email.split("#")[0]

Trying to create a list of users in AD

So, I've created a script that searches AD for a list of users in a specific OU, and outputs this to a text file. I need to format this text file. The top OU I'm searching contains within it an OU for each location of this company, containing the user accounts for that location.
Here's my script:
import active_directory
import sys
sys.stdout = open('output.txt', 'w')
users = active_directory.AD_object ("LDAP://ou=%company%,dc=%domain%,dc=%name%
for user in users.search (objectCategory='Person'):
print user
sys.stdout.close()
Here's what my output looks like, and there's just 20-something lines of this for each different user:
LDAP://CN=%username%,OU=%location%,OU=%company%,dc=%domain%,dc=%name%
So, what I want to do is just to put this in plain English, make it easier to read, just by showing the username and the subset OU. So this:
LDAP://CN=%username%,OU=%location%,OU=%company%,dc=%domain%,dc=%name%
Becomes THIS:
%username%, %location%
If there's any way to export this to .csv or a .xls to put into columns that can be sorted by location or just alphabetical order, that would be GREAT. I had one hell of a time just figuring out the text file.

If you have a string like this
LDAP://CN=%username%,OU=%location%,OU=%company%,dc=%domain%,dc=%name%
Then manipulating it is quite easy. If the format is standard and doesn't change, the fastest way to manipulate it would just be to use string.split()
>>> splitted = "LDAP://CN=%username%,OU=%location%,OU=%company%,dc=%domain%,dc=%name%".split('=')
yields a list
>>> splitted
["LDAP://CN",
"%username%, OU",
"%location%, OU",
"%company%, dc",
"%domain%, dc",
"%name%"]
Now we can access the items of the list
>>> splitted[1]
"%username%, OU"
To get rid of the ", OU", we'll need to do another split.
>>> username = splitted[1].split(", OU")[0]
>>> username
%username%
CSV is just a text file, so all you have to do is change your file ending. Here's a full example.
output = open("output.csv","w")
users = active_directory.AD_object ("LDAP://ou=%company%,dc=%domain%,dc=%name%
for user in users.search (objectCategory='Person'):
# Because the AD_object.search() returns another AD_object
# we cannot split it. We need the string representation
# of this AD object, and thus have to wrap the user in str()
splitteduser = str(user).split('=')
username = splitteduser[1].split(", OU")[0]
location = splitteduser[2].split(", OU")[0]
output.write("%s, %s\n"%(username,location))
% \n is a line ending
% The above is the old way to format strings, but it looks simpler.
% Correct way would be:
% output.write("{0}, {1}\n".format(username,location))
output.close()
It's not the prettiest solution around, but it should be easy enough to understand.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

removing duplicate strings from the list? - python

getting output which shows 'set'before printing the new list i.e after "print Unique_list" Just convert it back to a list again. Unique_list = set(Email_list) Unique_list = list(Unique_list) #print Email_list.count() print "Printing Unique Email addresses\n" print (Unique_list)

Related

Reading list from text file and converting it to string

Python giving list out of bounds but it shows both elements in the list [duplicate]

How to account for string formatting in Python variable assignment?

"ValueError: list.index(x): x not in list" in Python, but it exists

Trying to create a list of users in AD

Categories

Resources