Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 days ago.
Improve this question
Hey stackoverflow community! I have a dataset with company names where some of them have mistakes (assuming it is gogle, amacon, steckoverflaw etc.). The dataset is really enormous, so correcting each of them will take quite a lot of time.
I tried fuzzywuzzy and results were ok but not perfect. Does anyone have any thoughts how to solve this?
from textblob import TextBlob
with open("text.txt", "r") as f: # Opening the test file with the
#intention to read
text = f.read() # Reading the file
textBlb = TextBlob(text) # Making our first textblob
textCorrected = textBlb.correct() # Correcting the text
print(textCorrected)
from textblob import TextBlob
a = "fuzzywuzzy"
print("original text: "+str(a))
b = TextBlob(a)
print("corrected text: "+str(b.correct()))
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 months ago.
Improve this question
I have the input like this
8.8.8.8,678,fog,hat
8.8.4.4,5674,rat,fruit
www.google.com,1234,can,zone
I want to split this input in Python so that only the IP address needs to be fetched which I will use for PING purpose. I really appreciate your help in this regard.
Guessing that your input is in a .csv file.
You could use Pandas or built in csv library to parse it. Using the latter in the example below.
import csv
with open("my_input.csv") as file:
csv_reader = csv.reader(file, delimiter=",")
ip_adresses = [ip for ip, *_ in csv_reader]
print(ip_adresses)
>> ['8.8.8.8', '8.8.4.4', 'www.google.com']
I collect the first item of the each row by using list unpacking.
Let's say one line of your data is in a variable called line. Then you can get the ping address this way:
ping = line.split(',')[0]
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 10 months ago.
Improve this question
My text file looks like this
{} /n {}/n
When I use readline, I get it as a list , like ['{}' , '{}]
how do I remove the string and read them as sets?
This code work for you.
with open('yourfile.txt','r') as file:
data = file.read().splitlines()
data = [set(a[1:-1].split(',')) for a in data]
print(data)
You can use eval also here, But using eval might be dangerous.
with open('yourfile.txt','r') as file:
data = file.read().splitlines()
data = [eval(a) for a in data]
print(data)
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I have a string
x='125mg'
First, i want to detect that number and text are together and if they are together so i want to separate it into 125 and mg.
Try this:
import re
a = '125mg switch'
' '.join(re.findall(r'[A-Za-z]+|\d+', a))
Output:
'125 mg switch'
This is the very simple task in python using the regular expresiion package in python .Here i am providing u the code for splitting the number from the string:
python code:
import re
a='125msg'
result=re.findall('\d+',a)
for i in result:
print(i)
You could simply do it using Regular Expression in Python. I don't know whether pandas can do that.
read more about it from this link
import re
test_str = "125mg"
res = re.findall(r'[A-Za-z]+|\d+', test_str)
print(str(res))
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Welcome to demofile.txt
This file is for testing purposes.
Good Luck!
Above Content Is In My Text File
How to read last 10 bytes from that text file? The expected output is:
Good Luck!
in_file.seek(-10, 2) # where 2 denotes the reference point being the end of the file
So then:
in_file = open("demofile.txt", "rb") # opening for [r]eading as [b]inary
in_file.seek(-10, 2)
s = in_file.read(10)
print(s)
OUTPUT:
b'Good Luck!'
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Hi I was wondering how I could format a large text file by adding line breaks after certain characters or words. For instance, everytime a comma was in the paragraph could I use python to make this output an extra linebreak.
You can do using str.replace() in python. Check out the below code, replacing every , with ,\n.
string = ""
with open('test.txt','r') as myfile:
for line in myfile:
string += line.replace(",",",\n")
myfile.close()
myfile = open('test.txt','w')
myfile.write(string)
File before execution:
Hello World and again HelloWorld,sdjakljsljsfs,asdgrwcfdssaasf,sdfoieunvsfaf,asdasdafjslkj,
After Execution:
Hello World and again HelloWorld,
sdjakljsljsfs,
asdgrwcfdssaasf,
sdfoieunvsfaf,
asdasdafjslkj,
you can use the ''.replace() method like so:
'roses can be blue, red, white'.replace(',' , ',\n') gives
'roses can be blue,\n red,\n white' efectively inserting '\n' after every ,