Related
I had removed .text and had just written peter_pan.split() but I was not sure whether the output was correct.
import requests
from nltk import FreqDist
url = "https://www.gutenberg.org/files/16/16-0.txt"
peter_pan = requests.get(url).text
peter_pan_words = peter_pan.text.split()
word_frequency = FreqDist(peter_pan_words)
freq = word_frequency.most_common(3)[2][1]
print(freq)
Where you are doing:
peter_pan = requests.get(url).text
The text that you are using is a method of the response object that is returned by requests.get. This method returns a string (containing the contents of the page). All this is fine.
However, you are then doing:
peter_pan_words = peter_pan.text.split()
This time what you are trying to do is access the text method of a str object, but a str does not have any method called text. If you do print(dir(peter_pan)), you will see the attributes (methods/properties) that are available for you to access. You should see something similar to:
['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']
You will see that text is not one of them (which is why you are getting the error), but you will also see that split is there -- and in fact all you need to do is to invoke split directly:
peter_pan_words = peter_pan.split()
This part of your code:
peter_pan = requests.get(url).text
When you call .text, you are converting the object into a string. So, since it's now a string, you cannot call .text on it again. Simply change:
peter_pan_words = peter_pan.text.split()
to
peter_pan_words = peter_pan.split()
I am having trouble getting a list of all variables in a file, and I have tried all methods that I can find, but there isn't a way that worked for me. dir() would return this:
['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isascii', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']
While working on this, I also had trouble importing a file from a variable containing the file name. I have tried importlib.import_module(filename), with open(filename, 'r') as infile: needed_file = infile.read(), and __import__() was deprecated, in the end, I'm just using with open().
Back to my first problem, I have tried using getattr(), however I got the same results as with dir(). These are what I have tried:
named_objects = dir(self.filedat)
print(named_objects)
attrs = [attr for attr in dir(self.filename) if not attr.startswith('__')]
file_objects = [getattr(self.filedat, attr) for attr in attrs]
named_objects = []
for attr in file_objects:
if isfunction(attr):
named_objects.append(attr.__name__)
elif not isfunction(attr) and not isclass(attr) and not ismodule(attr) and not ismethod(attr):
named_objects.append(attr.__name__)
print(file_objects)
I have also tried many variations of those, and none of which have succeeded.
Edit:
I am trying to get a list of all manually defined variables in said file.
I'm using pyshark to parse pcap files. I want to access layer fields using variable as shown in simple example below:
For example to access ntp server ip:
p = cap[0]
print(p.bootp.option_ntp_server)
However, I want to access it like this:
option_list = {
"12": "option_hostname",
"60": "option_vendor_class_id",
"43": "option_ntp_server"
}
p = cap[0]
print(p.bootp.%s %(option_list["43"]))
Of course this kind of access is not possible. So I tried to use getattr() like this:
getOption = getattr(p.bootp, option_list["43"])
getOption()
And it gave me following error:
'LayerFieldsContainer' object is not callable
It seems pyshark packet or layer classes is not callable.
How I can access layer fields using variable? Or can you suggest me another method to access options using option type numbers? Because I want to access this fields using option type numbers (Like option 12, option 43), not using titles.
i think OBJ pyshark.packet.fields.LayerFieldsContainer have format special, you can't use it as string.
dir (LayerFieldsContaniner) have:
['__add__', '__class__', '__class__', '__contains__', '__delattr__', '__delattr__', '__dict__', '__dir__', '__doc__', '__doc__', '__eq__', '__format__', '__format__', '__ge__', '__getattr__', '__getattribute__', '__getattribute__', '__getitem__', '__getnewargs__', '__getslice__', '__getstate__', '__getstate__', '__gt__', '__hash__', '__hash__', '__init__', '__init__', '__le__', '__len__', '__lt__', '__mod__', '__module__', '__module__', '__mul__', '__ne__', '__new__', '__new__', '__reduce__', '__reduce__', '__reduce_ex__', '__reduce_ex__', '__repr__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__setattr__', '__setstate__', '__setstate__', '__sizeof__', '__sizeof__', '__slots__', '__str__', '__str__', '__subclasshook__', '__subclasshook__', '__weakref__', '_formatter_field_name_split', '_formatter_parser', 'add_field', 'all_fields', 'alternate_fields', 'base16_value', 'binary_value', 'capitalize', 'center', 'count', 'decode', 'encode', 'endswith', 'expandtabs', 'fields', 'find', 'format', 'get_default_value', 'hex_value', 'hide', 'index', 'int_value', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'main_field', 'name', 'partition', 'pos', 'raw_value', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'show', 'showname', 'showname_key', 'showname_value', 'size', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'unmaskedvalue', 'upper', 'zfill']
you can use it.
Initially I was trying to find dir(re) but realized I had quotes.
Edit for clarity:
Wouldn't "re" be a string? So dir("re") == dir(string)? The output isn't the same. That is essentially what I am wondering.
Edit for comments:
I might misunderstand but doesn't dir return a list of all the functions within the module? Also when I call dir on "re" it gives me a list of functions. No error is returned.
edit2: yes dir. Been switching between a ruby and python project and for some reason I had def on the brain. sorry xD
I think what you need is a clarification on what dir does.
dir is a Python built-in function that does either one of two things, depending on whether an argument is supplied or not:
Without an argument, it returns a list of the names in the current scope. These names are represented as strings.
With an argument, it returns a list of the attributes and methods that belong to that argument. Once again, it will be a list of strings.
Below is a demonstration of the first action:
>>> num = 1 # Just to demonstrate
>>> dir()
['__builtins__', '__doc__', '__name__', '__package__', 'num']
>>>
And here is a demonstration of the second:
>>> dir(str)
['__add__', '__class__', '__contains__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__getslice__', '__gt__', '__hash__', '__init__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '_formatter_field_name_split', '_formatter_parser', 'capitalize', 'center', 'count', 'decode', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']
>>>
Moreover, if its argument is a Python module, then dir will list the names contained in that module:
>>> import sys
>>> dir(sys)
['__displayhook__', '__doc__', '__excepthook__', '__name__', '__package__', '__stderr__', '__stdin__', '__stdout__', '_clear_type_cache', '_current_frames', '_getframe', '_mercurial', 'api_version', 'argv', 'builtin_module_names', 'byteorder', 'call_tracing', 'callstats', 'copyright', 'displayhook', 'dllhandle', 'dont_write_bytecode', 'exc_clear', 'exc_info', 'exc_type', 'excepthook', 'exec_prefix', 'executable', 'exit', 'flags', 'float_info', 'float_repr_style', 'getcheckinterval', 'getdefaultencoding', 'getfilesystemencoding', 'getprofile', 'getrecursionlimit',
'getrefcount', 'getsizeof', 'gettrace', 'getwindowsversion', 'hexversion', 'long_info', 'maxint', 'maxsize', 'maxunicode', 'meta_path', 'modules', 'path', 'path_hooks', 'path_importer_cache', 'platform', 'prefix', 'ps1', 'ps2', 'py3kwarning', 'setcheckinterval', 'setprofile', 'setrecursionlimit', 'settrace', 'stderr', 'stdin', 'stdout', 'subversion', 'version', 'version_info',
'warnoptions', 'winver']
>>>
Now, you said in your question that the output of dir("re") was not equal to dir(string). There are a few things I would like to say about this:
If string is a string literal, as in dir("string"), then it should work:
>>> dir("re") == dir("string")
True
>>>
If "re" is actually the re module, then the behavior you are seeing is expected since the module is not a string.
If you were expecting dir("re") to list the names contained in the re module, then you are mistaken. You cannot reference the module with a string. Instead, you have to explicitly import it first:
>>> dir("re") # The output is the same as doing dir(str)
['__add__', '__class__', '__contains__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__getslice__', '__gt__', '__hash__', '__init__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '_formatter_field_name_split', '_formatter_parser', 'capitalize', 'center', 'count', 'decode', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']
>>>
>>> import re
>>> dir(re) # The output is for the re module, not str
['DEBUG', 'DOTALL', 'I', 'IGNORECASE', 'L', 'LOCALE', 'M', 'MULTILINE', 'S', 'Scanner', 'T', 'TEMPLATE', 'U', 'UNICODE', 'VERBOSE', 'X', '_MAXCACHE', '__all__', '__builtins__', '__doc__', '__file__', '__name__', '__package__', '__version__', '_alphanum', '_cache', '_cache_repl', '_compile', '_compile_repl', '_expand', '_pattern_type', '_pickle', '_subx', 'compile', 'copy_reg', 'error', 'escape', 'findall', 'finditer', 'match', 'purge', 'search', 'split', 'sre_compile', 'sre_parse', 'sub', 'subn', 'sys', 'template']
>>>
What are the following names when I first start my python shell? They do not look like functions from __builtins__:
>>> dir(__name__)
['__add__', '__class__', '__contains__', '__delattr__', '__doc__', '__eq__',
'__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__',
'__getslice__', '__gt__', '__hash__', '__init__', '__le__', '__len__', '__lt__',
'__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__',
'__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__',
'__subclasshook__', '_formatter_field_name_split', '_formatter_parser',
'capitalize', 'center', 'count', 'decode', 'encode', 'endswith', 'expandtabs',
'find', 'format', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace',
'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'partition', 'replace',
'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split',
'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper',
'zfill']
__name__ is a string and those a string methods. It's the name of the module or '__main__' on the toplevel. Hence this idiom appears often:
if __name__ == '__main__':
# this file was called directly, not imported
main()
dir(__name__) shows the attributes of __name__, and since __name__ is a string, it shows the attributes of the str class. Most of the listed attributes are methods. You can get more information using help():
>>> help(str.index)
Help on method_descriptor:
index(...)
S.index(sub [,start [,end]]) -> int
Like S.find() but raise ValueError when the substring is not found.