Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I just finished reading some books about pandas, numpy & matplotlib and thought about getting some practice.
Unfortunately, I am a bit lost on how to start.
Can anyone recommend a site, which provides csv-files to practice on?
I have found some sites like kaggle or data.gov, but they just have files, which are already cleaned etc.
I'm also open to other ways on how to practice those libraries.
Grateful for every answer.
Best regards
Just Search the Web. Using e.g. Google seach for Pandas tutorial,
Numpy tutorial and so on.
Even the home site of matplotlib contains a couple of introductory
tutorials. See e.g. https://matplotlib.org/tutorials/index.html
A good source of knowledge is also stackoverflow itself.
You could try finding some interesting datasets on: icpsr.umich.edu
In my experience, these sets often contain missing values and values that need formatting, which could be good for practice.
Some datasets are public, but many require will require special credentials. If you are a university student, this often qualifies.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I've been studying python for data science for about 5 months now. But I get really stucked when it comes to matplotlib. There's always so many options to do anything, and I can't see a well defined path to do anything. Does anyone have this problem too and knows how to deal with it?
I had the same problem sometime back. I just picked the Boston Housing Prices dataset and kept practicing on that. If you work on it enough you will be able to create all types of plots for the EDA and get good practice. Of course after a certain point it can get boring , thats when you jump to a dataset in an area of your interests, in my case it was movie reviews.
Below is the link to the housing prices data.
https://www.kaggle.com/c/house-prices-advanced-regression-techniques
I think your question is stating that you are bored and do not have any projects to make. If that is correct, there are many datasets available on sites like Kaggle that have open-source datasets for practice programmers.
in programming in general " There's always so many options to do anything".
i recommend to you that read library and understand their functions and classes in a glance, then go and solve some problems from websites or give a real project if you can. if your code works do not worry and go ahead.
after these try and error you have a lot of real idea about various problems and you recognize difference between these options and pros and cons of them. like me three years ago.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
Does anyone know of any useful resources to learn the structure of python code? When I say structure, I'm referring to the ingredients necessary to make the code correct, so a combination of syntax and the order of terms etc.
For example, to understand the type of a pandas dataframe, df, (I know, I've already identified that it's a dataframe, but just hypothetically), we type: type(df). For the shape, we write df.shape.
So, in the latter, why do we put df first, as in type(df)? Yet in the former, we put df last, and in parenthesis as in type(df). So, mixing these commands up, one may write df.type and expect to get an answer, but they wouldn't. So how do we learn these rules? I have searched high and low for some guidance, but I find very little; a lot of the guides assume you have an appreciation of these rules and don't explain them.
I'm not asking for help on this particular problem. Rather, I'm looking for a resource that would enable me to understand this, and the multitude of other rules better.
I don't have a Computer Science background, so my answer might be unorthodox or incomplete. There are things which come across as inconsistent when you use Python, and these things often trick me when I write new code. So you can easily find yourself looking up the same stuff time and time again.
However, I would say that the more you practice, the less likely it is to make mistakes or be confused. One good tip is to create a OneNote page with all the snippets that usually trip you up. Maybe by going back to that same snippet you will end up saving yourself time and fix those concepts in memory. It will cost you nothing, so give it a try.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
NLTK is something I've been interested for a fair while, but I have been so far unaware of an efficient way to become (relatively) acquainted with NLTK.
I have done some preliminary research, which has resulted in my awareness of two online courses:
statistics.com/nlp-using-nltk/
statistics.com/nlp-using-nltk/
and this ebook:
nltk.org/book/ch00.html
I would appreciate a reccomendation of an online tutorial or course, preferably in video format, that would be a good tool for me to use. If you have used one of the things that I linked to, please give me your impressions, if possible.
I strongly recommend working your way through the NLTK book, chapter by chapter. That's by far the best way to learn how to use NLTK, and it doesn't require much Python knowledge to get started.
Earlier this year, I put together a very basic introduction to NLTK that some people have found useful: A Smattering of NLP in Python.
I'm not aware of any good video tutorials for NLTK.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I stumbled upon the wikidump python library, which I think suits me just fine.
I could get by by looking at the source code, but I'm new at python and I don't want to write BS code as the project I need it for is kind of important to me.
I got the 'wiki-SPECIFICDATE-pages-articles.xml.bz2' file and I would need to use that as my source for single article fetching. Can anyone give me some pointers as to properly achieve this or, even better, point at some documentation? I couldn't find any!
(p.s. if you got any better and properly doc'd lib, please tell me)
Not sure if I understand the question, but if you have the Wikipedia dump and you need to parse the wikicode, I would suggest mwparserfromhell lib.
Another powerful framework is Pywikibot, that is the historic framework for bot users on Wikipedia (thus, it has many scripts dedicated to writing pages, instead of reading and parsing articles). It has a lot of documentation (though, sometimes obsolete) and it uses MediaWiki API.
You can use them both, of course: PWB for fetching articles and mwparserfromhell for parsing.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I'm just starting with mwclient. I'm going to create bots to query our MediaWiki database and make small revisions.
But I cannot find anywhere a simple list of python commands like how to get ages of pages, contents of categories, contents of pages, etc.
Does anyone know a good starters resource?
The official docs at https://github.com/mwclient/mwclient/wiki have some introductory tutorials. I'm in charge for documentation for mwclient but haven't had enough time to really expand them - could use help from anyone who is willing.
One of my colleagues just sent me a link to the MediaWiki API wiki page.
I currently use python+urllib for API queries, and mwclient whenever I need to edit/create a page.
An useful place to get started with mwclient (read/edit/create a page):
http://brianna.modernthings.org/article/134/write-api-enabled-on-wikimedia-sites
The Bot Manual also has tons of good info and links, e.g. creating a bot.