Transposing first column in dataframe in pandas - python

I am fairly new to Python and I just do pretty basic stuff. So when I managed to sort this out by myself I was really chuffed, although I am not sure if this is the most pythonic way to do it.
I had a csv file that contained this information when read the normal way:
I wanted the items in the first-row to be the column headers.

So I used Transpose:
df_t=df.T
And save it as a new file, without the header to remove the existing headings.
df_t.to_csv('employment.csv)
When opening the new file, this is what I have:
As I said, not the most pythonic, but it seems to work. Any suggestions on how to improve this, will be most appreciated.

Related

How can I format google sheets so I can export my data properly?

I plan to make an educational web game. I have thousands of trivia questions I need to write down in a way that can be easily transferred out and automatically organized based on their column, at a later date.
I was suggested to use google sheets so I can later export as a .csv, and that should be easy to work with for a developer. When i exported a .csv and opened it in Panda python the a column was cut off and 1 column was used as a 'header', not just a normal entry https://imgur.com/a/olcpVO8. This obviously wont work and seems to be an issue.
Should I just leave the first row and column empty and work around the issue? I don't want to write thousands of sets only to find out I did this the wrong way. Can anyone give any insight into whether this is my best option and how I should best format it?
I have to write Questions(1), Answers(4), Explanations(1) per entry
I hope this makes sense, thanks for your time.
I tried doing this and have no issue at all using the exported CSV from Google Sheets, using the same data as in your example.
In my opinion, whatever software you're using in your second screenshot is your issue, it seems like its removing numbers from the first row because that should be your header row. Check around in your software for options like, "First column contains headers" or "Use row 1 as Header" and make sure these aren't being used.

Python Processing large amount of data from excel

I've got very hard task to do. I need to process Excel file with 6336 rows x 53 columns. My task is to create program which:
Read data from input Excel file.
Sort all rows by specific column data, for eg. Sort by A1:A(last)
Place columns in new output Excel file by given order, for eg.
SaleCity Branch CustomerID InvoiceNum
Old File For eg. Old File Merge old file cols
Col[A1:A(last)] SaleCity='Oklahoma' Col[M1:M(last) Col[K1:K(last) &
Branch='OKL GamesShop' B1:B(last)]
Save new excel File.
Excel Sample:
Excel
(All data in this post is not real so don't try to hack someone or something :D)
I know that I did not provide any code but to be honest I tried solving it by myself and I don't even know which module I should use. I tried using OpenPyXl and Pandas but there's too much data for my capabilities.
Thank you in advance for any help. If I asked the question in the wrong place, please direct me to the right one.
Edit:
To be clear. I'm not asking for full solution here. What am I asking for is guidance and mentority.
I would recommend you to use PySpark. It is more difficult than pandas, but the parallelization provided will help with yours large excel files.
Or you could also use multiprocessing lib from python to paralelize pandas functions.
https://towardsdatascience.com/make-your-own-super-pandas-using-multiproc-1c04f41944a1

Import Only Necessary CSV Columns In IDL

I am struggling to find a function in IDL that will replicate something I have done in Python with Pandas. I am new to IDL and there is next to nothing resource wise that I can find.
In Python, I use the following:
pd.read_csv('<csv filepath>', usecols=[n])
The usecols part will only pull in the columns of a CSV I would like in my data frame. Is there a way to do this in IDL?
I hope this makes sense - my first post here!
Thanks.
There is a READ_CSV routine that can read CSV files, but it does not have a way to pull out specific columns. It will give you a structure with one field for each column of the CSV file — so you could just grab the column you need from the structure and throwing away the rest of the structure. Something like:
csv = read_csv('somefile.csv')
col_n = csv.(n)

How do I write to one sheet in an already existing excel sheet in Python?

I got an excel file that has four sheets. One sheet, sheet 4. contains data in simple CSV and the others read the data of this sheet and make different calculations and graphs. In my python application I would like to open the excel file, open sheet 4, and replace the data. I know you technically can't open and edit excel however you like with Python, due to the complex file structure of XLS (previous relevant answer), but is there a work around for this specific case? Remember the only thing I want to do is to open the data sheet, write to it, and ignore the others...
Note: Previous answers to relevant questions have suggested using the copy function in xlutils. But that doesn't work in this case, as the rest of the sheets are rather complex. The graphs, for example, can't be preserved with the copy function.
I used to use pyExcelerator. It did certainly a good job, but I'm not sure if it is maintained.
https://pypi.python.org/pypi/pyExcelerator/
hth.

Python. Enthought Traits. How to get a multi-column from a spreadsheet to display on the TabularEditor?

I am trying to understand how enthought.traits and enthought.traitsui work, and so far I have found it very easy to work with. I have also looked at the example https://svn.enthought.com/enthought/browser/Traits/trunk/enthought/traits/ui/demo/Advanced/Tabular_editor_demo.py?rev=17881
That shows how to put data on to a TabularEditor. But we have to mention the column names in the Adapter class. But if I have to put all my data with loads of columns and rows from a spreadsheet to the table, how would I go about it? Is there a demo file that I have missed?

Categories