Nested dictionary groups from excel - python

I'm new in python and openpyxl. I started to learn in order to make my every day tasks easier and faster at my workplace.
Task:
There is an excel file with a lots of rows, looks like this
excel file
I want to create a daily report based on this excel file. In my example Today is 2019/05/08.
Expected result:
Only show the info where the date is match with Today date.
Expected structure:
required outcome
My solution
In my solution I create a list of the rows where I can find only the Today values. After that I read only that rows and create dictionaries. But the result is nothing. I also in a trouble about how to work with multiple keys. Because there are multiple issue numbers are in the list.
from datetime import datetime
import openpyxl
from openpyxl import load_workbook
from openpyxl.utils import get_column_letter
from openpyxl.utils import column_index_from_string
#Open excel file
excel_path = "\\REE.xlsx"
wb = openpyxl.load_workbook(excel_path, data_only=True)
ws_1 = wb.worksheets[1]
#The Today date. need some format due to excel date handling
today = datetime.today()
today = today.replace(hour=00, minute=00, second=00, microsecond=00)
#Crate a list of the lines where only Today values are present
issue_line_list = []
for cell in ws_1["B"]:
if cell.value == today:
issue_line = cell.row
issue_line_list.append(issue_line)
#Creare a txt file for output
file = open("daily_report.txt", "w")
#The dict what I want to use
dict = []
issue_numbers_list = []
issue = []
#Create a dict for the issues
for line in issue_line_list:
issue_number_value = ws_1.cell(row = line, column = 3).value
issue_numbers_list.append(issue_number_value)
#Create a dict for other information
for line in issue_line_list:
issue_number_value = ws_1.cell(row = line, column = 3).value
by_value = ws_1.cell(row = line, column = 2 ).value
group_value = ws_1.cell(row = line, column = 4).value
events_value = ws_1.cell(row = line, column = 5).value
deadline_value = ws_1.cell(row = line, column = 6).value
try:
deadline_value = deadline_value.strftime('%Y.%m.%d')
except:
deadline_value = ""
issue.append(issue_number_value)
issue.append(by_value)
issue.append(group_value)
issue.append(events_value)
issue.append(deadline_value)
issue.append(deadline_value)
#Append the two dict
dict.append(issue_numbers_list)
dict.append(issue)
#Save it to the txt file.
file.write(dict)
file.close()
Questions
- How to solve the multiple same key issue?
- How to create nested groups?
- What should add or delete to my code in order to get the expected result?
Remark
Openpyxl is not only option. If you have a bettwer/easier/faster way I open for every idea.
Thank you in advance for you support!

Can you try the following:
import pandas as pd
cols = ['date', 'by', 'issue_number', 'group', 'events', 'deadline']
req_cols = ['events', 'deadline']
data = [
['2019-05-07', 'john', '113140', '#issue_closed', 'something different', ''],
['2019-05-08', 'david', '113140', '#task', 'something different', ''],
['2019-05-08', 'victor', '114761', '#task_result', 'something different', ''],
['2019-05-08', 'john', '114761', '#task', 'something different', '2019-05-10'],
['2019-05-08', 'david', '114761', '#task',
'something different', '2019-05-08'],
['2019-05-08', 'victor', '113140', '#task_result', 'something different', ''],
['2019-05-07', 'john', '113140', '#issue_created',
'something different', '2019-05-09'],
['2019-05-07', 'david', '113140', '#location', 'something different', ''],
['2019-05-07', 'victor', '113140', '#issue_closed', 'something different', 'done'],
['2019-05-07', 'john', '113140', '#task_result', 'something different', ''],
['2019-05-07', 'david', '113140', '#task',
'something different', '2019-05-10'],
]
df = pd.DataFrame(data, columns=cols)
df1 = df.groupby(['issue_number', 'group']).describe()[req_cols].droplevel(0, axis=1)['top']
df1.columns = req_cols
print(df1)
Output:
events deadline
issue_number group
113140 #issue_closed something different done
#issue_created something different 2019-05-09
#location something different
#task something different 2019-05-10
#task_result something different
114761 #task something different 2019-05-08
#task_result something different
To open an excel file, you can do the following:
df = pd.read_excel(excel_path, sheet_name=my_sheet)
req_cols = ['EVENTS', 'DEADLINE']
df1 = df.groupby(['ISSUE NUMBER', 'GROUP']).describe()[req_cols].droplevel(0, axis=1)['top']
df1.columns = req_cols
print(df1)

The task almost solved, but I faced a new issue.
The code:
excel_path = "\\REE.xlsx"
my_sheet = 'Events'
cols = ['DATE', 'BY', 'ISSUE NUMBER', 'GROUP', 'EVENTS', 'DEADLINE']
req_cols = ['EVENTS', 'DEADLINE']
df = pd.read_excel(excel_path, sheet_name = my_sheet, columns=cols)
today = datetime.today().strftime('%Y-%m-%d')
today_filter = (df[(df['DATE'] == today)])
df = pd.DataFrame(today_filter, columns=cols)
df1 = df.groupby(['ISSUE NUMBER', 'GROUP']).describe()[req_cols].droplevel(0, axis=1['top']
df1.columns = req_cols
print(df1)
On the 'BY' column there are same values. eg. '#task'. But the script print only once.
int his case
Required result:
114761
#task Jane another words 2019-05-10
#task result John something
#task John something else 2019-05-08
...
...
...
...
My code result:
114761
#task Jane another words 2019-05-10
#task result John something
...
...
...
John #task something else 2019-05-08 do not print it out. Why?
And there is a some result in other options also. If there are more some values at'BY' column the script print out only the first and skip the rest.

Related

how to use the input with pandas to get all the value.count linked to this input

my dataframe looks like this:
Index(['#Organism/Name', 'TaxID', 'BioProject Accession', 'BioProject ID', 'Group', 'SubGroup', 'Size (Mb)', 'GC%', 'Replicons', 'WGS',
'Scaffolds', 'Genes', 'Proteins', 'Release Date', 'Modify Date',
'Status', 'Center', 'BioSample Accession', 'Assembly Accession',
'Reference', 'FTP Path', 'Pubmed ID', 'Strain'],
dtype='object')
I ask the user to enter the name of the species with this script :
print("bacterie species?")
species=input()
I want to look for the rows with "Organism/Name" equal to the species written by the user (input) then to calculate with "values.count" of the status column and finally to retrieve 'FTP Path'.
Here is the code that I could do but that does not work:
if (data.loc[(data["Organism/Name"]==species)
print(Data['Status'].value_counts())
else:
print("This species not found")
if (data.loc[(data["Organism/Name"]==species)
print(Data['Status'].value_counts())
else:
print(Data.get["FTP Path"]
If I understand your question correctly, this is what you're trying to achieve:
import wget
import numpy as np
import pandas as pd
URL='https://ftp.ncbi.nlm.nih.gov/genomes/GENOME_REPORTS/prokaryotes.txt'
data = pd.read_csv(wget.download(URL) , sep = '\t', header = 0)
species = input("Enter the bacteria species: ")
if data["#Organism/Name"].str.contains(species, case = False).any():
print(data.loc[data["#Organism/Name"].str.contains(species, case = False)]['Status'].value_counts())
FTP_list = data.loc[data["#Organism/Name"].str.contains(species, case = False)]["FTP Path"].values
else:
print("This species not found")
To wite all the FTP_Path urls into a txt file, you can do this:
with open('/path/urls.txt', mode='wt') as file:
file.write('\n'.join(FTP_list))

Pandas - EmptyDataError: No columns to parse from file when reading stock .csv file

Let me first start by saying I have gone through and done my due diligence trying to find a solution based on questions previously asked on the web.
I've run into an odd bug in my code that I really cannot explain...
So far my code executes the following:
take stock symbols and write OHLC data to a CSV file
loop through the directory that contains the CSV files and use that data to calculate technical indicators
add the technical indicator data to the same CSV file
So the bug is that it executes everything perfectly (99 stocks) EXCEPT for ZM.csv (Zoom). The error that it prints is"
pandas.errors.EmptyDataError: No columns to parse from file.
So to troubleshoot I copied and pasted the data from ZM.csv into a CSV that I know ran fine (I used AAPL) and it actually executed fine. Next, I took the working data from AAPL.csv, pasted it into ZM.csv and ran it again. It throws the same error. I also tried renaming the file to ZMI (randomly) and it worked.
This led me to believe that for some unknown reason that the FILENAME is the root issue. The part where I first create the CSV files, I changed the name of the file to be {symbol}1.csv, {symbol}_.csv, and {symbol}I.csv to no avail. Lastly, I combined the two files together and did not mess with anything else. It worked. Does anyone know why?
The flow is to first run bars.py, check the data/ohlc/ directory CSV files (should only have the OHLC data), run technical_analysis.py, and then check the CSV files again (now with technical indicators).
[bar.py]
from config import *
from datetime import datetime
import requests, json
holdings = open('data/qqq.csv').readlines()
symbols_list = [holding.split(',')[2].strip() for holding in holdings][1:]
symbols = ','.join(symbols_list)
minute_bars_url = '{}/1Min?symbols={}&limit=100'.format(BARS_URL, symbols)
r = requests.get(minute_bars_url, headers=HEADERS)
ohlc_data = r.json()
for symbol in ohlc_data:
filename = 'data/ohlc/{}.csv'.format(symbol)
f = open(filename, 'w+')
f.write('Timestamp,Open,High,Low,Close,Volume\n')
for bar in ohlc_data[symbol]:
t = datetime.fromtimestamp(bar['t'])
timestamp = t.strftime('%I:%M:%S%p-%Z%Y-%m-%d')
line = '{},{},{},{},{},{}\n'.format(timestamp, bar['o'], bar['h'],
bar['l'], bar['c'], bar['v'])
f.write(line)
The variables symbols_list and symbols print as follows:
symbols_list = ['AAPL', 'MSFT', 'AMZN', 'FB', 'GOOGL', 'GOOG', 'TSLA', 'NVDA', 'PYPL', 'ADBE', 'INTC', 'NFLX', 'CMCSA', 'PEP', 'COST', 'CSCO', 'AVGO', 'QCOM', 'TMUS', 'AMGN', 'TXN', 'CHTR', 'SBUX', 'ZM', 'AMD', 'INTU', 'ISRG', 'MDLZ', 'JD', 'GILD', 'BKNGLD', 'BKNG', 'FISV', 'MELI', 'ATVI', 'ADP', 'CSX', 'REGN', 'MU', 'AMAT', 'ADSK', 'VRTX', 'LRCX', 'ILMN', 'ADI', 'BIIB', 'MNST', 'EXC', 'KDP', 'LULU', 'DOCU', 'WDAY', 'CTSH', 'KHC', 'NXPI', 'BIDU', 'XEL', 'DXCM', 'EBAY', 'EA', 'ID', 'SNPS',XX', 'CTAS', 'SNPS', 'ORLY', 'SGEN', 'SPLK', 'ROST', 'WBA', 'KLAC', 'NTES', 'PCAR', 'CDNS', 'MAR', 'VRSK', 'PAYX', 'ASML', 'ANSS', 'MCHP', 'XLNX', 'MRNA', 'CPRT', 'ALGN', 'PDD', 'ALXN', 'SIRI', 'FAST', 'SWKS', 'VRSN', 'DLTR', 'CE 'TTWO', 'RN', 'MXIM', 'INCY', 'TTWO', 'CDW', 'CHKP', 'CTXS', 'TCOM', 'BMRN', 'ULTA', 'EXPE', 'FOXA', 'LBTYK', 'FOX', 'LBTYA']
symbols = AAPL,MSFT,AMZN,FB,GOOGL,GOOG,TSLA,NVDA,PYPL,ADBE,INTC,NFLX,CMCSA,PEP,COST,CSCO,AVGO,QCOM,TMUS,AMGN,TXN,CHTR,SBUX,ZM,AMD,INTU,ISRG,MDLZ,JD,GILD,BKNG,FISV,MELI,ATVI,ADP,CSX,REGN,MU,AMAT,ADSK,VRTX,LRCX,ILMN,ADI,BIIB,MNST,EXC,KDP,LULU,DOCU,WDAU,DOCU,WDAY,CTSH,KHC,NXPI,BIDU,XEL,DXCM,EBAY,EA,IDXX,CTAS,SNPS,ORLY,SGEN,SPLK,ROST,WBA,KLAC,NTES,PCAR,CDNS,MAR,VRSK,PAYX,ASML,ANSS,MCHP,XLNX,MRNA,CPRT,ALGN,PDD,ALXN,SIRI,FAST,SWKS,VRSN,DLTR,CERN,MXIM,INCY,TTWO,CDW,CHKP,CTXS,TCOM,EXPE,FOXA,BMRN,ULTA,EXPE,FOXA,LBTYK,FOX,LBTYA
So ZM is not listed last.
[technical_analysis.py]
import btalib
import pandas as pd
from datetime import datetime
from bars import ohlc_data
from bars import symbols_list as symbols
for symbol in symbols:
try:
file_path = f'data/ohlc/{symbol}.csv'
dataframe = pd.read_csv(file_path,
parse_dates=True,
index_col='Timestamp')
sma6 = btalib.sma(dataframe, period=6)
sma10 = btalib.sma(dataframe, period=10)
rsi = btalib.rsi(dataframe)
macd = btalib.macd(dataframe)
dataframe['SMA-6'] = sma6.df
dataframe['SMA-10'] = sma10.df
dataframe['RSI'] = rsi.df
dataframe['MACD'] = macd.df['macd']
dataframe['Signal'] = macd.df['signal']
dataframe['Histogram'] = macd.df['histogram']
f = open(file_path, 'w+')
dataframe.to_csv(file_path, sep=',', index=True)
except:
print(f'{symbol} is not writing the technical data.')
I think the error might be since 'ZM' is the last symbol in holdings, it contains some whitespace, due to in [bar.py] you created holdings the following way (instead of just the normal pd.read_csv):
holdings = open('data/qqq.csv').readlines()
symbols_list = [holding.split(',')[2].strip() for holding in holdings][1:]
symbols = ','.join(symbols_list)
You can probably reduce the code more to get a minimally viable example. I suspect there is something funny in the qqq.csv file and the split/strip code that makes the last entry not quite what you want.
Hopefully, that'll be clear printing the variable values as below.
with data/qqq.csv like
xname,yname,symbol
xxx,yyy,ZM
and py example
def write_OHLC(fname):
"write example data to a file"
f = open(fname, 'w+')
f.write('Timestamp,Open,High,Low,Close,Volume\n')
# IRL, would parse json and spitout meaningful values
f.write('2020-10-13 16:30,1,10,5,100\n')
def all_symbols():
"get list of all symbols from qqq.csv"
holdings = open('data/qqq.csv').readlines()
symbols_list = [holding.split(',')[2].strip() for holding in holdings][1:]
return symbols_list
# issue saving/reading last(?) symbol
symbols = all_symbols()
print(symbols)
# check just zoom
zm_sym = symbols[-1]
fname = f'data/ohlc/{zm_sym}.csv'
# inspect
print(zm_sym)
print(fname)
# write and read back
write_OHLC(fname)
ZM = pd.read_csv(fname,
parse_dates=True,
index_col='Timestamp')
print(ZM)

Reading a dynamic table with pandas

I'm using conda 4.5.11 and python 3.6.3 to read a dynamic list, such as this:
[['Results:',
'2',
'Time:',
'16',
'Register #1',
'Field1:',
'999999999999999',
'Field2:',
'name',
'Field3:',
'some text',
'Field4:',
'number',
'Fieldn:',
'other number',
'Register #2',
'Field1:',
'999999999999999',
'Field2:',
'name',
'Field3:',
'type',
'Field4:',
'some text'
'FieldN:',
'some text',
'Register #N',
...
]]
Here is the code for my best try:
data = []
header = []
data_text = []
for data in res:
part = data.split(":")
header_text = part[1]
data_t = part[2]
header.append(header_text)
data_text.append(data_t)
df_data = pd.DataFrame(data_text)
df_header = pd.DataFrame(header)
Output
Field1 Field2 Field3 Field4 Fieldn1 Fieldn2 Fieldn
999999999999999 name sometext number number text number
999999999999999 name sometext number number number NAN
999999999999999 name number NAN number text number
Is it possible to read from a list and concat in one DataFrame?

How to iterate through pandas columns and replace cells with information with the next row down

I am trying to loop through a pandas dataframe column and based on if the next row down does not include "Property Address" add the information from that next row down to the previous row. For example, if I have a column that goes from top to bottom ["Property Address", "Alternate Address", "Property Address"] I would like to take the information from "Alternate Address" and add that information to the column above it ("Property Address"). I have already double checked that there are no trailing or leading spaces and that everything is lower case so that all comparisons will work. However, I still get this error:
if i == "Property Address" and df.loc[i+1, :] != "Property Address":
TypeError: must be str, not int
Does anyone have ideas on what I can do so that this will work? I am new to Python, and I am really lost. Please let me know if there is any more information that I should provide to make answering this question easier. Thanks
Here is my code so far:
import pandas as pd
import time
df = pd.read_excel('BRH.xls') # Reads the Excel File and creates a
dataframe
# Column Headers
df = df[['street', 'state', 'zip', 'Address Type', 'mStreet', 'mState', 'mZip']]
propertyAddress = "Property Address" # iterates thru column and replaces
the current row with info from next row down
for i in df['Address Type']:
if i == "Property Address" and df.loc[i+1, :] != "Property Address":
df['mStreet'] == df.loc[i + 1, 'street']
df['mState'] == df.loc[i + 1, 'state']
df['mZip'] = df.loc[i + 1, 'zip']
df.to_excel('BRHOut.xls')
print('operation complete in:', time.process_time(), 'ms')
You can use pd.Series.shift to construct an appropriate mask.
Here's some untested pseudo-code:
m1 = df['AddressType'].shift() == 'Property Address'
m2 = df['AddressType'] != 'Property Address'
mask = m1 & m2
for col in ['Street', 'State', 'Zip']:
df.loc[mask, 'm'+col] = df.loc[mask, col.lower()].shift(-1)
Your TypeError is happening because i is a string. When you call df.loc[i+1, :], you are attempting to do something like "Property Address" + 1. Once you resolve that, you will still have some indexing issues in the body of your for loop.
#jpp gave a very succinct answer, but I believe that it pulls information from the intended destination and writes it to the intended source. In other words, the roles of "Property Address" and "Alternate Address" are reversed. I believe this will deliver the correct result:
Setup
import pandas as pd
df = pd.DataFrame(data={
'street': [
'123 Main Street',
'1600 Pennsylvania Ave',
'567 Fake Ave',
'1 University Ave'
],
'state': ['CA', 'DC', 'DC', 'CA'],
'zip': ['95126', '20500', '20500', '94301'],
'Address Type': [
'Property Address',
'Alternate Address',
'Property Address',
'Alternate Address'
],
'mStreet': [None, None, None, None],
'mState': [None, None, None, None],
'mZip': [None, None, None, None],
},
columns=[
'street',
'state',
'zip',
'Address Type',
'mStreet',
'mState',
'mZip'
])
# Create a new dataframe with all address attributes shifted UP one row
next_address_attributes = df[['Address Type', 'street', 'state', 'zip']].shift(-1)
# Create a series to indicate whether information should be drawn from next row
# All the decision-making is right here
get_attributes_from_next_address = ((df['Address Type'] == 'Property Address')
& (next_address_attributes['Address Type'] != 'Property Address'))
Using For Loop
for i, getting_attributes_is_necessary in get_attributes_from_next_address.iteritems():
if getting_attributes_is_necessary:
df.at[i, 'mStreet'] = next_address_attributes.at[i, 'street']
df.at[i, 'mState'] = next_address_attributes.at[i, 'state']
df.at[i, 'mZip'] = next_address_attributes.at[i, 'zip']
Loopless
df.loc[get_attributes_from_next_address, 'mStreet'] = next_address_attributes.loc[get_attributes_from_next_address, 'street']
df.loc[get_attributes_from_next_address, 'mState'] = next_address_attributes.loc[get_attributes_from_next_address, 'state']
df.loc[get_attributes_from_next_address, 'mZip'] = next_address_attributes.loc[get_attributes_from_next_address, 'zip']

How do i select only certain rows based on label in pandas?

Here is my function:
def get_historical_closes(ticker, start_date, end_date):
my_dir = '/home/manish/Desktop/Equity/subset'
os.chdir(my_dir)
dfs = []
for files in glob.glob('*.txt'):
dfs.append(pd.read_csv(files, names = ['Ticker', 'Date', 'Open', 'High', 'Low', 'Close', 'Volume', 'Null'], parse_dates = [1]))
p = pd.concat(dfs)
d = p.reset_index(['Date', 'Ticker', 'Close'])
pivoted = d.pivot_table(index = ['Date'], columns =['Ticker'])
pivoted.columns = pivoted.columns.droplevel(0)
return pivoted
closes = get_historical_closes(['LT' or 'HDFC'or 'ACC'], '1999-01-01', '2014-12-31')
My problem is I just want to get data for a few rows namely, data for LT, HDFC and ACC for all the dates, but when I execute the function, I am getting data for all the rows (approx. 1500 nos.)
How can I slice the dataframe, so that I get only selected rows and not the entire dataframe?
Raw input data is a collection of text files as so:
20MICRONS,20150401,36.5,38.95,35.8,37.35,64023,0
3IINFOTECH,20150401,5.9,6.3,5.8,6.2,1602365,0
3MINDIA,20150401,7905,7905,7850,7879.6,310,0
8KMILES,20150401,710.05,721,706,712.9,20196,0
A2ZINFRA,20150401,15.5,16.55,15.2,16,218219,0
AARTIDRUGS,20150401,648.95,665.5,639.65,648.25,42927,0
AARTIIND,20150401,348,349.4,340.3,341.85,122071,0
AARVEEDEN,20150401,42,42.9,41.55,42.3,627,0
ABAN,20150401,422,434.3,419,429.1,625857,0
ABB,20150401,1266.05,1284,1266,1277.45,70294,0
ABBOTINDIA,20150401,3979.25,4009.95,3955.3,3981.25,2677,0
ABCIL,20150401,217.8,222.95,217,221.65,11583,0
ABGSHIP,20150401,225,225,215.3,220.2,237737,0
ABIRLANUVO,20150401,1677,1677,1639.25,1666.7,106336,0
ACC,20150401,1563.7,1591.3,1553.2,1585.9,176063,0
ACCELYA,20150401,932,953.8,923,950.5,4297,0
ACE,20150401,40.1,41.7,40.05,41.15,356130,0
ACROPETAL,20150401,2.75,3,2.7,2.85,33380,0
ADANIENT,20150401,608.8,615.8,603,612.4,868006,0
ADANIPORTS,20150401,308.45,312.05,306.1,310.95,1026200,0
ADANIPOWER,20150401,46.7,48,46.7,47.75,3015649,0
ADFFOODS,20150401,60.5,60.5,58.65,59.75,23532,0
ADHUNIK,20150401,20.95,21.75,20.8,21.2,149431,0
ADORWELD,20150401,224.9,224.9,215.65,219.2,2743,0
ADSL,20150401,19,20,18.7,19.65,35053,0
ADVANIHOTR,20150401,43.1,43.1,43,43,100,0
ADVANTA,20150401,419.9,430.05,418,428,16206,0
AEGISCHEM,20150401,609,668,600,658.4,264828,0
AFL,20150401,65.25,70,65.25,68.65,9507,0
AGARIND,20150401,95,100,87.25,97.45,14387,0
AGCNET,20150401,91.95,93.75,91.4,93,2453,0
AGRITECH,20150401,5.5,6.1,5.5,5.75,540,0
AGRODUTCH,20150401,2.7,2.7,2.6,2.7,451,0
AHLEAST,20150401,196,202.4,185,192.25,357,0
AHLUCONT,20150401,249.5,258.3,246,251.3,44541,0
AHLWEST,20150401,123.9,129.85,123.9,128.35,688,0
AHMEDFORGE,20150401,229.5,237.35,228,231.45,332680,0
AIAENG,20150401,1268,1268,1204.95,1214.1,48950,0
AIL,20150401,735,747.9,725.1,734.8,31780,0
AJANTPHARM,20150401,1235,1252,1207.05,1223.3,126442,0
AJMERA,20150401,118.7,121.9,117.2,118.45,23005,0
AKSHOPTFBR,20150401,14.3,14.8,14.15,14.7,214028,0
AKZOINDIA,20150401,1403.95,1412,1392,1400.7,17115,0
ALBK,20150401,99.1,101.65,99.1,101.4,2129046,0
ALCHEM,20150401,27.9,32.5,27.15,31.6,32338,0
ALEMBICLTD,20150401,34.6,36.7,34.3,36.45,692688,0
ALICON,20150401,280,288,279.05,281.05,5937,0
ALKALI,20150401,31.6,34.2,31.6,33.95,4663,0
ALKYLAMINE,20150401,314,334,313.1,328.8,1515,0
ALLCARGO,20150401,317,323.5,315,319.15,31056,0
ALLSEC,20150401,21.65,22.5,21.6,21.6,435,0
ALMONDZ,20150401,10.6,10.95,10.5,10.75,23600,0
ALOKTEXT,20150401,7.5,8.2,7.4,7.95,8145264,0
ALPA,20150401,11.85,11.85,10.75,11.8,3600,0
ALPHAGEO,20150401,384.3,425.05,383.95,419.75,13308,0
ALPSINDUS,20150401,1.85,1.85,1.85,1.85,1050,0
ALSTOMT&D,20150401,585.85,595,576.65,588.4,49234,0
AMARAJABAT,20150401,836.5,847.75,831,843.9,121150,0
AMBIKCO,20150401,790,809,780.25,802.6,4879,0
AMBUJACEM,20150401,254.95,261.4,253.4,260.25,1346375,0
AMDIND,20150401,20.5,22.75,20.5,22.3,693,0
AMRUTANJAN,20150401,480,527.05,478.35,518.3,216407,0
AMTEKAUTO,20150401,144.5,148.45,144.2,147.45,552874,0
AMTEKINDIA,20150401,55.6,58.3,55.1,57.6,700465,0
AMTL,20150401,13.75,14.45,13.6,14.45,2111,0
ANANTRAJ,20150401,39.9,40.3,39.35,40.05,376564,0
ANDHRABANK,20150401,78.35,80.8,78.2,80.55,993038,0
ANDHRACEMT,20150401,8.85,9.3,8.75,9.1,15848,0
ANDHRSUGAR,20150401,92.05,98.95,91.55,96.15,11551,0
ANGIND,20150401,36.5,36.9,35.6,36.5,34758,0
ANIKINDS,20150401,22.95,24.05,22.95,24.05,1936,0
ANKITMETAL,20150401,2.85,3.25,2.85,3.15,29101,0
ANSALAPI,20150401,23.45,24,23.45,23.8,76723,0
ANSALHSG,20150401,29.9,29.9,28.75,29.65,7748,0
ANTGRAPHIC,20150401,0.1,0.15,0.1,0.15,23500,0
APARINDS,20150401,368.3,375.6,368.3,373.45,2719,0
APCOTEXIND,20150401,505,505,481.1,495.85,3906,0
APLAPOLLO,20150401,411.5,434,411.5,428.65,88113,0
APLLTD,20150401,458.9,464,450,454.7,72075,0
APOLLOHOSP,20150401,1351,1393.85,1351,1390,132827,0
APOLLOTYRE,20150401,169.65,175.9,169,175.2,3515274,0
APOLSINHOT,20150401,195,197,194.3,195.2,71,0
APTECHT,20150401,57.6,61,57,59.7,206475,0
ARCHIDPLY,20150401,32.95,35.8,32.5,35.35,103036,0
ARCHIES,20150401,19.05,19.4,18.8,19.25,46840,0
ARCOTECH,20150401,342.5,350,339.1,345.2,44142,0
ARIES,20150401,106.75,113.9,105,112.7,96825,0
ARIHANT,20150401,43.5,50,43.5,49.3,1647,0
AROGRANITE,20150401,61.5,62,59.55,60.15,2293,0
ARROWTEX,20150401,25.7,27.8,25.1,26.55,17431,0
ARSHIYA,20150401,39.55,41.5,39,40,69880,0
ARSSINFRA,20150401,34.65,36.5,34.6,36.3,71442,0
ARVIND,20150401,260.85,268.2,259,267.2,1169433,0
ARVINDREM,20150401,15.9,17.6,15.5,17.6,5407412,0
ASAHIINDIA,20150401,145,145,141,142.45,16240,0
ASAHISONG,20150401,113,116.7,112.15,115.85,5475,0
ASAL,20150401,45.8,45.8,38,43.95,7429,0
ASHAPURMIN,20150401,74,75.4,74,74.05,36406,0
ASHIANA,20150401,248,259,246.3,249.5,21284,0
ASHIMASYN,20150401,8.4,8.85,8.05,8.25,3253,0
ASHOKA,20150401,175.1,185.4,175.1,183.75,1319134,0
ASHOKLEY,20150401,72.7,74.75,72.7,74.05,17233199,0
ASIANHOTNR,20150401,104.45,107.8,101.1,105.15,780,0
ASIANPAINT,20150401,810,825.9,803.5,821.7,898480,0
ASIANTILES,20150401,116.25,124.4,116.25,123.05,31440,0
ASSAMCO,20150401,4.05,4.3,4.05,4.3,476091,0
ASTEC,20150401,148.5,154.5,146,149.2,322308,0
ASTRAL,20150401,447.3,451.3,435.15,448.6,64889,0
ASTRAMICRO,20150401,146.5,151.9,145.2,150.05,735681,0
ASTRAZEN,20150401,908,940.95,908,920.35,3291,0
ATFL,20150401,635,648,625.2,629.25,6202,0
ATLANTA,20150401,67.2,71,67.2,68.6,238683,0
ATLASCYCLE,20150401,203.9,210.4,203,208.05,25208,0
ATNINTER,20150401,0.2,0.2,0.2,0.2,1704,0
ATUL,20150401,1116,1160,1113,1153.05,32969,0
ATULAUTO,20150401,556.55,576.9,555.9,566.25,59117,0
AURIONPRO,20150401,192.3,224.95,191.8,217.55,115464,0
AUROPHARMA,20150401,1215,1252,1215,1247.4,1140111,0
AUSOMENT,20150401,22.6,22.6,21.7,21.7,2952,0
AUSTRAL,20150401,0.5,0.55,0.5,0.5,50407,0
AUTOAXLES,20150401,834.15,834.15,803,810.2,4054,0
AUTOIND,20150401,60,65,59.15,63.6,212036,0
AUTOLITIND,20150401,36,39,35.2,37.65,14334,0
AVTNPL,20150401,27,28,26.7,27.9,44803,0
AXISBANK,20150401,557.7,572,555.25,569.65,3753262,0
AXISCADES,20150401,335.4,345,331.4,339.65,524538,0
AXISGOLD,20150401,2473.95,2493,2461.1,2483.15,138,0
BAFNAPHARM,20150401,29.95,31.45,29.95,30.95,21136,0
BAGFILMS,20150401,3.05,3.1,2.9,3,31278,0
BAJAJ-AUTO,20150401,2027.05,2035,2002.95,2019.8,208545,0
BAJAJCORP,20150401,459,482,454,466.95,121972,0
BAJAJELEC,20150401,230,234.8,229,232.4,95432,0
BAJAJFINSV,20150401,1412,1447.5,1396,1427.55,44811,0
BAJAJHIND,20150401,14.5,14.8,14.2,14.6,671746,0
BAJAJHLDNG,20150401,1302.3,1329.85,1285.05,1299.9,24626,0
BAJFINANCE,20150401,4158,4158,4062.2,4140.05,12923,0
BALAJITELE,20150401,65.75,67.9,65.3,67.5,47063,0
BALAMINES,20150401,81.5,83.5,81.5,83.45,6674,0
BALKRISIND,20150401,649,661,640,655,16919,0
BALLARPUR,20150401,13.75,13.95,13.5,13.9,271962,0
BALMLAWRIE,20150401,568.05,580.9,562.2,576.75,17423,0
BALPHARMA,20150401,68.9,74.2,67.1,68.85,84178,0
BALRAMCHIN,20150401,50.95,50.95,49.3,50,84400,0
BANARBEADS,20150401,33,39.5,33,39.25,1077,0
BANARISUG,20150401,834.7,855,820,849.85,618,0
BANCOINDIA,20150401,105,107.5,103.25,106.8,11765,0
BANG,20150401,6.2,6.35,6.1,6.35,9639,0
BANKBARODA,20150401,162.75,170.4,162.05,168.9,2949846,0
BANKBEES,20150401,1813.45,1863,1807,1859.78,19071,0
BANKINDIA,20150401,194.6,209.8,194.05,205.75,3396490,0
BANSWRAS,20150401,65,65,60.1,63.9,6238,0
BARTRONICS,20150401,11.45,11.85,11.35,11.6,109658,0
BASF,20150401,1115,1142,1115,1124.65,14009,0
BASML,20150401,184,192,183.65,191.6,642,0
BATAINDIA,20150401,1095,1104.9,1085,1094.7,137166,0
BAYERCROP,20150401,3333,3408.3,3286.05,3304.55,8839,0
BBL,20150401,627.95,641.4,622.2,629.8,5261,0
BBTC,20150401,441,458,431.3,449.15,141334,0
BEDMUTHA,20150401,16.85,18,16.25,17.95,16412,0
BEL,20150401,3355,3595,3350,3494.2,582755,0
BEML,20150401,1100,1163.8,1086,1139.2,631231,0
BEPL,20150401,22.1,22.45,21.15,22.3,5459,0
BERGEPAINT,20150401,209.3,216.9,208.35,215.15,675963,0
BFINVEST,20150401,168.8,176.8,159.5,172.7,113352,0
BFUTILITIE,20150401,707.4,741,702.05,736.05,1048274,0
BGLOBAL,20150401,2.9,3.05,2.9,3.05,16500,0
BGRENERGY,20150401,117.35,124,117.35,122.3,207979,0
BHAGYNAGAR,20150401,17.9,17.9,16.95,17.5,1136,0
BHARATFORG,20150401,1265.05,1333.1,1265.05,1322.6,704419,0
BHARATGEAR,20150401,73.5,77.7,72.7,75.9,13730,0
BHARATRAS,20150401,810,840,800,821.4,981,0
BHARTIARTL,20150401,393.3,404.85,393.05,402.3,5494883,0
BHEL,20150401,235.8,236,229.6,230.7,3346075,0
BHUSANSTL,20150401,65.15,67.9,63.65,64,1108540,0
BIL,20150401,401.3,422,401.3,419.35,2335,0
BILENERGY,20150401,0.8,0.95,0.8,0.95,8520,0
BINANIIND,20150401,90.55,93.95,90.2,93.3,27564,0
BINDALAGRO,20150401,23.4,23.4,22.25,22.8,111558,0
BIOCON,20150401,472.5,478.85,462.7,466.05,1942983,0
BIRLACORPN,20150401,415,420,402.8,414.7,11345,0
BIRLACOT,20150401,0.05,0.1,0.05,0.1,439292,0
BIRLAERIC,20150401,52.3,54.45,52.15,53.7,9454,0
BIRLAMONEY,20150401,24.35,28.85,23.9,28.65,78710,0
BLBLIMITED,20150401,3.7,3.7,3.65,3.65,550,0
BLISSGVS,20150401,128,132.55,124.3,126.15,261958,0
BLKASHYAP,20150401,13.7,15.15,13.7,14.15,118455,0
BLUEDART,20150401,7297.35,7315,7200,7285.55,2036,0
BLUESTARCO,20150401,308.75,315,302,311.35,19046,0
BLUESTINFO,20150401,199,199.9,196.05,199.45,1268,0
BODALCHEM,20150401,34.5,34.8,33.05,34.65,65623,0
BOMDYEING,20150401,64,66.3,63.7,65.95,1168851,0
BOSCHLTD,20150401,25488,25708,25201,25570.7,16121,0
BPCL,20150401,810.95,818,796.5,804.2,1065969,0
BPL,20150401,30.55,32.5,30.55,31.75,116804,0
BRFL,20150401,146,147.9,142.45,144.3,7257,0
BRIGADE,20150401,143.8,145.15,140.25,144.05,36484,0
BRITANNIA,20150401,2155.5,2215.3,2141.35,2177.55,245908,0
BROADCAST,20150401,3.35,3.5,3.3,3.3,4298,0
BROOKS,20150401,38.4,39.5,38.4,39.3,19724,0
BSELINFRA,20150401,1.9,2.15,1.85,2.05,97575,0
BSL,20150401,29.55,31.9,27.75,31,9708,0
BSLGOLDETF,20150401,2535,2535,2501.5,2501.5,122,0
BSLIMITED,20150401,27.5,27.5,25.45,27.15,728818,0
BURNPUR,20150401,9.85,9.85,9.1,9.15,144864,0
BUTTERFLY,20150401,190.95,194,186.1,192.35,25447,0
BVCL,20150401,17.25,17.7,16.5,17.7,9993,0
CADILAHC,20150401,1755,1796.8,1737.05,1790.15,302149,0
CAIRN,20150401,213.85,215.6,211.5,213.35,841463,0
CAMLINFINE,20150401,89.5,91.4,87.5,91.1,32027,0
CANBK,20150401,366.5,383.8,365.15,381,1512605,0
CANDC,20150401,20.6,24.6,20.6,23.25,9100,0
CANFINHOME,20150401,611.1,649.95,611.1,644.7,72233,0
CANTABIL,20150401,47.6,50.5,47.6,50.25,5474,0
CAPF,20150401,398.85,427,398,421.75,224074,0
CAPLIPOINT,20150401,1020,1127.8,1020,1122.65,108731,0
CARBORUNIV,20150401,191.05,197,188.35,190,42681,0
CAREERP,20150401,151.9,156.6,149,153.25,26075,0
CARERATING,20150401,1487,1632.75,1464,1579.2,65340,0
CASTROLIND,20150401,476,476.25,465.1,467.3,185850,0
CCCL,20150401,4.2,4.7,4.2,4.65,47963,0
CCHHL,20150401,10.8,11,10.4,10.8,69325,0
CCL,20150401,178.35,185.9,176,184.3,244917,0
CEATLTD,20150401,805.25,830.8,785.75,826.7,501415,0
CEBBCO,20150401,18.3,20.25,18.1,19.85,40541,0
CELEBRITY,20150401,11.5,12.5,11.5,12.1,5169,0
CELESTIAL,20150401,59.9,61.8,59.5,60.05,128386,0
CENTENKA,20150401,152,159.9,148.2,157.1,16739,0
CENTEXT,20150401,1.5,1.5,1.2,1.25,19308,0
CENTRALBK,20150401,106,107.2,104.3,106.3,992782,0
CENTUM,20150401,756.85,805,756.8,801.9,26848,0
CENTURYPLY,20150401,234,245,234,243.45,367540,0
CENTURYTEX,20150401,633.6,682.4,631,675.35,3619413,0
CERA,20150401,2524.75,2524.75,2470,2495.3,6053,0
CEREBRAINT,20150401,15.6,16.2,14.65,14.8,348478,0
CESC,20150401,604.95,613.4,595.4,609.75,294334,0
CGCL,20150401,173,173,173,173,9,0
CHAMBLFERT,20150401,70.2,73.4,70.2,72.65,2475030,0
CHEMFALKAL,20150401,72.8,77,72,76.3,1334,0
CHENNPETRO,20150401,69,70.35,68.3,68.95,160576,0
CHESLINTEX,20150401,10.1,10.1,8.75,9.4,1668,0
CHOLAFIN,20150401,599.85,604,582.15,598.2,23125,0
CHROMATIC,20150401,3.4,4.05,3,3.3,63493,0
CIGNITITEC,20150401,433,444.95,432,440,32923,0
CIMMCO,20150401,92,94.05,91,94.05,19931,0
CINELINE,20150401,14.5,14.95,14.5,14.9,4654,0
CINEVISTA,20150401,3.3,3.3,3.3,3.3,10,0
CIPLA,20150401,714,716.5,703.85,709.6,1693796,0
CLASSIC,20150401,1.5,1.55,1.45,1.45,7770,0
CLNINDIA,20150401,824.7,837.9,819,828.8,6754,0
CLUTCHAUTO,20150401,13.75,13.75,13.6,13.6,1414,0
CMAHENDRA,20150401,9.35,9.5,8.9,9.15,1005172,0
CMC,20150401,1925.85,1925.85,1891,1907.25,153068,0
CNOVAPETRO,20150401,20,22.75,17.1,22.75,1656,0
COALINDIA,20150401,362.9,364.25,358,363,1428949,0
COLPAL,20150401,2003.4,2009.9,1990.05,2002.5,92909,0
COMPUSOFT,20150401,9.4,10.05,9,9.7,15083,0
CONCOR,20150401,1582.35,1627.3,1561,1582.85,182280,0
CONSOFINVT,20150401,36.55,40,36.5,40,439,0
CORDSCABLE,20150401,25.55,28,24.1,25.8,15651,0
COREEDUTEC,20150401,8,8.85,7.6,8.4,890455,0
COROMANDEL,20150401,268.5,271.35,266.15,268.35,42173,0
CORPBANK,20150401,52.5,55,52.05,54.1,1141752,0
COSMOFILMS,20150401,76.9,80,76.2,79.25,21020,0
COUNCODOS,20150401,1.2,1.2,1.2,1.2,2850,0
COX&KINGS,20150401,323,324.85,316.5,317.8,76998,0
CPSEETF,20150401,24.2,24.37,24.08,24.34,180315,0
CREATIVEYE,20150401,3.4,3.6,2.8,3.45,8545,0
CRISIL,20150401,2049,2052.45,2000,2030.7,3928,0
CROMPGREAV,20150401,164.85,167.4,163.2,166.1,2739478,0
CTE,20150401,18.55,18.55,16.85,17.05,8260,0
CUB,20150401,97.35,98.75,96.4,98.3,182702,0
CUMMINSIND,20150401,879,900.95,874.75,889.9,358652,0
CURATECH,20150401,10.8,11,9.75,10,755,0
CYBERTECH,20150401,28.5,33.45,28.1,33.4,103549,0
CYIENT,20150401,509.9,515,495.1,514.1,30415,0
DAAWAT,20150401,105,112.25,99.5,108.4,26689,0
DABUR,20150401,266.5,268.5,264.65,266.55,642177,0
DALMIABHA,20150401,428.15,439.9,422.5,432.65,9751,0
DALMIASUG,20150401,17.5,17.5,16.45,17.15,12660,0
DATAMATICS,20150401,66.5,75,66,72.15,119054,0
DBCORP,20150401,378,378,362.6,369.45,8799,0
DBREALTY,20150401,67,67.15,65.8,66.3,212297,0
DBSTOCKBRO,20150401,47.6,47.65,47.45,47.55,24170,0
DCBBANK,20150401,110.95,114.95,110.15,114.45,935858,0
DCM,20150401,84.5,88.75,84.1,87,34747,0
DCMSHRIRAM,20150401,107.95,114.3,107.95,112.8,29474,0
DCW,20150401,16.75,17.2,16.65,17.15,270502,0
DECCANCE,20150401,310.05,323.9,310.05,321.55,446,0
DECOLIGHT,20150401,1.45,1.45,1.4,1.4,1100,0
DEEPAKFERT,20150401,140,144,138.25,139.95,162156,0
DEEPAKNTR,20150401,68,70.65,66.4,69.95,8349,0
DEEPIND,20150401,46.6,54.4,46.3,51.9,52130,0
DELTACORP,20150401,79.95,82.75,79.75,82.35,889247,0
DELTAMAGNT,20150401,36.6,37.45,36.6,37.45,60,0
DEN,20150401,121.45,127,121.2,122.4,59512,0
DENABANK,20150401,50.8,51.5,50.1,51.35,376680,0
DENORA,20150401,136.7,136.7,131.05,133.6,743,0
DHAMPURSUG,20150401,36.8,36.95,34.85,36.35,38083,0
DHANBANK,20150401,30.8,32.1,30.5,31.75,195779,0
DHANUKA,20150401,690,690,652,660.15,24958,0
DHARSUGAR,20150401,14.15,14.7,13.8,14.45,1748,0
DHFL,20150401,468.9,474.9,461.6,467.85,448551,0
DHUNINV,20150401,97.15,103,94.5,99.85,15275,0
DIAPOWER,20150401,44.9,45.95,43.3,45.55,126085,0
DICIND,20150401,343,347,341,341.95,7745,0
DIGJAM,20150401,8,8.15,7.75,8.05,96467,0
DISHMAN,20150401,168,172.65,164.7,171.8,778414,0
DISHTV,20150401,82.2,84.85,81.35,84.15,5845850,0
DIVISLAB,20150401,1770.1,1809,1770.1,1802.35,68003,0
DLF,20150401,157,160.9,156.2,159.7,3098216,0
DLINKINDIA,20150401,165.05,168,162.2,164.75,22444,0
DOLPHINOFF,20150401,120.8,134.4,119.5,130.2,190716,0
DONEAR,20150401,15,15.95,14.5,15.35,679,0
DPL,20150401,46.6,49,44,45.45,25444,0
DPSCLTD,20150401,17.15,17.15,16.55,16.85,916,0
DQE,20150401,24.3,24.8,22.75,23.1,57807,0
DRDATSONS,20150401,5.8,6.1,5.7,6,2191357,0
DREDGECORP,20150401,374.9,403,372.65,393.4,106853,0
DRREDDY,20150401,3541,3566.8,3501.7,3533.65,282785,0
DSKULKARNI,20150401,77.6,77.6,74,77.1,3012,0
DSSL,20150401,9.5,9.5,9.5,9.5,50,0
DTIL,20150401,206.95,231.75,205.95,219.05,1437,0
DUNCANSLTD,20150401,15.55,16.3,15.3,15.85,740,0
DWARKESH,20150401,21,21,19.85,20.7,9410,0
DYNAMATECH,20150401,3868,4233,3857.1,3920.55,59412,0
DYNATECH,20150401,2.85,3,2.85,3,3002,0
EASTSILK,20150401,1.55,1.85,1.55,1.75,9437,0
EASUNREYRL,20150401,40.05,43,40.05,42.55,21925,0
ECEIND,20150401,136,148,127,133.85,43034,0
ECLERX,20150401,1603.8,1697,1595,1600.65,123468,0
EDELWEISS,20150401,63.65,67.5,63,66.6,451255,0
EDL,20150401,23.9,25,23.9,24.4,7799,0
EDUCOMP,20150401,12.45,13.55,12.35,13.55,499009,0
EICHERMOT,20150401,15929,16196.95,15830.05,16019.5,45879,0
EIDPARRY,20150401,174.05,175.8,168.65,171.2,56813,0
EIHAHOTELS,20150401,228,232.8,225,228,85,0
EIHOTEL,20150401,107.25,110,107.25,109.5,57306,0
EIMCOELECO,20150401,399,409.5,399,409.5,184,0
EKC,20150401,9.35,11.15,9.35,11.05,350782,0
ELAND,20150401,14.3,16.45,14.3,16.25,191406,0
ELDERPHARM,20150401,90.5,91.5,89.45,91.5,23450,0
ELECON,20150401,66.5,76.2,66.25,74.45,6045416,0
ELECTCAST,20150401,19.8,20.55,18.9,19.4,1956889,0
ELECTHERM,20150401,25.9,25.9,22.2,24,14611,0
ELGIEQUIP,20150401,147.5,150.4,146.4,150,9475,0
....
ZENITH, 20150401,...
I use EdChum code from his comment and add some clarification. I think the main problem is d is output dataframe d cannot be looped in cycle for, if you need one output from all *.txt files.
import pandas as pd
import glob
def get_historical_closes(ticker, start_date, end_date):
dfs = []
#create empty df for output
d = pd.DataFrame()
#glob can use path with *.txt - see http://stackoverflow.com/a/3215392/2901002
for files in glob.glob('/home/manish/Desktop/Equity/subset/*.txt'):
#added index_col for multiindex df
dfs.append(pd.read_csv(files, index_col=['Date', 'Ticker', 'Close'], names = ['Ticker', 'Date', 'Open', 'High', 'Low', 'Close', 'Volume', 'Null'], parse_dates = [1]))
p = pd.concat(dfs)
#d is output from all .txt files, so cannot be looped in cycle for
d = p.reset_index(['Date', 'Ticker', 'Close'])
d = d[(d['Ticker'].isin(ticker)) & (d['Date'] > start_date) & (d['Date'] < end_date)]
pivoted = d.pivot_table(index = ['Date'], columns =['Ticker'])
pivoted.columns = pivoted.columns.droplevel(0)
return pivoted
#function isin need list of columns, so 'or' can be replaced by ','
#arguments are changed for testing: 'HDFC' to 'AGCNET' and end_date '2014-12-31' to '2015-12-31'
closes = get_historical_closes(['LT','AGCNET','ACC'], '1999-01-01', '2015-12-31')
print closes

Categories