I trying to load a bunch of ExcelSheets into Oracle table. The excelsheets are not consistent. So the tables in Oracle need to be dynamically generated based on the column in the Excelsheets. This is the easiest solution I found in the internet, so I went on with this approach. I'm kinda Novice in Python.
The code I've used to load the excel is below
xls_file = pd.ExcelFile("Excel File")
df = xls_file.parse('Sheet1',parse_dates=True)
df = df.replace(np.nan, '') #replace NaN with blank string
df = df.replace('NA', '') #replace "NA" with blank string
df = df.replace('_ND', '') #replace "NA" with blank string
df.to_sql('MEM_TABLE', conn_ora, if_exists='append', index = False )
The table that to_sql method created has following structure
MEM_TABLE(
date CLOB NULL,
emp_nbr NUMBER(19,0) NULL
)
In dataframe it reads something like this
DATE EMP_NBR
0 2
1 2018-01-04 00:00:00 1
2 2018-01-01 00:00:00 2
3 2018-01-03 00:00:00 1
4 1
5 1
Looks like Pandas is auto casting 1/4/2018 to date when creating the DataFrame. The error I'm getting below, which I believe is because the DATE column in table MEM_TABLE is CLOB and in dataFrame the Date field has DATE object. Looks likes the date object can't be insert into a CLOB cell.
TypeError: expecting string or bytes object
If the Dates are in string(i.e "1/4/2018"), I'm not getting any errors.
My question here is Is there anyway to disable the auto Parse on the date field, while the dataframe is getting created?
You are facing this issue because on line 2 you are setting parse_dates argument as true, it needs to be set as false.
df = xls_file.parse('Sheet1',parse_dates=False)