It seems like it should be a simple thing: create an empty DataFrame in the Pandas Python Data Analysis Library. But if you want to create a DataFrame that
- is empty (has no records)
- has datatypes
- has columns in a specific order
...i.e. the equivalent of SQL's CREATE TABLE, then it's not obvious how to do it in Pandas, and I wasn't able to find any one web page that laid it all out. The trick is to use an empty Numpy ndarray in the DataFrame constructor:
df=DataFrame(np.zeros(0,dtype=[
('ProductID', 'i4'),
('ProductName', 'a50')]))
Then, to insert a single record:
df = df.append({'ProductID':1234, 'ProductName':'Widget'})
UPDATE 2013-07-18: Append is missing a parameter:
df = df.append({'ProductID':1234, 'ProductName':'Widget'},ignore_index=True)
 
1 comment:
thanks, this was helpful. you can also use pandas columns arg to name the columns, e.g.
columns = ['price', 'item']
pd.DataFrame(data=np.zeros((0,len(columns))), columns=columns)
Post a Comment