Creating a Pandas DataFrame from a Numpy array: How do I specify the index column and column headers?

Data Science

I have a Numpy array consisting of a list of lists, representing a two-dimensional array with row labels and column names as shown below:

data = array([['','Col1','Col2'],['Row1',1,2],['Row2',3,4]])

I'd like the resulting DataFrame to have Row1 and Row2 as index values, and Col1, Col2 as header values

I can specify the index as follows:

df = pd.DataFrame(data,index=data[:,0]),

however, I am unsure how to best assign column headers.






To set the first row as column headers and the first column as an index, you need to specify them in the DataFrame constructor. 

pd.DataFrame(data=data[1:,1:], index=data[1:,0], columns=data[0,1:])

data[1:,1:] represents that the actual data starts from the second column of the second row.


If you want to unleash your potential in this competitive field, please visit the Data Science course page for more information, where you can find the Data Science tutorials and Data Science frequently asked interview questions and answers as well.


This topic has been locked/unapproved. No replies allowed

Login to participate in this discussion.

Leave a reply

Before proceeding, please check your email for a verification link. If you did not receive the email, click here to request another.