Tuesday, May 21, 2019

In R, how to sample by rows when the dataframe index is by column?

I need to sample 800 out of 1000 rows for a training set, but index goes by columns. Ex. df[1] returns the first column.



Q2dat = read.csv("Q2in.csv")

Q2dat = as.data.frame(Q2dat)


Q2datTrain = sample(Q2dat,0.8*nrow(Q2dat)) # this only lets me sample columns, so 800 is too many

Q2datTrain = sample(nrow(Q2dat),0.8*nrow(Q2dat)) # this samples any value in the dataframe, but not whole rows


I'm not sure how to change the data frame so that it indexes by rows instead of columns, or how to sample whole rows.
Turning the data frame into a matrix just creates 8000 values, and when I specify the number of rows for the matrix, it's an unused statement

No comments:

Post a Comment

plot explanation - Why did Peaches' mom hang on the tree? - Movies & TV

In the middle of the movie Ice Age: Continental Drift Peaches' mom asked Peaches to go to sleep. Then, she hung on the tree. This parti...