R : Create dataframe row per row with named and typed columns

Question

I would like to create a dataframe ith 4 columns : charcater, character, numeric, numeric

and to fill my data line per line, for maintenability, so keeping only c("site1", "Site 1", 6.943890, -6.0557) as the data for one row.

This is working but awful.... How can I make it more "R beautiful" ?

S <- data.frame( t(data.frame(
  "s1" = c("None", "None", 0, 0 ),
  "s2" = c("site1", "Site 1", 6.943890, -6.0557),
  "s3" = c("site2", "Site 2", 43.943890, -3.055796)
) ) , stringsAsFactors = F)
colnames(S) <- c("id", "name", "lat", "lng")
S$lat <- as.double(S$lat)
S$lng <- as.double(S$lng)

which results in a correct dataframe with named columns with the right type....

David Mas · Accepted Answer · 2020-04-07 09:30:58Z

I am not sure I understand your question, if you want to fill in a data.frame row by row I would use a matrix.

library(magrittr)
rnames = c("s1","s2","s3")
values = c(c("None", "None", 0, 0 ),
           c("site1", "Site 1", 6.943890, -6.0557),
           c("site2", "Site 2", 43.943890, -3.055796))
matrix(values,nrow = length(rnames),
       ncol = length(values)/length(rnames),
       byrow = T) %>% as.data.frame() -> S
colnames(S) <- c("id", "name", "lat", "lng")

S
#>      id   name      lat       lng
#> 1  None   None        0         0
#> 2 site1 Site 1  6.94389   -6.0557
#> 3 site2 Site 2 43.94389 -3.055796

The other option is to build the dataframe from a a list. I think this would be more similar to what you would need in a real-world scenario.

options(stringsAsFactors = FALSE)


df_list <- list(
  s1 = data.frame(id = "None", name = "None", lat = 0, lng = 0 ),
  s2 = data.frame(id = "site1", name =  "Site 1", lat = 6.943890,  lng = -6.0557),
  s3 = data.frame(id = "site2", name ="Site 2",lat = 43.943890,lng = -3.055796)
) 

S = dplyr::bind_rows(df_list)
colnames(S) <- c("id", "name", "lat", "lng")

# i don't think u need this
S$lat <- as.double(S$lat)
S$lng <- as.double(S$lng)

S
#>      id   name      lat       lng
#> 1  None   None  0.00000  0.000000
#> 2 site1 Site 1  6.94389 -6.055700
#> 3 site2 Site 2 43.94389 -3.055796

As others have said this is not very common in R, for large datasets the list option will likely be a bit slow. Vectorizing through columns is the optimal option most of the time.

^{Created on 2020-04-07 by the reprex package (v0.3.0)}

Ronak Shah · Accepted Answer · 2020-04-07 08:47:30Z

0

You could use type.convert which converts data to its appropriate class.

str(S)
#'data.frame':  3 obs. of  4 variables:
# $ X1: chr  "None" "site1" "site2"
# $ X2: chr  "None" "Site 1" "Site 2"
# $ X3: chr  "0" "6.94389" "43.94389"
# $ X4: chr  "0" "-6.0557" "-3.055796"

S <- type.convert(S, as.is = TRUE)
str(S)

#'data.frame':  3 obs. of  4 variables:
# $ X1: chr  "None" "site1" "site2"
# $ X2: chr  "None" "Site 1" "Site 2"
# $ X3: num  0 6.94 43.94
# $ X4: num  0 -6.06 -3.06

There is also readr::type_convert(S) which does the same thing.

answered Apr 7, 2020 at 8:47

Ronak Shah

391k20 gold badges173 silver badges237 bronze badges

2 Comments

Stéphane V Over a year ago

And can I get rid of the multiple data.frame( t(data.frame( by using a better data frame definition ?

Ronak Shah Over a year ago

@StéphaneV You are creating dataframe in row-wise fashion. That is not standard R-way, you should do it column wise.

S <- data.frame(id = c('None', 'site1', 'site2'),             name = c('None', 'Site1', 'Site2'),             lat = c(0, 6.943890, 43.943890),             long = c(0, -6.0557, -3.055796), stringsAsFactors = FALSE)

and you don't need to change their type explicitly afterwards.

Collectives™ on Stack Overflow

R : Create dataframe row per row with named and typed columns

2 Answers 2

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related