Character value to numeric vector in dataframe

Question

From a dataframe I get a character vector of a column by: arrange by date, time and group_by date

fall_hc %>% arrange(a_dat,AZeit) %>% group_by(a_dat) %>%
        mutate(time_vec = str_c(snzeit,collapse= ",")) %>% 
        ungroup() %>% 
        filter(!is.na(time_vec))

This results in character vectors like:

x <- c("5, 31, 16, 64, 9, 10, 31")

I need a numeric vector from this x within a dataframe: 5 31 16 64 9 10 31.
because I want to calculate diff(x): 26 -15 48 -55 1 21.
and further process this to calculate the number of negativ differences.

Here are some data:

tibble::tribble(
              ~a_dat,  ~AZeit, ~snzeit,
        "2019-01-02", "24180",      31,
        "2019-01-02", "24360",      27,
        "2019-01-02", "24480",      16,
        "2019-01-02", "24780",      64,
        "2019-01-02", "30420",       9,
        "2019-01-02", "30840",      10,
        "2019-01-02", "35280",      31,
        "2019-01-03", "24120",      40,
        "2019-01-03", "24120",      27,
        "2019-01-03", "24480",       6,
        "2019-01-03", "24480",       4,
        "2019-01-03", "24780",       9,
        "2019-01-03", "25380",      25,
        "2019-01-03", "26460",      33,
        "2019-01-04", "24000",       5,
        "2019-01-04", "24360",       2,
        "2019-01-04", "24900",       1,
        "2019-01-04", "27180",      29,
        "2019-01-04", "30600",       8,
        "2019-01-07", "24780",      25,
        "2019-01-07", "24840",       4,
        "2019-01-07", "28920",       3,
        "2019-01-07", "31620",      11,
        "2019-01-08", "24060",      46,
        "2019-01-08", "24480",       7,
        "2019-01-08", "25260",       4,
        "2019-01-08", "27900",       5,
        "2019-01-08", "29820",       5,
        "2019-01-08", "30060",      74,
        "2019-01-08", "33360",       5,
        "2019-01-08", "33600",      28,
        "2019-01-08", "34200",      15,
        "2019-01-08", "35520",      13,
        "2019-01-08", "36000",      19,
        "2019-01-08", "44100",      24
        )

They are already sorted, so you can leave out %>% arrange(a_dat,AZeit).

To clarify the purpose: I need to know how the operations are sorted with respect to snzeit, thats the time from cut to suture. Want to identify those days where they are sorted from short to long.

So for each date group you want to calculate the number of negative differences in snzeit column ? — Shafee
– Shafee, Commented Jul 7, 2022 at 9:08
at the moment I want to identify these groups with increasing snzeit. Theory: with increasing snzeit the waiting time of the patients is reduced. Want to prove that with my data. — Peter Hahn
– Peter Hahn, Commented Jul 7, 2022 at 9:23
I am quite lost about what you want to do, but I suppose you should be able to solve it now from the solutions given to this question. Thanks. — Shafee
– Shafee, Commented Jul 7, 2022 at 10:05

Shafee · Accepted Answer · 2022-07-07 09:24:46Z

2

If I have understood your problem correctly, you donot need to create that character vector in the 1st place, you just need list-column and then applying diff on each list and calculate the number of negative values for each a_dat


library(dplyr)
library(tibble)


fall_hc <- tibble::tribble(
  ~a_dat,  ~AZeit, ~snzeit,
  "2019-01-02", "24180",      31,
  "2019-01-02", "24360",      27,
  "2019-01-02", "24480",      16,
  "2019-01-02", "24780",      64,
  "2019-01-02", "30420",       9,
  "2019-01-02", "30840",      10,
  "2019-01-02", "35280",      31,
  "2019-01-03", "24120",      40,
  "2019-01-03", "24120",      27,
  "2019-01-03", "24480",       6,
  "2019-01-03", "24480",       4,
  "2019-01-03", "24780",       9,
  "2019-01-03", "25380",      25,
  "2019-01-03", "26460",      33,
  "2019-01-04", "24000",       5,
  "2019-01-04", "24360",       2,
  "2019-01-04", "24900",       1,
  "2019-01-04", "27180",      29,
  "2019-01-04", "30600",       8,
  "2019-01-07", "24780",      25,
  "2019-01-07", "24840",       4,
  "2019-01-07", "28920",       3,
  "2019-01-07", "31620",      11,
  "2019-01-08", "24060",      46,
  "2019-01-08", "24480",       7,
  "2019-01-08", "25260",       4,
  "2019-01-08", "27900",       5,
  "2019-01-08", "29820",       5,
  "2019-01-08", "30060",      74,
  "2019-01-08", "33360",       5,
  "2019-01-08", "33600",      28,
  "2019-01-08", "34200",      15,
  "2019-01-08", "35520",      13,
  "2019-01-08", "36000",      19,
  "2019-01-08", "44100",      24
)


fall_hc %>%
    arrange(a_dat, AZeit) %>%
    group_by(a_dat) %>%
    summarise(
        time_vec = list(snzeit)
    ) %>%
    group_by(a_dat) %>%
    summarise(
        time_vec = diff(unlist(time_vec)) # caluculate diffs for each list
    ) %>%
    group_by(a_dat) %>%
    summarise(
        time_vec_neg = sum(time_vec < 0) # count number of negative values
    )

#> `summarise()` has grouped output by 'a_dat'. You can override using the
#> `.groups` argument.
#> # A tibble: 5 × 2
#>   a_dat      time_vec_neg
#>   <chr>             <int>
#> 1 2019-01-02            3
#> 2 2019-01-03            3
#> 3 2019-01-04            3
#> 4 2019-01-07            2
#> 5 2019-01-08            5

^{Created on 2022-07-07 by the reprex package (v2.0.1)}

edited Jul 7, 2022 at 9:24

answered Jul 6, 2022 at 12:33

Shafee

20.9k4 gold badges39 silver badges73 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Peter Hahn Over a year ago

Thanks I tried that already. Works only on single vectors. Within a dataframe I get an error: must be size 8350 or 1, not 87908.

Shafee Over a year ago

@PeterHahn, can you then share a sample of your fall_hc data, so that i can give a try?!?

Peter Hahn Over a year ago

Will prepare it, take some time, must remove data by which someone can identify patients.

Peter Hahn Over a year ago

Here are the data:

Shafee Over a year ago

@PeterHahn, I have updated the code...does this work for you?!

zx8754 · Accepted Answer · 2022-07-06 13:33:08Z

0

Keep numbers as numeric in a list:

mtcars %>% 
  group_by(cyl) %>% 
  summarise(x = list(mpg))
# # A tibble: 3 x 2
#         cyl x         
#      <dbl> <list>    
#    1     4 <dbl [11]>
#    2     6 <dbl [7]> 
#    3     8 <dbl [14]>

Then we can do below on the list of numbers:

mtcars %>% 
  group_by(cyl) %>% 
  summarise(x = list(mpg)) %>% 
  group_by(cyl) %>% 
  summarise(xDiffSum = sum(diff(unlist(x))))
# # A tibble: 3 x 2
#      cyl xDiffSum
#   <dbl>    <dbl>
# 1     4    -1.40
# 2     6    -1.3 
# 3     8    -3.7

answered Jul 6, 2022 at 13:33

zx8754

56.7k12 gold badges131 silver badges229 bronze badges

Collectives™ on Stack Overflow

Character value to numeric vector in dataframe

2 Answers 2

5 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related