1

I have 2 dataframes like this

df1

       date item 
 02/01/2017    A 
 09/01/2017    B
 14/01/2017    C

df2

      date1       date2  item    prm
 01/01/2017  03/01/2017     A    YES
 08/01/2017  10/01/2017     B    YES
 15/01/2017  17/01/2017     C    YES

Purpose

The prm variable is a constant variable, it has the just 1 value. I'd like to add the variable prm in my df1 with this condition

df1$date is between df2$date1 and df2$date2 and df1$item=df2$item

But, if the condition don't match, then I need that prm gets the value "NO"

4
  • 1
    This is a simple non-equi join in data.table Commented Oct 24, 2017 at 8:22
  • I tried with data.table but my df pass from 1 800 000 rows to 800 000 rows don't know why because when i do a unique i still have 1 800 000 rows Commented Oct 24, 2017 at 9:05
  • 1
    Try library(data.table) ; setDT(df1)[setDT(df2), on = .(item, date >= date1, date <= date2), prm := i.prm] (assuming the date formats are correct) Commented Oct 24, 2017 at 9:27
  • Nice thank you it works fine like all the solutions Commented Oct 24, 2017 at 9:39

4 Answers 4

2

Using non-equi joins and update on join which are available with data.table this becomes:

library(data.table)
setDT(df1)[setDT(df2), on = .(item, date>=date1, date<= date2), prm := i.prm][
  is.na(prm), prm := "NO"]
df1
         date item prm
1: 2017-01-02    A YES
2: 2017-01-09    B YES
3: 2017-01-14    C  NO
Sign up to request clarification or add additional context in comments.

Comments

2

Here is a solution using dplyr:

library(tidyverse)

df1 = tribble(~date, ~item,
             "02/01/2017",    "A",
             "09/01/2017",    "B",
             "16/01/2017",    "C")

df2 = tribble(~date1, ~date2, ~item,
"01/01/2017",  "03/01/2017",     "A",
"08/01/2017",  "10/01/2017",     "B",
"15/01/2017",  "15/01/2017",     "C")

df3 = merge(x = df1, y = df2)


df4 = as.data.frame(cbind(df3[1], lapply(df3[2:4], as.Date, format = "%d/%m/%Y")))


df5 <- df4 %>%
  mutate(prm = if_else((date > date1) & (date < date2), "YES", "NO"))

df5

Comments

2

You can use ifelse here

 df1 <- read.table(text = "      date item 
 02/01/2017    A 
 09/01/2017    B
 16/01/2017    C", header = T)

df2 <- read.table(text = "      date1       date2  item
 01/01/2017  03/01/2017     A 
                  08/01/2017  10/01/2017     B
                  15/01/2017  17/01/2017     C", header = T)

df1$date <- as.Date(df1$date, format = "%d/%m/%Y")
df2$date1 <- as.Date(df2$date1, format = "%d/%m/%Y")
df2$date2 <- as.Date(df2$date2, format = "%d/%m/%Y")


df1$prm <- ifelse(df1$date >= df2$date1 & df1$date <= df2$date2 & df1$item == df2$item, "YES" , "NO")

        date item prm
1 0002-01-20    A YES
2 0009-01-20    B YES
3 0016-01-20    C YES

3 Comments

Yes it works but i think that my explanation was not good enough i'll edit my question
Orhan is there anything missing from the answer?
Does 0002-01-20 looks like a valid date to you? This should be df1$date <- as.Date(df1$date, format = "%d/%m/%Y") etc.
1

[EDIT]

In case the number of rows in df1 and df2 is different, you can use sqldf and create an LEFT JOIN on df1.date between df2.date1 and df2.date2 and df1.item = df2.item and use a CASE WHEN statement to create the column prm:

options("stringsAsFactors" = FALSE)

df1 <- read.table(text = 
"date item 
02/01/2017    A 
09/01/2017    B
16/01/2017    C 
02/01/2017    C",
header = TRUE)
df2 <- read.table(text =
"date1       date2  item
01/01/2017  03/01/2017     A 
08/01/2017  10/01/2017     B
15/01/2017  17/01/2017     C",
header = TRUE)

library(sqldf)


sqldf("
  SELECT df1.*, CASE WHEN df1.item = df2.item THEN 'yes' ELSE 'no' END AS prm
  FROM df1 
  LEFT JOIN df2 
   ON df1.date BETWEEN df2.date1 AND df2.date2
   AND df1.item = df2.item
  ")

        date item prm
1 02/01/2017    A yes
2 09/01/2017    B yes
3 16/01/2017    C yes
4 02/01/2017    C  no

2 Comments

It works but same problem than with data.table i don't know why, before the sqldf my data frames had 1 800 000 rows and after it has 800 000 rows
Is it because only 800,000 rows meet the date AND item criteria? I have changed the join to LEFT JOIN, is that what you are looking for?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.