1

I think this might be a very simple question, but I can't for the life of me find the answer online or in the book I've been using to learn R.

I'm trying to create a table with variables named based on the values in a vector of an existing matrix. Here is an example of how the vectors or interest appear in the table (named "thresholds") I am using where the variable names appears-

varname       threshold_1_name threshold_1_value
1   BMXBMI    high             25
2  BMXCALF    low              40
3    BMXHT    high             180 
4   BMXLEG    low              40   
5   BMXSUB    high             32  
6 BMXTHICR    high             65

The table has 81 records in it, and I want to do something like this:

for (i in 1:81) {
  varname1 <- paste(thresholds$varname[i], thresholds$threshold_1_name[i], sep = "_")
  newtable$[varname1] <- ifelse((bigTable$[thresholds$varname[i]] < thresholds$threshold_1_value[i]),1,0)
}

which would create 'newtable' with 81 columns with names where the first six columns would be named BMXBMI_high, BMXCALF_low, BMXHT_high, BMXLEG_low, BMXSUB_high, BMXTHICR_high. My ifelse statement seems to be fine- I tested it outside of the loop and it worked. I think I'm using incorrect syntax to create the variable names.

Any advice on what I should do or how I should search for an answer would be greatly appreciated. I think part of my inability to find an answer is because I'm using incorrect vocabulary/search terms. Thanks!

@Ben- as you seem to have predicted, I'm not having issues with my ifelse line. Here is a sample of bigTable (and I added a column to the 'threshold' sample above) to help you provide some advice on how to sort the issue there. I am trying to code values in the new variables as 0/1 depending on whether the value in bigTable is above or below the value in 'thresholds'

Sample of bigTable:

  BMXHT BMXBMI BMXLEG BMXCALF BMXWAIST BMXTHICR BMXTRI BMXSUB
1 174.0  24.90     NA    37.5     98.0       NA   12.8   20.4
2 178.3  29.10   45.2    42.6     99.9     56.2   17.4   38.6
3 162.0  22.56   39.7    34.0     81.6     47.0   20.3   16.8
4 162.9  29.39   43.0    37.2     90.7     55.7   26.4   34.2
5 190.1  30.94   46.6    43.7    108.0     64.0   15.5   26.6
6 180.0  30.62   46.0    40.5    112.8     57.1   26.2   NA

When I tried to code everything in one line, I keep getting an error that says the code is only reading the first entry, so I am now trying the following segment of code which is both horribly inefficient and still not working (the first two lines are what you previously sent)-

varname1 <- paste(thresholds$varname, thresholds$threshold_1_name, sep = "_")
bigTable[varname1[1:5]] <- NA

for (i in 1:5) {
  value <- thresholds$threshold_1_value[i]
  var <- thresholds$varname[i]
  newvar <- varname1[i]
  for(j in 1:10) {
    if(bigTable[var[j]] > value) {bigTable[newvar] = 1}
    else if (bigTable[var[j]] <= value) {bigTable[newvar] = 0}
  }
}

Again, any help you can provide is greatly appreciated!

2
  • How about varname1[i] and newtable$varname1[i] instead of varname1 and newtable$[varname1] on the LHS of the assignment in the loop? Commented Dec 20, 2012 at 22:35
  • Can you post bigTable as well. There are far quicker ways to do this, but you also need to understand the difference between $ and [[. Look at ?Extract. Commented Dec 20, 2012 at 22:38

1 Answer 1

2

Here's an answer to the question of 'how do I add many columns to a dataframe using variables in the dataframe', which seems to be part of the original problem (can't do much about the rest until we see what bigTable looks like):

# prepare data
thresholds <- read.table(text = "varname       threshold_1_name 
   BMXBMI    high 
  BMXCALF    low
    BMXHT    high
   BMXLEG    low
   BMXSUB    high
 BMXTHICR    high", header = TRUE)

To pursue the loop in the question, we can use it create new column names based on existing data

varname1 <- NULL
for (i in 1:nrow(thresholds) {
  varname1[i] <- paste(thresholds$varname[i], thresholds$threshold_1_name[i], sep = "_")
}

But note that a loop isn't needed here, a basic vector operation will get the same result as the loop:

varname1 <- paste(thresholds$varname, thresholds$threshold_1_name, sep = "_")

Anyway, whichever way you do it, then you can add the names as column names like so:

# add new columns to a new dataframe
newtable <- data.frame(setNames(replicate(length(varname1), numeric(0), simplify = F), varname1))

And here's the output, new columns with names that are a function of existing variables:

 str(newtable)
'data.frame':   0 obs. of  6 variables:
 $ BMXBMI_high  : num 
 $ BMXCALF_low  : num 
 $ BMXHT_high   : num 
 $ BMXLEG_low   : num 
 $ BMXSUB_high  : num 
 $ BMXTHICR_high: num 
Sign up to request clarification or add additional context in comments.

3 Comments

the columns shouldn't be added to thresholds they should be added to a new table called newtable :)
@Ben- I just edited the post above to include a sample of bigTable. Any help you can provide is greatly appreciated.
@Struggling_with_R, can you add to your question a description of what the loop involving bigTable is supposed to do? It's not really clear what you're after with that...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.