12

I want to make a query against a LDAP directory of how employees are distributed in departments and groups...

Something like: "Give me the department name of all the members of a group" and then use R to make a frequency analysis, but I can not find any examples on how to connect and run a LDAP query using R.

RCurl seems to have some kind of support ( http://cran.r-project.org/web/packages/RCurl/index.html ):

Additionally, the underlying implementation is robust and extensive, supporting FTP/FTPS/TFTP (uploads and downloads), SSL/HTTPS, telnet, dict, ldap, and also supports cookies, redirects, authentication, etc.

But I am no expert in R and have not been able to find a single example using RCurl (or any other R library) to do this..

Right now I am using CURL like this to obtain the members of a group:

curl "ldap://ldap.replaceme.com/o=replaceme.com?memberuid?sub?(cn=group-name)"

Anyone here knows how to do the same in R with RCurl?

5
  • 2
    We'd need to know a bit more about the LDAP server config. An example LDAP query via curl -u USERNAME 'ldap://192.168.0.66/CN=Users,DC=training,DC=local\?sAMAccountName?sub?(ObjectClass=*)' (that's from an IBM example). It won't work for you since you need to know the proper search parameters. It's pretty straightforward to run that via RCurl and then process the results, but if you should get the query working from curl on the command line first. Commented Apr 1, 2014 at 18:38
  • 1
    Right now I am retrieving the list of members of a group like this: ldapsearch -t -h ldap.replaceme.com -x -b "o=replaceme.com" "(cn=group-name)" memberuid Commented Apr 1, 2014 at 18:42
  • @hrbrmstr if you can translate my ldapsearch to curl and then to R with RCurl, that would be the exact answer I am looking for... Commented Apr 1, 2014 at 18:49
  • Hi @hrbrmstr I have translated my ldapsearch query to curl... Can you tell me how do I run it with RCurl? Commented Apr 1, 2014 at 19:45
  • Already did it my self... but thanks a lot for your guidance @hrbrmstr :-) Commented Apr 1, 2014 at 22:14

5 Answers 5

12

Found the answer myself:

First run this commands to make sure RCurl is installed (as described in http://www.programmingr.com/content/webscraping-using-readlines-and-rcurl/ ):

install.packages("RCurl", dependencies = TRUE)
library("RCurl")

And then user getURL with an ldap URL (as described in http://www.ietf.org/rfc/rfc2255.txt although I couldn't understand it until I read http://docs.oracle.com/cd/E19396-01/817-7616/ldurl.html and saw ldap[s]://hostname:port/base_dn?attributes?scope?filter):

getURL("ldap://ldap.replaceme.com/o=replaceme.com?memberuid?sub?(cn=group-name)")
Sign up to request clarification or add additional context in comments.

1 Comment

On a related note this is an excellent guide on the usage of RCurl omegahat.org/RCurl/RCurlJSS.pdf
5

I've written a function here to parse ldap output into a dataframe, and I used the examples provided as a reference for getting everything going.

I hope it helps someone!

library(RCurl)
library(gtools)

parseldap<-function(url, userpwd=NULL)
{
  ldapraw<-getURL(url, userpwd=userpwd)
  # seperate by two new lines
  ldapraw<-gsub("(DN: .*?)\n", "\\1\n\n", ldapraw)
  ldapsplit<-strsplit(ldapraw, "\n\n")
  ldapsplit<-unlist(ldapsplit)
  # init list and count
  mylist<-list()
  count<-0
  for (ldapline in ldapsplit) {
    # if this is the beginning of the entry
    if(grepl("^DN:", ldapline)) {
      count<-count+1
      # after the first 
      if(count == 2 ) {
        df<-data.frame(mylist)
        mylist<-list()
      }
      if(count > 2) {
        df<-smartbind(df, mylist)
        mylist<-list()
      }
      mylist["DN"] <-gsub("^DN: ", "", ldapline)
    } else {
      linesplit<-unlist(strsplit(ldapline, "\n"))
      if(length(linesplit) > 1) {
        for(line in linesplit) {
          linesplit2<-unlist(strsplit(line, "\t"))
          linesplit2<-unlist(strsplit(linesplit2[2], ": "))
          if(!is.null(unlist(mylist[linesplit2[1]]))) {
            x<-strsplit(unlist(mylist[linesplit2[1]]), "|", fixed=TRUE)

            x<-append(unlist(x), linesplit2[2])
            x<-paste(x, sep="", collapse="|")
            mylist[linesplit2[1]] <- x
          } else {
            mylist[linesplit2[1]] <- linesplit2[2]  
          }
        }
      } else {
        ldaplinesplit<-unlist(strsplit(ldapline, "\t"))
        ldaplinesplit<-unlist(strsplit(ldaplinesplit[2], ": "))
        mylist[ldaplinesplit[1]] <- ldaplinesplit[2]
      }

    }

  }
  if(count == 1 ) {
    df<-data.frame(mylist)
  } else {
    df<-smartbind(df, mylist)
  }
  return(df)
}

Comments

2
loginLDAP <- function(username, password) {

  ldap_url <- "ldap://SERVER-NAME-01.companyname.com"

  handle <- curl::new_handle(timeout = 10)

  curl::handle_setopt(handle = handle, userpwd = paste0("companyname\\", username, ":", password))

  tryCatch(
    {
      response <- curl::curl_fetch_memory(url = ldap_url, handle = handle)

      if (response$status_code == 0) {
        return(list(success = TRUE, username = username))
      } else {
        print("Invalid login credentials.")
      }
    },
    error = function(e) {
      return(list(success = FALSE, username = username))
    }
  )
}

Comments

1

I wrote a R library for accessing ldap servers using the openldap library. In detail, the function searchldap is a wrapper for the openldap method searchldap. https://github.com/LukasK13/ldapr

1 Comment

Another related package can be found here.
0

I followed this strategy:

  1. run a Perl script with an LDAP query, write data to disc as JSON.
  2. read in the json structure with R, create a dataframe.

For step (1), I used this script:

#use Modern::Perl;
use strict;
use warnings;
use feature 'say';
use Net::LDAP;
use JSON;
chdir("~/git/_my/R_one-offs/R_grabbag");
my $ldap = Net::LDAP->new( 'ldap.mydomain.de' ) or die "$@";
my $outfile = "ldapentries_mydomain_ldap.json";
my $mesg = $ldap->bind ;    # an anonymous bind
# get all cn's (= all names)
$mesg = $ldap->search(
                base   => " ou=People,dc=mydomain,dc=de",
                filter => "(cn=*)"
              );

my $json_text = "";
my @entries;

foreach my $entry ($mesg->entries){
 my %entry;
 foreach my $attr ($entry->attributes) {
    foreach my $value ($entry->get_value($attr)) {
      $entry{$attr} = $value;
    }
  }
  push @entries, \%entry;
}

$json_text = to_json(\@entries);
say "Length json_text: " . length($json_text);


open(my $FH, ">", $outfile);
print $FH $json_text;
close($FH);
$mesg = $ldap->unbind;

You might need check the a max size limit of entries returned by the ldap server. See https://serverfault.com/questions/328671/paging-using-ldapsearch

For step (2), I used this R code:

setwd("~/git/_my/R_one-offs/R_grabbag")
library(rjson)
# read into R list, from file, created from perl script
json <- rjson::fromJSON(file="ldapentries_mydomain_ldap.json",method = "C")
head(json)

# create a data frame from list
library(reshape2)
library(dplyr)
library(tidyr)

# not really efficient, maybe thre's a better way to do it
df.ldap <- json %>% melt %>% spread( L2,value)

# optional:
# turn factors into characters
i <- sapply(df.ldap, is.factor)
df.ldap[i] <- lapply(df.ldap[i], as.character)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.