40

am trying to write an R script that will access an Excel file that is stored on my company's Sharepoint page so that I can make a few calculations and plot the results. I've tried various ways to do this (download.file, RCurl getURL(), gdata), but I can't seem to figure out how to do this. The url is HTTPS and there should be a username and password required. I've gotten the closest with this code:

require(RCurl)
URL<-"https://companyname.sharepoint.com/sites/folder/_layouts/15/WopiFrame.aspx?sourcedoc={2DCC2ED7-1C13-4910-AFAD-4A9ACFF1C797}&file=myfile.xlsx&action=default'  
f<-getURL(URL,verbose=T,ssl.verifyhost=F,ssl.verifypeer=F,userpwd="mylogin:mypw") 

This seems to connect (although the username and password don't seem to matter) and returns

> f  
[1] "<html><head><title>Object moved</title></head><body>\r\n<h2>Object moved to <a href=\"https://companyname.sharepoint.com/sites/_layouts/15/WopiFrame2.aspx?sourcedoc={2DCC2ED7-1C13-4910-AFAD-4A9ACFF1C797}&amp;file=MyFile.xlsx&amp;action=default\">here</a>.</h2>\r\n</body></html>\r\n"`

However, I'm not sure what to do at this point, or even if I'm on the right track. Any help will be greatly appreciated.

2
  • 1
    Have you had any luck accessing it, yet? I have a similar question. Commented Feb 29, 2016 at 17:10
  • No, I wasn't able to figure it out. Commented Mar 1, 2016 at 17:22

11 Answers 11

20

I use

library(readxl)
read_excel('//companySharepointSite/project/.../ExcelFilename.xlsx', 'Sheet1', skip=1)

Note, no https:, and sometimes I have to open the file first (i.e., cut and paste //companySharepointSite/project/.../ExcelFilename.xlsx into my browser's address bar)

Sign up to request clarification or add additional context in comments.

1 Comment

This does not work for me, it simply returns an html stating that I need to login. Thoughts?
8

I found that other answers did not work for me, perhaps because I am on a Mac, which obviously does not play as well with Microsoft products such as Sharepoint.

Ended up having to split it into two pieces: first download the Excel file to disk and then separately read that Excel file.

library(httr)
library(readxl)

# the URL of your sharepoint file
file_url <- "https://yoursharepointsite/Documents/yourfile.xlsx"

# save the excel file to disk
GET(file_url, 
    authenticate(active_directory_username, active_directory_password, "ntlm"),
    write_disk("tempfile.xlsx", overwrite = TRUE))

# save to dataframe
df <- read_excel("tempfile.xlsx")
df

# remove excel file from disk
file.remove("tempfile.xlsx")

This gets the job done, though would be interested if anyone knows how to avoid the interim step of writing to disk.

N.B. Depending on your specific machine/network/Sharepoint configuration, you may also be able to just use authenticate(":",":","ntlm") per this answer.

1 Comment

Did not work for me.
6

I was unable to accomplish this using hints from answers above in R (I tried many approaches found on this site). However, just to highlight the response by @RyanBradley above and especially the response by @ZS27:

I instead had to use the OneDrive Desktop client (Windows) to allow me to sync the folder to my computer. Newer versions of SharePoint (like that found in MS Teams) have a sync button or feature in the document libraries / folders that interfaces with OneDrive.

This is the functional equivalent of mounting the folder as a network drive, so R interfaces with the file as if it was a part of your file system. Works for me.

1 Comment

Yes - this is easy and quick. I just synced from within Teams actually. Now I can "see" the file on my computer and can open it just like any other file. Thanks!
3

I recently downloaded a file from my company sharepoint by using the Microsoft365R package which facilitates authentication through the browser.

The first line of code, get_business_onedrive() opened a new tab in my browser. I briefly saw the windows login screen, then it reused my authentication from earlier in the day. Then the website read, "Authenticated with Azure Active Directory. Please close this page and return to R."

I returned to R to downloaded my file to my working directly, read the relevant data into my environment as an object, then deleted the file from the working directly.

Ideally there would be a way to skip the download part, as mentioned by Nick Kastango in a separte post. I used my company's usual browser facilitated authentication. I specifically did not want to map my Team drive to my local onedrive because I want my colleagues with access to the file to be able to use the script and not have map files to a local drive. Rather, they'll use the script and login with their own credentials.

# load relevant packages
library(Microsoft365R) #for accessing onedrive
library(readxl) #for reading excel files

get_business_onedrive() #I ran this first and the browser logged me in

list_sharepoint_sites() #list my teams in the Rstudio console

site <- get_sharepoint_site("MyTeamName") #written exactly as it was listed in the console
  
site$list_drives() #list the various drive libraries, like "documents" and "wiki"

drv <- site$get_drive()# default is the document library, so I didn't need to specify anything

#downloads the file to the project working directory
drv$download_file("foldername/filename.xlsx")

# Review the sheet names in order to select the correct one.  
excel_sheets("filename.xlsx")

#read the dataframe to my Rstudio environment
df <-read_excel("filename.xlsx", 
                       sheet = "sheetname")

#Removes the file from my working directory. I don't want the downloaded excel file to stay. I always want to be working with the current version of the spreadsheet.
file.remove("filename.xlsx")

Comments

2

I had a situation exactly like you. I want to access an excel file, available on an sharepoint site using R programming language.

I have also surfed many stuff in Internet and I didn't find anything relevant to my requirement.

Then, I have attempted the following thing: I have made the sharepoint folder as a network drive folder, in my local system.

Then, I have accessed that excel file(in sharepoint site) from my machine without accessing web browser.

Hence, I have copied the network path, present in my system (it will be same as your sharepoint site, however it will not have https/http. The site will start with "\" like the following: "\sharepoint.test.com\folder\path").

Launch RStudio and select Import Dataset option, under Environment section.

Choose 'From Excel'. 'Import Excel Data' form will be opened.

Under File/URL field: Paste the network path of sharepoint (copied from your machine).

Click Import, the excel file in Sharepoint will be imported in R successfully.

Ensure that the file should not have html language as input (lie %20 and all) and Backslash should be used as separator in the URL. While importing the file, provide the input of the folder name exactly, as you see.

For example: Sharepoint.microsoft.com - Sharepoint's Domain Department name - name of the Folder Project name - name of the folder Sample.xlsx - name of the file So, your URL to import dataset should be:

"\Sharepoint.microsoft.com\Department name\Project name\Sample.xlsx".

Thank you!

Comments

1

You may need to map a network drive to the SharePoint library so that you can connect to it directly. Or if you don't want to map a network drive you could also place a shortcut to the folder in your startup folder.

Example file path: \company_sharepoint_site\ssp\site_name\sub_site_name\library_name

Example start up folder location (Windows 10): C:\Users\USER_NAME\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup

Note direction of the slashes ("\" rather than "/") is important so that your file path is interpreted as a file location, not an internet browser location. By placing such a path in a network drive or as a shortcut in your startup folder your PC should connect to it when it boots.

 # Load or install readxl
if(require(readxl) == FALSE){
  install.packages("readxl")
  if(require(readxl)== FALSE){stop("Unable to install and load readxl")}
}

# Define path to data 
data_path <- "\\\\company_sharepoint_site\\ssp\\site_name\\sub_site_name\\library_name\\Example.xlsx"

# Pull data
df_employees <- read_xlsx(data_path)

1 Comment

You can sync the SharePoint folder to your computer via OneDrive by clicking on the "Sync" button located on the SharePoint documents toolbar. My SharePoint is synced to my C drive and enables me to connect to the document locally (C:/Users/OneDrive/my_data.xlsx).
0

Try using the link in this format: http://site/_layouts/download.aspx?SourceUrl=url-of-document-in-library

3 Comments

It is considered good practice to not just past a link, but also summarise the essential points from the link. This prevents your answer from becoming useless if the link dies.
I think that the link is the format. Not a link to a document that has the format.
What does the "url of document in library" part mean? where can we get that?
0

If above doesn't work try this syntax [note slash directions]:

"\\gov.sharepoint.com@SSL/DavWWWRoot/sites/SomePath/SomePath/SomePath/SomeFile"

See this for more info about syntax and what's going on:

Connect to a site via SSL/DavWWWRoot not usual URL? Why does this make a difference?

Comments

0

I have been able to download files of a SharePoint Online with the following code :

library(Microsoft365R)
library(AzureGraph)

az <- create_graph_login()
gr <- get_graph_login()

# You can run 
# gr$get_user()$list_group_memberships()
# to get the list of the groups you have access to.
# For my use case, the correct id was the 25th object 
# of the list. This is the reason there is a [25]
# below.

id <- gr$get_user()$list_group_memberships()[25]
obj_Sharepoint <- gr$get_aad_object(id)
obj_Drive <- obj_Sharepoint$get_drive()
obj_Drive$download_folder(src = "directory_On_The_Sharepoint/", dest = "D:/My_Directory_On_My_Computer")

Comments

0

Here is another approach that can be considered to download an Excel file from a SharePoint Online :

library(RDCOMClient)
xlApp <- COMCreate("Excel.Application")
xlApp[["DisplayAlerts"]] <- FALSE
xlApp[["Visible"]] <- TRUE

xlWbk <- xlApp$Workbooks()$Open("https://XXXX.sharepoint.com/.../my_Excel_File.xlsx")
xlWbk$SaveAs("C:\\Downloads\\my_Excel_File.xlsx")

Basically, you open the Excel file and you save it in a directory afterwards.

Comments

0

You can also consider the following approach :

library(RDCOMClient)
FSO <- COMCreate("scripting.filesystemobject")

FromPath <- "//sharepoint.xxx/xxx/xxx/test.xlsx"
ToPath <- "C:\\xxx\\test.xlsx"
FSO$CopyFile(FromPath, ToPath)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.