17

I am using Microsoft sharepoint. I have an url, by using that url I need to get total data like photos,videos,folders,subfolders,files,posts etc... and I need to store those data in database(Sql server). I am using python.

So,Please anyone suggest me how to do this and I am beginner for accessing sharepoint and working this sort of things.

2
  • 2
    Welcome to stackoverflow! Can you please explain what you have tried and what methods have you started with? For a question to attract a proper answer, you need to key in your own efforts as well. Commented Jan 30, 2020 at 5:27
  • I have taken url, using microsoft graph api, I tried to get the data which is present in that url, but I can't able to get data totally. when I opened that url I can see the information which I required but I am not getting any idea , how to get data and store in to my database. Commented Jan 30, 2020 at 6:14

4 Answers 4

25

Here's the starter code for connecting to share point through Python and accessing the list of files, folders and individual file contents of Sharepoint as well. You can build on top of this to suit your needs.

Please note that this method works for public Sharepoint sites that are accessible through internet. For Organisation restricted Sharepoint sites that are hosted on a Company's intranet, I haven't tested this code out.

You will have to modify the link to the Sharepoint file a bit since you cannot directly access a Sharepoint file in Python using the URL address of that file which is copied from the web browser.


from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.files.file import File 

####inputs########
# This will be the URL that points to your sharepoint site. 
# Make sure you change only the parts of the link that start with "Your"
url_shrpt = 'https://YourOrganisation.sharepoint.com/sites/YourSharepointSiteName'
username_shrpt = 'YourUsername'
password_shrpt = 'YourPassword'
folder_url_shrpt = '/sites/YourSharepointSiteName/Shared%20Documents/YourSharepointFolderName/'

#######################



###Authentication###For authenticating into your sharepoint site###
ctx_auth = AuthenticationContext(url_shrpt)
if ctx_auth.acquire_token_for_user(username_shrpt, password_shrpt):
  ctx = ClientContext(url_shrpt, ctx_auth)
  web = ctx.web
  ctx.load(web)
  ctx.execute_query()
  print('Authenticated into sharepoint as: ',web.properties['Title'])

else:
  print(ctx_auth.get_last_error())
############################
  
  
  
  
####Function for extracting the file names of a folder in sharepoint###
###If you want to extract the folder names instead of file names, you have to change "sub_folders = folder.files" to "sub_folders = folder.folders" in the below function
global print_folder_contents
def print_folder_contents(ctx, folder_url):
    try:
       
        folder = ctx.web.get_folder_by_server_relative_url(folder_url)
        fold_names = []
        sub_folders = folder.files #Replace files with folders for getting list of folders
        ctx.load(sub_folders)
        ctx.execute_query()
     
        for s_folder in sub_folders:
            
            fold_names.append(s_folder.properties["Name"])

        return fold_names

    except Exception as e:
        print('Problem printing out library contents: ', e)
######################################################
  
  
# Call the function by giving your folder URL as input  
filelist_shrpt=print_folder_contents(ctx,folder_url_shrpt) 

#Print the list of files present in the folder
print(filelist_shrpt)

Now that we are able to retrieve and print the list of files present in a particular folder in Sharepoint, below is the code to access the file contents of a particular file and save it to local disk having known the file name and path in Sharepoint.

#Specify the URL of the sharepoint file. Remember to change only the the parts of the link that start with "Your"
file_url_shrpt = '/sites/YourSharepointSiteName/Shared%20Documents/YourSharepointFolderName/YourSharepointFileName'

#Load the sharepoint file content to "response" variable
response = File.open_binary(ctx, file_url_shrpt)

#Save the file to your offline path
with open("Your_Offline_File_Path", 'wb') as output_file:  
    output_file.write(response.content)

You can refer to the following links for connecting to SQL server and storing the contents in tables: Connecting to Microsoft SQL server using Python

https://datatofish.com/how-to-connect-python-to-sql-server-using-pyodbc/

Sign up to request clarification or add additional context in comments.

10 Comments

Thank you so much for giving the information, but in my sharepoint I have documents as one URL and few other sub sites and few more. When I accessed to that sites it would not be in the forms of folders, it would be in the form of posts/discussions. Please can you say anything related to that , how to get those data.
If you have only the URL link to your sharepoint documents, you will have to extract the following parameters from the URL namely: "YourOrganisation", "YourSharepointSiteName", "YourSharepointFolderName" and "YourSharepointFileName". All the above parameters would be embedded in your sharepoint link itself. So try to parse the URL and then extract the above parameters and then try to run the above script. A simple analysis on your sharepoint link would get you all these details
Help me to extract the data which is in the form of dialog/segments(likewise in the format of boxes). It is similar to quora page(quora.com/topic/Fitness). So how to get that data. I mean to say that I can't share my sharepoint data or details to you, so I just attached the link which is similar to my page. So please can you say how to get that data.
Dear @sai. There is no one single solution for extracting files and posts from a sharepoint link. Both are 2 separate ways and need to be handled differently. For file extractions, the solution I gave you would work perfectly fine. But for extracting post contents, you will have to use web scraping techniques using the Beautifulsoup package of python. So the technique that you need for extracting posts and any content from a web page is web scraping and BeatufifulSoup has wonderful ways of doing web scraping, You can take a look at dataquest.io/blog/web-scraping-beautifulsoup
Thank you so much, for suggesting and providing information.
|
9

A simpler solution would be to create a shortcut in your OneDrive. This shortcut is then readable with a common pd.read_excel, pd.read_csv, etc.

For example:

df = pd.read_excel(r'C:\Users\badgenumber\OneDrive - company\Team folder\Ticketing System\ Inquiries\Inquiry tracker.xlsx')

2 Comments

This is the easiest way to do it if you don't have password auth available
FileNotFoundError: [Errno 2] No such file or directory: [Full path to file]
3

You might want to consider using Pysharepoint. It provides easy interface to upload and download files to and from SharePoint in Python.

import pysharepoint as ps

sharepoint_base_url = "https://<abc>.sharepoint.com/"
username = "username"
password = "password"

site = ps.SPInterface(sharepoint_base_url, username, password)

source_path = "Shared Documents/Shared/<Location>"
sink_path = "/full_sink_path/"
filename = "filename.ext"
sharepoint_site = "https://<abc>.sharepoint.com/sites/<site_name>"

site.download_file_sharepoint(source_path, sink_path, filename, sharepoint_site)
site.upload_file_sharepoint(source_path, sink_path, filename, sharepoint_site)

1 Comment

wish pysharepoint supported auth other than username/password
0

Did you check the Office365-REST-Python-Client?

https://github.com/vgrem/Office365-REST-Python-Client

For examples see following link:

https://github.com/vgrem/Office365-REST-Python-Client/tree/master/examples/sharepoint/files

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.