3

I have the following code that creates a dataframe based on user input:

import pandas as pd
  from pandas import DataFrame

 publications = 
 pd.read_csv("C:/Users/nkambhal/data/pubmed_search_results_180730.csv", sep= 
 "|")

 publications['title'] = publications['title'].fillna('')

 search_term = input('Enter the term you are looking for: ')

 publications[['title','publication_id']] 
 [publications['title'].str.contains(search_term)]
 title_mask = 
 publications.title.str.lower().str.contains(search_term.lower())
 new = publications.loc[title_mask, ['title', 'publication_ID']]

Now I want to use the publication ID's in the new dataframe to run this SQL query:

SELECT
   author_profile
   pub_lst.* 
FROM
   pub_lst
JOIN
    author_profile
        ON pub_lst.author_id = author_profile.author_id
WHERE
    pub_lst.publication_id IN (67855,65559);

In the where statement, I want the IDs in the new dataframe to be there. So in the data frame there are the publication_ids ( 5, 6, 4) then I want them to be added to the query.

How can I add the appropriate publication_ids to the SQL query and run it through python and save it to a csv file?

1 Answer 1

2

To put data into a string, you can use the python's str.format function. You can read about it a little more here

For your query string, it should work out like so:

query_string = """
SELECT
   author_profile
   pub_lst.* 
FROM
   pub_lst
JOIN
    author_profile
        ON pub_lst.author_id = author_profile.author_id
WHERE
    pub_lst.publication_id IN {};
"""
print(query_string.format(str(tuple(new.publication_ID.values))))

As for the running the query, you will need to use a python module for whichever database you want to connect it. Such as PyMySQL for connecting to a MySQL database. https://pypi.org/project/PyMySQL/

Although, you could use an ORM such as peewee or SqlAlchemy to make your life a little easier while dealing with SQL databases. Pandas and SqlAlchemy mix really well. But Peewee is easier to get started with.

For creating a csv, you could use the inbuild python csv module, pandas or Peewee or SqlAlchemy in ascending order of difficulty.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.