0

How can i merge two columns into one (final output) (python/sqlite)

import sqlite3
import pandas as pd

# load data
df = pd.read_csv('CurriculumAuditReport.csv')

# strip whitespace from headers
df.columns = df.columns.str.strip()

con = sqlite3.connect("sans.db")

# drop data into database
df.to_sql("MyTable", con, if_exists='replace')

qry = """
SELECT department, count(*) as cnt
FROM MyTable
WHERE CompletedTraining = 'Incomplete'
GROUP BY department
"""

qry2 = """
SELECT [Employee Name], Department, [Date Assigned] FROM MyTable Where CompletedTraining ='Incomplete' ORDER BY Department ASC
"""


df = pd.read_sql_query(qry, con)
df2 = pd.read_sql_query(qry2, con)

print(df.to_json())
print(df2)


con.close()

can i merge department with cnt? so that i have AQPSD: 6, ASD: 8, CO: 2 ect???

currently: 2 columns as expected

   Department  count(*)

0       AQPSD         6
1         ASD         8
2          CO         2
3       ECARS         3
4          ED         6
5          EO         4
6         ISD         4
7        MSCD         5
8         OIS         1
9          RD         2
10        TTD         4

this has the following Output: 1 column (kind of hard to display but its my end goal)

Department

0       AQPSD 6
1         ASD 8
2          CO 2
3       ECARS 3
4          ED 6
5          EO 4
6         ISD 4
7        MSCD 5
8         OIS 1
9          RD 2
10        TTD 4
4
  • Do I understand it correctly - you want to have a single column containing values like AQPSD: 6, ASD: 8, etc. ? Can you post your desired output? Commented Dec 28, 2017 at 22:24
  • How about df.set_index('Department').to_dict() ? Commented Dec 28, 2017 at 22:29
  • @MaxU i have updated the post to reflex what i would like the output to be Commented Dec 28, 2017 at 23:08
  • @tarashypka is that the same format for chart.js? Commented Dec 28, 2017 at 23:12

2 Answers 2

4

You can either do it on the SQLite side or in Pandas.

Option 1 (using SQLite):

qry = """
SELECT department || ' ' || cast(count(*) as text) as col_name
FROM MyTable
WHERE CompletedTraining = 'Incomplete'
GROUP BY department
"""
df = pd.read_sql(qry, con)

Option 2 (using Pandas):

assuming we have the following DataFrame:

In [79]: df
Out[79]:
   department  cnt
0       AQPSD    6
1         ASD    8
2          CO    2
3       ECARS    3
4          ED    6
5          EO    4
6         ISD    4
7        MSCD    5
8         OIS    1
9          RD    2
10        TTD    4

let's convert it to a single column DF:

In [80]: df['department'] = df['department'] + ' ' + df.pop('cnt').astype(str)

In [81]: df
Out[81]:
   department
0     AQPSD 6
1       ASD 8
2        CO 2
3     ECARS 3
4        ED 6
5        EO 4
6       ISD 4
7      MSCD 5
8       OIS 1
9        RD 2
10      TTD 4

PS this can easily be done without using SQLite at all, but we would need a small reproducible sample data set in the original format (which would reproduce data from CurriculumAuditReport.csv)

Sign up to request clarification or add additional context in comments.

Comments

0

This is a step by step solution:

Add a new column and convert the count column to string with "astype(str)

df['new_column'] = df['Department'] + " " + df['count'].astype(str)

Delete columns that you don't need

del df['Department']
del df['count']

Rename new_column

df.rename(columns={'new_column': 'Department'}, inplace=True)

I know it has a lot of steps, but sometimes is better to break it down in small steps to have a better understanding.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.