1

I have a requirement to read the data from a csv file and populate it into the database. I am using the bcp commandline utility for this purpose. My CSV File look like this:

First_name,Last_name,EmpID,company,languages
"Jack","Thomas","57616","IBM","C
C++
JAVA
COBOL
PERL
SQL
 "
"Tim","Cook","10001","Apple","Python
C++
Java
XML
 "

As you see, the last column(languages) has values each on a new line. The bcp command has been scripted to check for the row delimiter and once it gets the first value from the last column, it terminates Request you to please suggest how to parse this using bcp?

3
  • bcp/BULK INSERT doesn't support quoted text; at least not before SQL Server 2019. You'll need to use a different tool, for example SSIS. Commented Dec 14, 2018 at 10:17
  • "Import and Export Data" tool will help you. It's intuitive to use. Commented Dec 14, 2018 at 10:55
  • What have you tried, where are you stuck? Regard that this site is not a free code/script writing service! Commented Dec 14, 2018 at 12:37

2 Answers 2

3

I don't see much progress in trying to find a solution/doing research on your own - what is expected in [SO].

Here a possible PowerShell solution importing the csv,
converting the multiline column to a semicolon separated one and exporting as csv.

Import-Csv .\old.csv| ForEach-Object {
    $_.Languages=$_.Languages -split "`r?`n" -ne ' ' -join ';'
    $_
} | Export-Csv .\New.csv -NoTypeInformation

This will result in all columns double quoted:

> Get-Content .\new.csv
"First_name","Last_name","EmpID","company","languages"
"Jack","Thomas","57616","IBM","C;C++;JAVA;COBOL;PERL;SQL"
"Tim","Cook","10001","Apple","Python;C++;Java;XML"

Another PowerShell one liner will remedy this:

(Get-Content .\new.csv).trim('"') -replace '","',',' | Set-Content .\new.csv

First_name,Last_name,EmpID,company,languages
Jack,Thomas,57616,IBM,C;C++;JAVA;COBOL;PERL;SQL
Tim,Cook,10001,Apple,Python;C++;Java;XML

EDIT: one combined .ps1 file

## Q:\Test\2018\12\14\SO_53777634.ps1

$FileIn = '.\old.csv'
$FileOut= '.\new.csv'

Import-Csv $FileIn | ForEach-Object {
    $_.Languages=$_.Languages -split "`r?`n" -ne ' ' -join ';'
    $_
} | Export-Csv $FileOut -NoTypeInformation

(Get-Content $FileOut).trim('"') -replace '","',',' | Set-Content $FileOut
Sign up to request clarification or add additional context in comments.

6 Comments

When I am running the command Import-Csv .\old.csv| ForEach-Object { $_.Languages=$_.Languages -split "r?n" -ne ' ' -join ';' $_ } | Export-Csv .\New.csv -NoTypeInformation, I am getting the error message : At line:1 char:102 + ... $_.Languages=$_.Languages -split "r?n" -ne ' ' -join ';' $_} | Expo ... + ~~ Unexpected token '$_' in expression or statement. + CategoryInfo : ParserError: (:) [], ParentContainsErrorRecordException + FullyQualifiedErrorId : UnexpectedToken
What PowerShell version do yoi use? ($PSVersionTable) Looks like a copy/paste error, the lines shouldn't count up to char:102. See edited answer.
Im using the PS version 5.1 Name : ConsoleHost Version : 5.1.16299.666 InstanceId : 448f3d4a-5248-4bb8-bef0-9a70200435ad
I am running the entire script as a single line, so probably its showing char:102
Try running the script .\file.ps1 , however you name it.
|
1

Here is a SQL solution: This walks through your import file and parses the data into two tables. There are two loops. One loop for the 'master' table and one loop for the 'detail' table.

Setup

IF EXISTS(SELECT *
          FROM   #tempTable)
  DROP TABLE #tempTable

/*
Create Table emps
(
First_Name Varchar(25),
Last_Name VarChar(25),
EmpID VarChar(10),
Company VarChar(30)
)

Create Table langs
(
EmpID VarChar(10),
Lang VarChar(15)
)
*/

Delete From langs
Delete From emps

CREATE TABLE #tempTable
(
  RowVal VarChar(Max)
)

Query

BULK INSERT #tempTable
FROM 'c:\Downloads\EmpLangs.txt' 
WITH 
(
    FIRSTROW = 2,
    ROWTERMINATOR = '\n'
)

Declare @RowV VarChar(100)
--Use the following to get the location of each delimiter
Declare @f1q1 Int
Declare @f1q2 Int
Declare @f2q1 Int
Declare @f2q2 Int
Declare @f3q1 Int
Declare @f3q2 Int
Declare @f4q1 Int
Declare @f4q2 Int
Declare @f5q1 Int

Declare @empid VarChar(10)

Declare @vHeader Int = 1  --Is header row?

Declare vCursor CURSOR For Select RowVal  From #tempTable

  Open vCursor;
  Fetch Next From vCursor Into @RowV

  While @@FETCH_STATUS = 0  --Walk through rows to parse
  Begin

   If @vHeader = 1
      Begin     
        Set @f1q1 = CHARINDEX('"',@RowV,1)
        Set @f1q2 = CHARINDEX('"',@RowV,@f1q1+1)

        Set @f2q1 = CHARINDEX('"',@RowV,@f1q2+1)
        Set @f2q2 = CHARINDEX('"',@RowV,@f2q1+1)

        Set @f3q1 = CHARINDEX('"',@RowV,@f2q2+1)
        Set @f3q2 = CHARINDEX('"',@RowV,@f3q1+1)

        Set @f4q1 = CHARINDEX('"',@RowV,@f3q2+1)
        Set @f4q2 = CHARINDEX('"',@RowV,@f4q1+1)

        Set @f5q1 = CHARINDEX('"',@RowV,@f4q2+1)

        Insert Into emps Values
        (SUBSTRING(@RowV,@f1q1+1,@f1q2-@f1q1-1),
         SUBSTRING(@RowV,@f2q1+1,@f2q2-@f2q1-1),
         SUBSTRING(@RowV,@f3q1+1,@f3q2-@f3q1-1),
         SUBSTRING(@RowV,@f4q1+1,@f4q2-@f4q1-1) 
        )

        Set @vHeader = 0
        Set @empid = SUBSTRING(@RowV,@f3q1+1,@f3q2-@f3q1-1)
        Insert Into langs Values (@empid,SUBSTRING(@RowV,@f5q1+1,Len(@RowV)- @f5q1 + 1))  -- ADDED to get the trailing language from the header row
      End

     Fetch Next From vCursor Into @RowV
       While @@FETCH_STATUS = 0  And @vHeader = 0 And @RowV <> ' "'
         Begin
            Insert Into langs Values (@empid,@RowV)
            Fetch Next From vCursor Into @RowV
            If @RowV = ' "' 
             Begin
                If @@FETCH_STATUS = 0 
                  Begin
                     Fetch Next From vCursor Into @RowV
                     Set @vHeader = 1
                  End
             End
         End
  End;

  Close vCursor
  Deallocate vCursor

Select e.*,l.lang From emps e
INNER JOIN
langs l ON e.EmpID = l.EmpID

Result

First_Name  Last_Name   EmpID   Company Lang
Jack        Thomas      57616   IBM     C
Jack        Thomas      57616   IBM     C++
Jack        Thomas      57616   IBM     JAVA
Jack        Thomas      57616   IBM     COBOL
Jack        Thomas      57616   IBM     PERL
Jack        Thomas      57616   IBM     SQL
Tim         Cook        10001   Apple   Python
Tim         Cook        10001   Apple   C++
Tim         Cook        10001   Apple   Java
Tim         Cook        10001   Apple   XML

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.