0

I have a host information table stored in a database of SQL Server, and the table has a text column storing a string formatted like Ansible inventory. See the text below for a sample item in the text column.

host-001.servers.company.com desc='Production Web Cache' env='Prod' patch_round='Beta' dc='Main' rhel_v='7.6' primary='[email protected]' secondary='[email protected]'

I need to extract certain attributes from the text column, e.g. extract desc='Production Web Cache', and get its value Production Web Cache. I want to use regular expression in a SQL query and hope to get some pointers.

Or, if you know another way of achieving this purpose, I will also highly appreciate your hints. Let me know if you need more clarification.

4
  • T-SQL would be a poor choice of a language to to do this, you could be better off using one with good string manipulation tools. Commented Dec 7, 2021 at 17:40
  • Can any of your key-value-pairs ever contain the characters = or '? If so, don't do this in SQL. Commented Dec 7, 2021 at 17:43
  • Tried running this through String_Split() to see how viable it was... Revoking my suggestion and agreeing with Larnu. T-SQL and built-in functions is the wrong tool for the job. Commented Dec 7, 2021 at 17:56
  • You probably could do this with a string splitter (not STRING_SPLIT), @JNevill , but it would need to run on a couple of assumptions: 1. A domain is present at the start and followed by a space. 2. Values cannot contain =. 3. All values are enclosed in single quotes (') and all names are not. It would, however, be truly ugly. Commented Dec 7, 2021 at 17:59

3 Answers 3

2

A very similar approach to John. I use a JSON splitter firstly to get the data into parts, though this puts the value with the next header. I use CHARINDEX to find the end of the value, and then use that LEFT/STUFF to get the 2 values into their respective parts. Then I use LAG to get the actual header, rather that the next values header. Finally, I remove the surrounding quotes.

This follows on the assumptions from my comment:

  1. A domain is present at the start and followed by a space.
  2. Values cannot contain =.
  3. All values are enclosed in single quotes (') and all names are not

Note I don't include the domain in the results, but the SQL should give you more than enough to work out how to add it:

DECLARE @YourString nvarchar(4000) = N'host-001.servers.company.com desc=''Production Web Cache'' env=''Prod'' patch_round=''Beta'' dc=''Main'' rhel_v=''7.6'' primary=''[email protected]'' secondary=''[email protected]''';

WITH CTE AS(
    SELECT *,
           LAG(ContentHeader) OVER (ORDER BY [Key]) AS ActualHeader
    FROM (VALUES(@YourString))V(YourString)
         CROSS APPLY(VALUES(STUFF(@YourString, 1, CHARINDEX(N' ',@YourString),N'')))S(NewString)
         CROSS APPLY OPENJSON('["' + REPLACE(NewString,'=','","') + '"]')OJ
         CROSS APPLY(VALUES(NULLIF(CHARINDEX('''',OJ.[value],2),0)))CI(I)
         CROSS APPLY(VALUES(LEFT(OJ.[Value],CI.I),STUFF(OJ.[Value],1,ISNULL(CI.I+1,0),'')))P(ContentValue,ContentHeader))
SELECT ActualHeader AS Header,
       REPLACE([ContentValue],'''','') AS [Value]
FROM CTE
WHERE ActualHeader IS NOT NULL;

db<>fiddle

Sign up to request clarification or add additional context in comments.

2 Comments

That's beautiful and I hate it. Bravo.
Neat. Looks like XML is an option also
2

A little ugly, but using a bit of JSON (to GTD the sequence) and the window function lead() over()

Example

Declare @YourTable table (ID int,SomeCol varchar(max))
 Insert Into @YourTable values
 (1,'host-001.servers.company.com desc=''Production Web Cache'' env=''Prod'' patch_round=''Beta'' dc=''Main'' rhel_v=''7.6'' primary=''[email protected]'' secondary=''[email protected]''')

  Select A.ID
       ,Host = left(SomeCol,charindex(' ',SomeCol+' '))
       ,B.*
  From  @YourTable A
  Cross Apply (

                Select Item =  ltrim(rtrim(right(Value,charindex(' ',reverse(Value)+' '))))
                      ,Value = ltrim(rtrim(replace(
                               IsNull(lead( left(Value,nullif(len(Value)+1-charindex(' ',reverse(Value)+' '),0)),1) over (order by [Key])
                                     ,lead(right(Value,charindex(' ',reverse(Value)+' ')),1) over (order by [key])
                                     ),'''','')))
                 From  OpenJSON( '["'+replace(string_escape(SomeCol,'json'),'=','","')+'"]' )

              ) B
 Where B.Value is not null
 

Results

ID  Host                             Item           Value
1   host-001.servers.company.com     desc           Production Web Cache 
1   host-001.servers.company.com     env            Prod 
1   host-001.servers.company.com     patch_round    Beta 
1   host-001.servers.company.com     dc             Main 
1   host-001.servers.company.com     rhel_v         7.6 
1   host-001.servers.company.com     primary        [email protected] 
1   host-001.servers.company.com     secondary      [email protected]

EDIT - Injected "HOST="

Declare @YourTable table (ID int,SomeCol varchar(max))
 Insert Into @YourTable values
 (1,'host-001.servers.company.com desc=''Production Web Cache'' env=''Prod'' patch_round=''Beta'' dc=''Main'' rhel_v=''7.6'' primary=''[email protected]'' secondary=''[email protected]''')

 
 Select A.ID
       ,B.*
  From  @YourTable A
  Cross Apply (
                Select Item =  ltrim(rtrim(right(Value,charindex(' ',reverse(Value)+' '))))
                      ,Value = ltrim(rtrim(replace(
                               IsNull(lead(left(Value,nullif(len(Value)+1-charindex(' ',reverse(Value)+' '),0)),1) over (order by [Key])
                                     ,lead(right(Value,charindex(' ',reverse(Value)+' ')),1) over (order by [key])
                                     ),'''','')))
                 From  OpenJSON( '["'+replace(string_escape('host='+SomeCol,'json'),'=','","')+'"]' )

              ) B
 Where B.Value is not null

Results

ID  Item        Value
1   host        host-001.servers.company.com
1   desc        Production Web Cache
1   env         Prod
1   patch_round Beta
1   dc          Main
1   rhel_v      7.6
1   primary     [email protected]
1   secondary   [email protected]

3 Comments

Very similar route to what I was going down, I'll post shortly anyway.
@Larnu Just my first thought. Looking forward to your approach.
There you go, John.
0

Ideally, your data should be stored in separate columns. But if you are going to cram it into one column, at least use a recognized format such as XML or JSON.

Given that single quotes are valid XML attribute delimiters, you can transform this into XML and use XQuery.

It's not pretty, because the hostname value is not delimited

SELECT
  v3.n.value('@host','varchar(255)'),
  v3.n.value('@desc','varchar(1000)')
FROM t
CROSS APPLY(VALUES(
    CHARINDEX(' ', t.value)
)) v1(space)
CROSS APPLY(VALUES(
    CAST(
        '<x host=''' +
        CASE WHEN v1.space = 0
            THEN t.value
            ELSE LEFT(t.value, v1.space - 1) + '''' + SUBSTRING(t.value, v1.space, LEN(t.value))
            END +
        ' />'
      AS xml)
)) v2(xml)
CROSS APPLY v2.xml.nodes('x') v3(n);

db<>fiddle

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.