How to generate sequence number for a set of rows while Insert?

Question

The original question was - how to attach a sequence number to a set of rows which are extracted from an xml string.

Updated script below. Adopted from the solution provided by user @T N .

I just need to know whether this script is correct or am i doing anything wrong.

Note:- this seems to produce the correct result though!

Update:06/25/24 - I am still looking for an answer.

CREATE TABLE [dbo].[UseSequence](
    [UId] [bigint] NOT NULL,
    [Version] [int] NOT NULL,
    [Speed] [int] NOT NULL,
    [TrainType] [int] NOT NULL,
    [HeadEnd] [int] NOT NULL,
    [Restricted] [int] NOT NULL,
    [Sequence] [int] NOT NULL
);

DECLARE 
    @BulletindataText nvarchar(max)
    ,@BulletinDataXml XML
    ,@BulletinSpeedsXml xml
    ,@DigitalUID    bigint
    ,@Version int;
;

Set @BulletindataText = '<BulletinData>
    <Bul>       
        <BulletinSpeeds>
            <BulletinSpeedRestriction>
                <Speed >30</Speed>
                <TrainType Numeric="2">Passenger</TrainType>
                <HeadEndOnlySpeedRestriction Numeric="2">No</HeadEndOnlySpeedRestriction>
                <RestrictedSpeed Numeric="2">No</RestrictedSpeed>
            </BulletinSpeedRestriction>
            <BulletinSpeedRestriction>
                <Speed >30</Speed>
                <TrainType Numeric="1">Freight</TrainType>
                <HeadEndOnlySpeedRestriction Numeric="2">No</HeadEndOnlySpeedRestriction>
                <RestrictedSpeed Numeric="2">No</RestrictedSpeed>
            </BulletinSpeedRestriction>
            <BulletinSpeedRestriction>
                <Speed >30</Speed>
                <TrainType Numeric="3">Intermodal</TrainType>
                <HeadEndOnlySpeedRestriction Numeric="2">No</HeadEndOnlySpeedRestriction>
                <RestrictedSpeed Numeric="2">No</RestrictedSpeed>
            </BulletinSpeedRestriction>
            <BulletinSpeedRestriction>
                <Speed >30</Speed>
                <TrainType Numeric="6">Commuter</TrainType>
                <HeadEndOnlySpeedRestriction Numeric="2">No</HeadEndOnlySpeedRestriction>
                <RestrictedSpeed Numeric="2">No</RestrictedSpeed>
            </BulletinSpeedRestriction>
            <BulletinSpeedRestriction>
                <Speed >30</Speed>
                <TrainType Numeric="5">TiltTrain</TrainType>
                <HeadEndOnlySpeedRestriction Numeric="2">No</HeadEndOnlySpeedRestriction>
                <RestrictedSpeed Numeric="2">No</RestrictedSpeed>
            </BulletinSpeedRestriction>
            <BulletinSpeedRestriction>
                <Speed >30</Speed>
                <TrainType Numeric="4">HighSpeedPassenger</TrainType>
                <HeadEndOnlySpeedRestriction Numeric="2">No</HeadEndOnlySpeedRestriction>
                <RestrictedSpeed Numeric="2">No</RestrictedSpeed>
            </BulletinSpeedRestriction>
        </BulletinSpeeds>
    </Bul>
</BulletinData>'
;

SET @BulletinDataXml = cast(@BulletinDataText as xml)
SET @BulletinSpeedsXml =  @BulletinDataXml.query('<BulletinSpeeds> {for $x in /BulletinData/Bul/BulletinSpeeds/child::* return $x} </BulletinSpeeds>'); 


-- Insert 1
Set @DigitalUID = 104;
Set @Version = 1;

-- Insert 2
--Set @DigitalUID = 104;
--Set @Version = 2;

INSERT INTO [dbo].[BulletinSpeedRestrictionUseSequence]
(DigitalUID
,[Version]
,Speed
,TrainType
,HeadEndOnlySpeedRestriction
,RestrictedSpeed,
[Sequence]
)
SELECT  @DigitalUID,
        @Version,
        a.c.value('Speed[1]','int') as 'Speed',
        a.c.value('TrainType[1]/@Numeric','int') as 'TrainType',
        a.c.value('HeadEndOnlySpeedRestriction[1]/@Numeric','int') as 'HeadEndOnlySpeedRestriction',
        a.c.value('RestrictedSpeed[1]/@Numeric','int') as 'RestrictedSpeed',
        row_number() over (partition by @DigitalUID, @Version order by a.c) AS SequenceNum
FROM @BulletinSpeedsXml.nodes('/BulletinSpeeds/BulletinSpeedRestriction') a(c)
;

Result

DigitalUID	Version	Speed	TrainType	HeadEndOnlySpeedRestriction	RestrictedSpeed	Sequence
104	1	30	2	2	2	1
104	1	30	1	2	2	2
104	1	30	3	2	2	3
104	1	30	6	2	2	4
104	1	30	5	2	2	5
104	1	30	4	2	2	6
104	2	30	2	2	2	1
104	2	30	1	2	2	2
104	2	30	3	2	2	3
104	2	30	6	2	2	4
104	2	30	5	2	2	5
104	2	30	4	2	2	6

Must you use sequence number? If you are only concerned about the insertion order, how about create a timestamp field with default value as current timestamp? — ray
– ray, Commented Apr 30, 2024 at 22:37
@ray i thought about that. but i have to make sure the rows do get different timestamp - since this is an xml insert. however, it is on me to test this out. I just put this question out such that i have a backup option by the time i figure out how the xml insert works with a timestamp. thanks for the honest feedback though! — Tech with Thiru
– Tech with Thiru, Commented Apr 30, 2024 at 22:45
Probably row_number() over (order by (select 1))? Apparently you have to trick Sql Server into sorting by a constant here. Usually this errors for good reason, but in this case I would assume the order to remain the same as in the file? — BenderBoy
– BenderBoy, Commented Apr 30, 2024 at 22:55
Beware of any answers that use the ROW_NUMBER() function along with an underspecified ORDER BY constraint. These assume that the generated numbers will reflect the original data order, but this is not guaranteed. It may seem true most or all of the time with small data sets, but as data scales up, the optimizer is free to use parallel operations, hash tables, and possibly other processing techniques that no longer preserve this assumed order. — T N
– T N, Commented May 1, 2024 at 4:34

T N · Accepted Answer · 2024-05-01 07:17:56Z

1

Regarding the sequence number, if you need to establish a well-defined order withing each set of values (having the same UID and Version), I believe an explicit SequenceNum or Ordinal column is the best approach. I would avoid using an IDENTITY column that doubles as both a PK and a row order, as this could cause difficulties in the future if you need to modify the data in some way that adjusts that defined order. I think it is best to keep PK and ordinal separate.

Based on the mixed order in your posted results, I am going to assume that this reflects the order of your source data XML and that you wish to guarantee that the original order is reflected in the assigned SequenceNum value.

I also reverse engineered some source data based on your query structure and an assumption that the @UID, @Version, and @BSpeedsXml values actually originate from a table.

CREATE TABLE Data (
    UID INT,
    Version INT,
    BSpeedsXml XML
)

INSERT Data
VALUES
    (115967, 1, '<BSpeed>
      <Restriction><Speed>60</Speed><TrainType Numeric="2" /></Restriction>
      <Restriction><Speed>40</Speed><TrainType Numeric="1" /></Restriction>
      <Restriction><Speed>40</Speed><TrainType Numeric="3" /></Restriction>
      <Restriction><Speed>60</Speed><TrainType Numeric="6" /></Restriction>
      <Restriction><Speed>60</Speed><TrainType Numeric="5" /></Restriction>
      <Restriction><Speed>60</Speed><TrainType Numeric="4" /></Restriction>
    </BSpeed>'),
    (115967, 2, '<BSpeed>
      <Restriction><Speed>60</Speed><TrainType Numeric="1" /></Restriction>
      <Restriction><Speed>40</Speed><TrainType Numeric="2" /></Restriction>
      <Restriction><Speed>40</Speed><TrainType Numeric="4" /></Restriction>
      <Restriction><Speed>60</Speed><TrainType Numeric="5" /></Restriction>
      <Restriction><Speed>60</Speed><TrainType Numeric="6" /></Restriction>
      <Restriction><Speed>60</Speed><TrainType Numeric="3" /></Restriction>
    </BSpeed>')

Based on some of the answers to this question, I have a couple of candidate solutions:

-- Calculate node position by counting preceding sibling nodes.
-- This may become excessively slow as the node count increases.
SELECT  d.UID,
        d.Version,
        a.c.value('Speed[1]','int') as 'Speed',
        a.c.value('TrainType[1]/@Numeric','int') as 'TrainType',
        a.c.value('for $i in . return count(../*[. << $i]) + 1', 'int') AS SequenceNum
FROM Data d
CROSS APPLY d.BSpeedsXml.nodes('/BSpeed/Restriction') a(c)
ORDER BY UID, Version, SequenceNum

-- Generate position numbers using ROW_NUMBER() OVER(ORDER BY xml_node).
-- (This is potentially the best answer, but I cannot find documentation
-- stating that "ORDER BY xml_node" is well-defined.)
SELECT  d.UID,
        d.Version,
        a.c.value('Speed[1]','int') as 'Speed',
        a.c.value('TrainType[1]/@Numeric','int') as 'TrainType',
        row_number() over (partition by UID, Version order by a.c) AS SequenceNum
FROM Data d
CROSS APPLY d.BSpeedsXml.nodes('/BSpeed/Restriction') a(c)
ORDER BY UID, Version, SequenceNum

-- Count the nodes, generate sequence numbers, and then access the data using those
-- sequence numbers as indexes.
-- Note: GENERATE_SERIES() function is only available in SQL Server 2022 and later.
SELECT  d.UID,
        d.Version,
        a.c.value('Speed[1]','int') as 'Speed',
        a.c.value('TrainType[1]/@Numeric','int') as 'TrainType',
        s.value AS SequenceNum
FROM Data d
CROSS APPLY GENERATE_SERIES(1, d.BSpeedsXml.value('count(/BSpeed/Restriction)', 'int')) s
CROSS APPLY d.BSpeedsXml.nodes('/BSpeed/Restriction[sql:column("s.value")]') a(c)
ORDER BY UID, Version, SequenceNum

-- Alternative to GENERATE_SERIES() for SQL Server versions prior to 2022.
-- Count the nodes, generate sequence numbers, and then access the data using those
-- sequence numbers as indexes.
SELECT  d.UID,
        d.Version,
        a.c.value('Speed[1]','int') as 'Speed',
        a.c.value('TrainType[1]/@Numeric','int') as 'TrainType',
        s.value AS SequenceNum
FROM Data d
CROSS APPLY ( -- Alternative to GENERATE_SERIES() for older SQL Server versions
    SELECT TOP (d.BSpeedsXml.value('count(/BSpeed/Restriction)', 'int'))
        ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS value
    FROM (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) n1(value)
    CROSS JOIN (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) n2(value) -- up to 100
    CROSS JOIN (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) n3(value) -- up to 1000
    CROSS JOIN (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) n4(value) -- up to 10,000
    -- Add more CROSS JOINs if needed based on expected max count.
) s
CROSS APPLY d.BSpeedsXml.nodes('/BSpeed/Restriction[sql:column("s.value")]') a(c)
ORDER BY UID, Version, SequenceNum

All produce the same results:

UID	Version	Speed	TrainType	SequenceNum
115967	1	60	2	1
115967	1	40	1	2
115967	1	40	3	3
115967	1	60	6	4
115967	1	60	5	5
115967	1	60	4	6
115967	2	60	1	1
115967	2	40	2	2
115967	2	40	4	3
115967	2	60	5	4
115967	2	60	6	5
115967	2	60	3	6

See this db<>fiddle for a demo. (Alternate demo at SQL Fiddle.)

edited May 1, 2024 at 7:17

answered May 1, 2024 at 6:35

T N

10.6k1 gold badge12 silver badges30 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Tech with Thiru Over a year ago

@T N ; I have updated my post with the script I am using. I am quite not sure how to extract from the original xml (provided in OP) and populate the Data table in your solution. I tried about a day but i can't quite figure out how to do this. Would you be kind enough to update your answer using the script i provided? Thanks!

Tech with Thiru Over a year ago

Oh Nvm. I got my answer. thanks!

Tech with Thiru Over a year ago

Btw, I couldn't understand this comment you made in your solution. could you please elaborate what you meant by cannot find documentation stating that "ORDER BY xml_node" is well-defined

T N Over a year ago

@blogs4t - The SELECT - ORDER BY documentation states "Columns of type ntext, text, image, geography, geometry, and xml cannot be used in an ORDER BY clause." However, it appears that the OVER Clause for a window function follows different rules, allowing an xml column. After finding code that used this ability, I searched for, but could not find documentation that allows this or that defines what xml order is.

Tech with Thiru Over a year ago

@T N - I have updated my post with the complete script which i use for the task, adopted from your solution. However, I won't need the intermediate table Data which you used in your answer. Can you please kindly review the updated script from the post and let me know if it is correct or am i missing anything or doing something wrong? Also, could you please explain what a.c means or how it works in the order by a.c clause from the row_number() over statement?

|

Collectives™ on Stack Overflow

How to generate sequence number for a set of rows while Insert?

1 Answer 1

6 Comments

Your Answer

Linked

Hot Network Questions

DigitalUID	Version	Speed	TrainType	HeadEndOnlySpeedRestriction	RestrictedSpeed	Sequence
104	1	30	2	2	2	1
104	1	30	1	2	2	2
104	1	30	3	2	2	3
104	1	30	6	2	2	4
104	1	30	5	2	2	5
104	1	30	4	2	2	6
104	2	30	2	2	2	1
104	2	30	1	2	2	2
104	2	30	3	2	2	3
104	2	30	6	2	2	4
104	2	30	5	2	2	5
104	2	30	4	2	2	6

DigitalUID	Version	Speed	TrainType	HeadEndOnlySpeedRestriction	RestrictedSpeed	Sequence
104	1	30	2	2	2	1
104	1	30	1	2	2	2
104	1	30	3	2	2	3
104	1	30	6	2	2	4
104	1	30	5	2	2	5
104	1	30	4	2	2	6
104	2	30	2	2	2	1
104	2	30	1	2	2	2
104	2	30	3	2	2	3
104	2	30	6	2	2	4
104	2	30	5	2	2	5
104	2	30	4	2	2	6

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related

DigitalUID	Version	Speed	TrainType	HeadEndOnlySpeedRestriction	RestrictedSpeed	Sequence
104	1	30	2	2	2	1
104	1	30	1	2	2	2
104	1	30	3	2	2	3
104	1	30	6	2	2	4
104	1	30	5	2	2	5
104	1	30	4	2	2	6
104	2	30	2	2	2	1
104	2	30	1	2	2	2
104	2	30	3	2	2	3
104	2	30	6	2	2	4
104	2	30	5	2	2	5
104	2	30	4	2	2	6