Python add multiple strings to another string with indexes single time

Question

I have a long text, and some list of dict objects which has indexes of this long text. I want to add some strings to these indexes. If I set a loop, indexes change and I must calculate the indexes again. I think this way very confusing. Is there any way add different strings to different indexes in single time?

My sample data:

main_str = 'Lorem Ipsum is simply dummy text of the printing and typesetting industry.'

My indexes list:

indexes_list = [
    {
      "type": "first_type",
      "endOffset": 5,
      "startOffset": 0,
    },
    {
      "type": "second_type",
      "endOffset": 22,
      "startOffset": 16,
    }
]

My main purpose: I want to add <span> attributes to given indexes with some color styles based on types. After that I render it on template, directly. Have you another suggestion?

For example I want to create this data according to above variables main_str and indexes_list(Please ignore color part of styles. I provide it dynamically from value of type from indexes_list):

new_str = '<span style="color:#FFFFFF">Lorem</span> Ipsum is <span style="color:#FFFFFF">simply</span> dummy text of the printing and typesetting industry.'

Frank · Accepted Answer · 2020-04-17 23:14:06Z

1

Create a new str to avoid change the main_str:

main_str = 'Lorem Ipsum is simply dummy text of the printing and typesetting industry.'
indexes_list = [
    {
      "type": "first_type",
      "startOffset": 0,
      "endOffset": 5,
    },
    {
      "type": "second_type",
      "startOffset": 16,
      "endOffset": 22,
    }
]

new_str = ""
index = 0
for i in indexes_list:
    start = i["startOffset"]
    end = i["endOffset"]
    new_str += main_str[index: start] + "<span>" + main_str[start:end] + "</span>"
    index = end
new_str += main_str[index:]
print(new_str)

answered Apr 17, 2020 at 23:14

Frank

1,28512 silver badges26 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

kamilyrb Over a year ago

Your solution works correctly. Thanks your answer. Actually I search whether it is possible single time instead of a loop.

Mad Physicist · Accepted Answer · 2020-04-18 05:33:18Z

1

Here is a solution without any imperative for loops. It still uses plenty of looping for the list comprehensions.

# Get all the indices and label them as starts or ends.
starts = [(o['startOffset'], True) for o in indexes_list]
ends = [(o['endOffset'], False) for o in indexes_list]

# Sort everything...
all_indices = sorted(starts + ends)

# ...so it is possible zip together adjacent pairs and extract substrings.
pieces = [
    (s[1], main_str[s[0]:e[0]])
    for s, e in zip(all_indices, all_indices[1:])
]

# And then join all the pieces together with a bit of conditional formatting.
formatted = ''.join([
    f"<span>{part}</span>" if is_start else part
    for is_start, part in pieces
])

formatted
# '<span>Lorem</span> Ipsum is s<span>imply </span>dummy text of the printing and typesetting industry.'

Also, although you said you do not want for loops, it is important to note that you do not have to do any index modification if you do the updates in reverse order.

def update_str(s, spans): 
    for lookup in sorted(spans, reverse=True, key=lambda o: o['startOffset']): 
        start = lookup['startOffset'] 
        end = lookup['endOffset'] 
        before, span, after = s[:start], s[start:end], s[end:] 
        s = f'{before}<span>{span}</span>{after}' 
    return s 

update_str(main_str, indexes_list)                                                                                                                                                                                                   
# '<span>Lorem</span> Ipsum is s<span>imply </span>dummy text of the printing and typesetting industry.'

edited Apr 18, 2020 at 5:33

Mad Physicist

116k29 gold badges202 silver badges292 bronze badges

answered Apr 17, 2020 at 23:11

mcskinner

2,7682 gold badges15 silver badges22 bronze badges

4 Comments

kamilyrb Over a year ago

Thanks your notes and answer, I updated the indexes. Actually I need a new string which is added new strings to related indexes. I don't need to dict object.

mcskinner Over a year ago

Could you provide the output you are expecting for your example data?

kamilyrb Over a year ago

I've added which data fromat I want as output.

mcskinner Over a year ago

Okay, I have implemented this without any loops for you. Or at least, without any procedural for loops. All of the list comprehensions are still technically loops.

Mad Physicist · Accepted Answer · 2020-04-18 03:59:52Z

The unvisited insertion indices won't change if you iterate backwards. This is true for all such problems. It sometimes even lets you modify sequences during iteration if you're careful (not that I'd ever recommend it).

You can find all insertion points from the dict, sort them backwards, and then do the insertion. For example:

items = ['<span ...>', '</span>']
keys = ['startOffset', 'endOffset']
insertion_points = [(d[key], item) for d in indexes_list for key, item in zip(keys, items)]
insertion_points.sort(reverse=True)

for index, content in insertion_points:
    main_str = main_str[:index] + content + main_str[index:]

The reason not to do that is that it's inefficient. For reasonable sized text that's not a huge problem, but keep in mind that you are chopping up and reallocating an ever increasing string with each step.

A much more efficient approach would be to chop up the entire string once at all the insertion points. Adding list elements at the right places with the right content would be much cheaper that way, and you would only have to rejoin the whole thing once:

items = ['<span ...>', '</span>']
keys = ['startOffset', 'endOffset']
insertion_points = [(d[key], item) for d in indexes_list for key, item in zip(keys, items)]
insertion_points.sort()

last = 0
chopped_str = []
for index, content in insertion_points:
    chopped_str.append(main_str[last:index])
    chopped_str.append(content)
    last = index
chopped_str.append[main_str[last:]]
main_str = ''.join(chopped_str)

Collectives™ on Stack Overflow

Python add multiple strings to another string with indexes single time

3 Answers 3

1 Comment

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related