2

I've a list like:

mylist = ['La', 'domestication', "d'un", 'animal', 'ou', "d'un", 'végétal,', 'necessite', "l'acquisition", "d'une", 'ferme']

I want to split elements which have " ' " inside by 2 elements and keep their index in the original list.

OUTPUT REQUESTED : my_new_list = ['La', 'domestication', "d'" ,'un', 'animal', 'ou', "d'", 'un', 'végétal,', 'necessite', "l'", 'acquisition', "d'", 'une', 'ferme']

I've tried few thing but I admit I'm stuck to replace the two new split element in the correct index, here is the code I've tried:

for word in mylist:
    if "'" in word:
        new_words = word.split("'")
        mylist[mylist.index(word)] = (new_words[0]+"'")
        mylist.insert(mylist.index((new_words[0]+"'")+1), new_words[1]) 

print(mylist)

Thank you for your time and help :)

6
  • 1
    This would be much easier if you actually made a new list for your result, instead of modifying the original list. Commented May 27, 2022 at 12:53
  • What if a word has 2 or more quotes? Commented May 27, 2022 at 12:54
  • @ScottHunter : Sure, but I need to keep index position and replace the old word by the 2 new sliced Commented May 27, 2022 at 12:59
  • And yet you accepted @Nick's answer, which does exactly that. Commented May 27, 2022 at 13:16
  • @ScottHunter: In fairness, any solution that makes a new list can be trivially tweaked to alter the original list (for when you're relying on aliases of the list seeing the modification), so it hardly matters which solution is used (aside from the minor memory expense of having two at once). Just change my_new_list = operation_on(mylist) to mylist[:] = operation_on(mylist) and it swaps out the contents in-place (then discards the temporary). Commented May 27, 2022 at 13:20

6 Answers 6

7

Assuming you're happy with creating a new list, one way to achieve that is to join the existing list together with spaces and then split on either a space or a ':

import re

mylist = [
 'La', 'domestication', "d'un",
 'animal', 'ou', "d'un", 'végétal,',
 'necessite', "l'acquisition",
 "d'une", 'ferme'
]

my_new_list = re.split(r" |(?<=')", ' '.join(mylist))

Output:

[
 'La', 'domestication', "d'", 'un',
 'animal', 'ou', "d'", 'un', 'végétal,',
 'necessite', "l'", 'acquisition',
 "d'", 'une', 'ferme'
]

Note this assumes the words in the list don't have a space in them; if they might, you can just replace the space in the code with a character (or character sequence) which does not occur in the words, e.g. \0:

my_new_list = re.split(r"\0|(?<=')", '\0'.join(mylist))
Sign up to request clarification or add additional context in comments.

9 Comments

Assumes (reasonably) that entries in the original list don't have any spaces in them.
@ScottHunter agreed, what is in the list is whole words; but it would be easy enough to replace space with a character such as #
@ScottHunter I've added a note to the answer about that possibilty
@Nick: Or more safely, some character sequence that's guaranteed not to appear in the input. If the input is text, '\0' would be a good choice, or you can just make some long enough string of garbage that's statistically unlikely to occur (assuming you're not worrying about malicious input), e.g. '\0\f\v\a' or the like.
@ShadowRanger yeah, I was trying not to over-complicate...
|
4
mylist = ['La', 'domestication', "d'un", 'animal', 'ou', "d'un", 'végétal,', 'necessite', "l'acquisition", "d'une", 'ferme']
newlist = []

for word in mylist:
    ele = word.split("'")
    if len(ele) > 1:
        for i in range (0,len(ele)):
            if i == 0:
                newlist.append(ele[i]+"'")
            else:
                newlist.append(ele[i])
    else:
        newlist.append(ele[0])

print (mylist)
print (newlist)

Comments

4

A (slightly) simpler variation on previous answers, that makes no assumptions about the strings in your list:

newlist = []

for word in mylist:
    bits = word.split("'")
    for w in bits[:-1]:
        newlist.append(w+"'")
    newlist.append(bits[-1])

1 Comment

Simple is always good
3
mylist = ['La', 'domestication', "d'un", 'animal', 'ou', "d'un", 'végétal,', 'necessite', "l'acquisition", "d'une", 'ferme']
new_list = []
for word in mylist:
    if "'" in word:
        parts = word.split("'")
        for i in range(len(parts)-1):
            new_lst.append(parts[i]+"'")
        new_list.append(parts[-1])
    else:
        new_list.append(word)

Comments

2

I just modified your code.

   mylist = ['La', 'domestication', "d'un", 'animal', 'ou', "d'un", 'végétal,', 'necessite', "l'acquisition", "d'une", 'ferme']

lol = str("")
print(lol)
for word in mylist:
    
    if "'" in word:
        new_words = word.split("'")
        for i in new_words:
            lol=lol+"0"+str(i).encode('ascii', 'ignore').decode('ascii')
    else:
        lol=lol+"0"+str(word).encode('ascii', 'ignore').decode('ascii')
        
if lol[0]=="0":
    lol = lol[1:]
    
lol = lol.split("0")
print(lol)

2 Comments

What if 0 appears in any of the words?
As @ShadowRanger suggested use some character sequence that's guaranteed not to appear in the input.
1

You can use enumerate() to get the index you are iterating at, then use array slicing to replace at the correct position.

mylist = ['La', 'domestication', "d'un", 'animal', 'ou', "d'un", 'végétal,', 'necessite', "l'acquisition", "d'une", 'ferme']

for i, w in enumerate(mylist):
    if "'" in w:
        mylist[i:i+1] = w.split("'") 

Now printing my mylist gives us:

['La', 'domestication', 'd', 'un', 'animal', 'ou', 'd', 'un', 'végétal,', 'necessite', 'l', 'acquisition', 'd', 'une', 'ferme']

Edit: This is bad code. Don't do this.

8 Comments

Desired result should include the quotes.
I'm kinda amused this works. It's doing something unsafe (modifying a list as you iterate over it), but the end result is fine, it just does a little unnecessary work checking the results of a split (because the loop picks up on the new element(s) following the replaced element).
@ScottHunter: Assuming only one separator per string, str.partition could be used for the purpose, but otherwise, you're stuck with regex to do that. Boo.
@ShadowRanger I know! Everything about screams WRONG! to me! :)
@aelmosalamy: It's not enumerate doing it, it's the list's iterator. enumerate happily keeps going as long as the underlying iterator it wraps keeps producing values. list's iterator enables this by (unlike dict's iterator) not checking for (a subset of) unsafe modifications, and rechecking the length of the underlying list as it goes, rather than caching it up front. I doubt the language guarantees this behavior of list iterators, so you definitely shouldn't rely on it (non-CPython interpreters might fail, and future CPython might change the behavior).
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.