0

I am trying to write a big script but I am stuck on a part. I want to sprit an array based on ".."

From the script I got this:

print @coordinates;
gene            complement(872..1288)

my desired output:

complement   872  1288

I tried:

   1) my @answer = split(.., @coordinates)
    print("@answer\n");   

   2) my @answer = split /../, @coordinates;

   3) print +(split /\../)[-1],[-2],[-3] while <@coordinates>

   4) foreach my $anwser ( @coordinates )
    {$anwser =~ s/../"\t"/;
    print  $anwser;}

    5) my @answer = split(/../,          "complement(872..1288)"); #to see if the printed array is problematic. 
    which prints:
      )          )          )          )          )          )          )          )          )          
    
    6) my @answer = split /"gene            "/, @coordinates; # I tried to "catch" the entire output's spaces and tabs
    which prints
    0000000000000000000000000000000001000000000100000000

But none of them works. Does anyone has any idea how to step over this issue?
Ps, unfortunately, I can't run my script right now on Linux so I used this website to run my script. I hope this is not the reason why I didn't get my desired output.

2
  • 1
    I think you need to escape the dots like so: /\.\./. Otherwise, a dot matches "any character". Commented Mar 23, 2020 at 11:44
  • Thanks for the reply. I did my @answer = split(\.\., @coordinates); and prints this 0000000000000000000000000000000001000000000100000000 Commented Mar 23, 2020 at 12:00

3 Answers 3

2
my $RE_COMPLEMENT = qr{(complement)\((\d+)\.\.(\d+)\)}msx;
for my $item (@coordinates) {
    my ($head, $i, $j) = $item =~ $RE_COMPLEMENT;
    if (defined($head) && defined($i) && defined($j)) {
        print("$head\t$i\t$j\n");
    }
}
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks a lot for the reply. I got exactly what I want.
Use qr when compiling a regex outside of a match, after the binding operator, it's more standard to use m.
You use the /msx there, but I don't think you want any of them. There is no ^ or $, so the /m has no effect. There is no '.' meta character, so the '/s' has no effect. And, since there's no whitespace, the '/x' has no effect. Not a huge deal, but it's often to exclude extraneous bits in examples to people who are already struggling. Otherwise, they tend to think it's important to the solution.
1

split operates on a scalar, not on an array.

my $string = 'gene    complement(872..1288)';
my @parts  = split /\.\./, $string;
print $parts[0];  # gene    complement(872
print $parts[1];  # 1288)

To get the desired output, you can use a substitution:

my $string = 'gene    complement(872..1288)';
$string =~ s/gene +|\)//g;
$string =~ s/\.\./ /;
$string =~ s/\(/ /;

3 Comments

First of all, thank you for your reply. I typed my @parts = split /\.\./, @coordinates; as I got this array and I got 0000000000000000000000000000000001000000000100000000
The second argument of split can't be an array.
I see. Thanks a lot once again.
0

Desired effect can be achieved with

use of tr operator to replace '(.)' => ' '

then splitting data string into element on space

storing only required part of array

output elements of array joined with tabulation

use strict;
use warnings;
use feature 'say';

my $data = <DATA>;

chomp $data;
$data =~ tr/(.)/ /;

my @elements = (split ' ', $data)[1..3];

say join "\t", @elements;

__DATA__
gene            complement(872..1288)

Or as an alternative solution with only substitutions (without splitting data string into array)

use strict;
use warnings;
use feature 'say';

my $data = <DATA>;

chomp $data;

$data =~ s/gene\s+//;
$data =~ s/\)//;
$data =~ s/[(.]+/\t/g;

say $data;

__DATA__
gene            complement(872..1288)

Output

complement      872     1288

6 Comments

First of all, thank you once again for your help and for the explanations. I tried both ways and works perfectly. To be honest the first seems to be better as at the end, you will obtain 3 values at @element and It will be easier to manipulate when you are writing a big script. Just a last thing. At your first solution I got the following warning I Use of uninitialized value $elements[2] in join or string at txt.perl line 33, <DATA> line 33. Is it because I am using an online Perl executor?
@KGee -- you need see <DATA> line 33, what is there? I guess that it could be just an empty line and the code can not split empty line into elements, if no elements then no way to join them.
@ Polar Bear Finally I found it.... elements = (split ' ', $data)[1..3]; should be elements = (split ' ', $data)[0..2]; Thanks again for your time!!!!
@KGee -- this would be odd. The elements after split should be -- [0]gene,[1]complement,[2]872,[3]1288. If you get assigned elements [0..2] then in such case your data before split do not have 'gene'. I am puzzled how it could happen. Are you mixed code from 1st and 2nd examples? Otherwise I do not see how it could happen.
@KGee -- Online: test #1, test #2.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.