0

Is it possible to take a very long string and split it by sentence into 5000 char (or smaller) array items?

Here's what I have so far:

<?php
$text = 'VERY LONG STRING';

foreach(explode('. ', $text) as $chunk) {
    $accepted[] = $chunk;
}
?>

This just splits the string into an array containing single sentence items. I need to group items into sub arrays, each containing a list of items which, when added together, contain no more than 5000 characters.

I tried this:

<?php
$text = 'VERY LONG STRING';

foreach(explode('. ', $text) as $chunk) {
    $key = strlen(implode('. ', $accepted).'. '.$chunk) / 5000;
    $accepted[$key][] = $chunk;
}
?>

You can probably see what I tried to do here, but it didn't work.

UPDATE: This did the trick:

<?php
foreach(explode('. ', $text) as $chunk) {
  $chunkLen = strlen(implode('. ', $result).'. '.$chunk.'.');
  if ($len + $chunkLen > 5000) {
    $result[] = $partial;
    $partial = [];
    $len = 0;
  }
  $len += $chunkLen;
  $partial[] = $chunk;
}

if($partial) $result[] = $partial;
?>
5
  • Not sure if stackoverflow.com/questions/29818232/… is the same. Commented May 3, 2020 at 16:05
  • You could simply use php.net/manual/de/function.chunk-split.php Commented May 3, 2020 at 16:10
  • Is the . in your explode meant to be regex special char or just a variable? Something like preg_split('/(.{1,5000})/', $text could be close. Might also be inefficient though depending how large string is. Commented May 3, 2020 at 16:12
  • Hi @Nigel Ren, I was distracted while posting :D - but I assume everyone here should be able to switch languages! ;-) Commented May 3, 2020 at 16:16
  • This question is Unclear because there is no minimal reproducible example. Commented Sep 15, 2023 at 21:14

2 Answers 2

1

You could do something like this:

$text = 'VERY LONG STRING';
$result = [];
$partial = [];
$len = 0;

foreach(explode(' ', $text) as $chunk) {
  $chunkLen = strlen($chunk);
  if ($len + $chunkLen > 5000) {
    $result[] = $partial;
    $partial = [];
    $len = 0;
  }
  $len += $chunkLen;
  $partial[] = $chunk;
}

if ($partial) {
    $result[] = $partial;
}

You can test it more easily if you do it with a lower max length

Sign up to request clarification or add additional context in comments.

1 Comment

This was a huge help. After some changes, I got it to do exactly what was needed. I've updated this question with the correct code. Thank you.
1

If I don't misunderstand your question then you need something like this,

<?php
$text = 'VERY LONG STRING';
$s = chunk_split($text, 3, '|'); // put 5000 instead of 3
$s = substr($s, 0, -1);
$accepted = explode('|', $s);
print_r($accepted);
?>

OR

<?php
$text = 'VERY LONG STRING';
$accepted = str_split($text, 3);
print_r($accepted);
?>

DEMO: https://3v4l.org/H9DAl

DEMO: https://3v4l.org/PN7Aj

2 Comments

Does this split by sentences or just every x characters?
@NigelRen every x character sir.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.