1

I'm a beginner in regexp and i try to search in json formatted text, but i cannot make it work right:

SELECT DISTINCT tag, body FROM pages 
WHERE (body REGEXP BINARY '"listeListeOuiNon":".*1.*"')

It shows me as results text with

"listeListeOuiNon":"1" and

"listeListeOuiNon":"1,2" and

"listeListeOuiNon":"0,1" as expected,

but also "listeListeOuiNon":"2" (not expected)

Any idea? Maybe it's because it's greedy, but i'm not sure...

Thanks in advance!

4
  • Just a word of caution, MySQL REGEX is a very expensive operation. It's often faster to break what you can into several LIKE statements if you can swing it. Also, is this a common query or will the matching text change often? Commented Sep 23, 2011 at 12:31
  • yes i know about like being faster, but in fact matching text is changing and trying it with LIKE '"listeListeOuiNon":"%1%"' is not better.. Commented Sep 23, 2011 at 12:41
  • Can you give us an example row where "listeListeOuiNon":"2" was matched? Commented Sep 23, 2011 at 12:59
  • yes for example : SELECT '"listeListeOuiNon":"2", "listeToto":"1"' REGEXP BINARY '"listeListeOuiNon":".*1.*"' Commented Sep 23, 2011 at 13:07

3 Answers 3

2

Well, it's quite easy to debug:

SELECT '"listeListeOuiNon":"2"' REGEXP BINARY '"listeListeOuiNon":".*1.*"'

returns 0

SELECT '"listeListeOuiNon":"1"' REGEXP BINARY '"listeListeOuiNon":".*1.*"'

returns 1

SELECT '"listeListeOuiNon":"1,2"' REGEXP BINARY '"listeListeOuiNon":".*1.*"'

returns 1

So something is not right at your side... because it just could not return rows where body equals "listeListeOuiNon":"2". But it is possible, that body has several of these statements, something like:

body => '"listeListeOuiNon":"1,2", "listeListeOuiNon":"2"'

So you have to modify your regexp:

'^"listeListeOuiNon":".*1.*"$'

Well, then you have to modify your query:

SELECT DISTINCT tag, body FROM pages WHERE (body REGEXP BINARY '"listeListeOuiNon":".*1.*"') AND NOT (body REGEXP BINARY '"listeListeOuiNon":"2"')

Sign up to request clarification or add additional context in comments.

1 Comment

thanks Jauzsika, but in fact i have complicated data, like : {"bf_titre":"Veille partage","listeListeLogiciel":"BAZ","listeListeEtatDuBug":"CONF","listeListeBugs":"USER","listeListeOuiNon":"2","bf_sauveur":"Florian","id_typeannonce":"31","createur":"Anonyme","categorie_fiche":"Maintenance du site","date_creation_fiche":"2011-08-30 14:20:09","date_debut_validite_fiche":"2011-08-30","date_fin_validite_fiche":"0000-00-00","statut_fiche":"1","id_fiche":"VeillePartagee"} <-- and for this it doesn't work..
1

I would try to replace the two .* with [^"]*... That'll however only be sufficient if your listeListeOuiNon cannot contain litteral "s, or you'd have to also handle the escape sequence. Basically with the . you'll match any JSON string that has a 1 "after" "listListOuiNon":", even if it's in another field, and yes, that's because it's greedy.

1 Comment

waouw! it seems to work!! SELECT '"listeListeOuiNon":"2", "listeToto":"1"' REGEXP BINARY '"listeListeOuiNon":"[^"]*1[^"]*"' is egal to 0! merci beaucoup!
1

Returns 0.

enter image description here

1 Comment

my problem is that have have more in my json text.. : SELECT '"listeListeOuiNon":"2", "listeToto":"1"' REGEXP BINARY '"listeListeOuiNon":".*1.*"'

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.