(?:\d{1,2}[\-\/])?(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec|January|February|March|April|May|June|July|August|September|October|November|December)?[\,\.\s]*(?:\d{1,2}[\-\/\.)\s,]*)+(?:\d{2,4})(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec|January|February|March|April|May|June|July|August|September|October|November|December)?[\,\.\s]*(?:\d{1,2}[\-\/\.),]*)
I was trying to extract dates from the text from these ff. format:
- January 1, 2020
- January 01, 2020
- JANUARY 1, 2020
- JANUARY 01, 2020
- Jan. 1, 2020
- Jan. 01, 2020
- JAN. 1, 2020
- JAN. 01, 2020
- 2020 January 1
- 2020 January 01
- 2020 Jan. 1
- 2020 Jan. 01
- 2020 JAN. 1
- 2020 JAN. 01
- 01/01/2020
- 2020/01/01
- 01.01.2020
- 2020.01.01
- 01-01-2020
- 2020-01-01
Here's a sample. The problem is when it tries to extract from this format 2020 JAN. 1 , 2020 JAN. 01, 2020 Jan. 01, 2020-01-01.