0

I have dates in the form given below:

   "1###-##-##" here # denote uncertainty. e.g.

   "1###-##-##" denotes (1000-00-00 to 1999-12-31)
   "-138 - ## - ##" denotes (0138-01-01 BC, 0138-12-31 BC) 
   "18##-##-##" denotes (1800-01-01, 1899-12-31)
   "1713-##-##" denotes (1713-01-01, 1713-12-31)
   "####-##-##" denotes (0001-01-01, 9999-12-31)

I tried to achieve this conversion by using specific switch cases which did not turn out to be efficient. Is there some other means in python by which I may achieve this?

Here below zero values are converted to BC

EDIT: My desired output is given a pattern like "1###-##-##" find out the minimum and maximum range

2
  • 1
    What is your desired output? It's not exactly clear. A tuple with minimum and maximum dates? Commented Sep 25, 2013 at 22:58
  • @OfirIsrael My desired output is given a pattern like "1*--**" find out the minimum and maximum range Commented Sep 25, 2013 at 23:15

2 Answers 2

4

Given

dateranges = [
    "1***-**-**", 
    "-138 - ## - ##",
    "18##-##-##",
    "1713-##-##",
    "####-##-##"
]

your best parser will probably be re, assuming that you don't want to do this properly:

import re
matcher = re.compile("(-?[\d*#]+)\s*-\s*([\d*#][\d*#])\s*-\s*([\d*#][\d*#])")

datetuples = [matcher.match(daterange).groups() for daterange in dateranges]

And then you can just go through the tuples,

for year, month, day in datetuples:

convert each unknown to a digit and cap.

    minyear  = int(year.replace("*", "0").replace("#", "0"))
    minmonth = max(1, int(month.replace("*", "0").replace("#", "0")))
    minday   = max(1, int(day.replace("*", "0").replace("#", "0")))

    mindate = (minyear, minmonth, minday)

    maxyear  = int(year.replace("*", "9").replace("#", "9"))
    maxmonth = min(12, int(month.replace("*", "9").replace("#", "9")))
    ### WARNING! MAXIMUM DAY NUMBER DEPENDS ON BOTH MONTH AND YEAR
    maxday  = min(31, int(day.replace("*", "9").replace("#", "9")))

    maxdate = (maxyear, maxmonth, maxday)

    print(mindate, maxdate)

#>>> (1000, 1, 1) (1999, 12, 31)
#>>> (-138, 1, 1) (-138, 12, 31)
#>>> (1800, 1, 1) (1899, 12, 31)
#>>> (1713, 1, 1) (1713, 12, 31)
#>>> (0, 1, 1) (9999, 12, 31)

Bear in mind that this accepts false positives due to the cap and also bear in mind the big, bold warning.

Sign up to request clarification or add additional context in comments.

Comments

0

If I understand correctly you want * to denote a range from 0 to the maximum value, and # to denote a range from 1 to the maximum value. Further, a date that starts with a dash is in BC.

Is this any better than your original approach?

def getdaterange(inputdate):
    inputdate = inputdate.replace(' ', '')

    maxdate = '9999-12-31'
    zerodate = '0000-00-00'
    onedate = '0001-01-01'

    lowdate = list(inputdate)
    highdate = list(inputdate)

    for i, ch in enumerate(inputdate):
        if ch == '*' or ch == '#':
            highdate[i] = maxdate[i]

        if ch == '*':
            lowdate[i] = zerodate[i]
        elif ch == '#':
            lowdate[i] = onedate[i]

    if inputdate[0] == '-':
        lowdate[0], highdate[0] = '0', '0'
        lowdate.append(' BC')
        highdate.append(' BC')

    return (''.join(lowdate), ''.join(highdate))

testins = ('1***-**-**', '-138 - ## - ##', '18##-##-##',
            '1713-##-##', '####-##-##')
for i in testins:
    print(getdaterange(i))

The outputs look correct.

('1000-00-00', '1999-12-31')
('0138-01-01 BC', '0138-12-31 BC')
('1801-01-01', '1899-12-31')
('1713-01-01', '1713-12-31')
('0001-01-01', '9999-12-31')

Of course it doesn't validate input to the function. It may be worth adding some regex to the beginning to verify the input is in the right form.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.