Finding Patterns In Dataset/Array

Question

I have this Dataset:

    NAME  VALUE1 VALUE2
0  Alpha     100     A1
1  Alpha     100     A1
2  Alpha     200     A2

I want to run a script that finds which patterns are in the dataset. For example in this particular dataset the rules it will find are:

1)IF NAME = ALPHA & VALUE1 = 100, THEN VALUE2 = A1

2)IF NAME = ALPHA & VALUE1 = 200, THEN VALUE2 = A2

I know that each column and row value will have to be compared like so...

ALPHA 100
ALHA 100
ALPHA 200

ALPHA A1 
ALPHA A1
ALPHA A2

100 A1
100 A1
200 A2

ALPHA 100 A1
ALPHA 100 A1
ALPHA 200 A2

"ALPHA 100", can't be correct because "ALPHA 200" exists, same for "ALPHA A1" since "ALPHA A2" exists.

"100 A1" and "200 A2", are correct, but "ALPHA 100 A1", and "ALPHA 200 A2" are stronger variations and therefore are the ones printed out.

How could I go about this?

user184868 · Accepted Answer · 2022-06-15 17:50:04Z

1

Okay, it is clasterisation task for each row. But i also want to find some sort of non-stochastical solutions for this. Like first, you may have hypothesis that there are all relations inside each row, like if alfa and 100 then a1, if alfa and A1 then 100, etc., as a condition you can take arbitrary amount of fields in the row.

Then, as you read next row, you update the rules. If you find a contradicting entry like alpha, 300 -> A1 now you use your generalization function. This may be alpha, 100 or 300 -> a1; or!!! alpha, interval (100 .. 300) -> A1. There is not general known approach for this, what makes it interesting. You might tell me exact task what are you doing, i would be interested in solving that

answered Jun 15, 2022 at 17:50

user184868

18312 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Owen Osagiede Over a year ago

I am trying to find the relationships in any relational dataset. So rules that are found will be specific to that dataset the script ingested. It makes sense how you say, if you find contradicting entry, then the rule is updated. Maybe we can discuss more over email so I can give you a better look.

user184868 Over a year ago

email [email protected]. You may organize git repository or something. Finding relationship in any dataset is too broad for the begin - remember- we are searching for the method so it would be better if we narrow down to one specific problem. What i already figured out, i called it n-tuple agreement method and generaliaztion using commonsence ontology

Owen Osagiede Over a year ago

I sent the email

Collectives™ on Stack Overflow

Finding Patterns In Dataset/Array

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related