0

I have a file that looks like so:

QUERY:1
DATABASE:geoquery
NL:What are the capitals of the states that border the most populated states?
SQL:something

QUERY:2
DATABASE:geoquery
NL:What are the capitals of the states bordering New York?
SQL:SELECT state.Capital FROM state JOIN borderinfo ON state.State_Name = borderinfo.State_Name          
WHERE borderinfo.Border = 'New York'

QUERY:3
DATABASE:geoquery
NL:Show the state capitals and populations.
SQL:SELECT state.Capital, state.Population FROM state

etc...

The person generating this file refuses to give it to me in a usable format like say, XML or JSON. So I am parsing it a couple of times with REGEX to get results I want.

Strip off Databases to populate select list (works fine):

$.get('inputQueryExamples.txt',
        function(data){
            var string = data;
            var dbDynamo ='';
            dbExp = new RegExp('(DATABASE:.*)','gm');
            dbDynamo = string.match(dbExp);
            cleanBreaks = new RegExp('(\r?\n|\r)','gm');
            stringNow = dbDynamo.toString();
            //console.log(dbDynamo);
            dbDynamo = dbDynamo.map(function(el){return el.replace('DATABASE:','');});

            var outArray = [];
            for(i=0; i < dbDynamo.length; i++){
                if ($.inArray(dbDynamo[i],outArray)== -1){
                    outArray.push(dbDynamo[i]);
                    }
                }

            dbDynamo = outArray.sort();

            var options = '';
            for(i=0; i<dbDynamo.length; i++){
                options += '<option value="' + dbDynamo[i] + '">' + dbDynamo[i] + '</option>';
            };
            $(select).append(options);


});

The problem comes when I parse a second time to get all of the strings associated with a specific database. I end up with a linebreak in front of every string so that when I fill a textarea with the autocomplete text it starts on the second line:

Array [ "

NL:Show company with complaint against Debt collection product.,DATABASE:consumer", "

NL:Show issues and dates of complaints about HSBC companies.,DATABASE:consumer", "

NL:Show companies, issues and dates with consumer concerns.,DATABASE:consumer", "

NL:Show issues and companies with complaints in MA state." ]

$(document).ready(function(){
        $.get('inputQueryExamples.txt',function(data){
            var queryString = data;
            var cleanString = "";
            var db = '';
            $('#database-list').change(function(){
               db = $('#database-list').val();

               // /(^DATABASE:.*\r\n)(^NL.*)/gm
               // http://regex101.com/r/mN4hS2

               regex = new RegExp('(^DATABASE:'+ db +'\r\n)(^NL.*)' ,'gm');

               cleanString = queryString.match(regex);
               //console.log(cleanString);
               //cleanBreaks = new RegExp('(\r\n|\r|\n)(^NL.*)','gm');
               //stringNow = cleanString.toString();
               //var dirtyString
               //dirtyString = stringNow.match(cleanBreaks);
               //console.log(dirtyString); 

               var nlString = cleanString.map(function(el) {return el.replace('DATABASE:' + db,'');});
               nlString = nlString.map(function(el){return el.replace('NL:',''); });
               //nlString = nlString.map(function(el){return      el.replace('\\r','').replace('\\n','');});

               console.log(nlString.pop());
               $('#query-list').autocomplete({
                source:nlString,

                }); 

             }); // end change

I have tried about everything I can think of to get rid of the linebreaks without success. Any ideas would be appreciated. Unfortunately the one where the server side just gives me data in a usable format is not viable. There is a lot of extra code in this just to give you an idea of what I have tried. I have commented out useless things. Thanks in advance.

2 Answers 2

1

It would make sense to use JavaScript to parse the data-structure you are given into a JavaScript object first, then you can more easily work with the data.

Here is a jsfiddle that demonstrates parsing your data into a JavaScript object. Below is the relevant code that does the parsing:

var datasources = {};

var parseData = function(block) {
    var db = block.match(/DATABASE:(.*)/)[1];
    var dbdata = {id: 0, nl: "", sql: ""};

    if (!datasources[db]) {
        datasources[db] = [];
    }

    dbdata.id = block.match(/QUERY:(.*)/)[1];
    dbdata.nl = block.match(/NL:(.*)/)[1];
    dbdata.sql = block.match(/SQL:(.*)/)[1];

    datasources[db].push(dbdata);
};

var parseBlocks = function(data) {
    var result = data.split('\n\n');
    for(var i=0; i < result.length; i++) {
       parseData(result[i]);
    };
};
Sign up to request clarification or add additional context in comments.

Comments

0

Thanks for the very thoughtful and elegant approach. I continued the brute force approach.

replace:

nlString = nlString.map(function(el){return el.replace('NL:',''); });

with:

nlString = nlString.map(function(el){return el.replace('NL:','').trim(); });

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.