[Lunr] Gravity: adding canonical query

Review Request #785 — Created July 23, 2021 and submitted

d.mendes
Lunr
gravity/database/mysql/canonicalquery
cde92c3...
lunr

Gravity: adding canonical query

Thought process,
First approach, i wrote one main loop that it would check each index of the string and choose which path to take depending on character found, my problem with it was that i had to keep verifying previous and next characters with diferent conditions and child loops, so i felt that the code became a bit complex and hard to read.

Second approach, functions like strpos can search for more than a single character and this way i can clean up a lot of the conditions, but the problem is that i we'll have multiples loops instead of one main loop to the entire string, and i'm not sure if the impact is noticible or not and in case it is, if it is better to assume it or not, also for numeric values i still used the first approach (with some cases jumping a few ranges) because there was no clear pattern to search for.

Removes Single line comments "#" or "--"
Removes Block comments "/*" and "*/"
Jumps (Ignores) characters in between:
- Executable comments ("/*M","*/") or ("/*!,*/")
- Backquotes "`"
- Escaped caracters "\"

Replaces in betweeen characters:
- Double quotes
- Single quotes

Replaces finds and replaces numeric characters:
- Numberic "123"
- Negative "-123"
- Decimals "12.34"
- Exponentials "12.34E11" or "12.34E-11"
- Hexadecimals "0x43"

Removes:
- End of Line characters
- Excessive blank spaces



  • 0
  • 0
  • 36
  • 1
  • 37
Description From Last Updated
There are no open issues
smillernl
  1. 
      
  2. The issue has been resolved. Show all issues

    Did you check this with phpcs? For example the uncommented private class variables should have been disallowed by it.

    1. Yes, i used Lunr standard

  3. 
      
d.mendes
d.mendes
d.mendes
smillernl
  1. 
      
  2. The issue has been resolved. Show all issues

    This should indicate the type of variable (as you did below), not the name.

    1. fixed it, also line 20 with the code " * @var $query"

  3. The issue has been resolved. Show all issues

    Please be consistent in usage of return types

    1. Changed this, can you confirm?

    2. This is fixed, but I made some new comments for types.

  4. The issue has been resolved. Show all issues

    There should be no spaces between the name and the ()

    1. fixed this, i also fixed when the function was being called

  5. The issue has been resolved. Show all issues

    You're checking if $end_pos is false above, it'll never be equal to strlen($string ) - 1 - $start_pos and even if it is, you're not using the result of the comparison.

    1. fixed it, and re-tested it

  6. The issue has been resolved. Show all issues

    null isn't allowed in the parameter type

  7. The issue has been resolved. Show all issues

    I don't think this needs to be defined twice?

    1. no need, removed it

  8. The issue has been resolved. Show all issues

    Why not null?

    1. No reason, changed it to null and re-tested

  9. The issue has been resolved. Show all issues

    How does this find any digits? It's just checking string length right?

    1. Artefact from development process old code.
      Comment Removed.

  10. The issue has been resolved. Show all issues

    It doesn't seem to do this

  11. 
      
d.mendes
pprkut
  1. 
      
  2. src/Lunr/Gravity/Database/MySQL/MySQLCanonicalQuery.php (Diff revision 3)
     
     
     
     
     
     
     
     
     
     
    The issue has been resolved. Show all issues

    Don't think this is needed. We can already get the query from other places.

  3. src/Lunr/Gravity/Database/MySQL/MySQLCanonicalQuery.php (Diff revision 3)
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
    The issue has been resolved. Show all issues

    You can combine these two, by checking whether the query canonicalization was already done ($this->canonical_query !== NULL).

    I would also rename the method to get and make a __toString() magic method that calls it.

    That way we can get the canonical query using

    $canonical_query = (string) $result->canonical_query();
    
    1. There is a call to canonical_query() in the __contructor, i also removed it
      this way only executes if __toString() or ->canonical_query() are called.

  4. src/Lunr/Gravity/Database/MySQL/MySQLCanonicalQuery.php (Diff revision 3)
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
    The issue has been resolved. Show all issues

    I don't think we need those wrappers.

  5. src/Lunr/Gravity/Database/MySQL/MySQLQueryResult.php (Diff revision 3)
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
    The issue has been resolved. Show all issues

    I'm afraid, doing it like this isn't good.

    Dependency Injection demands that we can somehow mock the canonical_query() method of the MySQLCanonicalQuery class. But we can't do that if class instantitation and method call are in the same location. So we have to split this.

    The best scenario I could think of is if the just return the instantiated class here (still considering that we should only instantiate it on the first call). Combined with the __toString() comment I made above this should still make for a nice API.

    1. So if i followed right, just remove the line $this->canonical_query = $canonical_query->canonical_query(); (assuming __toString() is implemented) should fix this right?
      also to execute only the first time that canonical_query() from MySQLQueryResult is executed

    2. Yes, that's good :)

  6. 
      
d.mendes
pprkut
  1. 
      
  2. The issue has been resolved. Show all issues

    Some things aren't in the codestyle apparently. We should make a list :D

    There should be an empty line between the open tag and the file comment.

  3. src/Lunr/Gravity/Database/MySQL/MySQLCanonicalQuery.php (Diff revision 4)
     
     
     
     
     
     
     
     
    The issue has been resolved. Show all issues

    empty line between @param and @return

    1. changed, also did it in other locations

  4. The issue has been resolved. Show all issues

    leftover test code

  5. src/Lunr/Gravity/Database/MySQL/MySQLQueryResult.php (Diff revision 4)
     
     
     
     
     
    The issue has been resolved. Show all issues

    common practice for us is that we initialize the property in the constructor. So a simple

    $this->canonical_query = NULL;
    

    should be fine here.

  6. The issue has been resolved. Show all issues

    I prefer a !== NULL check

  7. The issue has been resolved. Show all issues

    Put the expected value in the data provider, even if it's always the same. Makes it easier to extend in the future in case we do find a situation where it isn't.

    1. yes, i also added ReplaceBetweenDataProvider for the same reason

  8. The issue has been resolved. Show all issues

    This shouldn't be needed. $this->class is supposed to be the class we're testing.

  9. src/Lunr/Gravity/Database/MySQL/Tests/MySQLCanonicalQueryTest.php (Diff revision 4)
     
     
     
     
     
     
     
     
    The issue has been resolved. Show all issues

    you can name the test scenarios by setting a string key, like

    $data_provider['first argument not string'] = [['013456789013456789','23','56'], []];
    
  10. src/Lunr/Gravity/Database/MySQL/Tests/MySQLCanonicalQueryTest.php (Diff revision 4)
     
     
     
     
     
     
     
     
     
     
     
     
     
    The issue has been resolved. Show all issues

    Please change this to read input and expected output from test files, and add more possible scenarios.
    There's two possible structure we typically use (in tests/statics/<namespace>/):

    • foo_input.sql and foo_output.sql
    • input/foo.sql and output/foo.sql

    Whichever you prefer.

    Additional cases here that'd be worth considering are INSERT, UPDATE, REPLACE queries. SHOW would also be nice. Perhaps upserts, CTEs, etc.

    1. Added test by file

  11. src/Lunr/Gravity/Database/MySQL/Tests/MySQLQueryResultBaseTest.php (Diff revision 4)
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
    The issue has been resolved. Show all issues

    one test case per test :)

  12. 
      
smillernl
  1. 
      
  2. The issue has been resolved. Show all issues

    Try to avoid empty() since it'll also match things like 0, which you probably don't want here. is_null is a better check I think

    1. yes, also need to check if $to === ''

  3. The issue has been resolved. Show all issues

    missing param type

  4. The issue has been resolved. Show all issues

    missing param types

  5. The issue has been resolved. Show all issues

    this is equal to parent::tearDown() after you implement heinz' comments

  6. 
      
d.mendes
pprkut
  1. 
      
  2. The issue has been resolved. Show all issues

    Perhaps add an example for a query using maxscale query hints: https://mariadb.com/kb/en/mariadb-maxscale-14/maxscale-hint-syntax/

    1. I added a new file with a simple example,
      The code right now removes comments "-- ", but this seems to be something that may have some interest to distinguish
      So i'm not sure if i should change the code to keep this hints in the canonical query, or add it as an option

    2. No that's fine. For all intents and purposes it's a comment, since it doesn't control the query, it controls maxscale.

  3. tests/statics/Gravity/Database/MySQL/input_backup.sql (Diff revision 5)
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
    The issue has been resolved. Show all issues

    I'm not sure about this. On the one hand, it's nice to test that the code would support his.

    On the other hand, we'd never get to a query like this. We don't support either running multiple queries at once nor do we support fetching results for multiple queries issued at once.

    1. i think this probably should be removed, because the code will assume that it is 1 query, and it would probably need some new development and tests to fully support it

    2. Then let's remove it :)

  4. The issue has been resolved. Show all issues

    This is where things get tricky I suppose.

    Are these two queries canonically different?

    INSERT INTO `database`.`table` (`param1`, `param2`, `param3`) VALUES (?, '?', '?'), (?, '?', '?'), (?, '?', '?'), (?, '?', '?'), (?, '?', '?');
    

    INSERT INTO `database`.`table` (`param1`, `param2`, `param3`) VALUES (?, '?', '?'), (?, '?', '?'), (?, '?', '?'), (?, '?', '?');
    
    1. Yes they are different, it will be a different canonical for the same insert, case the number of rows to insert change

  5. 
      
d.mendes
d.mendes
pprkut
  1. 
      
  2. src/Lunr/Gravity/Database/MySQL/MySQLCanonicalQuery.php (Diff revisions 6 - 7)
     
     
     
     
    The issue has been resolved. Show all issues

    You need to handle the case where $offset is an empty array (if I understood the code in find_positions correctly).

    This can be the case in INSERT/REPLACE statements that don't use VALUES, like

    INSERT INTO table SELECT FROM table2

  3. 
      
d.mendes
pprkut
  1. 
      
  2. The issue has been resolved. Show all issues

    I think it would be a better test if we had an actual value here in the WHERE clause :-)

    1. Please check the other input test files as well

    2. i will add new tests files to the test function of canonical_query,
      because this test is for the collapse function, and the rows need to be in canonical formart already

    3. gotcha

  3. 
      
d.mendes
d.mendes
  1. 
      
  2. The issue has been dropped. Show all issues

    this is the same row, but will return failed

    1. I'm not sure what you mean

    2. i was wandering about the case sensitive, for example null versus NULL, if i should treat it as different rows or the same

    3. the same, in this case

  3. 
      
pprkut
  1. Ship It!
  2. 
      
pprkut
  1. 
      
  2. The issue has been resolved. Show all issues

    Branch not pushed

  3. 
      
pprkut
  1. 
      
  2. The issue has been resolved. Show all issues

    Unit tests are failing:

    There was 1 error:
    
    1) Lunr\Gravity\Database\MySQL\Tests\MySQLQueryResultBaseTest::testCanonicalQuery
    Undefined property: Lunr\Gravity\Database\MySQL\Tests\MySQLQueryResultBaseTest::$result_reflection
    
    /mnt/progs/projects/m2mobi/backend/lunr/src/Lunr/Gravity/Database/MySQL/Tests/MySQLQueryResultBaseTest.php:182
    
    --
    
    There was 1 warning:
    
    1) Warning
    No tests found in class "Lunr\Gravity\Database\MySQL\Tests\MySQLCanonicalQueryTest".
    
    --
    
    There were 2 failures:
    
    1) Lunr\Gravity\Filesystem\Tests\PhysicalFilesystemAccessObjectListDirectoriesTest::testGetListOfDirectoriesInAccessibleDirectory
    Failed asserting that two arrays are equal.
    --- Expected
    +++ Actual
    @@ @@
     Array (
    -    0 => 'folder1'
    -    1 => 'folder2'
    +    0 => 'Database'
    +    1 => 'folder1'
    +    2 => 'folder2'
     )
    
    /mnt/progs/projects/m2mobi/backend/lunr/src/Lunr/Gravity/Filesystem/Tests/PhysicalFilesystemAccessObjectListDirectoriesTest.php:40
    
    2) Lunr\Gravity\Filesystem\Tests\PhysicalFilesystemAccessObjectDirectoryTest::testGetListingOfAccessibleDirectory
    Failed asserting that two arrays are equal.
    --- Expected
    +++ Actual
    @@ @@
     Array (
    -    0 => 'file1'
    -    1 => 'file2'
    -    2 => 'file3'
    -    3 => 'folder1'
    -    4 => 'folder2'
    +    0 => 'Database'
    +    1 => 'file1'
    +    2 => 'file2'
    +    3 => 'file3'
    +    4 => 'folder1'
    +    5 => 'folder2'
     )
    
    /mnt/progs/projects/m2mobi/backend/lunr/src/Lunr/Gravity/Filesystem/Tests/PhysicalFilesystemAccessObjectDirectoryTest.php:41
    
  3. 
      
d.mendes
d.mendes
pprkut
  1. 
      
  2. The issue has been resolved. Show all issues

    When I rebase the latest code in the branch against master I still get:

    There was 1 error:
    
    1) Lunr\Gravity\Database\MySQL\Tests\MySQLQueryResultBaseTest::testCanonicalQuery
    Undefined property: Lunr\Gravity\Database\MySQL\Tests\MySQLQueryResultBaseTest::$result_reflection
    
  3. 
      
d.mendes
pprkut
Review request changed

Status: Closed (submitted)

Change Summary:

Merged into master

Loading...