Conversation
core/src/main/kotlin/org/evomaster/core/solver/SMTLibZ3DbConstraintSolver.kt
Outdated
Show resolved
Hide resolved
|
hi @agusaldasoro , thx. first ask @jgaleotti for a review. once he has given it, and you address it, then I can be asked for my review :) |
| * Characters that decompose under NFD (Å→A, and other accented letters like é, ü, ñ) | ||
| * are handled by normalizing to NFD form and stripping the remaining non-ASCII combining marks. | ||
| */ | ||
| fun sanitizeSmtIdentifier(name: String): String { |
There was a problem hiding this comment.
I think this belongs to org.evomaster.core.utils.StringUtils, perhapss a more suitable method name could be transliterateIdentifier()
| val genes = mutableListOf<Gene>() | ||
| for (columnName in columns.fields) { | ||
| var gene: Gene = IntegerGene(columnName, 0) | ||
| // columnName is the sanitized (ASCII) version from Z3; resolve back to original DB column name |
There was a problem hiding this comment.
I am not sure the right term for this should be sanitized
| // Only add GetValueSMTNode for the mentioned tables | ||
| for (table in schema.tables) { | ||
| val tableNameLower = table.id.name.lowercase() | ||
| val sanitizedTableNameLower = sanitizeSmtIdentifier(tableNameLower) |
There was a problem hiding this comment.
I am not sure the right term should be "sanitized"
| val originalColumn = table.columns.firstOrNull { | ||
| SmtLibGenerator.sanitizeSmtIdentifier(it.name).equals(columnName, ignoreCase = true) | ||
| } | ||
| val actualColumnName = originalColumn?.name ?: columnName |
There was a problem hiding this comment.
actualColumnName -> columnName?
| @@ -153,36 +153,41 @@ class SMTLibZ3DbConstraintSolver() : DbConstraintSolver { | |||
| // Create the list of genes with the values | |||
| val genes = mutableListOf<Gene>() | |||
| for (columnName in columns.fields) { | |||
There was a problem hiding this comment.
columnName -> smtColumnName?
| private fun appendTableDefinitions(smt: SMTLib) { | ||
| for (table in schema.tables) { | ||
| val dataTypeName = "${StringUtils.capitalization(table.id.name)}Row" | ||
| val sanitizedTableName = convertToAscii(table.id.name) |
There was a problem hiding this comment.
sanitizedTableName -> smtTableName?
| */ | ||
| private fun appendUniqueConstraints(smt: SMTLib, table: TableDto) { | ||
| val tableName = table.id.name.lowercase() | ||
| val tableName = convertToAscii(table.id.name).lowercase() |
There was a problem hiding this comment.
tableName -> smtTableName?
| */ | ||
| private fun appendPrimaryKeyConstraints(smt: SMTLib, table: TableDto) { | ||
| val tableName = table.id.name.lowercase() | ||
| val tableName = convertToAscii(table.id.name).lowercase() |
There was a problem hiding this comment.
tableName -> smtTableName ?
| */ | ||
| private fun getColumnReference(tableName: String, columnName: String): String { | ||
| return "(${columnName.uppercase()} ${tableName.lowercase()}$rowIndex)" | ||
| return "(${convertToAscii(columnName).uppercase()} ${convertToAscii(tableName).lowercase()}$rowIndex)" |
There was a problem hiding this comment.
add a comment to explain why this needs to be converted to ASCII
| @@ -131,7 +132,7 @@ class SmtLibGenerator(private val schema: DbInfoDto, private val numberOfRows: I | |||
| * @return The corresponding SMT node. | |||
| */ | |||
| private fun parseCheckExpression(table: TableDto, condition: SqlCondition, index: Int): SMTNode { | |||
There was a problem hiding this comment.
maybe not for this PR, but something to discuss as well with @jgaleotti , why is most of the code here relaying on TableDto and not org.evomaster.core.sql.schema.Table. DTOs are only meant as simple POJOs for transfer of data between processes, and not for business logic computation
| private fun appendTimestampConstraints(smt: SMTLib) { | ||
| for (table in schema.tables) { | ||
| val tableName = table.id.name.lowercase() | ||
| val smtTableName = convertToAscii(table.id.name).lowercase() |
There was a problem hiding this comment.
there are so many calls to this convertToAscii, which makes it more difficult to read, and possibly easy to make mistake in future if code needs changing and missing a needed call. maybe create a new SmtTable class with that that you need, and pass that around the code instead of doing the conversions each time?
| fun convertToAscii(name: String): String { | ||
| val replaced = name | ||
| .replace('Ø', 'O').replace('ø', 'o') | ||
| .replace("Æ", "AE").replace("æ", "ae") |
There was a problem hiding this comment.
what about Norwegian å and Å? or Swedish characters like Ä, ä, Ö and ö ? or other special characters in all other languages?
we cannot have ad-hoc solutions like this, only specific to our test data. Need a general solution.
This pull request focuses on improving the handling of database table and column names containing non-ASCII characters (such as Norwegian letters Æ, Ø, Å) in the SMT-LIB generation and parsing logic. The main change is the introduction and consistent application of a sanitization function to ensure all identifiers are valid SMT-LIB symbols. There are also minor improvements to type mapping and debugging output.
Sanitization and identifier handling:
sanitizeSmtIdentifierfunction inSmtLibGeneratorto convert non-ASCII table and column names into valid SMT-LIB identifiers, including explicit replacements for Norwegian characters and normalization for accented letters. This function is now used throughout SMT-LIB generation and parsing. (Fba25d2eL539R539)SmtLibGeneratorandSMTConditionVisitorto use sanitized identifiers for table and column names in SMT-LIB datatypes, constants, constraints, and queries. This ensures consistent and correct SMT-LIB output and parsing for schemas with non-ASCII names. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]SMT-LIB parsing and gene mapping:
Type mapping improvements:
INT,SERIAL,SMALLSERIAL,BIGSERIAL). (Fba25d2eL539R539)Debugging and output:
printlnto output the generated SMT-LIB string before invoking Z3, aiding in troubleshooting.