經過網友的提醒發現,正規表示式裡頭存在著重複的規則,所以就把它移掉來測試一下。
因為以實況來說,我已經用 c 重新改寫了本來那個 method 的運作流程,也沒好掉這麼多地時間;而且一併拿來進行測試, test unit & ouput report 我就懶得打了 ^^"
讓數據證明一切,用看的吧 ~
test unit :
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// ------------------------------------------------------------------------------------------------ | |
- ( void ) testRegExObjC1 | |
{ | |
NSPredicate * predicate; | |
predicate = [NSPredicate predicateWithFormat: @"SELF MATCHES %@", @"([^*|:\"<>?]|[ ]|\\w)+@[1-9][0-9]*[xX]$"]; | |
[predicate evaluateWithObject: @"1234567890123456789012345"]; | |
} | |
// ------------------------------------------------------------------------------------------------ | |
- ( void ) testRegExObjC2 | |
{ | |
NSPredicate * predicate; | |
predicate = [NSPredicate predicateWithFormat: @"SELF MATCHES %@", @"([^*|:\"<>?])+@[1-9][0-9]*[xX]$"]; | |
[predicate evaluateWithObject: @"1234567890123456789012345"]; | |
} | |
// ------------------------------------------------------------------------------------------------ | |
- ( void ) testRegExObjC3 | |
{ | |
NSPredicate * predicate; | |
predicate = [NSPredicate predicateWithFormat: @"SELF MATCHES %@", @"([^*|:\"<>?])+@[1-9][0-9]*[xX]$"]; | |
[predicate evaluateWithObject: @"123456789012345678901234567890"]; | |
} | |
// ------------------------------------------------------------------------------------------------ | |
- ( void ) testRegExObjC4 | |
{ | |
NSPredicate * predicate; | |
predicate = [NSPredicate predicateWithFormat: @"SELF MATCHES %@", @"([^*|:\"<>?])+@[1-9][0-9]*[xX]$"]; | |
[predicate evaluateWithObject: @"123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890"]; | |
} | |
// ------------------------------------------------------------------------------------------------ | |
- ( void ) testRegExC1 | |
{ | |
NSString * parseString; | |
parseString = @"1234567890123456789012345"; | |
[parseString compareByRegularExpression: @"([^*|:\"<>?]|[ ]|\\w)+@[1-9][0-9]*[xX]$"]; | |
} | |
// ------------------------------------------------------------------------------------------------ | |
- ( void ) testRegExC2 | |
{ | |
NSString * parseString; | |
parseString = @"1234567890123456789012345"; | |
[parseString compareByRegularExpression: @"([^*|:\"<>?])+@[1-9][0-9]*[xX]$"]; | |
} | |
// ------------------------------------------------------------------------------------------------ | |
- ( void ) testRegExC3 | |
{ | |
NSString * parseString; | |
parseString = @"123456789012345678901234567890"; | |
[parseString compareByRegularExpression: @"([^*|:\"<>?])+@[1-9][0-9]*[xX]$"]; | |
} | |
// ------------------------------------------------------------------------------------------------ | |
- ( void ) testRegExC4 | |
{ | |
NSString * parseString; | |
parseString = @"123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890"; | |
[parseString compareByRegularExpression: @"([^*|:\"<>?])+@[1-9][0-9]*[xX]$"]; | |
} | |
// ------------------------------------------------------------------------------------------------ |
output report :
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Test Case '-[TechDNSStringTest testRegExC1]' started. | |
Test Case '-[TechDNSStringTest testRegExC1]' passed (0.001 seconds). | |
Test Case '-[TechDNSStringTest testRegExC2]' started. | |
Test Case '-[TechDNSStringTest testRegExC2]' passed (0.000 seconds). | |
Test Case '-[TechDNSStringTest testRegExC3]' started. | |
Test Case '-[TechDNSStringTest testRegExC3]' passed (0.000 seconds). | |
Test Case '-[TechDNSStringTest testRegExC4]' started. | |
Test Case '-[TechDNSStringTest testRegExC4]' passed (0.000 seconds). | |
Test Case '-[TechDNSStringTest testRegExObjC1]' started. | |
Test Case '-[TechDNSStringTest testRegExObjC1]' passed (17.567 seconds). | |
Test Case '-[TechDNSStringTest testRegExObjC2]' started. | |
Test Case '-[TechDNSStringTest testRegExObjC2]' passed (0.001 seconds). | |
Test Case '-[TechDNSStringTest testRegExObjC3]' started. | |
Test Case '-[TechDNSStringTest testRegExObjC3]' passed (0.001 seconds). | |
Test Case '-[TechDNSStringTest testRegExObjC4]' started. | |
Test Case '-[TechDNSStringTest testRegExObjC4]' passed (0.001 seconds). |
可以比較出來,如果正規表示式條件沒有調整到很正確的狀況時,Objective-C 的處理時間會比 C 處理的時間還要多出許多。
( 因為 C 的語法也經過一定程度判斷簡化了 )
用 c 的語法改寫正規表示式檢查方式的 code :
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// ------------------------------------------------------------------------------------------------ | |
- ( BOOL ) compareByRegularExpression:(NSString *)regularExpression | |
{ | |
// NSParameterAssert( regularExpression ); | |
// | |
// NSPredicate * predicate; | |
// | |
// predicate = [NSPredicate predicateWithFormat: @"SELF MATCHES %@", regularExpression]; | |
// NSParameterAssert( predicate ); | |
// return [predicate evaluateWithObject: self]; | |
NSParameterAssert( regularExpression ); | |
regex_t regular; | |
int result; | |
regmatch_t matches[1]; | |
char errorMsg[BUFSIZ]; | |
memset( &matches, 0, sizeof(matches) ); | |
memset( &errorMsg, 0, sizeof( errorMsg ) ); | |
result = regcomp( ®ular, [regularExpression cStringUsingEncoding: NSASCIIStringEncoding], REG_EXTENDED ); | |
if ( 0 != result ) | |
{ | |
regerror( result, ®ular, errorMsg, sizeof( errorMsg ) ); | |
regfree( ®ular ); | |
return NO; | |
} | |
result = regexec( ®ular, [self cStringUsingEncoding: NSASCIIStringEncoding], 1, matches, 0 ); | |
if ( REG_NOMATCH == result ) | |
{ | |
regerror( result, ®ular, errorMsg, sizeof( errorMsg ) ); | |
regfree( ®ular ); | |
return NO; | |
} | |
// must all character equal for regular expression. | |
if ( 0 != matches[0].rm_so ) // check start character. | |
{ | |
regfree( ®ular ); | |
return NO; | |
} | |
if ( ( [self length] ) != matches[0].rm_eo ) // check match length equal or not. | |
{ | |
regfree( ®ular ); | |
return NO; | |
} | |
regfree( ®ular ); | |
return YES; | |
} | |
// ------------------------------------------------------------------------------------------------ |
※ 其中的
regmatch_t matches[1];
// ...
result = regexec( ®ular, [self cStringUsingEncoding: NSASCIIStringEncoding], 1, matches, 0 );
因為這個這個函式的功能只判斷,字串是否符合該正規表示式,所以整體運算流程只需要執行個一次就好了,也不需要讓 regexec 這個函式反覆進行判斷PS: 當我在把函式中的 character set 調整成 UTF8 之後, c 的處理時間有些微的上漲了 0.001 秒 XD
沒有留言:
張貼留言